Sunday, 15 May 2011

java - regex is very slow, how to check if a string is only with word chars fast? -


I have a string, to see that a string (mostly string is only one CJK with four) word characters , And it will be applied several times, so the cost is unacceptable, but I do not know how to optimize it, any suggestions?

  / * \ w is the equivalent of square squares [\ p] {p} {p} {lu} \ p {lieutenant} \ p {lo} \ p {nd}]. For more information about Unicode TR-18, keep in mind that among the Unicode releases, sets of characters may vary in each category. * / Personal static final pattern sOnlyWordChars = Pattern.compile ("\\ w +"); Private boolean is the only Word Chairs (strings) {Returns the Old Word Chase.Machter (s) .matches (); }   

When s is "3g", or "go_url", or "hao123", then only Word Chars should be correct.

  private boolean is the only Word Chars (strings) {char [] chars = s.toCharArray (); For (four c: characters) {if (! (Liter! Liter (c)) {return false; True} true; }   

A better implementation

  Public stable boolean is alpha (string str) {if (str == null) {return false; } Int sz = str.length (); For (int i = 0; i   

Or if you are using Apache Commons, the second extradition of the answer is actually in the source code if the alpha is

UPDATE

Sorry for the late reply. I was not sure about speed, although I have read in many places that the loop is faster than regex. To ensure that I run the following code and here is the result

5000000 for running

with your code: 4.99 seconds (after this Runtime error is not working for big data)

My first code 2.71 seconds

with my second code 1.06 seconds

with my code 0.36 seconds with your code

/ P> P>

with my second code 0.33 Cundle

I have sample code used.

NB Little mistakes can happen You can play with it to test in different scenarios. According to January's comment, I think these are small things like private or public use, it is a good thing to test yast condition.

No comments:

Post a Comment