Regular expression matching is a powerful tool (in Java, and in other contexts) but it does have some drawbacks. One of these that regular expressions tends to be rather expensive.

Pattern and Matcher instances should be reused

Consider the following example:

/**
 * Test if all strings in a list consist of English letters and numbers.
 * @param strings the list to be checked
 * @return 'true' if an only if all strings satisfy the criteria
 * @throws NullPointerException if 'strings' is 'null' or a 'null' element.
 */
public boolean allAlphanumeric(List<String> strings) {
    for (String s : strings) {
        if (!s.matches("[A-Za-z0-9]*")) {
            return false;
        }  
    }
    return true;
}

This code is correct, but it is inefficient. The problem is in the matches(...) call. Under the hood, s.matches("[A-Za-z0-9]*") is equivalent to this:

Pattern.matches(s, "[A-Za-z0-9]*")

which is in turn equivalent to

Pattern.compile("[A-Za-z0-9]*").matcher(s).matches()

The Pattern.compile("[A-Za-z0-9]*") call parses the regular expression, analyze it, and construct a Pattern object that holds the data structure that will be used by the regex engine. This is a non-trivial computation. Then a Matcher object is created to wrap the s argument. Finally we call match() to do the actual pattern matching.

The problem is that this work is all repeated for each loop iteration. The solution is to restructure the code as follows:

private static Pattern ALPHA_NUMERIC = Pattern.compile("[A-Za-z0-9]*");

public boolean allAlphanumeric(List<String> strings) {
    Matcher matcher = ALPHA_NUMERIC.matcher("");
    for (String s : strings) {
        matcher.reset(s);
        if (!matcher.matches()) {
            return false;
        }  
    }
    return true;
}

Note that the javadoc for Pattern states:

Instances of this class are immutable and are safe for use by multiple concurrent threads. Instances of the Matcher class are not safe for such use.

Don’t use match() when you should use find()

Suppose you want to test if a string s contains three or more digits in a row. You cn express this in various ways including:

if (s.matches(".*[0-9]{3}.*")) {
    System.out.println("matches");
}

or

if (Pattern.compile("[0-9]{3}").matcher(s).find()) {
    System.out.println("matches");
}

The first one is more concise, but it is also likely to be less efficient. On the face of it, the first version is going to try to match the entire string against the pattern. Furthermore, since “.*” is a “greedy” pattern, the pattern matcher is likely to advance “eagerly” try to the end of the string, and backtrack until it finds a match.

By contrast, the second version will search from left to right and will stop searching as soon as it finds the 3 digits in a row.