Regex | Matches |
<a +href="http://([\w\.-]+) | OK. The domain name in an HTML link.
Ie, an "<", one or more blanks, href=, then
one or more word characters or dots.
Group 1 matches the domain name.
This only matches lowercase; to match regardless of case,
the Pattern object should be created with the second parameter
specifying case insensitivity. |
<a +href="http://(.+)["/?:] | BAD. This pattern would appear to work,
stopping when a character after the domain name is found. However, "+" is a
greedy qualifier, it will match all characters to the end of the string, then
backup until it finds one of the terminating characters, ""/?:", which almost surely
won't be in the same link. |
<a +href="http://(.+?)["/?:] | OK. Fixes the above pattern
by using the lazy quantifier +? . |