Java regex skipping matches
- by Mihail Burduja
I have some text; I want to extract pairs of words that are not separated by punctuation. Thi is the code:
//n-grams
Pattern p = Pattern.compile("[a-z]+");
if (n == 2) {
p = Pattern.compile("[a-z]+ [a-z]+");
}
if (n == 3) {
p = Pattern.compile("[a-z]+ [a-z]+ [a-z]+");
}
Matcher m = p.matcher(text.toLowerCase());
ArrayList<String>…