JAVA : How to get the positions of all matches in a String?
- by user692704
I have a text document and a query (the query could be more than one word). I want to find the position of all occurrences of the query in the document.
I thought of the documentText.indexOf(query) and using regular expression but I could not make it work. 
I end up with the following method:
First, I have create a dataType called QueryOccurrence
public class QueryOccurrence implements Serializable{
  public QueryOccurrence(){}
  private int start;
  private int end;      
  public QueryOccurrence(int nameStart,int nameEnd,String nameText){
    start=nameStart;
    end=nameEnd;        
  }
  public int getStart(){
    return start;
  }
  public int getEnd(){
    return end;
  }
  public void SetStart(int i){
    start=i;
  }
  public void SetEnd(int i){
     end=i;
  }
}
Then, I have used this datatype in the following method: 
    public static List<QueryOccurrence>FindQueryPositions(String documentText, String query){
    // Normalize do the following: lower case, trim, and remove punctuation
    String normalizedQuery = Normalize.Normalize(query);
    String normalizedDocument = Normalize.Normalize(documentText);
    String[] documentWords = normalizedDocument.split(" ");;               
    String[] queryArray = normalizedQuery.split(" ");
    List<QueryOccurrence> foundQueries = new ArrayList();
    QueryOccurrence foundQuery = new QueryOccurrence();
    int index = 0;
    for (String word : documentWords) {            
        if (word.equals(queryArray[0])){
            foundQuery.SetStart(index);
        }
        if (word.equals(queryArray[queryArray.length-1])){
            foundQuery.SetEnd(index);
            if((foundQuery.End()-foundQuery.Start())+1==queryArray.length){
                //add the found query to the list
                foundQueries.add(foundQuery);
                //flush the foundQuery variable to use it again
                foundQuery= new QueryOccurrence();
            }
        }
        index++;
    }
    return foundQueries;
}
This method return a list of all occurrence of the query in the document each one with its position. 
Could you suggest any easer and faster way to accomplish this task.
Thanks