Get the package of a Java source file
- by Oak
My goal is to find the package (as string) of a Java source file, given as plaintext and not already sorted in folders.
I can't just locate the first instance of the keyword package in the file, because it may appear inside a comment. So I was thinking about two alternatives:
Scan the file word-by-word, maintaining a "inside-a-comment" flag for the scanner. The first time the package keyword is encountered while not inside a comment, stop the scanning and report the result.
Use a regex - should be theoretically possible because block comments do not next in Java, but I tried making such a regex and it turned out to be quite complicated - for me, at least.
Another difference between the two approaches is that when scanning manually I can stop the scan when I can be certain the package keyword can no longer appear, saving some time... and I'm not sure I can do something similar with regexes. On the other hand, the decision "when it can no longer appear" is not necessarily simple, though I could use some heuristic for that.
I would like to hear any input on this problem, and would welcome any help with the regex. My solution is written in Java as well.