Any software for pattern-matching and -rewriting source code?
- by Steven A. Lowe
I have some old software (in a language that's not dead but is dead to me ;-)) that implements a basic pattern-matching and -rewriting system for source code. I am considering resurrecting this code, translating it into a modern language, and open-sourcing the project as a refactoring power-tool. Before I go much further, I want to know if anything like this exists already (my google-fu is fanning air on this tonight).
Here's how it works:
the pattern-matching part matches source-code patterns spanning multiple lines of code using a template with binding variables,
the pattern-rewriting part uses a template to rewrite the matched code, inserting the contents of the bound variables from the matching template
matching and rewriting templates are associated (1:1) by a simple (unconditional) rewrite rule
the software operates on the abstract syntax tree (AST) of the input application, and outputs a modified AST which can then be regenerated into new source code
for example, suppose we find a bunch of while-loops that really should be for-loops. The following template will match the while-loop pattern:
Template oldLoopPtrn
int @cnt@ = 0;
while (@cnt@ < @max@)
{
… @body@
++@cnt@;
}
End_Template
while the following template will specify the output rewrite pattern:
Template newLoopPtrn
for(int @cnt@ = 0; @cnt@ < @max@; @cnt@++)
{
@body@
}
End_Template
and a simple rule to associate them
Rule oldLoopPtrn --> newLoopPtrn
so code that looks like this
int i=0;
while(i<arrlen)
{
printf("element %d: %f\n",i,arr[i]);
++i;
}
gets automatically rewritten to look like this
for(int i = 0; i < arrlen; i++)
{
printf("element %d: %f\n",i,arr[i]);
}
The closest thing I've seen like this is some of the code-refactoring tools, but they seem to be geared towards interactive rewriting of selected snippets, not wholesale automated changes.
I believe that this kind of tool could supercharge refactoring, and would work on multiple languages (even HTML/CSS). I also believe that converting and polishing the code base would be a huge project that I simply cannot do alone in any reasonable amount of time.
So, anything like this out there already? If not, any obvious features (besides rewrite-rule conditions) to consider?
EDIT: The one feature of this system that I like very much is that the template patterns are fairly obvious and easy to read because they're written in the same language as the target source code, not in some esoteric mutated regex/BNF format.