My company (let's call them Acme Technology) has a library of approximately one thousand source files that originally came from its Acme Labs research group, incubated in a development group for a couple years, and has more recently been provided to a handful of customers under non-disclosure.  Acme is getting ready to release perhaps 75% of the code to the open source community.  The other 25% would be released later, but for now, is either not ready for customer use or contains code related to future innovations they need to keep out of the hands of competitors.
The code is presently formatted with #ifdefs that permit the same code base to work with the pre-production platforms that will be available to university researchers and a much wider range of commercial customers once it goes to open source, while at the same time being available for experimentation and prototyping and forward compatibility testing with the future platform.  Keeping a single code base is considered essential for the economics (and sanity) of my group who would have a tough time maintaining two copies in parallel.
Files in our current base look something like this:
> // Copyright 2012 (C) Acme Technology, All Rights Reserved.
> // Very large, often varied and restrictive copyright license in English and French,
> // sometimes also embedded in make files and shell scripts with varied 
> // comment styles. 
> 
> 
>   ... Usual header stuff...
>
> void initTechnologyLibrary() {
>     nuiInterface(on);
> #ifdef  UNDER_RESEARCH
>     holographicVisualization(on);
> #endif
> }
And we would like to convert them to something like:
> // GPL Copyright (C) Acme Technology Labs 2012, Some rights reserved.
> // Acme appreciates your interest in its technology, please contact 
[email protected] 
> // for technical support, and www.acme.com/emergingTech for updates and RSS feed.
> 
>   ... Usual header stuff...
>
> void initTechnologyLibrary() {
>     nuiInterface(on);
> }
Is there a tool, parse library, or popular script that can replace the copyright and strip out not just #ifdefs, but variations like #if defined(UNDER_RESEARCH), etc.?
The code is presently in 
Git and would likely be hosted somewhere that uses 
Git.  Would there be a way to safely link repositories together so we can efficiently reintegrate our improvements with the open source versions?  Advice about other pitfalls is welcome.