Process files in a folder that haven't previously been processed

Posted by Paul on Super User See other posts from Super User or by Paul
Published on 2012-07-11T01:01:26Z Indexed on 2012/07/11 3:18 UTC
Read the original article Hit count: 423

Filed under:
|
|

I have a series of files in a directory that I need to carry an action out on using a script. Once the action is done, then I want to keep a log that the file has been processed, so that the next time the script is run, it does not attempt to carry out the action again.

So lets say I can find all the files that should be processed like this:

for i in `find /logfolder -name '20*.log'` ; do
    process_log $i
    echo $i >> processedlogsfile
done

So I have a file containing the logs I have processed, and my goal would be to modify the for loop such that these processed logs are not processed a second time.

Doing a manual scan each time seems inefficient, particularly as the processedlogfiles gets bigger:

 if grep -iq "$i" processdlogfiles ; then continue; fi

It would be good if these files could be excluded when setting up the for loop.

Note that the OS in question is a linux derivative, a managment appliance, with a limited toolset (no attr command for example) and so no way to install additional utilities (well it is possible but not an option). Most common bash shell commands are available though.

Also, the filenames and locations of the processed files must remain where they are - they can't be altered to reflect their processed status

© Super User or respective owner

Related posts about unix

Related posts about shell-script