Improving grepping over a huge file performance
Posted
by
rogerio_marcio
on Programmers
See other posts from Programmers
or by rogerio_marcio
Published on 2012-05-29T22:02:09Z
Indexed on
2012/09/05
21:50 UTC
Read the original article
Hit count: 382
I have FILE_A which has over 300K lines and FILE_B which has over 30M lines. I created a bash script that greps each line in FILE_A over in FILE_B and writes the result of the grep to a new file.
This whole process is taking over 5+ hours.
I'm looking for suggestions on whether you see any way of improving the performance of my script.
I'm using grep -F -m 1 as the grep command. FILE_A looks like this:
123456789
123455321
and FILE_B is like this:
123456789,123456789,730025400149993,
123455321,123455321,730025400126097,
So with bash I have a while loop that picks the next line in FILE_A and greps it over in FILE_B. When the pattern is found in FILE_B i write it to result.txt.
while read -r line; do
grep -F -m1 $line 30MFile
done < 300KFile
Thanks a lot in advance for your help.
© Programmers or respective owner