UNIX-style RegExp Replace running extremely slowly under windows. Help? EDIT: Negative lookahead ass
Posted
by John Sullivan
on Stack Overflow
See other posts from Stack Overflow
or by John Sullivan
Published on 2010-03-30T19:06:40Z
Indexed on
2010/03/30
19:53 UTC
Read the original article
Hit count: 318
regex
|regex-negation
I'm trying to run a unix regEXP on every log file in a 1.12 GB directory, then replace the matched pattern with ''
. Test run on a 4 meg file is took about 10 minutes, but worked. Obviously something is murdering performance by several orders of magnitude.
Find: ^(?!.*155[0-2][0-9]{4}\s.*).*$
-- NOTE: match any line NOT starting 155[0-2]NNNN where in is a number 0-9. Replace with: ''
.
Is there some justifiable reason for my regExp to take this long to replace matching text, or is the program I am using (this is windows / a program called "grepWin") most likely poorly optimized?
Thanks.
UPDATE: I am noticing that searching for ^(155[0-2]).$ takes ~7 seconds in a 5.6 MB file with 77 matches. Adding the Negative Lookahead Assertion, ?=, so that the regExp becomes ^(?!155[0-2]).$ is causing it to take at least 5-10 minutes; granted, there will be thousands and thousands of matches.
Should the negative lookahead assertion be extremely detrimental to performance, and/or a large quantity of matches?
© Stack Overflow or respective owner