C#: Efficiently search a large string for occurences of other strings
Posted
by Jon
on Stack Overflow
See other posts from Stack Overflow
or by Jon
Published on 2009-06-23T08:19:08Z
Indexed on
2010/04/13
15:03 UTC
Read the original article
Hit count: 287
Hi,
I'm using C# to continuously search for multiple string "keywords" within large strings, which are >= 4kb. This code is constantly looping, and sleeps aren't cutting down CPU usage enough while maintaining a reasonable speed. The bog-down is the keyword matching method.
I've found a few possibilities, and all of them give similar efficiency.
1) http://tomasp.net/articles/ahocorasick.aspx -I do not have enough keywords for this to be the most efficient algorithm.
2) Regex. Using an instance level, compiled regex. -Provides more functionality than I require, and not quite enough efficiency.
3) String.IndexOf. -I would need to do a "smart" version of this for it provide enough efficiency. Looping through each keyword and calling IndexOf doesn't cut it.
Does anyone know of any algorithms or methods that I can use to attain my goal?
© Stack Overflow or respective owner