Best way to split a string by word (SQL Batch separator)

Posted by Paul Kohler on Stack Overflow See other posts from Stack Overflow or by Paul Kohler
Published on 2010-04-12T01:59:32Z Indexed on 2010/04/12 2:03 UTC
Read the original article Hit count: 338

Filed under:
|
|

I have a class I use to "split" a string of SQL commands by a batch separator - e.g. "GO" - into a list of SQL commands that are run in turn etc.

...
private static IEnumerable<string> SplitByBatchIndecator(string script, string batchIndicator)
{
    string pattern = string.Concat("^\\s*", batchIndicator, "\\s*$");
    RegexOptions options = RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.Multiline;
    foreach (string batch in Regex.Split(script, pattern, options))
    {
        yield return batch.Trim();
    }
}

My current implementation uses a Regex with yield but I am not sure if it's the "best" way.

  • It should be quick
  • It should handle large strings (I have some scripts that are 10mb in size for example)
  • The hardest part (that the above code currently does not do) is to take quoted text into account

Currently the following SQL will incorrectly get split:

var batch = QueryBatch.Parse(@"-- issue...
insert into table (name, desc)
values('foo', 'if the
go
is on a line by itself we have a problem...')");

Assert.That(batch.Queries.Count, Is.EqualTo(1), "This fails for now...");

I have thought about a token based parser that tracks the state of the open closed quotes but am not sure if Regex will do it.

Any ideas!?

© Stack Overflow or respective owner

Related posts about c#

Related posts about regex