Search Results

Search found 1 results on 1 pages for 'farcaller'.

Page 1/1 | 1 

  • Optimising ruby regexp -- lots of match groups

    - by Farcaller
    I'm working on a ruby baser lexer. To improve performance, I joined up all tokens' regexps into one big regexp with match group names. The resulting regexp looks like: /\A(?<__anonymous_-1038694222803470993>(?-mix:\n+))|\A(?<__anonymous_-1394418499721420065>(?-mix:\/\/[\A\n]*))|\A(?<__anonymous_3077187815313752157>(?-mix:include\s+"[\A"]+"))|\A(?<LET>(?-mix:let\s))|\A(?<IN>(?-mix:in\s))|\A(?<CLASS>(?-mix:class\s))|\A(?<DEF>(?-mix:def\s))|\A(?<DEFM>(?-mix:defm\s))|\A(?<MULTICLASS>(?-mix:multiclass\s))|\A(?<FUNCNAME>(?-mix:![a-zA-Z_][a-zA-Z0-9_]*))|\A(?<ID>(?-mix:[a-zA-Z_][a-zA-Z0-9_]*))|\A(?<STRING>(?-mix:"[\A"]*"))|\A(?<NUMBER>(?-mix:[0-9]+))/ I'm matching it to my string producing a MatchData where exactly one token is parsed: bigregex =~ "\n ... garbage" puts $~.inspect Which outputs #<MatchData "\n" __anonymous_-1038694222803470993:"\n" __anonymous_-1394418499721420065:nil __anonymous_3077187815313752157:nil LET:nil IN:nil CLASS:nil DEF:nil DEFM:nil MULTICLASS:nil FUNCNAME:nil ID:nil STRING:nil NUMBER:nil> So, the regex actually matched the "\n" part. Now, I need to figure the match group where it belongs (it's clearly visible from #inspect output that it's _anonymous-1038694222803470993, but I need to get it programmatically). I could not find any option other than iterating over #names: m.names.each do |n| if m[n] type = n.to_sym resolved_type = (n.start_with?('__anonymous_') ? nil : type) val = m[n] break end end which verifies that the match group did have a match. The problem here is that it's slow (I spend about 10% of time in the loop; also 8% grabbing the @input[@pos..-1] to make sure that \A works as expected to match start of string (I do not discard input, just shift the @pos in it). You can check the full code at GH repo. Any ideas on how to make it at least a bit faster? Is there any option to figure the "successful" match group easier?

    Read the article

1