regex trouble - Page 15

RegEx (or other) parsing of script

- by jpmyob

RegEx is powerful - it is tru but I have a little - query for you I want to parse out the FUNCTIONS from some old code in JS...however - I am RegEx handicapped (mentally deficient in grasping the subtleties).. the issue that makes me NOT EVEN TRY - is two fold - 1) myVar = function(x){ yadda yadda } AND function myVar(x) { yadda yadda } are found throuout - COLD I write a parser for each? sure - but that seems inefficient... 2) MANY things may reside INSIDE the {} including OTHER sets of {} or other Functions(){} block of text... HELP - does anyone have, or know of some code parsing snippets or examples that will parse out the info I want to collect? Thanks

Read the article

Regex - Format with tabs and alphabetical

- by Sam

Is it possible to use regex to turn this <site-ui:header title="error" backURL="javascript:history.go(-1);" /> into this <site-ui:header backURL="javascript:history.go(-1);" title="error" /> Basically, my goal is to format this xml so that the fields are in alphabetical order (e.g. backURL comes before title), and each field should be tabbed two spaces. If this can be done, any pointers would be really helpful! Even more helpful is an exact regex for vim.

Read the article

RegEx to Reject Unescaped Character

- by JDV72

I want to restrict usage of unescaped ampersands in a particular input field. I'm having trouble getting a RegEx to kill usage of "&" unless followed by "amp;"...or perhaps just restrict usage of "& " (note the space). I tried to adapt the answer in this thread, but to no avail. Thanks. (FWIW, here's a RegEx I made to ensure that a filename field didn't contain restrited chars. and ended in .mp3. It works fine, but does it look efficient?)

Read the article

Regex to validate for Unique Well Identifier in rails

- by Jasper502

I am a regex newbie and can't seem to figure this one out. Here is a link to the required string formats: http://earth.gov.bc.ca/royp-bin/phcgi.exe?PH_QKC=DOCUWI&PH_APP=RMSprodApp&PH_HTML=DOCUWI.htm For example: 100041506421W500 = 1+0+{01-16}+{01-36}+{001-129}+{01-36}+W+{1-6}+0+{0-9} I tried this: ^10[0|2-9]{1}0*([1-9]|1[0-6])0*([1-9]|[12][0-9]|3[0-6])0*([1-9][0-9]|1[0-2][0-9])0*([1-9]|[12][0-9]|3[0-6])W[1-6]0[0-9]$ In a regex validator and it sort of works except that 1041506421W500 and 10000000041506421W500 validates. The entire string can only be 16 characters long. I am pretty sure I am missing something obvious here regarding the leading zeros. Tried the NTS format and running into the same sort of problems.

Read the article

php regex to split invoice line item description

- by user1053700

I am attempting to split strings like the following: An item (Item A) which may contain 89798 numbers and letters @ $550.00 4 of Item B @ $420.00 476584 of Item C, with a larger quantity and different currency symbol @ £420.00 into: array( 0 => 1 1 => "some item which may contain 89798 numbers and letters" 2 => $550.00 ); does that make sense? I am looking for a regex pattern which will split the quantity, description, and price (including symbol). the strings will always be: qty x description @ price+symbol so i assume the regex would be something like: `(match a number and only a number) x (get description letters and numbers before the @ symbol) @ (match the currency symbol and price)` How should I approach this?

Read the article

What is the REGEX of a CSS selector

- by user421563

I'd like to parce a CSS file and add before every CSS selector another selector. From: p{margin:0 0 10px;} .lead{margin-bottom:20px;font-size:21px;font-weight:200;line-height:30px;} I'd like: .mySelector p{margin:0 0 10px;} .mySelector .lead{margin-bottom:20px;font-size:21px;font-weight:200;line-height:30px;} But my CSS file is really complex (in fact it is the bootstrap css file) so the regex should match all CSS selectors. For now I have this regex: ([^\r\n,{};]+)(,|{) and you can see the result here http://regexr.com?328ps but as you can see there are a lot of matches that shouldn't match for example: text-shadow:0 -1px 0 rgba(0, matchs positive but it shouldn't Does someone have a solution ? THX

Read the article

C#: Using regular expression (Regex) to duplicate a specific character in a string

- by user3703944

Anyone know how to use regex to duplicate a specific character in a string? I have a path that is entered like this: C:/Example/example I would like to use regex (or any other method) to display it like this: C://Example//example Is it possible? This is where I'm getting the file path private void btnSearchImage_Click_1(object sender, EventArgs e) { OpenFileDialog ofd = new OpenFileDialog(); ofd.Filter = "Image Files(*.jpg; *.jpeg; *.gif; *.bmp)|*.jpg; *.jpeg; *.gif; *.bmp"; if (ofd.ShowDialog() == System.Windows.Forms.DialogResult.OK) { string filenName = ofd.FileName; pictureBox1.Image = new Bitmap(filenName); string path = filenName; txtimgPath.Text = path; } } Thanks

Read the article

Generic validate input data via regex. Input error when match.count == 0

- by Valamas

Hi, I have a number of types of data fields on an input form, for example, a web page. Some fields are like, must be an email address, must be a number, must be a number between, must have certain characters. Basically, the list is undefinable. I wish to come up with a generic way of validating the data inputed. I thought I would use regex to validate the data. The fields which need validation would be related to a "regex expression" and a "regex error message" stating what the field should contain. My current mock up has that when the match count is zero, that would signify an error and to display the message. While still a white belt regex designer I have come to understand that in certain situations that it is difficult to write a regex which results in a match count of zero for every case. A complex regex case I looked for help on was Link Here. The forum post was a disaster because I confused people helping me. But one of the statements said that it was difficult to make a regex with a match count of zero meaning the input data was invalid; that the regex was very difficult to write that for. Does anyone have comments or suggestions on this generic validation system I am trying to create? thanks

Read the article

.NET Regex: Howto extract IPv6 address parts

- by Quandary

Question: How does the .NET regex string to extract IPv6 addresses look like ? I can get it to extract a simple IPv6 address like "1050:0:0:0:5:600:300c:326b" but not the colon format ("ff06::c3"); My problem is, it should extract a 0 for every omitted value between the :: How do I do that? Below my code + description. Specify IPv6 addresses by omitting leading zeros. For example, IPv6 address 1050:0000:0000:0000:0005:0600:300c:326b may be written as 1050:0:0:0:5:600:300c:326b. Double colon Specify IPv6 addresses by using double colons (::) in place of a series of zeros. For example, IPv6 address ff06:0:0:0:0:0:0:c3 may be written as ff06::c3. Double colons may be used only once in an IP address. strInputString = "ff06::c3"; strInputString = "1050:0000:0000:0000:0005:0600:300c:326b"; string strPattern = "([A-Fa-f0-9]{1,4}:){7}([A-Fa-f0-9]{1,4})"; //strPattern = @"\A(?:[0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}\z"; //strPattern = @"(\A([0-9a-f]{1,4}:){1,1}(:[0-9a-f]{1,4}){1,6}\Z)|(\A([0-9a-f]{1,4}:){1,2}(:[0-9a-f]{1,4}){1,5}\Z)|(\A([0-9a-f]{1,4}:){1,3}(:[0-9a-f]{1,4}){1,4}\Z)|(\A([0-9a-f]{1,4}:){1,4}(:[0-9a-f]{1,4}){1,3}\Z)|(\A([0-9a-f]{1,4}:){1,5}(:[0-9a-f]{1,4}){1,2}\Z)|(\A([0-9a-f]{1,4}:){1,6}(:[0-9a-f]{1,4}){1,1}\Z)|(\A(([0-9a-f]{1,4}:){1,7}|:):\Z)|(\A:(:[0-9a-f]{1,4}){1,7}\Z)|(\A((([0-9a-f]{1,4}:){6})(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3})\Z)|(\A(([0-9a-f]{1,4}:){5}[0-9a-f]{1,4}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3})\Z)|(\A([0-9a-f]{1,4}:){5}:[0-9a-f]{1,4}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}\Z)|(\A([0-9a-f]{1,4}:){1,1}(:[0-9a-f]{1,4}){1,4}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}\Z)|(\A([0-9a-f]{1,4}:){1,2}(:[0-9a-f]{1,4}){1,3}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}\Z)|(\A([0-9a-f]{1,4}:){1,3}(:[0-9a-f]{1,4}){1,2}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}\Z)|(\A([0-9a-f]{1,4}:){1,4}(:[0-9a-f]{1,4}){1,1}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}\Z)|(\A(([0-9a-f]{1,4}:){1,5}|:):(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}\Z)|(\A:(:[0-9a-f]{1,4}){1,5}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}\Z) "; //strPattern = @"/^\s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?\s*$/"; //strPattern = @"(:?[0-9a-fA-F]{1,4}:){7}([0-9a-fA-F]{1,4})\z"; //strPattern = @"\A((?:[0-9A-Fa-f]{1,4}(?::[0-9A-Fa-f]{1,4})*)?)::((?:[0-9A-Fa-f]{1,4}(?::[0-9A-Fa-f]{1,4})*)?)\z"; //strPattern = @"\A((?:[0-9A-Fa-f]{1,4}(?::[0-9A-Fa-f]{1,4})*)?)::((?:[0-9A-Fa-f]{1,4}:)*)(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}\z"; //strPattern = @"/^(?:(?:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.*[a-f0-9](?::|$)){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:\.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))$/i"; System.Text.RegularExpressions.Regex reValidationRule = new System.Text.RegularExpressions.Regex("^" + strPattern + "$"); if (reValidationRule.Match(strInputString).Success) // If matching pattern { System.Text.RegularExpressions.Match maResult = System.Text.RegularExpressions.Regex.Match(strInputString, strPattern); // Console.WriteLine(maResult.Groups.Count) string[] astrReturnValues = new string[4]; System.Text.RegularExpressions.GroupCollection gc = maResult.Groups; System.Text.RegularExpressions.CaptureCollection cc; int counter; //System.Web.Script.Serialization.JavaScriptSerializer jssJSONserializer = new System.Web.Script.Serialization.JavaScriptSerializer(); //Console.WriteLine(jssJSONserializer.Serialize()); // Loop through each group. for (int i = 0; i < gc.Count; i++) { Console.WriteLine("Group: {0}", i); cc = gc[i].Captures; counter = cc.Count; // Print number of captures in this group. Console.WriteLine("Captures count = " + counter.ToString()); // Loop through each capture in group. for (int ii = 0; ii < counter; ii++) { Console.WriteLine("Capture: {0}", ii); // Print capture and position. Console.WriteLine(cc[ii] + " Starts at character " + cc[ii].Index); } }

Read the article

RegEx to ignore / skip everything in html tags

- by Scott Sumpter

Looking for a way to combine two Regular Expressions. One to catch the urls and the other to ensure is skips text within html tags. See sample text below functions. Need to pass a block of news text and format text by wrapping urls and email addresses in html tags so users don't have to. The below code works great until there are already html tags within the text. In that case it doubles the html tags. There are plenty of examples to strip html, but I want to just ignore it since the url is already linkified. Also - if there is an easier was to accomplish this, with or without Regex, please let me know. none of my attempts to combine Regexs have worked. coding in ASP.NET VB but will take any workable example/direction. Thanks! ===== Functions ============= Public Shared Function InsertHyperlinks(ByVal inText As String) As String Dim strBuf As String Dim objMatches As Object Dim iStart, iEnd As Integer strBuf = "" iStart = 1 iEnd = 1 Dim strRegUrlEmail As String = "\b(www|http|\S+@)\S+\b" 'RegEx to find urls and email addresses Dim objRegExp As New Regex(strRegUrlEmail, RegexOptions.IgnoreCase) 'Match URLs and emails Dim MatchList As MatchCollection = objRegExp.Matches(inText) If MatchList.Count <> 0 Then objMatches = objRegExp.Matches(inText) For Each Match In MatchList iEnd = Match.Index strBuf = strBuf & Mid(inText, iStart, iEnd - iStart + 1) If InStr(1, Match.Value, "@") Then strBuf = strBuf & HrefGet(Match.Value, "EMAIL", "_BLANK") Else strBuf = strBuf & HrefGet(Match.Value, "WEB", "_BLANK") End If iStart = iEnd + Match.Length + 1 Next strBuf = strBuf & Mid(inText, iStart) InsertHyperlinks = strBuf Else 'No hyperlinks to replace InsertHyperlinks = inText End If End Function Shared Function HrefGet(ByVal url As String, ByVal urlType As String, ByVal Target As String) As String Dim strBuf As String strBuf = "<a href=""" If UCase(urlType) = "WEB" Then If LCase(Left(url, 3)) = "www" Then strBuf = "<a href=""http://" & url & """ Target=""" & _ Target & """>" & url & "</a>" Else strBuf = "<a href=""" & url & """ Target=""" & _ Target & """>" & url & "</a>" End If ElseIf UCase(urlType) = "EMAIL" Then strBuf = "<a href=""mailto:" & url & """ Target=""" & _ Target & """>" & url & "</a>" End If HrefGet = strBuf End Function ===== Sample Text ============= This would be the inText parameter. Midway through the ride, we see a Skip this too. But sometimes we go here [insert normal www dot link dot com]. If you'd like to join us contact Bill Smith at [email protected]. Thanks! sorry stack overflow won't allow multiple hyperlinks to be added. ===== End Sample Text =============

Read the article

Advanced Regex: Smart auto detect and replace URLs with anchor tags

- by Robert Koritnik

I've written a regular expression that automatically detects URLs in free text that users enter. This is not such a simple task as it may seem at first. Jeff Atwood writes about it in his post. His regular expression works, but needs extra code after detection is done. I've managed to write a regular expression that does everything in a single go. This is how it looks like (I've broken it down into separate lines to make it more understandable what it does): 1 (?<outer>$)? 2 (?<scheme>http(?<secure>s)?://)? 3 (?<url> 4 (?(scheme) 5 (?:www\.)? 6 | 7 www\. 8 ) 9 [a-z0-9] 10 (?(outer) 11 [-a-z0-9/+&@#/%?=~_()|!:,.;cšžcd]+(?=$) 12 | 13 [-a-z0-9/+&@#/%?=~_()|!:,.;cšžcd]+ 14 ) 15 ) 16 (?<ending>(?(outer)\))) As you may see, I'm using named capture groups (used later in Regex.Replace()) and I've also included some local characters (cšžcd), that allow our localised URLs to be parsed as well. You can easily omit them if you'd like. Anyway. Here's what it does (referring to line numbers): 1 - detects if URL starts with open braces (is contained inside braces) and stores it in "outer" named capture group 2 - checks if it starts with URL scheme also detecting whether scheme is SSL or not 3 - start parsing URL itself (will store it in "url" named capture group) 4-8 - if statement that says: if "sheme" was present then www. part is optional, otherwise mandatory for a string to be a link (so this regular expression detects all strings that start with either http or www) 9 - first character after http:// or www. should be either a letter or a number (this can be extended if you'd like to cover even more links, but I've decided not to because I can't think of a link that would start with some obscure character) 10-14 - if statement that says: if "outer" (braces) was present capture everything up to the last closing braces otherwise capture all 15 - closes the named capture group for URL 16 - if open braces were present, capture closing braces as well and store it in "ending" named capture group First and last line used to have \s* in them as well, so user could also write open braces and put a space inside before pasting link. Anyway. My code that does link replacement with actual anchor HTML elements looks exactly like this: value = Regex.Replace( value, @"(?<outer>$)?(?<scheme>http(?<secure>s)?://)?(?<url>(?(scheme)(?:www\.)?|www\.)[a-z0-9](?(outer)[-a-z0-9/+&@#/%?=~_()|!:,.;cšžcd]+(?=$)|[-a-z0-9/+&@#/%?=~_()|!:,.;cšžcd]+))(?<ending>(?(outer)\)))", "${outer}<a href=\"http${secure}://${url}\">http${secure}://${url}</a>${ending}", RegexOptions.Compiled | RegexOptions.CultureInvariant | RegexOptions.IgnoreCase); As you can see I'm using named capture groups to replace link with an Anchor tag: "${outer}<a href=\"http${secure}://${url}\">http${secure}://${url}</a>${ending}" I could as well omit the http(s) part in anchor display to make links look friendlier, but for now I decided not to. Question I would like my links to be replaced with shortenings as well. So when user copies a very long link (for instance if they would copy a link from google maps that usually generates long links) I would like to shorten the visible part of the anchor tag. Link would work, but visible part of an anchor tag would be shortened to some number of characters. I could as well append ellipsis at the end of at all possible (and make things even more perfect). Does Regex.Replace() method support replacement notations so that I can still use a single call? Something similar as string.Format() method does when you'd like to format values in string format (decimals, dates etc...).

Read the article

C# Regex replace url

- by Martijn

I have a bunch of links in a document which has to be replaced by a javascript call. All the links looks the same: <a href="http://domain/ViewDocument.aspx?id=3D1&doc=form" target="_blank">Document naam 1</a> <a href="http://domain/ViewDocument.aspx?id=3D2&doc=form" target="_blank">Document naam 2</a> <a href="http://domain/ViewDocument.aspx?id=3D3&doc=form" target="_blank">Document naam 3</a> Now I want all this links to be replaced to: <a href="javascript:loadDocument('1','form')">Document naam 1</a> <a href="javascript:loadDocument('2','form')">Document naam 2</a> <a href="javascript:loadDocument('3','form')">Document naam 3</a> So the Id=3D in the url is the first parameter in the function and the doc parameter is the second parameter in the function call. I want to do this using Regex because I think this is the quickest way. But the problem is my regex knowledge is too limited

Read the article

Regex to remove conditional comments

- by cnu

I want a regex which can match conditional comments in a HTML source page so I can remove only those. I want to preserve the regular comments. I would also like to avoid using the .*? notation if possible. The text is foo  bar and I want to remove everything in  EDIT: It is because of BeautifulSoup I want to remove these tags. BeautifulSoup fails to parse and gives an incomplete source EDIT2: [if IE] isn't the only condition. There are lots more and I don't have any list of all possible combinations. EDIT3: Vinko Vrsalovic's solution works, but the actual problem why beautifulsoup failed was because of a rogue comment within the conditional comment. Like  <![endif]--> Notice the  comment? Though my problem was solve, I would love to get a regex solution for this.

Read the article

java regex: capture multiline sequence between tokens

- by Guillaume

I'm struggling with regex for splitting logs files into log sequence in order to match pattern inside these sequences. log format is: timestamp fieldA fieldB fieldn log message1 timestamp fieldA fieldB fieldn log message2 log message2bis timestamp fieldA fieldB fieldn log message3 The timestamp regex is known. I want to extract every log sequence (potentialy multiline) between timestamps. And I want to keep the timestamp. I want in the same time to keep the exact count of lines. What I need is how to decorate timestamp pattern to make it split my log file in log sequence. I can not split the whole file as a String, since the file content is provided in a CharBuffer Here is sample method that will be using this log sequence matcher: private void matches(File f, CharBuffer cb) { Matcher sequenceBreak = sequencePattern.matcher(cb); // sequence matcher int lines = 1; int sequences = 0; while (sequenceBreak.find()) { sequences++; String sequence = sequenceBreak.group(); if (filter.accept(sequence)) { System.out.println(f + ":" + lines + ":" + sequence); } //count lines Matcher lineBreak = LINE_PATTERN.matcher(sequence); while (lineBreak.find()) { lines++; } if (sequenceBreak.end() == cb.limit()) { break; } } }

Read the article

Hyperlink regex including http(s):// not working in C#

- by Rory Fitzpatrick

I think this is sufficiently different from similar questions to warrant a new one. I have the following regex to match the beginning hyperlink tags in HTML, including the http(s):// part in order to avoid mailto: links <a[^>]*?href=[""'](?<href>\\b(https?)://[^\[\]""]+?)[""'][^>]*?> When I run this through Nregex (with escaping removed) it matches correctly for the following test cases: <a href="http://www.bbc.co.uk"> <a href="http://bbc.co.uk"> <a href="https://www.bbc.co.uk"> <a href="mailto:[email protected]"> However when I run this in my C# code it fails. Here is the matching code: public static IEnumerable<string> GetUrls(this string input, string matchPattern) { var matches = Regex.Matches(input, matchPattern, RegexOptions.Compiled | RegexOptions.IgnoreCase); foreach (Match match in matches) { yield return match.Groups["href"].Value; } } And my tests: @"<a href=""https://www.bbc.co.uk"">bbc</a>".GetUrls(StringExtensions.HtmlUrlRegexPattern).Count().ShouldEqual(1); @"<a href=""mailto:[email protected]"">bbc</a>".GetUrls(StringExtensions.HtmlUrlRegexPattern).Count().ShouldEqual(0); The problem seems to be in the \\b(https?):// part which I added, removing this passes the normal URL test but fails the mailto: test. Anyone shed any light?

Read the article

Python regex look-behind requires fixed-width pattern

- by invictus

Hi When trying to extract the title of a html-page I have always used the following regex: (?<=<title.*>)([\s\S]*)(?=</title>) Which will extract everything between the tags in a document and ignore the tags themselves. However, when trying to use this regex in Python it raises the following Exception: Traceback (most recent call last): File "test.py", line 21, in pattern = re.compile('(?<=)([\s\S]*)(?=)') File "C:\Python31\lib\re.py", line 205, in compile return _compile(pattern, flags) File "C:\Python31\lib\re.py", line 273, in _compile p = sre_compile.compile(pattern, flags) File "C:\Python31\lib\sre_compile.py", line 495, in compile code = _code(p, flags) File "C:\Python31\lib\sre_compile.py", line 480, in _code _compile(code, p.data, flags) File "C:\Python31\lib\sre_compile.py", line 115, in _compile raise error("look-behind requires fixed-width pattern") sre_constants.error: look-behind requires fixed-width pattern The code I am using is: pattern = re.compile('(?<=<title.*>)([\s\S]*)(?=</title>)') m = pattern.search(f) if I do some minimal adjustments it works: pattern = re.compile('(?<=<title>)([\s\S]*)(?=</title>)') m = pattern.search(f) This will, however, not take into account potential html titles that for some reason have attributes or similar. Anyone know a good workaround for this issue? Any tips are appreciated.

Read the article

Bash and regex problem : check for tokens entered into a Coke vending machine

- by Michael Mao

Hi all: Here is a "challenge question" I've got from Linux system programming lecture. Any of the following strings will give you a Coke if you kick: L = { aaaa, aab, aba, baa, bb, aaaa"a", aaaa"b", aab"a", … ab"b"a, ba"b"a, ab"bbbbbb"a, ... } The letters shown in wrapped double quotes indicate coins that would have fallen through (but those strings are still part of the language in this example). Exercise (a bit hard) show this is the language of a regular expression And this is what I've got so far : #!/usr/bin/bash echo "A bottle of Coke costs you 40 cents" echo -e "Please enter tokens (a = 10 cents, b = 20 cents) in a sequence like 'abba' :\c" read tokens #if [ $tokens = aaaa ]||[ $tokens = aab ]||[ $tokens = bb ] #then # echo "Good! now a coke is yours!" #else echo "Thanks for your money, byebye!" if [[ $token =~ 'aaaa|aab|bb' ]] then echo "Good! now a coke is yours!" else echo "Thanks for your money, byebye!" fi Sadly it doesn't work... always outputs "Thanks for your money, byebye!" I believe something is wrong with syntax... We didn't provided with any good reference book and the only instruction from the professor was to consult "anything you find useful online" and "research the problem yourself" :( I know how could I do it in any programming language such as Java, but get it done with bash script + regex seems not "a bit hard" but in fact "too hard" for anyone with little knowledge on something advanced as "lookahead"(is this the terminology ?) I don't know if there is a way to express the following concept in the language of regex: Valid entry would consist of exactly one of the three components : aaaa, aab and bb, regardless of order, followed by an arbitrary sequence of a or b's So this is what is should be like : (a{4}Ua{2}bUb{2})(aUb)* where the content in first braces is order irrelevant. Thanks a lot in advance for any hints and/or tips :)

Read the article

Regex statements for date ranges <=4/1/2009 and <=10/01/2009

- by reggiereg

Hi, I need serious help building two Regex statements for a project. The software we're using ONLY accepts Regex for validation. I need one that fires for any date <4/1/2009 and a second that fires for any date <10/1/2009 My co-worker gave me the following code to check for <=10/01/2010, but it checks leap years and all that stuff. I need something a little more streamlined than this in the MM/DD/YYYY format. Thanks in advance! ^(?:(?:0?[1-9])|(?:1[0-2]))(\/|-|.)(?:0?[1-9]|1\d|2[0-8])(\/|-|.)(?:2[0-9][2-9][0-9])$|^(?:(?:0?[1-9])|(?:1[0-2]))(\/|-|.)(?:0?[1-9]|1\d|2[0-8])(\/|-|.)(?:201[1-9])$|^(?:(?:(?:0?[13578]|1[02])(\/|-|.)31)|(?:(?:0?[1,3-9]|1[0-2])(\/|-|.)(?:29|30)))(\/|-|.)(?:201[1-9])$|^(?:(?:(?:11)(\/|-|.))(?:0?[1-9]|1\d|2[0-9]|30)(\/|-|.))(2010)$|^(?:(?:(?:10|12)(\/|-|.))(?:0?[1-9]|1\d|2[0-9]|30|31)(\/|-|.))(2010)$|^(?:(?:0?[1-9])|(?:1[0-2]))(\/|-|.)(?:0?[1-9]|1\d|2[0-8])(\/|-|.)(?:2[0-9][2-9][0-9])$|^(?:(?:(?:0?[13578]|1[02])(\/|-|.)31)\1|(?:(?:0?[1,3-9]|1[0-2])(\/|-|.)(?:29|30)))(\/|-|.)(?:2[0-9][2-9][0-9])$|^(?:(?:0?[1-9])|(?:1[0-2]))(\/|-|.)(?:0?[1-9]|1\d|2[0-8])(\/|-|.)(?:2011)$|^(?:0?2(\/|-|.)29\3(?:(?:(?:2[0-9][1-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$

Read the article

c# regex split and extract multiple parts from a string

- by nLL

Hi, I am trying to extract some parts of the "Video:" line from below text. Seems stream 0 codec frame rate differs from container frame rate: 30000.00 (300 00/1) - 14.93 (1000/67) Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'C:\a.3gp': Metadata: major_brand : 3gp5 minor_version : 0 compatible_brands: 3gp5isom Duration: 00:00:45.82, start: 0.000000, bitrate: 357 kb/s Stream #0.0(und): Video: mpeg4, yuv420p, 352x276 [PAR 1:1 DAR 88:69], 344 kb /s, 14.93 fps, 14.93 tbr, 90k tbn, 30k tbc Stream #0.1(und): Audio: aac, 16000 Hz, mono, s16, 11 kb/s Stream #0.2(und): Data: mp4s / 0x7334706D, 0 kb/s Stream #0.3(und): Data: mp4s / 0x7334706D, 0 kb/s* This is an output from ffmpeg command line where i can get Video: part with private string ExtractVideoFormat(string rawInfo) { string v = string.Empty; Regex re = new Regex("[V|v]ideo:.*", RegexOptions.Compiled); Match m = re.Match(rawInfo); if (m.Success) { v = m.Value; } return v; } and result is mpeg4, yuv420p, 352x276 [PAR 1:1 DAR 88:69], 344 kb What i am trying to do is to somehow split that line and get mpeg4 yuv420p 352x276 [PAR 1:1 DAR 88:69] 344 kb assigned to diffrent string objects instead of single

Read the article

PHP regex - find and replace

- by jay

Hi, I am trying to do this regex match and replace but not able to do it. Example <SPAN class="one">first content here</SPAN> <SPAN class="two">second content here </SPAN> <SPAN class="three">one; two; three; and more.</span> <SPAN class="four">more content here.</span> I want to find each set of the span tags and replace with something like this Find <SPAN class="one">first content here</SPAN> Change to <one>first content here</one> same way the the rest of the span tags. class="one", class="two" and so on are the only key identifier which I use in the regex match expression. So if I find a span tag with these class then I want to do the replace. My main issue is that I am not able to find the occurrence of first closing tag so what it does is it finds from the start to end which is of no use. So far I have been trying to do this using notepad++ but just found that it has its limitations so any php help would be appreciated. regards

Read the article

regex split and extract multiple parts from a string

- by nLL

I am trying to extract some parts of the "Video:" line from below text. Seems stream 0 codec frame rate differs from container frame rate: 30000.00 (300 00/1) -> 14.93 (1000/67) Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'C:\a.3gp': Metadata: major_brand : 3gp5 minor_version : 0 compatible_brands: 3gp5isom Duration: 00:00:45.82, start: 0.000000, bitrate: 357 kb/s Stream #0.0(und): Video: mpeg4, yuv420p, 352x276 [PAR 1:1 DAR 88:69], 344 kb /s, 14.93 fps, 14.93 tbr, 90k tbn, 30k tbc Stream #0.1(und): Audio: aac, 16000 Hz, mono, s16, 11 kb/s Stream #0.2(und): Data: mp4s / 0x7334706D, 0 kb/s Stream #0.3(und): Data: mp4s / 0x7334706D, 0 kb/s* This is an output from ffmpeg command line where i can get Video: part with private string ExtractVideoFormat(string rawInfo) { string v = string.Empty; Regex re = new Regex("[V|v]ideo:.*", RegexOptions.Compiled); Match m = re.Match(rawInfo); if (m.Success) { v = m.Value; } return v; } and result is mpeg4, yuv420p, 352x276 [PAR 1:1 DAR 88:69], 344 kb What i am trying to do is to somehow split that line and get mpeg4 yuv420p 352x276 [PAR 1:1 DAR 88:69] 344 kb assigned to different string objects instead of single

Read the article

Regex to check if exact string exists

- by Jayrox

I am looking for a way to check if an exact string match exists in another string using Regex or any better method suggested. I understand that you tell regex to match a space or any other non-word character at the beginning or end of a string. However, I don't know exactly how to set it up. Search String: t String 1: Hello World, Nice to see you! t String 2: Hello World, Nice to see you! String 3: T Hello World, Nice to see you! I would like to use the search string and compare it to String 1, String 2 and String 3 and only get a positive match from String 1 and String 3 but not from String 2. Requirements: Search String may be at any character position in the Subject. There may or may not be a white-space character before or after it. I do not want it to match if it is part of another string; such as part of a word. For the sake of this question: I think I would do this using this pattern: /\bt\b/gi /\b{$search_string}\b/gi Does this look right? Can it be made better? Any situations where this pattern wouldn't work? Additional info: this will be used in PHP 5

Read the article

Regex expression is too greedy

- by alastairs

I'm writing a regular expression to match data from the IMDb soundtracks data file. My regexes are mostly working, although they are in places slurping too much text into my named groups. Take the following regex for example: "^ Performed by '?(?<performer>.*)('? $qv$)?$" The performer group includes the string ' (qv) as well as the performer's name. Unfortunately, because the records are not consistently formatted, some performers' names are surrounded by single quotation marks whilst others are not. This means they are optional as far as the regex is concerned. I've tried marking the last group as a greedy group using the ?> group specifier, but this appeared to have no effect on the results. I can improve the results by changing the performer group to match a small range of characters, but this reduces my chances of parsing the name out correctly. Furthermore, if I were to just exclude the apostrophe character, I would then be unable to parse, e.g., band names containing apostrophes, such as Elia's Lonely Friends Band who performed Run For Your Life featured in Resident Evil: Apocalypse.

Read the article

JavaScript Regex: Complicated input validation

- by ScottSEA

I'm trying to construct a regex to screen valid part and/or serial numbers in combination, with ranges. A valid part number is a two alpha, three digit pattern or /[A-z]{2}\d{3}/ i.e. aa123 or ZZ443 etc... A valid serial number is a five digit pattern, or /\d{5}/ 13245 or 31234 and so on. That part isn't the problem. I want combinations and ranges to be valid as well: 12345, ab123,ab234-ab245, 12346 - 12349 - the ultimate goal. Ranges and/or series of part and/or serial numbers in any combination. Note that spaces are optional when specifying a range or after a comma in a series. Note that a range of part numbers has the same two letter combination on both sides of the range (i.e. ab123 - ab239) I have been wrestling with this expression for two days now, and haven't come up with anything better than this: /^(?:[A-z]{2}\d{3}[, ]*)|(?:\d{5}[, ]*)|(?:([A-z]{2})\d{3} ?- ?\4\d{3}[, ]*)|(?:\d{5} ?- ?\d{5}[, ]*)$/ ... My Regex-Fu is weak.

Read the article

Python re module becomes 20 times slower when called on greater than 101 different regex

- by Wiil

My problem is about parsing log files and removing variable parts on each lines to be able to group them. For instance: s = re.sub(r'(?i)User [_0-9A-z]+ is ', r"User .. is ", s) s = re.sub(r'(?i)Message rejected because : (.*?) $.+$', r'Message rejected because : \1 (...)', s) I have about 120+ matching rules like those above. I have found no performances issues while searching successively on 100 different regex. But a huge slow down comes when applying 101 regex. Exact same behavior happens when replacing my rules set by for a in range(100): s = re.sub(r'(?i)caught here'+str(a)+':.+', r'( ... )', s) Got 20 times slower when putting range(101) instead. # range(100) % ./dashlog.py file.bz2 == Took 2.1 seconds. == # range(101) % ./dashlog.py file.bz2 == Took 47.6 seconds. == Why such thing is happening ? And is there any known workaround ? (Happens on Python 2.6.6/2.7.2 on Linux/Windows.)

Search Results

Search found 10005 results on 401 pages for 'regex trouble'.

Page 15/401 | < Previous Page | 11 12 13 14 15 16 17 18 19 20 21 22 | Next Page >

- by jpmyob

- by Sam

- by JDV72

- by Jasper502

- by user1053700

- by user421563

- by user3703944

- by Valamas

- by Quandary

- by Scott Sumpter

- by Robert Koritnik

- by Martijn

- by cnu

- by Guillaume

- by Rory Fitzpatrick

- by invictus

- by Michael Mao

- by reggiereg

- by nLL

- by jay

- by nLL

- by Jayrox

- by alastairs

- by ScottSEA

- by Wiil

< Previous Page | 11 12 13 14 15 16 17 18 19 20 21 22 | Next Page >