Search Results

Search found 4539 results on 182 pages for 'regex grouping'.

Page 17/182 | < Previous Page | 13 14 15 16 17 18 19 20 21 22 23 24  | Next Page >

  • RegEx to ignore / skip everything in html tags

    - by Scott Sumpter
    Looking for a way to combine two Regular Expressions. One to catch the urls and the other to ensure is skips text within html tags. See sample text below functions. Need to pass a block of news text and format text by wrapping urls and email addresses in html tags so users don't have to. The below code works great until there are already html tags within the text. In that case it doubles the html tags. There are plenty of examples to strip html, but I want to just ignore it since the url is already linkified. Also - if there is an easier was to accomplish this, with or without Regex, please let me know. none of my attempts to combine Regexs have worked. coding in ASP.NET VB but will take any workable example/direction. Thanks! ===== Functions ============= Public Shared Function InsertHyperlinks(ByVal inText As String) As String Dim strBuf As String Dim objMatches As Object Dim iStart, iEnd As Integer strBuf = "" iStart = 1 iEnd = 1 Dim strRegUrlEmail As String = "\b(www|http|\S+@)\S+\b" 'RegEx to find urls and email addresses Dim objRegExp As New Regex(strRegUrlEmail, RegexOptions.IgnoreCase) 'Match URLs and emails Dim MatchList As MatchCollection = objRegExp.Matches(inText) If MatchList.Count <> 0 Then objMatches = objRegExp.Matches(inText) For Each Match In MatchList iEnd = Match.Index strBuf = strBuf & Mid(inText, iStart, iEnd - iStart + 1) If InStr(1, Match.Value, "@") Then strBuf = strBuf & HrefGet(Match.Value, "EMAIL", "_BLANK") Else strBuf = strBuf & HrefGet(Match.Value, "WEB", "_BLANK") End If iStart = iEnd + Match.Length + 1 Next strBuf = strBuf & Mid(inText, iStart) InsertHyperlinks = strBuf Else 'No hyperlinks to replace InsertHyperlinks = inText End If End Function Shared Function HrefGet(ByVal url As String, ByVal urlType As String, ByVal Target As String) As String Dim strBuf As String strBuf = "<a href=""" If UCase(urlType) = "WEB" Then If LCase(Left(url, 3)) = "www" Then strBuf = "<a href=""http://" & url & """ Target=""" & _ Target & """>" & url & "</a>" Else strBuf = "<a href=""" & url & """ Target=""" & _ Target & """>" & url & "</a>" End If ElseIf UCase(urlType) = "EMAIL" Then strBuf = "<a href=""mailto:" & url & """ Target=""" & _ Target & """>" & url & "</a>" End If HrefGet = strBuf End Function ===== Sample Text ============= This would be the inText parameter. Midway through the ride, we see a Skip this too. But sometimes we go here [insert normal www dot link dot com]. If you'd like to join us contact Bill Smith at [email protected]. Thanks! sorry stack overflow won't allow multiple hyperlinks to be added. ===== End Sample Text =============

    Read the article

  • Advanced Regex: Smart auto detect and replace URLs with anchor tags

    - by Robert Koritnik
    I've written a regular expression that automatically detects URLs in free text that users enter. This is not such a simple task as it may seem at first. Jeff Atwood writes about it in his post. His regular expression works, but needs extra code after detection is done. I've managed to write a regular expression that does everything in a single go. This is how it looks like (I've broken it down into separate lines to make it more understandable what it does): 1 (?<outer>\()? 2 (?<scheme>http(?<secure>s)?://)? 3 (?<url> 4 (?(scheme) 5 (?:www\.)? 6 | 7 www\. 8 ) 9 [a-z0-9] 10 (?(outer) 11 [-a-z0-9/+&@#/%?=~_()|!:,.;cšžcd]+(?=\)) 12 | 13 [-a-z0-9/+&@#/%?=~_()|!:,.;cšžcd]+ 14 ) 15 ) 16 (?<ending>(?(outer)\))) As you may see, I'm using named capture groups (used later in Regex.Replace()) and I've also included some local characters (cšžcd), that allow our localised URLs to be parsed as well. You can easily omit them if you'd like. Anyway. Here's what it does (referring to line numbers): 1 - detects if URL starts with open braces (is contained inside braces) and stores it in "outer" named capture group 2 - checks if it starts with URL scheme also detecting whether scheme is SSL or not 3 - start parsing URL itself (will store it in "url" named capture group) 4-8 - if statement that says: if "sheme" was present then www. part is optional, otherwise mandatory for a string to be a link (so this regular expression detects all strings that start with either http or www) 9 - first character after http:// or www. should be either a letter or a number (this can be extended if you'd like to cover even more links, but I've decided not to because I can't think of a link that would start with some obscure character) 10-14 - if statement that says: if "outer" (braces) was present capture everything up to the last closing braces otherwise capture all 15 - closes the named capture group for URL 16 - if open braces were present, capture closing braces as well and store it in "ending" named capture group First and last line used to have \s* in them as well, so user could also write open braces and put a space inside before pasting link. Anyway. My code that does link replacement with actual anchor HTML elements looks exactly like this: value = Regex.Replace( value, @"(?<outer>\()?(?<scheme>http(?<secure>s)?://)?(?<url>(?(scheme)(?:www\.)?|www\.)[a-z0-9](?(outer)[-a-z0-9/+&@#/%?=~_()|!:,.;cšžcd]+(?=\))|[-a-z0-9/+&@#/%?=~_()|!:,.;cšžcd]+))(?<ending>(?(outer)\)))", "${outer}<a href=\"http${secure}://${url}\">http${secure}://${url}</a>${ending}", RegexOptions.Compiled | RegexOptions.CultureInvariant | RegexOptions.IgnoreCase); As you can see I'm using named capture groups to replace link with an Anchor tag: "${outer}<a href=\"http${secure}://${url}\">http${secure}://${url}</a>${ending}" I could as well omit the http(s) part in anchor display to make links look friendlier, but for now I decided not to. Question I would like my links to be replaced with shortenings as well. So when user copies a very long link (for instance if they would copy a link from google maps that usually generates long links) I would like to shorten the visible part of the anchor tag. Link would work, but visible part of an anchor tag would be shortened to some number of characters. I could as well append ellipsis at the end of at all possible (and make things even more perfect). Does Regex.Replace() method support replacement notations so that I can still use a single call? Something similar as string.Format() method does when you'd like to format values in string format (decimals, dates etc...).

    Read the article

  • C# Regex replace url

    - by Martijn
    I have a bunch of links in a document which has to be replaced by a javascript call. All the links looks the same: <a href="http://domain/ViewDocument.aspx?id=3D1&doc=form" target="_blank">Document naam 1</a> <a href="http://domain/ViewDocument.aspx?id=3D2&doc=form" target="_blank">Document naam 2</a> <a href="http://domain/ViewDocument.aspx?id=3D3&doc=form" target="_blank">Document naam 3</a> Now I want all this links to be replaced to: <a href="javascript:loadDocument('1','form')">Document naam 1</a> <a href="javascript:loadDocument('2','form')">Document naam 2</a> <a href="javascript:loadDocument('3','form')">Document naam 3</a> So the Id=3D in the url is the first parameter in the function and the doc parameter is the second parameter in the function call. I want to do this using Regex because I think this is the quickest way. But the problem is my regex knowledge is too limited

    Read the article

  • Regex to remove conditional comments

    - by cnu
    I want a regex which can match conditional comments in a HTML source page so I can remove only those. I want to preserve the regular comments. I would also like to avoid using the .*? notation if possible. The text is foo <!--[if IE]> <style type="text/css"> ul.menu ul li{ font-size: 10px; font-weight:normal; padding-top:0px; } </style> <![endif]--> bar and I want to remove everything in <!--[if IE]> and <![endif]--> EDIT: It is because of BeautifulSoup I want to remove these tags. BeautifulSoup fails to parse and gives an incomplete source EDIT2: [if IE] isn't the only condition. There are lots more and I don't have any list of all possible combinations. EDIT3: Vinko Vrsalovic's solution works, but the actual problem why beautifulsoup failed was because of a rogue comment within the conditional comment. Like <!--[if lt IE 7.]> <script defer type="text/javascript" src="pngfix_253168.js"></script><!--png fix for IE--> <![endif]--> Notice the <!--png fix for IE--> comment? Though my problem was solve, I would love to get a regex solution for this.

    Read the article

  • java regex: capture multiline sequence between tokens

    - by Guillaume
    I'm struggling with regex for splitting logs files into log sequence in order to match pattern inside these sequences. log format is: timestamp fieldA fieldB fieldn log message1 timestamp fieldA fieldB fieldn log message2 log message2bis timestamp fieldA fieldB fieldn log message3 The timestamp regex is known. I want to extract every log sequence (potentialy multiline) between timestamps. And I want to keep the timestamp. I want in the same time to keep the exact count of lines. What I need is how to decorate timestamp pattern to make it split my log file in log sequence. I can not split the whole file as a String, since the file content is provided in a CharBuffer Here is sample method that will be using this log sequence matcher: private void matches(File f, CharBuffer cb) { Matcher sequenceBreak = sequencePattern.matcher(cb); // sequence matcher int lines = 1; int sequences = 0; while (sequenceBreak.find()) { sequences++; String sequence = sequenceBreak.group(); if (filter.accept(sequence)) { System.out.println(f + ":" + lines + ":" + sequence); } //count lines Matcher lineBreak = LINE_PATTERN.matcher(sequence); while (lineBreak.find()) { lines++; } if (sequenceBreak.end() == cb.limit()) { break; } } }

    Read the article

  • Hyperlink regex including http(s):// not working in C#

    - by Rory Fitzpatrick
    I think this is sufficiently different from similar questions to warrant a new one. I have the following regex to match the beginning hyperlink tags in HTML, including the http(s):// part in order to avoid mailto: links <a[^>]*?href=[""'](?<href>\\b(https?)://[^\[\]""]+?)[""'][^>]*?> When I run this through Nregex (with escaping removed) it matches correctly for the following test cases: <a href="http://www.bbc.co.uk"> <a href="http://bbc.co.uk"> <a href="https://www.bbc.co.uk"> <a href="mailto:[email protected]"> However when I run this in my C# code it fails. Here is the matching code: public static IEnumerable<string> GetUrls(this string input, string matchPattern) { var matches = Regex.Matches(input, matchPattern, RegexOptions.Compiled | RegexOptions.IgnoreCase); foreach (Match match in matches) { yield return match.Groups["href"].Value; } } And my tests: @"<a href=""https://www.bbc.co.uk"">bbc</a>".GetUrls(StringExtensions.HtmlUrlRegexPattern).Count().ShouldEqual(1); @"<a href=""mailto:[email protected]"">bbc</a>".GetUrls(StringExtensions.HtmlUrlRegexPattern).Count().ShouldEqual(0); The problem seems to be in the \\b(https?):// part which I added, removing this passes the normal URL test but fails the mailto: test. Anyone shed any light?

    Read the article

  • Python regex look-behind requires fixed-width pattern

    - by invictus
    Hi When trying to extract the title of a html-page I have always used the following regex: (?<=<title.*>)([\s\S]*)(?=</title>) Which will extract everything between the tags in a document and ignore the tags themselves. However, when trying to use this regex in Python it raises the following Exception: Traceback (most recent call last): File "test.py", line 21, in pattern = re.compile('(?<=)([\s\S]*)(?=)') File "C:\Python31\lib\re.py", line 205, in compile return _compile(pattern, flags) File "C:\Python31\lib\re.py", line 273, in _compile p = sre_compile.compile(pattern, flags) File "C:\Python31\lib\sre_compile.py", line 495, in compile code = _code(p, flags) File "C:\Python31\lib\sre_compile.py", line 480, in _code _compile(code, p.data, flags) File "C:\Python31\lib\sre_compile.py", line 115, in _compile raise error("look-behind requires fixed-width pattern") sre_constants.error: look-behind requires fixed-width pattern The code I am using is: pattern = re.compile('(?<=<title.*>)([\s\S]*)(?=</title>)') m = pattern.search(f) if I do some minimal adjustments it works: pattern = re.compile('(?<=<title>)([\s\S]*)(?=</title>)') m = pattern.search(f) This will, however, not take into account potential html titles that for some reason have attributes or similar. Anyone know a good workaround for this issue? Any tips are appreciated.

    Read the article

  • Regex statements for date ranges <=4/1/2009 and <=10/01/2009

    - by reggiereg
    Hi, I need serious help building two Regex statements for a project. The software we're using ONLY accepts Regex for validation. I need one that fires for any date <4/1/2009 and a second that fires for any date <10/1/2009 My co-worker gave me the following code to check for <=10/01/2010, but it checks leap years and all that stuff. I need something a little more streamlined than this in the MM/DD/YYYY format. Thanks in advance! ^(?:(?:0?[1-9])|(?:1[0-2]))(\/|-|.)(?:0?[1-9]|1\d|2[0-8])(\/|-|.)(?:2[0-9][2-9][0-9])$|^(?:(?:0?[1-9])|(?:1[0-2]))(\/|-|.)(?:0?[1-9]|1\d|2[0-8])(\/|-|.)(?:201[1-9])$|^(?:(?:(?:0?[13578]|1[02])(\/|-|.)31)|(?:(?:0?[1,3-9]|1[0-2])(\/|-|.)(?:29|30)))(\/|-|.)(?:201[1-9])$|^(?:(?:(?:11)(\/|-|.))(?:0?[1-9]|1\d|2[0-9]|30)(\/|-|.))(2010)$|^(?:(?:(?:10|12)(\/|-|.))(?:0?[1-9]|1\d|2[0-9]|30|31)(\/|-|.))(2010)$|^(?:(?:0?[1-9])|(?:1[0-2]))(\/|-|.)(?:0?[1-9]|1\d|2[0-8])(\/|-|.)(?:2[0-9][2-9][0-9])$|^(?:(?:(?:0?[13578]|1[02])(\/|-|.)31)\1|(?:(?:0?[1,3-9]|1[0-2])(\/|-|.)(?:29|30)))(\/|-|.)(?:2[0-9][2-9][0-9])$|^(?:(?:0?[1-9])|(?:1[0-2]))(\/|-|.)(?:0?[1-9]|1\d|2[0-8])(\/|-|.)(?:2011)$|^(?:0?2(\/|-|.)29\3(?:(?:(?:2[0-9][1-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$

    Read the article

  • c# regex split and extract multiple parts from a string

    - by nLL
    Hi, I am trying to extract some parts of the "Video:" line from below text. Seems stream 0 codec frame rate differs from container frame rate: 30000.00 (300 00/1) - 14.93 (1000/67) Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'C:\a.3gp': Metadata: major_brand : 3gp5 minor_version : 0 compatible_brands: 3gp5isom Duration: 00:00:45.82, start: 0.000000, bitrate: 357 kb/s Stream #0.0(und): Video: mpeg4, yuv420p, 352x276 [PAR 1:1 DAR 88:69], 344 kb /s, 14.93 fps, 14.93 tbr, 90k tbn, 30k tbc Stream #0.1(und): Audio: aac, 16000 Hz, mono, s16, 11 kb/s Stream #0.2(und): Data: mp4s / 0x7334706D, 0 kb/s Stream #0.3(und): Data: mp4s / 0x7334706D, 0 kb/s* This is an output from ffmpeg command line where i can get Video: part with private string ExtractVideoFormat(string rawInfo) { string v = string.Empty; Regex re = new Regex("[V|v]ideo:.*", RegexOptions.Compiled); Match m = re.Match(rawInfo); if (m.Success) { v = m.Value; } return v; } and result is mpeg4, yuv420p, 352x276 [PAR 1:1 DAR 88:69], 344 kb What i am trying to do is to somehow split that line and get mpeg4 yuv420p 352x276 [PAR 1:1 DAR 88:69] 344 kb assigned to diffrent string objects instead of single

    Read the article

  • Bash and regex problem : check for tokens entered into a Coke vending machine

    - by Michael Mao
    Hi all: Here is a "challenge question" I've got from Linux system programming lecture. Any of the following strings will give you a Coke if you kick: L = { aaaa, aab, aba, baa, bb, aaaa"a", aaaa"b", aab"a", … ab"b"a, ba"b"a, ab"bbbbbb"a, ... } The letters shown in wrapped double quotes indicate coins that would have fallen through (but those strings are still part of the language in this example). Exercise (a bit hard) show this is the language of a regular expression And this is what I've got so far : #!/usr/bin/bash echo "A bottle of Coke costs you 40 cents" echo -e "Please enter tokens (a = 10 cents, b = 20 cents) in a sequence like 'abba' :\c" read tokens #if [ $tokens = aaaa ]||[ $tokens = aab ]||[ $tokens = bb ] #then # echo "Good! now a coke is yours!" #else echo "Thanks for your money, byebye!" if [[ $token =~ 'aaaa|aab|bb' ]] then echo "Good! now a coke is yours!" else echo "Thanks for your money, byebye!" fi Sadly it doesn't work... always outputs "Thanks for your money, byebye!" I believe something is wrong with syntax... We didn't provided with any good reference book and the only instruction from the professor was to consult "anything you find useful online" and "research the problem yourself" :( I know how could I do it in any programming language such as Java, but get it done with bash script + regex seems not "a bit hard" but in fact "too hard" for anyone with little knowledge on something advanced as "lookahead"(is this the terminology ?) I don't know if there is a way to express the following concept in the language of regex: Valid entry would consist of exactly one of the three components : aaaa, aab and bb, regardless of order, followed by an arbitrary sequence of a or b's So this is what is should be like : (a{4}Ua{2}bUb{2})(aUb)* where the content in first braces is order irrelevant. Thanks a lot in advance for any hints and/or tips :)

    Read the article

  • PHP regex - find and replace

    - by jay
    Hi, I am trying to do this regex match and replace but not able to do it. Example <SPAN class="one">first content here</SPAN> <SPAN class="two">second content here </SPAN> <SPAN class="three">one; two; three; and more.</span> <SPAN class="four">more content here.</span> I want to find each set of the span tags and replace with something like this Find <SPAN class="one">first content here</SPAN> Change to <one>first content here</one> same way the the rest of the span tags. class="one", class="two" and so on are the only key identifier which I use in the regex match expression. So if I find a span tag with these class then I want to do the replace. My main issue is that I am not able to find the occurrence of first closing tag so what it does is it finds from the start to end which is of no use. So far I have been trying to do this using notepad++ but just found that it has its limitations so any php help would be appreciated. regards

    Read the article

  • regex split and extract multiple parts from a string

    - by nLL
    I am trying to extract some parts of the "Video:" line from below text. Seems stream 0 codec frame rate differs from container frame rate: 30000.00 (300 00/1) -> 14.93 (1000/67) Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'C:\a.3gp': Metadata: major_brand : 3gp5 minor_version : 0 compatible_brands: 3gp5isom Duration: 00:00:45.82, start: 0.000000, bitrate: 357 kb/s Stream #0.0(und): Video: mpeg4, yuv420p, 352x276 [PAR 1:1 DAR 88:69], 344 kb /s, 14.93 fps, 14.93 tbr, 90k tbn, 30k tbc Stream #0.1(und): Audio: aac, 16000 Hz, mono, s16, 11 kb/s Stream #0.2(und): Data: mp4s / 0x7334706D, 0 kb/s Stream #0.3(und): Data: mp4s / 0x7334706D, 0 kb/s* This is an output from ffmpeg command line where i can get Video: part with private string ExtractVideoFormat(string rawInfo) { string v = string.Empty; Regex re = new Regex("[V|v]ideo:.*", RegexOptions.Compiled); Match m = re.Match(rawInfo); if (m.Success) { v = m.Value; } return v; } and result is mpeg4, yuv420p, 352x276 [PAR 1:1 DAR 88:69], 344 kb What i am trying to do is to somehow split that line and get mpeg4 yuv420p 352x276 [PAR 1:1 DAR 88:69] 344 kb assigned to different string objects instead of single

    Read the article

  • Regex to check if exact string exists

    - by Jayrox
    I am looking for a way to check if an exact string match exists in another string using Regex or any better method suggested. I understand that you tell regex to match a space or any other non-word character at the beginning or end of a string. However, I don't know exactly how to set it up. Search String: t String 1: Hello World, Nice to see you! t String 2: Hello World, Nice to see you! String 3: T Hello World, Nice to see you! I would like to use the search string and compare it to String 1, String 2 and String 3 and only get a positive match from String 1 and String 3 but not from String 2. Requirements: Search String may be at any character position in the Subject. There may or may not be a white-space character before or after it. I do not want it to match if it is part of another string; such as part of a word. For the sake of this question: I think I would do this using this pattern: /\bt\b/gi /\b{$search_string}\b/gi Does this look right? Can it be made better? Any situations where this pattern wouldn't work? Additional info: this will be used in PHP 5

    Read the article

  • Regex expression is too greedy

    - by alastairs
    I'm writing a regular expression to match data from the IMDb soundtracks data file. My regexes are mostly working, although they are in places slurping too much text into my named groups. Take the following regex for example: "^ Performed by '?(?<performer>.*)('? \(qv\))?$" The performer group includes the string ' (qv) as well as the performer's name. Unfortunately, because the records are not consistently formatted, some performers' names are surrounded by single quotation marks whilst others are not. This means they are optional as far as the regex is concerned. I've tried marking the last group as a greedy group using the ?> group specifier, but this appeared to have no effect on the results. I can improve the results by changing the performer group to match a small range of characters, but this reduces my chances of parsing the name out correctly. Furthermore, if I were to just exclude the apostrophe character, I would then be unable to parse, e.g., band names containing apostrophes, such as Elia's Lonely Friends Band who performed Run For Your Life featured in Resident Evil: Apocalypse.

    Read the article

  • JavaScript Regex: Complicated input validation

    - by ScottSEA
    I'm trying to construct a regex to screen valid part and/or serial numbers in combination, with ranges. A valid part number is a two alpha, three digit pattern or /[A-z]{2}\d{3}/ i.e. aa123 or ZZ443 etc... A valid serial number is a five digit pattern, or /\d{5}/ 13245 or 31234 and so on. That part isn't the problem. I want combinations and ranges to be valid as well: 12345, ab123,ab234-ab245, 12346 - 12349 - the ultimate goal. Ranges and/or series of part and/or serial numbers in any combination. Note that spaces are optional when specifying a range or after a comma in a series. Note that a range of part numbers has the same two letter combination on both sides of the range (i.e. ab123 - ab239) I have been wrestling with this expression for two days now, and haven't come up with anything better than this: /^(?:[A-z]{2}\d{3}[, ]*)|(?:\d{5}[, ]*)|(?:([A-z]{2})\d{3} ?- ?\4\d{3}[, ]*)|(?:\d{5} ?- ?\d{5}[, ]*)$/ ... My Regex-Fu is weak.

    Read the article

  • Parsing CSS by regex

    - by Ross
    I'm creating a CSS editor and am trying to create a regular expression that can get data from a CSS document. This regex works if I have one property but I can't get it to work for all properties. I'm using preg/perl syntax in PHP. Regex (?<selector>[A-Za-z]+[\s]*)[\s]*{[\s]*((?<properties>[A-Za-z0-9-_]+)[\s]*:[\s]*(?<values>[A-Za-z0-9#, ]+);[\s]*)*[\s]*} Test case body { background: #f00; font: 12px Arial; } Expected Outcome Array( [0] => Array( [0] => body { background: #f00; font: 12px Arial; } [selector] => Array( [0] => body ) [1] => Array( [0] => body ) [2] => font: 12px Arial; [properties] => Array( [0] => font ) [3] => Array( [0] => font ) [values] => Array( [0] => 12px Arial [1] => background: #f00 ) [4] => Array( [0] => 12px Arial [1] => background: #f00 ) ) ) Real Outcome Array( [0] => Array ( [0] => body { background: #f00; font: 12px Arial; } [selector] => body [1] => body [2] => font: 12px Arial; [properties] => font [3] => font [values] => 12px Arial [4] => 12px Arial ) ) Thanks in advance for any help - this has been confusing me all afternoon!

    Read the article

  • Python finding substring between certain characters using regex and replace()

    - by jCuga
    Suppose I have a string with lots of random stuff in it like the following: strJunk ="asdf2adsf29Value=five&lakl23ljk43asdldl" And I'm interested in obtaining the substring sitting between 'Value=' and '&', which in this example would be 'five'. I can use a regex like the following: match = re.search(r'Value=?([^&>]+)', strJunk) >>> print match.group(0) Value=five >>> print match.group(1) five How come match.group(0) is the whole thing 'Value=five' and group(1) is just 'five'? And is there a way for me to just get 'five' as the only result? (This question stems from me only having a tenuous grasp of regex) I am also going to have to make a substitution in this string such such as the following: val1 = match.group(1) strJunk.replace(val1, "six", 1) Which yields: 'asdf2adsf29Value=six&lakl23ljk43asdldl' Considering that I plan on performing the above two tasks (finding the string between 'Value=' and '&', as well as replacing that value) over and over, I was wondering if there are any other more efficient ways of looking for the substring and replacing it in the original string. I'm fine sticking with what I've got but I just want to make sure that I'm not taking up more time than I have to be if better methods are out there.

    Read the article

  • Python re module becomes 20 times slower when called on greater than 101 different regex

    - by Wiil
    My problem is about parsing log files and removing variable parts on each lines to be able to group them. For instance: s = re.sub(r'(?i)User [_0-9A-z]+ is ', r"User .. is ", s) s = re.sub(r'(?i)Message rejected because : (.*?) \(.+\)', r'Message rejected because : \1 (...)', s) I have about 120+ matching rules like those above. I have found no performances issues while searching successively on 100 different regex. But a huge slow down comes when applying 101 regex. Exact same behavior happens when replacing my rules set by for a in range(100): s = re.sub(r'(?i)caught here'+str(a)+':.+', r'( ... )', s) Got 20 times slower when putting range(101) instead. # range(100) % ./dashlog.py file.bz2 == Took 2.1 seconds. == # range(101) % ./dashlog.py file.bz2 == Took 47.6 seconds. == Why such thing is happening ? And is there any known workaround ? (Happens on Python 2.6.6/2.7.2 on Linux/Windows.)

    Read the article

  • Regex Replacing only whole matches

    - by Leen Balsters
    I am trying to replace a bunch of strings in files. The strings are stored in a datatable along with the new string value. string contents = File.ReadAllText(file); foreach (DataRow dr in FolderRenames.Rows) { contents = Regex.Replace(contents, dr["find"].ToString(), dr["replace"].ToString()); File.SetAttributes(file, FileAttributes.Normal); File.WriteAllText(file, contents); } The strings look like this _-uUa, -_uU, _-Ha etc. The problem that I am having is when for example this string "_uU" will also overwrite "_-uUa" so the replacement would look like "newvaluea" Is there a way to tell regex to look at the next character after the found string and make sure it is not an alphanumeric character? I hope it is clear what I am trying to do here. Here is some sample data: private function _-0iX(arg1:flash.events.Event):void { if (arg1.type == flash.events.Event.RESIZE) { if (this._-2GU) { this._-yu(this._-2GU); } } return; } The next characters could be ;, (, ), dot, comma, space, :, etc.

    Read the article

  • regex match css class name from single string containing multiple classes

    - by effectica
    I have a long string that contains multiple css classes. With regex I would like to match every class name as I then need to replace these class names like so: <span>CLASSNAME</span> I have tried for hours to come up with a solution and I think I am close however for certain class names I am not able to exclude closing curly brackets from the match. Here is a sample string I have been carrying out testing on: #main .items-Outer p{ font-family:Verdana; color: #000000; font-size: 50px; font-weight: bold; }#footer .footer-inner p.intro{ font-family:Arial; color: #444444; font-size: 30; font-weight: normal; }.genericTxt{ font-family:Courier; color: #444444; font-size: 30; font-weight: normal; } And here is the the regex I came up with so far: ((^(?:.+?)(?:[^{]*))|((?:\})(?:.+?)(?:[^{]*))) Please look at the screenshot I am attaching as it will show more clearly the matches I get. My problem is that I would obviously like to exclude curly brackets from any match.

    Read the article

  • Javascript Regex: Testing string for intelligent query

    - by Shyam
    Hi, I have a string that holds user input. This string can contain various types of data, like: a six digit id a zipcode that contains out of 4 digits and two alphanumeric characters a name (characters only) As I am using this string to search through a database, the query type is determined on the type of search, which i want to handle serverside using JavaScript (yes, I am using JavaScript serverside). Searching on StackOverflow, brought me some interesting information, like the .test-method, which seems perfect for my needs. The test-method returns either true or false based on the evaluation on the string using a regex object. I am using this page as a reference: http://www.javascriptkit.com/jsref/regexp.shtml So I am trying to determine the zipcode, by using the following very noobish regex. var r = /[A-Za-z]{2,2}/ As far I can understand, this should limit the amount of occurrences of alphanumeric characters to a maximum of two. See beneath the output of my JavaScript console. > var r = /[A-Za-z]{2,2}/ > var x = "2233AL" > r.test(x) true > var x = "2233A" > r.test(x) false > var x = "2233ALL" > r.test(x) true /* i want this to be false */ > A little help would be really appreciated!

    Read the article

  • Regex for ignoring consecutive quotation marks in string

    - by will-hart
    I have built a parser in Sprache and C# for files using a format I don't control. Using it I can correctly convert: a = "my string"; into my string The parser (for the quoted text only) currently looks like this: public static readonly Parser<string> QuotedText = from open in Parse.Char('"').Token() from content in Parse.CharExcept('"').Many().Text().Token() from close in Parse.Char('"').Token() select content; However the format I'm working with escapes quotation marks using "double doubles" quotes, e.g.: a = "a ""string""."; When attempting to parse this nothing is returned. It should return: a ""string"". Additionally a = ""; should be parsed into a string.Empty or similar. I've tried regexes unsuccessfully based on answers like this doing things like "(?:[^;])*", or: public static readonly Parser<string> QuotedText = from content in Parse.Regex("""(?:[^;])*""").Token() This doesn't work (i.e. no matches are returned in the above cases). I think my beginners regex skills are getting in the way. Does anybody have any hints? EDIT: I was testing it here - http://regex101.com/r/eJ9aH1

    Read the article

  • java regex for alpha and spaces is including [ ] \

    - by JayAvon
    This is my regex for my JTextField to not be longer than x characters and to not include anything other than letters or spaces. For some reason it is allowing [ ] and \ characters. This is driving me crazy. Is my regex wrong?? package com.jayavon.game.helper; import java.awt.Toolkit; import javax.swing.text.AttributeSet; import javax.swing.text.BadLocationException; import javax.swing.text.PlainDocument; public class CharacterNameCreationDocument extends PlainDocument { private static final long serialVersionUID = 1L; private int limit; public CharacterNameCreationDocument(int limit) { super(); this.limit = limit; } public void insertString(int offset, String str, AttributeSet attr) throws BadLocationException { if (str == null || (getLength() + str.length()) > limit || !str.matches("[a-zA-z\\s]*")){ Toolkit.getDefaultToolkit().beep(); return; } else { super.insertString(offset, str, attr); } } }

    Read the article

  • LLBLGen Pro feature highlights: grouping model elements

    - by FransBouma
    (This post is part of a series of posts about features of the LLBLGen Pro system) When working with an entity model which has more than a few entities, it's often convenient to be able to group entities together if they belong to a semantic sub-model. For example, if your entity model has several entities which are about 'security', it would be practical to group them together under the 'security' moniker. This way, you could easily find them back, yet they can be left inside the complete entity model altogether so their relationships with entities outside the group are kept. In other situations your domain consists of semi-separate entity models which all target tables/views which are located in the same database. It then might be convenient to have a single project to manage the complete target database, yet have the entity models separate of each other and have them result in separate code bases. LLBLGen Pro can do both for you. This blog post will illustrate both situations. The feature is called group usage and is controllable through the project settings. This setting is supported on all supported O/R mapper frameworks. Situation one: grouping entities in a single model. This situation is common for entity models which are dense, so many relationships exist between all sub-models: you can't split them up easily into separate models (nor do you likely want to), however it's convenient to have them grouped together into groups inside the entity model at the project level. A typical example for this is the AdventureWorks example database for SQL Server. This database, which is a single catalog, has for each sub-group a schema, however most of these schemas are tightly connected with each other: adding all schemas together will give a model with entities which indirectly are related to all other entities. LLBLGen Pro's default setting for group usage is AsVisualGroupingMechanism which is what this situation is all about: we group the elements for visual purposes, it has no real meaning for the model nor the code generated. Let's reverse engineer AdventureWorks to an entity model. By default, LLBLGen Pro uses the target schema an element is in which is being reverse engineered, as the group it will be in. This is convenient if you already have categorized tables/views in schemas, like which is the case in AdventureWorks. Of course this can be switched off, or corrected on the fly. When reverse engineering, we'll walk through a wizard which will guide us with the selection of the elements which relational model data should be retrieved, which we can later on use to reverse engineer to an entity model. The first step after specifying which database server connect to is to select these elements. below we can see the AdventureWorks catalog as well as the different schemas it contains. We'll include all of them. After the wizard completes, we have all relational model data nicely in our catalog data, with schemas. So let's reverse engineer entities from the tables in these schemas. We select in the catalog explorer the schemas 'HumanResources', 'Person', 'Production', 'Purchasing' and 'Sales', then right-click one of them and from the context menu, we select Reverse engineer Tables to Entity Definitions.... This will bring up the dialog below. We check all checkboxes in one go by checking the checkbox at the top to mark them all to be added to the project. As you can see LLBLGen Pro has already filled in the group name based on the schema name, as this is the default and we didn't change the setting. If you want, you can select multiple rows at once and set the group name to something else using the controls on the dialog. We're fine with the group names chosen so we'll simply click Add to Project. This gives the following result:   (I collapsed the other groups to keep the picture small ;)). As you can see, the entities are now grouped. Just to see how dense this model is, I've expanded the relationships of Employee: As you can see, it has relationships with entities from three other groups than HumanResources. It's not doable to cut up this project into sub-models without duplicating the Employee entity in all those groups, so this model is better suited to be used as a single model resulting in a single code base, however it benefits greatly from having its entities grouped into separate groups at the project level, to make work done on the model easier. Now let's look at another situation, namely where we work with a single database while we want to have multiple models and for each model a separate code base. Situation two: grouping entities in separate models within the same project. To get rid of the entities to see the second situation in action, simply undo the reverse engineering action in the project. We still have the AdventureWorks relational model data in the catalog. To switch LLBLGen Pro to see each group in the project as a separate project, open the Project Settings, navigate to General and set Group usage to AsSeparateProjects. In the catalog explorer, select Person and Production, right-click them and select again Reverse engineer Tables to Entities.... Again check the checkbox at the top to mark all entities to be added and click Add to Project. We get two groups, as expected, however this time the groups are seen as separate projects. This means that the validation logic inside LLBLGen Pro will see it as an error if there's e.g. a relationship or an inheritance edge linking two groups together, as that would lead to a cyclic reference in the code bases. To see this variant of the grouping feature, seeing the groups as separate projects, in action, we'll generate code from the project with the two groups we just created: select from the main menu: Project -> Generate Source-code... (or press F7 ;)). In the dialog popping up, select the target .NET framework you want to use, the template preset, fill in a destination folder and click Start Generator (normal). This will start the code generator process. As expected the code generator has simply generated two code bases, one for Person and one for Production: The group name is used inside the namespace for the different elements. This allows you to add both code bases to a single solution and use them together in a different project without problems. Below is a snippet from the code file of a generated entity class. //... using System.Xml.Serialization; using AdventureWorks.Person; using AdventureWorks.Person.HelperClasses; using AdventureWorks.Person.FactoryClasses; using AdventureWorks.Person.RelationClasses; using SD.LLBLGen.Pro.ORMSupportClasses; namespace AdventureWorks.Person.EntityClasses { //... /// <summary>Entity class which represents the entity 'Address'.<br/><br/></summary> [Serializable] public partial class AddressEntity : CommonEntityBase //... The advantage of this is that you can have two code bases and work with them separately, yet have a single target database and maintain everything in a single location. If you decide to move to a single code base, you can do so with a change of one setting. It's also useful if you want to keep the groups as separate models (and code bases) yet want to add relationships to elements from another group using a copy of the entity: you can simply reverse engineer the target table to a new entity into a different group, effectively making a copy of the entity. As there's a single target database, changes made to that database are reflected in both models which makes maintenance easier than when you'd have a separate project for each group, with its own relational model data. Conclusion LLBLGen Pro offers a flexible way to work with entities in sub-models and control how the sub-models end up in the generated code.

    Read the article

  • Spring security request matcher is not working with regex

    - by Felipe Cardoso Martins
    Using Spring MVC + Security I have a business requirement that the users from SEC (Security team) has full access to the application and FRAUD (Anti-fraud team) has only access to the pages that URL not contains the words "block" or "update" with case insensitive. Bellow, all spring dependencies: $ mvn dependency:tree | grep spring [INFO] +- org.springframework:spring-webmvc:jar:3.1.2.RELEASE:compile [INFO] | +- org.springframework:spring-asm:jar:3.1.2.RELEASE:compile [INFO] | +- org.springframework:spring-beans:jar:3.1.2.RELEASE:compile [INFO] | +- org.springframework:spring-context:jar:3.1.2.RELEASE:compile [INFO] | +- org.springframework:spring-context-support:jar:3.1.2.RELEASE:compile [INFO] | \- org.springframework:spring-expression:jar:3.1.2.RELEASE:compile [INFO] +- org.springframework:spring-core:jar:3.1.2.RELEASE:compile [INFO] +- org.springframework:spring-web:jar:3.1.2.RELEASE:compile [INFO] +- org.springframework.security:spring-security-core:jar:3.1.2.RELEASE:compile [INFO] | \- org.springframework:spring-aop:jar:3.0.7.RELEASE:compile [INFO] +- org.springframework.security:spring-security-web:jar:3.1.2.RELEASE:compile [INFO] | +- org.springframework:spring-jdbc:jar:3.0.7.RELEASE:compile [INFO] | \- org.springframework:spring-tx:jar:3.0.7.RELEASE:compile [INFO] +- org.springframework.security:spring-security-config:jar:3.1.2.RELEASE:compile [INFO] +- org.springframework.security:spring-security-acl:jar:3.1.2.RELEASE:compile Bellow, some examples of mapped URL path from spring log: Mapped URL path [/index] onto handler 'homeController' Mapped URL path [/index.*] onto handler 'homeController' Mapped URL path [/index/] onto handler 'homeController' Mapped URL path [/cellphone/block] onto handler 'cellphoneController' Mapped URL path [/cellphone/block.*] onto handler 'cellphoneController' Mapped URL path [/cellphone/block/] onto handler 'cellphoneController' Mapped URL path [/cellphone/confirmBlock] onto handler 'cellphoneController' Mapped URL path [/cellphone/confirmBlock.*] onto handler 'cellphoneController' Mapped URL path [/cellphone/confirmBlock/] onto handler 'cellphoneController' Mapped URL path [/user/update] onto handler 'userController' Mapped URL path [/user/update.*] onto handler 'userController' Mapped URL path [/user/update/] onto handler 'userController' Mapped URL path [/user/index] onto handler 'userController' Mapped URL path [/user/index.*] onto handler 'userController' Mapped URL path [/user/index/] onto handler 'userController' Mapped URL path [/search] onto handler 'searchController' Mapped URL path [/search.*] onto handler 'searchController' Mapped URL path [/search/] onto handler 'searchController' Mapped URL path [/doSearch] onto handler 'searchController' Mapped URL path [/doSearch.*] onto handler 'searchController' Mapped URL path [/doSearch/] onto handler 'searchController' Bellow, a test of the regular expressions used in spring-security.xml (I'm not a regex speciality, improvements are welcome =]): import java.util.Arrays; import java.util.List; public class RegexTest { public static void main(String[] args) { List<String> pathSamples = Arrays.asList( "/index", "/index.*", "/index/", "/cellphone/block", "/cellphone/block.*", "/cellphone/block/", "/cellphone/confirmBlock", "/cellphone/confirmBlock.*", "/cellphone/confirmBlock/", "/user/update", "/user/update.*", "/user/update/", "/user/index", "/user/index.*", "/user/index/", "/search", "/search.*", "/search/", "/doSearch", "/doSearch.*", "/doSearch/"); for (String pathSample : pathSamples) { System.out.println("Path sample: " + pathSample + " - SEC: " + pathSample.matches("^.*$") + " | FRAUD: " + pathSample.matches("^(?!.*(?i)(block|update)).*$")); } } } Bellow, the console result of Java class above: Path sample: /index - SEC: true | FRAUD: true Path sample: /index.* - SEC: true | FRAUD: true Path sample: /index/ - SEC: true | FRAUD: true Path sample: /cellphone/block - SEC: true | FRAUD: false Path sample: /cellphone/block.* - SEC: true | FRAUD: false Path sample: /cellphone/block/ - SEC: true | FRAUD: false Path sample: /cellphone/confirmBlock - SEC: true | FRAUD: false Path sample: /cellphone/confirmBlock.* - SEC: true | FRAUD: false Path sample: /cellphone/confirmBlock/ - SEC: true | FRAUD: false Path sample: /user/update - SEC: true | FRAUD: false Path sample: /user/update.* - SEC: true | FRAUD: false Path sample: /user/update/ - SEC: true | FRAUD: false Path sample: /user/index - SEC: true | FRAUD: true Path sample: /user/index.* - SEC: true | FRAUD: true Path sample: /user/index/ - SEC: true | FRAUD: true Path sample: /search - SEC: true | FRAUD: true Path sample: /search.* - SEC: true | FRAUD: true Path sample: /search/ - SEC: true | FRAUD: true Path sample: /doSearch - SEC: true | FRAUD: true Path sample: /doSearch.* - SEC: true | FRAUD: true Path sample: /doSearch/ - SEC: true | FRAUD: true Tests Scenario 1 Bellow, the important part of spring-security.xml: <security:http entry-point-ref="entryPoint" request-matcher="regex"> <security:intercept-url pattern="^.*$" access="ROLE_SEC" /> <security:intercept-url pattern="^(?!.*(?i)(block|update)).*$" access="ROLE_FRAUD" /> <security:access-denied-handler error-page="/access-denied.html" /> <security:form-login always-use-default-target="false" login-processing-url="/doLogin.html" authentication-failure-handler-ref="authFailHandler" authentication-success-handler-ref="authSuccessHandler" /> <security:logout logout-url="/logout.html" success-handler-ref="logoutSuccessHandler" /> </security:http> Behaviour: FRAUD group **can't" access any page SEC group works fine Scenario 2 NOTE that I only changed the order of intercept-url in spring-security.xml bellow: <security:http entry-point-ref="entryPoint" request-matcher="regex"> <security:intercept-url pattern="^(?!.*(?i)(block|update)).*$" access="ROLE_FRAUD" /> <security:intercept-url pattern="^.*$" access="ROLE_SEC" /> <security:access-denied-handler error-page="/access-denied.html" /> <security:form-login always-use-default-target="false" login-processing-url="/doLogin.html" authentication-failure-handler-ref="authFailHandler" authentication-success-handler-ref="authSuccessHandler" /> <security:logout logout-url="/logout.html" success-handler-ref="logoutSuccessHandler" /> </security:http> Behaviour: SEC group **can't" access any page FRAUD group works fine Conclusion I did something wrong or spring-security have a bug. The problem already was solved in a very bad way, but I need to fix it quickly. Anyone knows some tricks to debug better it without open the frameworks code? Cheers, Felipe

    Read the article

< Previous Page | 13 14 15 16 17 18 19 20 21 22 23 24  | Next Page >