Search Results

Search found 30 results on 2 pages for 'lookbehind'.

Page 1/2 | 1 2 | Next Page >

Need variable width negative lookbehind replacement

- by Technoh

I have looked at many questions here (and many more websites) and some provided hints but none gave me a definitive answer. I know regular expressions but I am far from being a guru. This particular question deals with regex in PHP. I need to locate words in a text that are not surrounded by a hyperlink of a given class. For example, I might have This <a href="blabblah" class="no_check">elephant</a> is green and this elephant is blue while this <a href="blahblah">elephant</a> is red. I would need to match against the second and third elephants but not the first (identified by test class "no_check"). Note that there could more attributes than just href and class within hyperlinks. I came up with ((?<!<a .*class="no_check".*>)\belephant\b) which works beautifully in regex test software but not in PHP. Any help is greatly appreciated. If you cannot provide a regular expression but can find some sort of PHP code logic that would circumvent the need for it, I would be equally grateful.

Read the article
Remove leading whitespaces using variable length lookbehind in RegExp

- by Shizhidi

Hello, I'm wondering if variable length lookbehind assertions are supported in JavaScript's RegExp engine? For example, I'm trying to match the string "variable length" in the string "[a lot of whitespaces and/or tabs]variable length lookbehind", and I have something like this but it does not go well in various RegExp testers: ^(?<=[ \t]+).+(?= lookbehind) If it's an illegal pattern, what would be a good workaround to it? Thanks!

Read the article
Backreferences in lookbehind

- by polygenelubricants

Can you use backreferences in a lookbehind? Let's say I want to split wherever behind me a character is repeated twice. String REGEX1 = "(?<=(.)\\1)"; // DOESN'T WORK! String REGEX2 = "(?<=(?=(.)\\1)..)"; // WORKS! System.out.println(java.util.Arrays.toString( "Bazooka killed the poor aardvark (yummy!)" .split(REGEX2) )); // prints "[Bazoo, ka kill, ed the poo, r aa, rdvark (yumm, y!)]" Using REGEX2 (where the backreference is in a lookahead nested inside a lookbehind) works, but REGEX1 gives this error at run-time: Look-behind group does not have an obvious maximum length near index 8 (?<=(.)\1) ^ This sort of make sense, I suppose, because in general the backreference can capture a string of any length (if the regex compiler is a bit smarter, though, it could determine that \1 is (.) in this case, and therefore has a finite length). So is there a way to use a backreference in a lookbehind? And if there isn't, can you always work around it using this nested lookahead? Are there other commonly-used techniques?

Read the article
grep with negative lookbehind

- by Dan Fabulich

I'm trying to grep through a bunch of files in nested subdirectories to look for regular expression matches; my regex requires negative lookbehind. Perl has negative lookbehind, but as far as I can tell GNU grep doesn't support negative lookbehinds. What's the easiest way to get an equivalent to GNU grep that supports negative lookbehinds? (I guess I could write my own mini-grep in Perl, but that doesn't seem like it should be necessary. My copy of the Perl Cookbook includes source for tcgrep; is that what I should use? If so, where's the latest version? Don't tell me I have to type this entire program!)

Read the article
How to non-greedy multiple lookbehind matches

- by ArtK

Source: <prefix><content1><suffix1><prefix><content2><suffix2> Engine: PCRE RegEx1: (?<=<prefix>)(.*)(?=<suffix1>) RegEx2: (?<=<prefix>)(.*)(?=<suffix2>) Result1: <content1> Result2: <content1><suffix1><prefix><content2> The desired result for RegEx2 is just <content2> but it is obviously greedy. How do I make RegEx2 non-greedy and use only the last matching lookbehind? [I hope I have translated this correctly from the NoteTab syntax. I don't do much RegEx coding. The <prefix>, <content> & <suffix> terms are just meant to represent arbitrary strings. Only the "<" in the "?<=" lookbehind command is significant.] I suspect it is something simple but after too many hours of searching I'm giving up on solving it myself. Thanks for the help Art

Read the article
Naming convetion of regex,lookahead and lookbehind

- by user198729

Why is it counter intuitive? /(?<!\d)\d{8}(?!\d)/,here (?<!\d) comes first,but called lookbehind,(?!\d) next,but called lookahead.All are counter intuitive. What's the reason to name it this way?

Read the article
[Regexp] Stop matching when meeting a sequence of chars: fixing a lookbehind

- by CFP

Hello everyone! I have the following regexp: (?P<question>.+(?<!\[\[)) It is designed to match hello world! in the string hello world! [[A string typically used in programming examples]] Yet I just matches the whole string, and I can't figure out why. I've tried all flavors of lookaround, but it just won't work... Anyone knows how to fix this problem? Thanks, CFP.

Read the article
RegEx Advanced : Positive lookbehind

- by mpneuried

This is my test-string: <img rel="{objectid:498,newobject:1,fileid:338}" width="80" height="60" align="left" src="../../../../files/jpg1/Desert1.jpg" alt="" /> I want to get each of the JSON formed Elements inbetween the rel attribute. It's working for the first element (objectid). Here is my ReqEx, which works fine: (?<=(rel="\{objectid:))\d+(?=[,|\}]) But i want to do somthing like this, which doesn't work: (?<=(rel="\{.*objectid:))\d+(?=[,|\}]) So i can parse every element of the search string. I'm using Java-ReqEx

Read the article
RegExp: want to find all links that do not end in ".html"

- by grovel

Hi, I'm a relative novice to regular expressions (although I've used them many times successfully). I want to find all links in a document that do not end in ".html" The regular expression I came up with is: href=\"([^"]*)(?<!html)\" In Notepad++, my editor, href=\"([^"]*)\" finds all the links (both those that end in "html" and those that do not). Why doesn't negative lookbehind work? I've also tried lookahead: href=\"[^"]*(?!html\") but that didn't work either. Can anybody help? Cheers, grovel

Read the article
Regexp for selecting spaces between digits and decimal char

- by Tirithen

I want to remove spaces from strings where the space is preceeded by a digit or a "." and acceded by a digit or ".". I have strings like: "50 .10", "50 . 10", "50. 10" and I want them all to become "50.10" but with an unknown number of digits on either side. I'm trying with lookahead/lookbehind assertions like this: $row = str_replace("/(?<=[0-9]+$)\s*[.]\s*(?=[0-9]+$)/", "", $row); But it does not work...

Read the article
Does lookaround affect which languages can be matched by regular expressions?

- by sepp2k

There are some features in modern regex engines which allow you to match languages that couldn't be matched without that feature. For example the following regex using back references matches the language of all strings that consist of a word that repeats itself: (.+)\1. This language is not regular and can't be matched by a regex, which does not use back references. My question: Does lookaround also affect which languages can be matched by a regular expression? I.e. are there any languages that can be matched using lookaround, which couldn't be matched otherwise? If so, is this true for all flavors of lookaround (negative or positive lookahead or lookbehind) or just for some of them?

Read the article
Java RegEx API "Look-behind group does not have an obvious maximum length near index ..."

- by Foo Inc

Hello, I'm on to some SQL where clause parsing and designed a working RegEx to find a column outside string literals using "Rad Software Regular Expression Desginer" which is using the .NET API. To make sure the designed RegEx works with Java too, I tested it by using the API of course (1.5 and 1.6). But guess what, it won't work. I got the message "Look-behind group does not have an obvious maximum length near index 28". The string that I'm trying to get parsed is Column_1='test''the''stuff''all''day''long' AND Column_2='000' AND TheVeryColumnIWantToFind = 'Column_1=''test''''the''''stuff''''all''''day''''long'' AND Column_2=''000'' AND TheVeryColumnIWantToFind = '' TheVeryColumnIWantToFind = '' AND (Column_3 is null or Column_3 = ''Not interesting'') AND ''1'' = ''1''' AND (Column_3 is null or Column_3 = 'Still not interesting') AND '1' = '1' As you may have guessed, I tried to create some kind of worst case to ensure the RegEx won't fail on more complicated SQL where clauses. The RegEx itself looks like this (?i:(?<!=\s*'(?:[^']|(?:''))*)((?<=\s*)TheVeryColumnIWantToFind(?=(?:\s+|=)))) I'm not sure if there is a more elegant RegEx (there'll most likely be one), but that's not important right now as it does the trick. To explain the RegEx in a few words: If it finds the column I'm after, it does a negative look-behind to figure out if the column name is used in a string literal. If so, it won't match. If not, it'll match. Back to the question. As I mentioned before, it won't work with Java. What will work and result in what I want? I found out, that Java does not seem to support unlimited look-behinds but still I couldn't get it to work. Isn't it right that a look-behind is always putting a limit up on itself from the search offset to the current search position? So it would result in something like "position - offset"?

Read the article
Regex negative look-behind in hgignore file

- by jco

I'm looking for a way to modify my .hgignore file to ignore all "Properties/AssemblyInfo.cs" files except those in either the "Test/" or the "Tests/" subfolders. I tried using the negative look-behind expression (?<!Test)/Properties/AssemblyInfo\.cs$, but I didn't find a way to "un-ignore" in both folders "Test/" and "Tests/".

Read the article
Java RegExp ViewState

- by CDSO1

I am porting some functionality from a C++ application to java. This involves reading non-modifiable data files that contain regular expressions. A lot of the data files contain regular expressions that look similar to the following: (?<=id="VIEWSTATE".*?value=").*?(?=") These regular expressions produce the following error: "Look-behind group does not have an obvious maximum length near index XX" In C++ the engine being used supported these expressions. Is there another form of regexp that can produce the same result that can be generated using expressions like my example as input?

Read the article
strange behavior in vim with negative look-behind

- by João Portela

So, I am doing this search in vim: /\(\(unum\)\|\(player\)=\)\@<!\"1\" and as expected it does not match lines that have: player="1" but matches lines that have: unum="1" what am i doing wrong? isn't the atom to be negated all of this: \(\(unum\)\|\(player\)=\) naturally just doing: /\(\(unum\)\|\(player\)=\) matches unum= or player=.

Read the article
PHP - REGEX - use string for pattern but exclude it from being removed!

- by aSeptik

Hi All guys! i'm pretty new on regex, i have learned something by the way, but is still pour knowledge! so i want ask you for clarification on how it work! assuming i have the following strings, as you can see they can be formatted little different way one from another but they are very similar! DTSTART;TZID="America/Chicago":20030819T000000 DTEND;TZID="America/Chicago":20030819T010000 DTSTART;TZID=US/Pacific DTSTART;VALUE=DATE now i want replace everything between the first A-Z block and the colon so for example i would keep DTSTART:20030819T000000 DTEND:20030819T010000 DTSTART DTSTART so on my very noobs knowledge i have worked out this shitty regex! :-( preg_replace( '/^[A-Z](?!;[A-Z]=[\w\W]+):$/m' , '' , $data ); but why i'm sure this regex will not work!? :-) Pls help me! PS: the title of question is pretty explaned, i want also know how for example use a well know string block for match another... preg_replace( '/^[DTSTART](?!;[A-Z]=[\w\W]+):$/m' , '' , $data ); ..without delete DTSTART Thanks for the time! Regards Luca Filosofi

Read the article
Using lookahead assertions in regular expressions

- by Greg Jackson

I use regular expressions on a daily basis, as my daily work is 90% in Perl (legacy codebase, but that's a different issue). Despite this, I still find lookahead and lookbehind to be terribly confusing and often unreadable. Right now, if I were to get a code review with a lookahead or lookbehind, I would immediately send it back to see if the problem can be solved by using multiple regular expressions or a different approach. The following are the main reasons I tend not to like them: They can be terribly unreadable. Lookahead assertions, for example, start from the beginning of the string no matter where they are placed. That, among other things, can cause some very "interesting" and non-obvious behaviors. It used to be the case that many languages didn't support lookahead/lookbehind (or supported them as "experimental features"). This isn't the case quite as much, but there's still always the question as to how well it's supported. Quite frankly, they feel like a dirty hack. Regexps often already are, but they can also be quite elegant, and have gained widespread acceptance. I've gotten by without any need for them at all... sometimes I think that they're extraneous. Now, I'll freely admit that especially the last two reasons aren't really good ones, but I felt that I should enumerate what goes through my mind when I see one. I'm more than willing to change my mind about them, but I feel that they violate some of my core tenets of programming, including: Code should be as readable as possible without sacrificing functionality -- this may include doing something in a less efficient, but clearer was as long as the difference is negligible or unimportant to the application as a whole. Code should be maintainable -- if another programmer comes along to fix my code, non-obvious behavior can hide bugs or make functional code appear buggy (see readability) "The right tool for the right job" -- I'm sure you can come up with contrived examples that could use lookahead, but I've never come across something that really needs them in my real-world development work. Is there anything that they're really the best tool for, as opposed to, say, multiple regexps (or, alternatively, are they the best tool for most cases they're used for today). My question is this: Is it good practice to use lookahead/lookbehind in regular expressions, or are they simply a hack that have found their way into modern production code? I'd be perfectly happy to be convinced that I'm wrong about this, and simple examples are useful for examples or illustration, but by themselves, won't be enough to convince me.

Read the article
codingbat wordEnds using regex

- by polygenelubricants

I'm trying to solve wordEnds from codingbat.com using regex. This is the simplest as I can make it with my current knowledge of regex: public String wordEnds(String str, String word) { return str.replaceAll( String.format( ".*?(?=%s)(?<=(.|^))%1$s(?=(.|$))|.+", java.util.regex.Pattern.quote(word) ), "$1$2" ); } String.format is used to inject word into the pattern for both readability and convenience (it's injected twice). Pattern.quote isn't necessary to pass their tests, but I think it's required for a proper regex-based solution. The regex has two major parts: If after matching as few characters as possible ".*?", word can still be found "(?=%s)", then lookbehind to capture any character immediately preceding it "(?<=(.|^))", match word "%1$s" and lookforward to capture any character following it "(?=(.|$))". The initial "if" test ensures that the atomic lookbehind captures only if there's a word Using lookahead to capture the following character doesn't consume it, so it can be used as part of further matching Otherwise match what's left "|.+" Groups 1 and 2 would capture empty strings I think this works in all cases, but it's obviously quite complex. I'm just wondering if others can suggest a simpler regex to do this. Note: I'm not looking for a solution using indexOf and a loop. I want a regex-based replaceAll solution. I also need a working solution that I can just copy-paste into codingbat and passes.

Read the article
Regex one-to-one mapping pattern replace

- by polygenelubricants

How would you use regex to write a function that replaces all lowercase letters with uppercase and vice versa? Note: this is NOT a homework question. See also my previous explorations of regex: Regex split into overlapping strings (Alan Moore's answer is especially instructive) Can you use zero-width matching regex in String split? (my solution exploits a known Java regex bug with regards to non-obvious length lookbehind!)

Read the article
Regex match pattern inside a wrapping pattern

- by shivesh

I want match all phone numbers that are wrapped between << and tags. This regex for phone numbers: 0[2349]{1}\-[1-9]{1}[0-9]{6} I tired to add lookahead (and lookbehind) like (?=(?:>>)) but this didn't work for me. DEMO

Read the article
Regex to find A and not B on a line

- by Zach

I'm looking for a regex to search my python program to find all lines where foo, but not bar, is passed into a method as a keyword argument. I'm playing around with lookahead and lookbehind assertions, but not having much luck. Any help? Thanks

Read the article
Regex for circular replacement

- by polygenelubricants

How would you use regex to write functions to do the following: Replace lowercase 'a' with uppercase and vice versa Where words are separated by whitespaces and > and < are special markers, replace >word with word< and vice versa Replace postincrement (i++;) with preincrement (++i;) and vice versa. Variable names are [a-z]+. Input is just a bunch of these statements. Bonus: also do decrement. Also interested in solutions in other flavors. Note: this is NOT a homework question. See also my previous explorations of regex: Regex split into overlapping strings (Alan Moore's answer is especially instructive) Can you use zero-width matching regex in String split? (my solution exploits a known Java regex bug with regards to non-obvious length lookbehind!)

Read the article
Regular expressions in URL Rewrite Module for IIS7

- by TN

I have following rewrite rule to append .aspx extension if url has no extension. <rule name="SimpleRewrite" stopProcessing="true"> <match url="^(.*(?<=/)([^/.]*))$" /> <action type="Rewrite" url="{R:1}.aspx" /> </rule> However the rule is not working: Error HTTP 500.52 - URL Rewrite Module Error. The expression "^(.*(?<=/)([^/.]*))$" has an invalid syntax. However, this regular expression works in .NET. What regular expressions are supported by IIS Url Rewrite Module? How to make positive lookbehind assertion?

Read the article
Regex for [a-zA-Z0-9\-] with dashes allowed in between but not at the start or end

- by orokusaki

I'm using Python and I'm not trying to extract the value, but rather test to make sure it fits the pattern. allowed values: spam123-spam-eggs-eggs1 spam123-eggs123 spam 123 eggs123 I just can't have a dash at the starting or the end. There is a question on here that works in the opposite direction by getting the string value after the fact, but I simply need to test for the value so that I can disallow it. Also, it can be a maximum of 25 chars long, but a minimum of 4 chars long. Here's what I've come up with after some experimentation with lookbehind, etc: # Nothing here

Read the article
Regexp look-behind to match internet speeds

- by Sandman

So the user may search for "10 mbit" after which I want to capture the "10" so I can use it in a speed-search rather than a string-search. This isn't a problem, the below regexp does this fine: if (preg_match("/(\d+)\smbit/", $string)){ ... } But, the user may search for something like "10/10 mbit" or "10-100 mbit". I don't want to match those with the above regexp - they should be handled in another fashion. So I would like a regexp that matches "10 mbit" if the number is all-numeric as a whole word (i.e. contained by whitespace, newline or lineend/linestart) Using lookbehind, I did this: if (preg_match("#(?<!/)(\d+)\s+mbit#i", $string)){ Just to catch those that doesn't have "/" before them, but this matched true for this string: "10/10 mbit" so I'm obviously doing something wrong here, but what?

Read the article

1 2 | Next Page >