Search Results

Search found 3804 results on 153 pages for 'regex lookarounds'.

Page 12/153 | < Previous Page | 8 9 10 11 12 13 14 15 16 17 18 19 | Next Page >

PCRE/PHP regex not matching last "item"

- by superbarney

Here's a regex that I've cobbled up: /(.*={76}\s)?\s*(.*?)\s\-\-\s(\d{2}\/\d{2}\-\d{2}\s\d{2}:\d{2})\s\s(.*?)\s(http:\/\/service.*?)\s(\-{76})/is and here's the text I will be parsing: http://p.linode.com/7015 and here's the replacement for the matched text: <item>\n\t<title>$2</title>\n\t<pubDate>$pubDate</pubDate>\n\t<description>$4</description>\n\t<link>$5</link>\n</item>\n\n I have almost come up with a regex needed to parse a block of text into RSS 2.0 XML markup. I've tested it with RegExr and RegexBuddy and it works perfectly except for the last "item" where there is no line breaks after the link (Line 269). In short, the problem is the "iProperty" article in the text is not matched. Any regex gurus willing to help me troubleshoot what's wrong? I've tried /(.*={76}\s)?\s*(.*?)\s\-\-\s(\d{2}\/\d{2}\-\d{2}\s\d{2}:\d{2})\s\s(.*?)\s(http:\/\/service.*?)\s*(\-{76})/is Here's the output that I get: http://p.linode.com/7016 but that didn't work as I expected (i.e. a newline would be optional). Thanks!

Read the article
Regex to ensure group match doesn't end with a specific character

- by AJ

I'm having trouble coming up with a regular expression to match a particular case. I have a list of tv shows in about 4 formats: Name.Of.Show.S01E01 Name.Of.Show.0101 Name.Of.Show.01x01 Name.Of.Show.101 What I want to match is the show name. My main problem is that my regex matches the name of the show with a preceding '.'. My regex is the following: "^([0-9a-zA-Z\.]+)(S[0-9]{2}E[0-9]{2}|[0-9]{4}|[0-9]{2}x[0-9]{2}|[0-9]{3})" Some Examples: >>> import re >>> SHOW_INFO = re.compile("^([0-9a-zA-Z\.]+)(S[0-9]{2}E[0-9]{2}|[0-9]{4}|[0-9]{2}x[0-9]{2}|[0-9]{3})") >>> match = SHOW_INFO.match("Name.Of.Show.S01E01") >>> match.groups() ('Name.Of.Show.', 'S01E01') >>> match = SHOW_INFO.match("Name.Of.Show.0101") >>> match.groups() ('Name.Of.Show.0', '101') >>> match = SHOW_INFO.match("Name.Of.Show.01x01") >>> match.groups() ('Name.Of.Show.', '01x01') >>> match = SHOW_INFO.match("Name.Of.Show.101") >>> match.groups() ('Name.Of.Show.', '101') So the question is how do I avoid the first group ending with a period? I realize I could simply do: var.strip(".") However, that doesn't handle the case of "Name.Of.Show.0101". Is there a way I could improve the regex to handle that case better? Thanks in advance.

Read the article
Intelligencia URL ReWriter mapping with regex

- by alex

I am using the Intelligencia URL rewriter in my asp.net web application. I use the web.config mappings I'm trying to map the following url: www.mydomain.com/product-deals/manufacturer-model_PRODUCTId.aspx To: www.mydomain.com/ProductInfo.aspx?productID=xxx obviously in the above example, xxx is replaced from the "productId" from the "friendly" url. In my web.config, I've got so far: <rewrite url="~/contract-deals/([\w-_]+)/_(.+).aspx" to="~/ProductInfo.aspx?productId=$1"/> This isn't working however. I need the correct regex to use for my requirements (regex really isn't my strong point!!)

Read the article
Trying to parse links in an HTML directory listing using Java regex

- by DiskCrasher

Ok I know everyone is going to tell me not to use RegEx for parsing HTML, but I'm programming on Android and don't have ready access to an HTML parser (that I'm aware of). Besides, this is server generated HTML which should be more consistent than user-generated HTML. The regex looks like this: Pattern patternMP3 = Pattern.compile( "<A HREF=\"[^\"]+.+\\.mp3</A>", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE); Matcher matcherMP3 = patternMP3.matcher(HTML); while (matcherMP3.find()) { ... } The input HTML is all on one line, which is causing the problem. When the HTML is on separate lines this pattern works. Any suggestions?

Read the article
Django URL regex question

- by shawnjan

Hey all! I had a quick question about Django URL configuration, and I guess REGEX as well. I have a small configuration application that takes in environments and artifacts in the following way: url(r'^env/(?P<env>\w+)/artifact/(?P<artifact>\w+)/$', 'config.views.ipview', name="bothlist"), Now, this works fine, but what I would like to do is have it be able to have additional parameters that are optional, such as a verbose mode or no formating mode. I know how to do this just fine in the views, but I can't wrap my head around the regex. the call would be something like GET /env/<env>/artifact/<artifact>/<opt:verbose>/<opt:noformat> Any help would be appreciated, thanks! -Shawn

Read the article
Regex: replace inner string

- by AllenG

I'm working with X12 EDI Files (Specifialy 835s for those of you in Health Care), and I have a particular vendor who's using a non-HIPAA compliant version (3090, I think). The problem is that in a particular segment (PLB- again, for those who care) they're sending a code which is no longer supported by the HIPAA Standard. I need to locate the specific code, and update it with a corrected code. I think a Regex would be best for this, but I'm still very new to Regex, and I'm not sure where to begin. My current methodology is to turn the file into an array of strings, find the array that starts with "PLB", break that into an array of strings, find the code, and change it. As you can guess, that's very verbose code for something which should be (I'd think) fairly simple. Here's a sample of what I'm looking for: ~PLB|1902841224|20100228|49KC15X078001104|.08~ And here's what I want to change it to: ~PLB|1902841224|20100228|CSKC15X078001104|.08~ Any suggestions?

Read the article
R regex to validate user input is correct.

- by John

I'm trying to practice writing better code, so I wanted to validate my input sequence with regex to make sure that the first thing I get is a single letter A to H only, and the second is a number 1 to 12 only. I'm new to regex and not sure what the expression should look like. I'm also not sure what type of error R would throw if this is invalidated? In Perl it would be something like this I think: =~ m/([A-M]?))/) Here is what I have so far for R: input_string = "A1" first_well_row = unlist(strsplit(input_string, ""))[1] # get the letter out first_well_col = unlist(strsplit(input_string, ""))[2] # get the number out

Read the article
Javascript RegEx Question

- by Sarfraz

Hello, How to find below comment block(s) with javascript/jquery regex: /* some description here */ It may even look like: /* some description here */ Or /* some description here */

Read the article
Basic Boost Regex question

- by shuttle87

I'm trying to write some c++ code that tests if a string is in a particular format. In this program there is a height followed by some decimal numbers: for example "height 123.45" or "height 12" would return true but "SomeOtherString 123.45" would return false. My first attempt at this was to write the following: string action; cin >> action; boost::regex EXPR( "^height \\d*(\\.\\d{1,2})?$/" ) ;//height format regex bool height_format_matches = boost::regex_match( action, EXPR ) ; if(height_format_matches==true){ \\do some stuff } However height_format_matches never seemed to be true. Any help is greatly appreciated!

Read the article
Regex - find and replace complete string occurrences only (not partial matches)

- by vittore

I'm not very good at regex but maybe there's a simple way to achieve this task. I'm given a string like "bla @a bla @a1 bla" I'm also given pairs like {"a", "a2"} , {"a1", "a13"}, and I need to replace @a with @a2 for the first pair, and @a1 with @a13 for the second one. The problem is when i use String.Replace and look for @a, it also replaces @a1 but it should not. I need it to completely match @a and avoid partially matching it in other places. Note: the given string could also be brackets, commas, dots and so on. However, pairs will always be [a-z]*[0-9]+ Help me with regex replace, please. Cheers

Read the article
boost's regex won't compile

- by myeviltacos

Hi everyone. I am using boost 1.45.0 on Ubuntu with Code::Blocks as my IDE, and I can't get basic_regex.hpp to compile. I'm pretty sure I set up boost correctly, because I can compile programs using boost::format without any errors. But I'm getting this annoying error, and I don't know how to get rid of it. The code that is provoking the error: boost::regex e("\"http:\\\\/\\\\/localhostr.com\\\\/files\\\\/.+?\""); Compiler output (GCC): obj/Debug/main.o In function `boost::basic_regex<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::assign(char const*, char const*, unsigned int)' /home/neal/Documents/boost_1_45_0/boost/regex/v4/basic_regex.hpp|379| undefined reference to `boost::basic_regex<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::do_assign(char const*, char const*, unsigned int)'| ||=== Build finished: 1 errors, 0 warnings ===| Did I miss a step when setting up boost, or should I downgrade to another version of boost?

Read the article
Poorly performing regex

- by Kieron

I've a really poorly performing piece of regex, currently it makes Firefox, Chrome and IE hang for a period of time. Here's the reg-ex: ^([a-zA-Z0-9]+[/]?)+[a-zA-Z0-9]+$ It's kind of a url matcher, but should only match the requested path (not starting with or ending with a slash). Valid examples: Segment Segment/Segment segment/segment/Segment (etc) Invalid examples: /Segment Segment/ Segment/Segment/ Using the regex above over all three browsers and using two or more slashes causes the browsers to hang. It's obviously a poorly formed reg-ex, but can anyone help build a better one? Thanks,

Read the article
regex search a mysql text column

- by Ian

Okay, I thought my head hurt with regular regex, but I can't seem to find what I'm looking for with regexp in mysql. I'm trying to look for situations in news articles where a Textile-formatted url has not ended with a slash so: "Catherine Zeta-Jones":/cr/catherinezeta-jones/ visited stack overflow is ok but "Catherine Zeta-Jones":/cr/catherinezeta-jones visited stack overflow is not. [just used Catherine as an example because I'm assuming an alpha search wouldn't catch the hyphen] One of these days I'll have to do that goat sacrifice so I can gain the proper knowledge of regex. Thanks everyone!

Read the article
Conditional regex in vim?

- by Vineeth

Is it possible to do conditional regex (like the one described in http://www.regular-expressions.info/conditional.html) in Vim?

Read the article
perl regex groups

- by Aaron Moodie

I've currenly trying to pull out dates from a file and feed them directly into an array. My regex is working, but I have 6 groups in it, all of which are being added to the array, when I only want the first one. @dates = (@dates, ($line =~ /((0[1-9]|[12][0-9]|3[01])(\/|\-)(0[1-9]|1[0-2])(\/|\-)([0-9][0-9][0-9][0-9]|[0-9][0-9]))/g )); is there a simple way to grab the $1 group of a perl regex? my output is looking like this: 13/04/2009, 13, /, 04, /, 2009, 14-12-09, 14, -, 12, -, 09

Read the article
How can I customise Zend_Form regex error messages?

- by Matt

I have the following code: $postcode = $form-createElement('text', 'postcode'); $postcode-setLabel('Post code:'); $postcode-addValidator('regex', false, array('/^[a-z]{1,3}[0-9]{1,3} ?[0-9]{1,3}[a-z]{1,3}$/i')); $postcode-addFilters(array('StringToUpper')); $postcode-setRequired(true); It creates an input field in a form and sets a regex validation rule and works just fine. The problem is that the error message it displays when a user enters an invalid postcode is this: 'POSTCODE' does not match against pattern '/^[a-z]{1,3}[0-9]{1,3} ?[0-9]{1,3}[a-z]{1,3}$/i' (where input was POSTCODE) How can I change this message to be a little more friendly?

Read the article
regex to filter all but whitelisted characters from a multi-language string

- by jeroen

I am trying to cleanup a string coming from a search box on a multi-language site. Normally I would use a regex like: $allowed = "-+?!,.;:\w\s"; $txt_search = preg_replace("/[^" . $allowed . "]?(.*?)[^" . $allowed . "]?/iu", "$1", $_GET['txt_search']); and that works fine for English texts. However, now I need to do the same when the texts entered can be in any language (Russian now, Chinese in the future). How can I clean up the string while preserving "normal texts" in the original language? I though about switching to a blacklist (although I´d rather not...) but at this moment the regex just completely destroys all original input.

Read the article
PHP DateTime Regex

- by CogitoErgoSum

Hey, long story short I have inherited some terrible code. As a result a string comparison is buggy when comparing dates due to the format of the date. I am trying to convert the date to a valid DateFormat syntax so I can run a proper comparison. These are some samples of the current format: 12/01/10 at 8:00PM 12/31/10 at 12:00PM 12/10/09 at 5:00AM and so forth. I'd like to convert this to a YYYYMMDDHHMM format i.e 201012012000 for comparison purposes. If anyone can give me a quick regex snippet to do this that'd be appreciated as right now i'm hitting a brick wall for a regex. I can do it by exploding the string over several times etc but I'd rather do it in a more efficient manner. Thanks!

Read the article
REGEX (.*) and newline

- by someonewhodoesintspeakenglish

How this fix? REGEX: //REGEX $match_expression = '/Rt..tt<\/td> <td>(.*)<\/td>/'; preg_match($match_expression,$text,$matches1); $final = $matches1[1]; //THIS WORKING <tr> <td class="rowhead vtop">RtÅ¡tt</td> <td><img border=0 src="http://somephoto"><br /> <br />INFO INFO INFO</td> </tr> //THIS NOT WORKING <tr> <td class="rowhead vtop">RtÅ¡tt</td> <td> <br /> IFNO<br /> INFO<br /></td></tr> Sorry for my poor English

Read the article
Regex notation meaning

- by Bi

What does this mean in regex? "[0-9A-Za-z._-]+" Thanks

Read the article
WordPress: Problem with the shortcode regex

- by peroyomas

This is the regular expression used for "shortcodes" in WordPress (one for the whole tag, other for the attributes). return '(.?)\[('.$tagregexp.')\b(.*?)(?:(\/))?\](?:(.+?)\[\/\2\])?(.?)'; $pattern = '/(\w+)\s*=\s*"([^"]*)"(?:\s|$)|(\w+)\s*=\s*\'([^\']*)\'(?:\s|$)|(\w+)\s*=\s*([^\s\'"]+)(?:\s|$)|"([^"]*)"(?:\s|$)|(\S+)(?:\s|$)/'; It parses stuff like [foo bar="baz"]content[/foo] or [foo /] In the WordPress trac they say it's a bit flawed, but my main problem is that it don't support shortcodes inside the attributes, like in [foo bar="[baz /]"]content[/foo] because the regex stops the main shortcode at the first appearance of a closing bracket, so in the example it renders [foo bar="[baz /] and "]content[/foo] shows as it is. Is there any way to change the regex so it bypass any ocurrence of [ with ] and its content when occurs between the opening tag or self-closing tag?

Read the article
Regex to represent "NOT" in a group

- by Joe Ijam

I have this Regex; <(\d+)(\w+\s\d+\s\d+(?::\d+){2})\s([\w\/.-])(.) What I want to do is to return FALSE(Not matched) if the third group is "MSWinEventLog" and returning "matched" for the rest. <166Apr 28 10:46:34 AMC the remaining phrase <11Apr 28 10:46:34 MSWinEventLog the remaining phrase <170Apr 28 10:46:34 Avantail the remaining phrase <171Apr 28 10:46:34 Avantail the remaining phrase <172Apr 28 10:46:34 AMC the remaining phrase <173Apr 28 10:46:34 AMC the remaining phrase <174Apr 28 10:46:34 Avantail the remaining phrase <175Apr 28 10:46:34 AMC the remaining phrase <176Apr 28 10:46:34 AMC the remaining phrase <177Apr 28 10:46:34 Avantail the remaining phrase <178Apr 28 10:46:34 AMC the remaining phrase <179Apr 28 10:46:34 Avantail the remaining phrase <180Apr 28 10:46:34 Avantail the remaining phrase How to put " NOT 'MSWinEventLog' " in the regex group ([\w\/.-]*) ? Note : The second phrase above should return "not matched"

Read the article
Regex for date.

- by Harikrishna

What should be the regex for matching date of any format like 26FEB2009 30 Jul 2009 27 Mar 2008 29/05/2008 27 Aug 2009 What should be the regular expression for that ? I have regex that matches with 26-Feb-2009 and 26 FEB 2009 with but not with 26FEB2009. So if any one know then please update it. (?:^|[^\d\w:])(?'day'\d{1,2})(?:-?st\s+|-?th\s+|-?rd\s+|-?nd\s+|-|\s+)(?'month'Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[uarychilestmbro]*(?:\s*,?\s*|-)(?:'?(?'year'\d{2})|(?'year'\d{4}))(?=$|[^\d\w]) EDIT The date 26FEB2009 is substring of other string like FUTIDX 26FEB2009 NIFTY 0 and parsed from html page, so I can not set the whitespace or delimiter.

Read the article
What's wrong with my regex

- by Tom Brown

Yes I know its usually a bad idea to parse HTML using RegEx, but that aside can someone explain the fault here: string outputString = Regex.Replace(inputString, @"<?(?i:script|embed|object|frameset|frame|iframe|metalink|style|html|img|layer|ilayer|meta|applet)(.|\n)*?>", ""); if (outputString != inputString) { Console.WriteLine("unwanted tags detected"); } It certainly detects the intended tags like: <script> and <html>, but it also rejects strings I want to allow such as <B>Description</B> and <A href="http://www.mylink.com/index.html">A Link containing 'HTML'</A>

Read the article
Regex matching very slow

- by Ali Lown

I am trying to parse a PDF to extract the text from it (please don't suggest any libraries to do this, as this is part of learning the format). I have already handled deflating it to put it in the alphanumeric format. I now need to extract the text from the text blocks. So, my current pattern is "BT.*?((.*?)).*?ET" (with DOTMATCHALL set) to match something like: BT /F13 12 Tf 288 720 Td (ABC) Tj ET The only bit I want is the text ABC in the brackets. The above pattern works, but is really slow, I assume it is because the regex library is failing to match the pattern that matches the text between BT and the (ABC) many times. The regex is pre-compiled in an attempt to speed it up, but it seems negligible. How may I speed this up?

Read the article

< Previous Page | 8 9 10 11 12 13 14 15 16 17 18 19 | Next Page >