Search Results

Search found 3804 results on 153 pages for 'regex'.

Page 85/153 | < Previous Page | 81 82 83 84 85 86 87 88 89 90 91 92  | Next Page >

  • strange behavior in vim with negative look-behind

    - by João Portela
    So, I am doing this search in vim: /\(\(unum\)\|\(player\)=\)\@<!\"1\" and as expected it does not match lines that have: player="1" but matches lines that have: unum="1" what am i doing wrong? isn't the atom to be negated all of this: \(\(unum\)\|\(player\)=\) naturally just doing: /\(\(unum\)\|\(player\)=\) matches unum= or player=.

    Read the article

  • How can I use Perl regular expressions to parse XML data?

    - by Luke
    I have a pretty long piece of XML that I want to parse. I want to remove everything except for the subclass-code and city. So that I am left with something like the example below. EXAMPLE TEST SUBCLASS|MIAMI CODE <?xml version="1.0" standalone="no"?> <web-export> <run-date>06/01/2010 <pub-code>TEST <ad-type>TEST <cat-code>Real Estate</cat-code> <class-code>TEST</class-code> <subclass-code>TEST SUBCLASS</subclass-code> <placement-description></placement-description> <position-description>Town House</position-description> <subclass3-code></subclass3-code> <subclass4-code></subclass4-code> <ad-number>0000284708-01</ad-number> <start-date>05/28/2010</start-date> <end-date>06/09/2010</end-date> <line-count>6</line-count> <run-count>13</run-count> <customer-type>Private Party</customer-type> <account-number>100099237</account-number> <account-name>DOE, JOHN</account-name> <addr-1>207 CLARENCE STREET</addr-1> <addr-2> </addr-2> <city>MIAMI</city> <state>FL</state> <postal-code>02910</postal-code> <country>USA</country> <phone-number>4014612880</phone-number> <fax-number></fax-number> <url-addr> </url-addr> <email-addr>[email protected]</email-addr> <pay-flag>N</pay-flag> <ad-description>DEANESTATES2BEDS2BATHSAPPLIANCED</ad-description> <order-source>Import</order-source> <order-status>Live</order-status> <payor-acct>100099237</payor-acct> <agency-flag>N</agency-flag> <rate-note></rate-note> <ad-content> MIAMI&#47;Dean Estates&#58; 2 beds&#44; 2 baths&#46; Applianced&#46; Central air&#46; Carpets&#46; Laundry&#46; 2 decks&#46; Pool&#46; Parking&#46; Close to everything&#46;No smoking&#46; No utilities&#46; &#36;1275 mo&#46; 401&#45;578&#45;1501&#46; </ad-content> </ad-type> </pub-code> </run-date> </web-export> PERL So what I want to do is open an existing file read the contents then use regular expressions to eliminate the unnecessary XML tags. open(READFILE, "FILENAME"); while(<READFILE>) { $_ =~ s/<\?xml version="(.*)" standalone="(.*)"\?>\n.*//g; $_ =~ s/<subclass-code>//g; $_ =~ s/<\/subclass-code>\n.*/|/g; $_ =~ s/(.*)PJ RER Houses /PJ RER Houses/g; $_ =~ s/\G //g; $_ =~ s/<city>//g; $_ =~ s/<\/city>\n.*//g; $_ =~ s/<(\/?)web-export>(.*)\n.*//g; $_ =~ s/<(\/?)run-date>(.*)\n.*//g; $_ =~ s/<(\/?)pub-code>(.*)\n.*//g; $_ =~ s/<(\/?)ad-type>(.*)\n.*//g; $_ =~ s/<(\/?)cat-code>(.*)<(\/?)cat-code>\n.*//g; $_ =~ s/<(\/?)class-code>(.*)<(\/?)class-code>\n.*//g; $_ =~ s/<(\/?)placement-description>(.*)<(\/?)placement-description>\n.*//g; $_ =~ s/<(\/?)position-description>(.*)<(\/?)position-description>\n.*//g; $_ =~ s/<(\/?)subclass3-code>(.*)<(\/?)subclass3-code>\n.*//g; $_ =~ s/<(\/?)subclass4-code>(.*)<(\/?)subclass4-code>\n.*//g; $_ =~ s/<(\/?)ad-number>(.*)<(\/?)ad-number>\n.*//g; $_ =~ s/<(\/?)start-date>(.*)<(\/?)start-date>\n.*//g; $_ =~ s/<(\/?)end-date>(.*)<(\/?)end-date>\n.*//g; $_ =~ s/<(\/?)line-count>(.*)<(\/?)line-count>\n.*//g; $_ =~ s/<(\/?)run-count>(.*)<(\/?)run-count>\n.*//g; $_ =~ s/<(\/?)customer-type>(.*)<(\/?)customer-type>\n.*//g; $_ =~ s/<(\/?)account-number>(.*)<(\/?)account-number>\n.*//g; $_ =~ s/<(\/?)account-name>(.*)<(\/?)account-name>\n.*//g; $_ =~ s/<(\/?)addr-1>(.*)<(\/?)addr-1>\n.*//g; $_ =~ s/<(\/?)addr-2>(.*)<(\/?)addr-2>\n.*//g; $_ =~ s/<(\/?)state>(.*)<(\/?)state>\n.*//g; $_ =~ s/<(\/?)postal-code>(.*)<(\/?)postal-code>\n.*//g; $_ =~ s/<(\/?)country>(.*)<(\/?)country>\n.*//g; $_ =~ s/<(\/?)phone-number>(.*)<(\/?)phone-number>\n.*//g; $_ =~ s/<(\/?)fax-number>(.*)<(\/?)fax-number>\n.*//g; $_ =~ s/<(\/?)url-addr>(.*)<(\/?)url-addr>\n.*//g; $_ =~ s/<(\/?)email-addr>(.*)<(\/?)email-addr>\n.*//g; $_ =~ s/<(\/?)pay-flag>(.*)<(\/?)pay-flag>\n.*//g; $_ =~ s/<(\/?)ad-description>(.*)<(\/?)ad-description>\n.*//g; $_ =~ s/<(\/?)order-source>(.*)<(\/?)order-source>\n.*//g; $_ =~ s/<(\/?)order-status>(.*)<(\/?)order-status>\n.*//g; $_ =~ s/<(\/?)payor-acct>(.*)<(\/?)payor-acct>\n.*//g; $_ =~ s/<(\/?)agency-flag>(.*)<(\/?)agency-flag>\n.*//g; $_ =~ s/<(\/?)rate-note>(.*)<(\/?)rate-note>\n.*//g; $_ =~ s/<ad-content>(.*)\n.*//g; $_ =~ s/\t(.*)\n.*//g; $_ =~ s/<\/ad-content>(.*)\n.*//g; } close( READFILE1 ); Is there an easier way of doing this? I don't want to use any modules. I know that it might make this easier but the file I am reading has a lot of data in it.

    Read the article

  • Replaceing <a href="mailto: with just email aadress

    - by Lauri
    I want to replace all "mailto:" links in html with plain emails. In: text .... <a href="mailto:[email protected]">not needed</a> text Out: text .... [email protected] text I did this: $str = preg_replace("/\<a.+href=\"mailto:(.*)\".+\<\/a\>/", "$1", $str); But it fails if there are multiple emails in string or html inside "a" tag In: <a href="mailto:[email protected]">not needed</a><a href="mailto:[email protected]"><font size="3">[email protected]</font></a> Out: [email protected]">

    Read the article

  • Invert regexp in vim

    - by Chris J
    There's a few "how do I invert a regexp" questions here on stackoverflow, but I can't find one for vim (if it does exist, by goggle-fu is lacking today). In essence I want to match all non-printable characters and delete them. I could write a short script, or drop to a shell and use tr or something similar to delete, but a vim solution would be dandy :-) Vim has the atom \p to match printable characters, however trying to do this :s/[^\p]//g to match the inverse failed and just left me with every 'p' in the file. I've seen the (?!xxx) sequence in other questions, and vim seems to not recognise this sequence. I've not found seen an atom for non-printable chars. In the interim, I'm going to drop to external tools, but if anyone's got any trick up their sleeve to do this, it'd be welcome :-) Ta!

    Read the article

  • Regexp match in Java

    - by tinti
    Regexp in Java I want to make a regexp who do this verify if a word is like [0-9A-Za-z][._-'][0-9A-Za-z] example for valid words A21a_c32 daA.da2 das'2 dsada ASDA 12SA89 non valid words dsa#da2 34$ Thanks

    Read the article

  • Intersection of two regular expressions

    - by Henry
    Hi, Im looking for function (PHP will be the best), which returns true whether exists string matches both regexpA and regexpB. Example 1: $regexpA = '[0-9]+'; $regexpB = '[0-9]{2,3}'; hasRegularsIntersection($regexpA,$regexpB) returns TRUE because '12' matches both regexps Example 2: $regexpA = '[0-9]+'; $regexpB = '[a-z]+'; hasRegularsIntersection($regexpA,$regexpB) returns FALSE because numbers never matches literals. Thanks for any suggestions how to solve this. Henry

    Read the article

  • Bash: Extract Range with Regular Expressioin (maybe sed?)

    - by sixtyfootersdude
    I have a file that is similar to this: <many lines of stuff> SUMMARY: <some lines of stuff> END OF SUMMARY I want to extract just the stuff between SUMMARY and END OF SUMMARY. I suspect I can do this with sed but I am not sure how. I know I can modify the stuff in between with this: sed "/SUMMARY/,/END OF SUMMARY/ s/replace/with/" fileName (But not sure how to just extract that stuff). I am Bash on Solaris.

    Read the article

  • regular expression of 0's and 1's

    - by Lopa
    Hello all I got this question which asks me to figure out why is it foolish to write a regular expression for the language that consists of strings of 0's and 1's that are palindromes( they read the same backwards and forwards). part 2 of the question says using any formal mechanism of your choice, show how it is possible to express the language that consists of strings of 0's and 1's that are palindromes?

    Read the article

  • Find ASCII "arrows" in text

    - by ulver
    I'm trying to find all the occurrences of "Arrows" in text, so in "<----=====><==->>" the arrows are: "<----", "=====>", "<==", "->", ">" This works: String[] patterns = {"<=*", "<-*", "=*>", "-*>"}; for (String p : patterns) { Matcher A = Pattern.compile(p).matcher(s); while (A.find()) { System.out.println(A.group()); } } but this doesn't: String p = "<=*|<-*|=*>|-*>"; Matcher A = Pattern.compile(p).matcher(s); while (A.find()) { System.out.println(A.group()); } No idea why. It often reports "<" instead of "<====" or similar. What is wrong?

    Read the article

  • Cleaning strings in R: add punctuation w/o overwriting last character

    - by spearmint
    I'm new to R and unable to find other threads with a similar issue. I'm cleaning data that requires punctuation at the end of each line. I am unable to add, say, a period without overwriting the final character of the line preceding the carriage return + line feed. Sample code: Data1 <- "%trn: dads sheep\r\n*MOT: hunn.\r\n%trn: yes.\r\n*MOT: ana mu\r\n%trn: where is it?" Data2 <- gsub("[^[:punct:]]\r\n\\*", ".\r\n\\*", Data1) The contents of Data2: [1] "%trn: dads shee.\r\n*MOT: hunn.\r\n%trn: yes.\r\n*MOT: ana mu\r\n%trn: where is it?" Notice the "p" of sheep was overwritten with the period. Any thoughts on how I could avoid this?

    Read the article

  • Building a regexp to split a string

    - by Kivin
    I'm seeking a solution to splitting a string which contains text in the following format: "abcd efgh 'ijklm no pqrs' tuv" which will produce the following results: ['abcd', 'efgh', 'ijklm no pqrs', 'tuv'] In otherwords, it splits by whitespace unless inside of a single quoted string. I think it could be done with .NET regexps using "Lookaround" operators, particularly balancing operators. I'm not so sure about perl.

    Read the article

  • How do you implement a good profanity filter?

    - by Ben Throop
    Many of us need to deal with user input, search queries, and situations where the input text can potentially contain profanity or undesirable language. Oftentimes this needs to be filtered out. Where can one find a good list of swear words in various languages and dialects? Are there APIs available to sources that contain good lists? Or maybe an API that simply says "yes this is clean" or "no this is dirty" with some parameters? What are some good methods for catching folks trying to trick the system, like a$$, azz, or a55? Bonus points if you offer solutions for PHP. :) Edit: Response to answers that say simply avoid the programmatic issue: I think there is a place for this kind of filter when, for instance, a user can use public image search to find pictures that get added to a sensitive community pool. If they can search for "penis", then they will likely get many pictures of, yep. If we don't want pictures of that, then preventing the word as a search term is a good gatekeeper, though admittedly not a foolproof method. Getting the list of words in the first place is the real question. So I'm really referring to a way to figure out of a single token is dirty or not and then simply disallow it. I'd not bother preventing a sentiment like the totally hilarious "long necked giraffe" reference. Nothing you can do there. :)

    Read the article

  • Building a Hashtag in Javascript without matching Anchor Names, BBCode or Escaped Characters

    - by Martindale
    I would like to convert any instances of a hashtag in a String into a linked URL: #hashtag - should have "#hashtag" linked. This is a #hashtag - should have "#hashtag" linked. This is a [url=http://www.mysite.com/#name]named anchor[/url] - should not be linked. This isn&#39;t a pretty way to use quotes - should not be linked. Here is my current code: String.prototype.parseHashtag = function() { return this.replace(/[^&][#]+[A-Za-z0-9-_]+(?!])/, function(t) { var tag = t.replace("#","") return t.link("http://www.mysite.com/tag/"+tag); }); }; Currently, this appears to fix escaped characters (by excluding matches with the amperstand), handles named anchors, but it doesn't link the #hashtag if it's the first thing in the message, and it seems to grab include the 1-2 characters prior to the "#" in the link. Halp!

    Read the article

  • Convert regular expression to CFG

    - by user242581
    How can I convert some regular language to its equivalent Context Free Grammar(CFG)? Whether the DFA corresponding to that regular expression is required to be constructed or is there some rule for the above conversion? For example, considering the following regular expression 01+10(11)* How can I describe the grammar corresponding to the above RE?

    Read the article

  • Regular Expression With Mask

    - by Kumar
    I have a regular expression for phone numbers as follows: ^[01]?[- .]?(\([2-9]\d{2}\)|[2-9]\d{2})[- .]?\d{3}[- .]?\d{4}$ I have a mask on the phone number textbox in the following format: (___)___-____ How can I modify the regular expression so that it accommodates the mask?

    Read the article

  • Pulling out two separate words from a string using reg expressions?

    - by Marvin
    I need to improve on a regular expression I'm using. Currently, here it is: ^[a-zA-Z\s/-]+ I'm using it to pull out medication names from a variety of formulation strings, for example: SULFAMETHOXAZOLE-TRIMETHOPRIM 200-40 MG/5ML PO SUSP AMOX TR/POTASSIUM CLAVULANATE 125 mg-31.25 mg ORAL TABLET, CHEWABLE AMOXICILLIN TRIHYDRATE 125 mg ORAL TABLET, CHEWABLE AMOX TR/POTASSIUM CLAVULANATE 125 mg-31.25 mg ORAL TABLET, CHEWABLE Amoxicillin 1000 MG / Clavulanate 62.5 MG Extended Release Tablet The resulting matches on these examples are: SULFAMETHOXAZOLE-TRIMETHOPRIM AMOX TR/POTASSIUM CLAVULANATE AMOXICILLIN TRIHYDRATE AMOX TR/POTASSIUM CLAVULANATE Amoxicillin The first four are what I want, but on the fifth, I really need "Amoxicillin / Clavulanate". How would I pull out patterns like "Amoxicillin / Clavulanate" (in fifth row) while missing patterns like "MG/5 ML" (in the first row)?

    Read the article

  • Regexp: Replace only in specific context

    - by blinry
    In a text, I would like to replace all occurrences of $word by [$word]($word) (to create a link in Markdown), but only if it is not already in a link. Example: [$word homepage](http://w00tw00t.org) should not become [[$word]($word) homepage](http://w00tw00t.org). Thus, I need to check whether $word is somewhere between [ and ] and only replace if it's not the case. Can you think of a preg_replace command for this?

    Read the article

  • How to split a space separated file?

    - by simplesimon
    Hi I am trying to import this: http://en.wikipedia.org/wiki/List_of_countries_by_continent_%28data_file%29 which is of the format like: AS AF AFG 004 Afghanistan, Islamic Republic of EU AX ALA 248 Åland Islands EU AL ALB 008 Albania, Republic of AF DZ DZA 012 Algeria, People's Democratic Republic of OC AS ASM 016 American Samoa EU AD AND 020 Andorra, Principality of AF AO AGO 024 Angola, Republic of NA AI AIA 660 Anguilla if i do <? explode(" ",$data"); ?> that works fine apart from countries with more than 1 word. how can i split it so i get the first 4 bits of data (the chars/ints) and the 5th bit of data being whatever remains? this is in php thank you

    Read the article

< Previous Page | 81 82 83 84 85 86 87 88 89 90 91 92  | Next Page >