Search Results

Search found 5493 results on 220 pages for 'boost regex'.

Page 126/220 | < Previous Page | 122 123 124 125 126 127 128 129 130 131 132 133  | Next Page >

  • Jakarta Regexp 1.5 Backreferences?

    - by Matt Smith
    Why does this match: String str = "099.9 102.2" + (char) 0x0D; RE re = new RE("^([0-9]{3}.[0-9]) ([0-9]{3}.[0-9])\r$"); System.out.println(re.match(str)); But this does not: String str = "099.9 102.2" + (char) 0x0D; RE re = new RE("^([0-9]{3}.[0-9]) \1\r$"); System.out.println(re.match(str)); The back references don't seem to be working... What am I missing?

    Read the article

  • Regular expressions in python unicode

    - by Remy
    I need to remove all the html tags from a given webpage data. I tried this using regular expressions: import urllib2 import re page = urllib2.urlopen("http://www.frugalrules.com") from bs4 import BeautifulSoup, NavigableString, Comment soup = BeautifulSoup(page) link = soup.find('link', type='application/rss+xml') print link['href'] rss = urllib2.urlopen(link['href']).read() souprss = BeautifulSoup(rss) description_tag = souprss.find_all('description') content_tag = souprss.find_all('content:encoded') print re.sub('<[^>]*>', '', content_tag) But the syntax of the re.sub is: re.sub(pattern, repl, string, count=0) So, I modified the code as (instead of the print statement above): for row in content_tag: print re.sub(ur"<[^>]*>",'',row,re.UNICODE But it gives the following error: Traceback (most recent call last): File "C:\beautifulsoup4-4.3.2\collocation.py", line 20, in <module> print re.sub(ur"<[^>]*>",'',row,re.UNICODE) File "C:\Python27\lib\re.py", line 151, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: expected string or buffer What am I doing wrong?

    Read the article

  • Regexp look-behind to match internet speeds

    - by Sandman
    So the user may search for "10 mbit" after which I want to capture the "10" so I can use it in a speed-search rather than a string-search. This isn't a problem, the below regexp does this fine: if (preg_match("/(\d+)\smbit/", $string)){ ... } But, the user may search for something like "10/10 mbit" or "10-100 mbit". I don't want to match those with the above regexp - they should be handled in another fashion. So I would like a regexp that matches "10 mbit" if the number is all-numeric as a whole word (i.e. contained by whitespace, newline or lineend/linestart) Using lookbehind, I did this: if (preg_match("#(?<!/)(\d+)\s+mbit#i", $string)){ Just to catch those that doesn't have "/" before them, but this matched true for this string: "10/10 mbit" so I'm obviously doing something wrong here, but what?

    Read the article

  • C# regular expression

    - by vert
    How would I write a regular expression (C#) which will check a given string to see if any of its characters are characters OTHER than the following: a-z A-Z Æ æ Å å Ø ø - ' Thanks!

    Read the article

  • Weird error using preg_match and unicode

    - by Thorpe Obazee
    if (preg_match('(\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+)', '2010/02/14/this-is-something')) { // do stuff } The above code works. However this one doesn't. if (preg_match('/\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+/u', '2010/02/14/this-is-something')) { // do stuff } Maybe someone could shed some light as to why the one below doesn't work. This is the error that is being produced: A PHP Error was encountered Severity: Warning Message: preg_match() [function.preg-match]: Unknown modifier '\'

    Read the article

  • How to avoid resetting the java Scanner position

    - by Derek
    I have some code that looks more or less like this: while(scanner.hasNext()) { if(scanner.findInLine("Test") !=null) { //do some things }else{ scanner.nextLine(); } } I am using this to parse an ~10MB text file. The problem is, if I put a breakpoint on the while() and the scanner.nextLine(), I can see that sometimes the scanners position (in the debug window) goes back to zero. I think this is causing me some kind of loop blow up, because the regext in findInLine() starts at zero, looks through some amount of text, advancing the position, and then it randomly gets set back to zero, so it has to re-parse all that text again. Any ideas what can be causing that? Am I even doing this the right way? Thanks Some additional info: The Scanner is instantiated from an InputStream. After diubg sine debugging, it appears that there is a HearCharBuffer that Scanner uses and it only allows 1024 characters at a time, and then resets. Is there a way to avoid this, or do things differently? That seems like a small amount of characters to be able to scan. Derek

    Read the article

  • Is there a way to get the PREMATCH ($`) and POSTMATCH ($') from pcrecpp?

    - by Eric Peers
    Is there a way to obtain the C++ equivalent of Perl's PREMATCH ($`) and POSTMATCH ($') from pcrecpp? I would be happy with a string, a char *, or pairs indices/startpos+length that point at this. StringPiece seems like it might accomplish part of this, but I'm not certain how to get it. in perl: $_ = "Hello world"; if (/lo\s/) { $pre = $`; #should be "Hel" $post = $'; #should be "world" } in C++ I would have something like: string mystr = "Hello world"; //do I need to map this in a StringPiece? if (pcrecpp::RE("lo\s").PartialMatch(mystr)) { //should I use Consume or FindAndConsume? //What should I do here to get pre+post matches??? } pcre plainjane c seems to have the ability to return the vector with the matches including the "end" portion of the string, so I could theoretically extract such a pre/post variable, but that seems like a lot of work. I like the simplicty of the pcrecpp interface. Suggestions? Thanks! --Eric

    Read the article

  • Matching several items inside one string with preg_match_all() and end characters

    - by nefo_x
    I have the following code: preg_match_all('/(.*) \((\d+)\) - ([\d\.\d]+)[,?]/U', "E-Book What I Didn't Learn At School... (2) - 3525.01, FREE Intro DVD/Vid (1) - 0.15", $match); var_dump($string, $match); and get the following ouput: array(4) { [0]=> array(1) { [0]=> string(54) "E-Book What I Didn't Learn At School... (2) - 3525.01," } [1]=> array(1) { [0]=> string(39) "E-Book What I Didn't Learn At School..." } [2]=> array(1) { [0]=> string(1) "2" } [3]=> array(1) { [0]=> string(7) "3525.01" } } which matches only one items... what i need is to get all items from such strings. when i've added "," sign to the end of the string - it worked fine. but that is non-sense in adding comma to each string. Any advice?

    Read the article

  • How to process this string via regular expression

    - by iiduce
    my string style like this: expression1/field1+expression2*expression3+expression4/field2*expression5*expression6/field3 a real style mybe like this: computer/(100)+web*mail+explorer/(200)*bbs*solution/(300) "+" and "*" represent operator "computer","web"...represent expression (100),(200) represent field num . field num may not exist. I want process the string to this: /(100)+web*+explorer/(200)bbs/(300) rules like this: if expression length is more than 3 and its field is not (200), then add brackets to it.

    Read the article

  • regular expression and escaping

    - by pstanton
    Sorry if this has been asked, my search brought up many off topic posts. I'm trying to convert wildcards from a user defined search string (wildcard is "*") to postgresql like wildcard "%". I'd like to handle escaping so that "%" => "\%" and "\*" => "*" I know i could replace \* with something else prior to replacing * and then swap it back, but i'd prefer not to and instead only convert * using a pattern that selects it when not proceeded by \. String convertWildcard(String like) { like = like.replaceAll("%", "\\%"); like = like.replaceAll("\\*", "%"); return like; } Assert.assertEquals("%", convertWildcard("*")); Assert.assertEquals("\%", convertWildcard("%")); Assert.assertEquals("*", convertWildcard("\*")); // FAIL Assert.assertEquals("a%b", convertWildcard("a*b")); Assert.assertEquals("a\%b", convertWildcard("a%b")); Assert.assertEquals("a*b", convertWildcard("a\*b")); // FAIL ideas welcome.

    Read the article

  • Help with Regular Expression

    - by shivesh
    Hello I need help with Regular Expression, I want to match each section (number and it's text - 2 groups), the text can be multi line, each section ends when another section starts (another number) or when .END is reached or EOF. Demo Expression: \(\d{1,3}\) ([\s\S]*?)(\.END|\(\d{1,3}\)) Input text: (1) some text some text some text some text some text some text (2) some text some textsome text (3) some textsome text some textsome textsome text (4) some text .END first group should match number (with brackets) and second group should match corresponded text.

    Read the article

  • best REGEXP friendly Text Editors + most powerful REGEXP syntax?

    - by John
    I am fluent with Microsoft Visual 2005 regular expressions and they are a big time saver. I seem to learn them best by having a vaguely organized cheat sheet thrown at me, at which point I read just a little and play with them until I understand what's going on. That learning approach has worked well for me, for now. I would really like to take this to the next level though. Basically -- What is the REGEXP convention that is generally regarded as the most open-ended and powerful? VS2005 Regexps seem kind of gimped, so maybe I'm a kid playing in a sandbox. Are there text editors out there that can perform a highlight all matches, list lines containing string, or some kind of powerful function like that in conjunction with the very strongest REGEXP language? If not I can just use multiple programs and a weird technique but I'd like to avoid that. I wonder if a stronger REGEXP language or a "stronger" regEXP writer might be able to have his search match all results on all lines even by clicking a "find next" by adding some simple criteria to the search. Anyway, please provide advice!

    Read the article

  • What is a good CPU/PC setup to speed up intensive C++/templates compilation?

    - by ApplePieIsGood
    I currently have a machine with an Opteron 275 (2.2Ghz), which is a dual core CPU, and 4GB of RAM, along with a very fast hard drive. I find that when compiling even somewhat simple projects that use C++ templates (think boost, etc.), my compile times can take quite a while (minutes for small things, much longer for bigger projects). Unfortunately only one of the cores is pegged at 100%, so I know it's not the I/O, and it would seem that there is no way to take advantage of the other core for C++ compilation?

    Read the article

  • Writing a PHP web crawler using cron

    - by Horse
    Hi all I have written myself a web crawler using simplehtmldom, and have got the crawl process working quite nicely. It crawls the start page, adds all links into a database table, sets a session pointer, and meta refreshes the page to carry onto the next page. That keeps going until it runs out of links That works fine however obviously the crawl time for larger websites is pretty tedious. I wanted to be able to speed things up a bit though, and possibly make it a cron job. Any ideas on making it as quick and efficient as possible other than setting the memory limit / execution time higher?

    Read the article

  • string substitution regular expression not working in tcl

    - by Puneet Mittal
    i am trying to replace all the special characters including white space, hyphen, etc, to underscore, from a string variable in tcl. I wrote the code below but it doesn't seem to be working. set varname $origVar puts "Variable Name :>> $varname" if {$varname != ""} { regsub -all {[\s-\]\[$^?+*()|\\%&#]} $varname "_" $newVar } puts "New Variable :>> $newVar" one issue is that, instead of replacing the string in $varname, it is replacing the data inside $origVar. No idea why, and also i read the example code (for proper syntax) in my tcl book and according to that it should be something like this regsub -all {[\s-][$^?+*()|\\%&#]} $varname "_" newVar so i used the same syntax but it didn't work and gave the same result as modifying the $origVar instead of required $varname value.

    Read the article

  • Regular expression match, extracting only wanted segments of string

    - by Ben Carey
    I am trying to extract three segments from a string. As I am not particularly good with regular expressions, I think what I have done could probably be done better... I would like to extract the bold parts of the following string: SOMETEXT: ANYTHING_HERE (Old=ANYTHING_HERE, New=ANYTHING_HERE) Some examples could be: ABC: Some_Field (Old=,New=123) ABC: Some_Field (Old=ABCde,New=1234) ABC: Some_Field (Old=Hello World,New=Bye Bye World) So the above would return the following matches: $matches[0] = 'Some_Field'; $matches[1] = ''; $matches[2] = '123'; So far I have the following code: preg_match_all('/^([a-z]*\:(\s?)+)(.+)(\s?)+\(old=(.+)\,(\s?)+new=(.+)\)/i',$string,$matches); The issue with the above is that it returns a match for each separate segment of the string. I do not know how to ensure the string is the correct format using a regular expression without catching and storing the match if that makes sense? So, my question, if not already clear, how I can retrieve just the segments that I want from the above string?

    Read the article

  • preg_replace only replaces first occurrence then skips to next line

    - by Dom
    Got a problem where preg_replace only replaces the first match it finds then jumps to the next line and skips the remaining parts on the same line that I also want to be replaced. What I do is that I read a CSS file that sometimes have multiple "url(media/pic.gif)" on a row and replace "media/pic.gif" (the file is then saved as a copy with the replaced parts). The content of the CSS file is put into the variable $resource_content: $resource_content = preg_replace('#(url\((\'|")?)(.*)((\'|")?\))#i', '${1}'.url::base(FALSE).'${3}'.'${4}', $resource_content); Does anyone know a solution for why it only replaces the first match per line?

    Read the article

  • How to get N random string from a {a1|a2|a3} format string?

    - by Pentium10
    Take this string as input: string s="planets {Sun|Mercury|Venus|Earth|Mars|Jupiter|Saturn|Uranus|Neptune}" How would I choose randomly N from the set, then join them with comma. The set is defined between {} and options are separated with | pipe. The order is maintained. Some output could be: string output1="planets Sun, Venus"; string output2="planets Neptune"; string output3="planets Earth, Saturn, Uranus, Neptune"; string output4="planets Uranus, Saturn";// bad example, order is not correct Java 1.5

    Read the article

  • Problem with regular expression for some special parttern.

    - by SpawnCxy
    Hi all, I got a problem when I tried to find some characters with following code: preg_match_all('/[\w\uFF10-\uFF19\uFF21-\uFF3A\uFF41-\uFF5A]/',$str,$match); //line 5 print_r($match); And I got error as below: Warning: preg_match_all() [function.preg-match-all]: Compilation failed: PCRE does not support \L, \l, \N, \U, or \u at offset 4 in E:\mycake\app\webroot\re.php on line 5 I'm not so familiar with reg expression and have no idea about this error.How can I fix this?Thanks.

    Read the article

< Previous Page | 122 123 124 125 126 127 128 129 130 131 132 133  | Next Page >