Search Results

Search found 10005 results on 401 pages for 'regex trouble'.

Page 119/401 | < Previous Page | 115 116 117 118 119 120 121 122 123 124 125 126  | Next Page >

  • regular expression code

    - by Gaia Andreoletti
    Deal all, I need to find match between two tab delimited files files like this: File 1: ID1 1 65383896 65383896 G C PCNXL3 ID1 2 56788990 55678900 T A ACT1 ID1 1 56788990 55678900 T A PRO55 File 2 ID2 34 65383896 65383896 G C MET5 ID2 2 56788990 55678900 T A ACT1 ID2 2 56788990 55678900 T A HLA what I would like to do is to retrive the matching line between the two file. What I would like to match is everyting after the gene ID So far I have written this code but unfortunately perl keeps giving me the error: use of "Use of uninitialized value in pattern match (m//)" Could you please help me figure out where i am doing it wrong? Thank you in advance! use strict; open (INA, $ARGV[0]) || die "cannot to open gene file"; open (INB, $ARGV[1]) || die "cannot to open coding_annotated.var files"; my @sample1 = <INA>; my @sample2 = <INB>; foreach my $line (@sample1) { my @tab = split (/\t/, $line); my $chr = $tab[1]; my $start = $tab[2]; my $end = $tab[3]; my $ref = $tab[4]; my $alt = $tab[5]; my $name = $tab[6]; foreach my $item (@sample2){ my @fields = split (/\t/,$item); if ($fields[1]=~ m/$chr(.*)/ && $fields[2]=~ m/$start(.*)/ && $fields[4]=~ m/$ref(.*)/ && $fields[5]=~ m/$alt(.*)/&& $fields[6]=~ m/$name(.*)/){ print $line,"\n",$item; } } }

    Read the article

  • Issue with my regular expression?

    - by Rubans
    I'm trying to locate the number matches in a relative path for directory up references("..\"). So I have the following pattern : "(..\)" which works as expected for the path "....\a\b" where it will give me 2 successfull groups ("..\") but when I try the path "..\a\b" it will also return 2 when it should be 1. I tried this in a reg ex tool such Expresso and it seems to work as expected in there but not in in .net, any ideas?

    Read the article

  • Java String.split

    - by user903772
    I have the following text: ARIYALUR:ARIYALUR|CHENNAI:CHENNAI|COIMBATORE:COIMBATORE|CUDDALORE:CUDDALORE|DINDIGUL:DINDIGUL|ERODE:ERODE|KANCHEEPURAM:KANCHEEPURAM|KANYAKUMARI:KANYAKUMARI|KRISHNAGIRI:KRISHNAGIRI|MADURAI:MADURAI|NAMAKKAL:NAMAKKAL|NILGIRIS:NILGIRIS|PERAMBALUR:PERAMBALUR|PONDICHERRY:PONDICHERRY|SALEM:SALEM|THANJAVUR:THANJAVUR|THENI:THENI|THIRUVALLUR:THIRUVALLUR|THOOTHUKUDI:THOOTHUKUDI|TIRUNELVELI:TIRUNELVELI|VELLORE:VELLORE|VILLUPURAM:VILLUPURAM|VIRUDHUNAGAR:VIRUDHUNAGAR| I tried to do a split("|") but my array is made up of alphabets and not each district. Please help.

    Read the article

  • Modify a reference numbered group to match against

    - by StuperUser
    I want to match YYYY-YY for sequential years. I at moment I'm trying to match where all the second YY is the 3rd and 4th characters in YYYY with 1 added to it. So far I've got {19|20}(\d{2})-(\d{2}), but not sure how to use ? with reference to (1) or whether I'm going about this the right way and finding out the inevitable "unknown unknowns" (like YY99) with this approach? Edit: Matches: 2010-11,2011-12,2029-30 Not matches: 2010-12, 2010-09,2011-2,2011-2012

    Read the article

  • Redirect visitor with .htaccess

    - by Aaron
    Hi all, I've got an e-shop on a virtual server that's been used as a subdirectory for the last few years, but now I'm finally giving the VS it's own domain name. What I really need is visitors to the old URL to be transparently (and 301) redirected to the new URL with everything after /eshop/ maintained and apended to the new host. I.e. http://www.example.com/eshop/page.php - http://www.newdomain.com/page.php Any help would be greatly appreciated.

    Read the article

  • questions on nfa and dfa..

    - by Loop
    Hi Guys... Hope you help me with this one.... I have a main question which is ''how to judge whether a regular expression will be accepted by NFA and/or DFA? For eg. My question says that which of the regular expressions are equivalent? explain... 1.(a+b)*b(a+b)*b(a+b)* 2.a*ba*ba* 3.a*ba*b(a+b)* do we have to draw the NFA and DFA and then find through minimisation algorithm? if we do then how do we come to know that which regular expression is accepted by NFA/DFA so that we can begin with the answer? its so confusing.... Second is a very similar one, the question asks me to show that the language (a^nb^n|n1} is not accepted by DFA...grrrrr...how do i know this? (BTW this is a set of all strings of where a number of a's is followed by the same number of b's).... I hope I explained clearly well....

    Read the article

  • Python comparing string against several regular expressions

    - by maerics
    I'm pretty experienced with Perl and Ruby but new to Python so I'm hoping someone can show me the Pythonic way to accomplish the following task. I want to compare several lines against multiple regular expressions and retrieve the matching group. In Ruby it would be something like this: STDIN.each_line do |line| case line when /^A:(.*?)$/ then puts "FOO: #{$1}" when /^B:(.*?)$/ then puts "BAR: #{$1}" # when ... else puts "NO MATCH: #{line}" end end My attempts in Python are turning out pretty ugly because the matching group is returned from a call to match/search on a regular expression and Python has no assignment in conditionals or switch statements. What's the Pythonic way to do (or think!) about this problem?

    Read the article

  • Remove newlines and spaces

    - by Cosmin
    How can I remove newline between <table> .... </table> and add \n after each ex: <table border="0" cellspacing="0" cellpadding="0" width="450" class="descriptiontable"><tr> <td width="50%" valign="top"> <span class="displayb">Model Procesor:</span> Intel Celeron<br><span class="displayb">Frecventa procesor (MHz):</span> 2660<br><span class="displayb">Placa Video:</span> Intel Extreme Graphics 2<br><span class="displayb">Retea integrata:</span> 10/100Mbps, RJ-45<br><span class="displayb">Chipset:</span> Intel 845G<br> </td> <td width="50%" valign="top"> <span class="displayb">Capacitate RAM (MB):</span> 512<br><span class="displayb">Tip RAM:</span> DDR<br> </td> </tr></table> and become : <table border="0" cellspacing="0" cellpadding="0" width="450" class="descriptiontable"><tr><td width="50%" valign="top"><span class="displayb">Model Procesor:</span> Intel Celeron<br><span class="displayb">Frecventa procesor (MHz):</span> 2660<br><span class="displayb">Placa Video:</span> Intel Extreme Graphics 2<br><span class="displayb">Retea integrata:</span> 10/100Mbps, RJ-45<br><span class="displayb">Chipset:</span> Intel 845G<br></td><td width="50%" valign="top"><span class="displayb">Capacitate RAM (MB):</span> 512<br><span class="displayb">Tip RAM:</span> DDR<br></td></tr></table>\n s.

    Read the article

  • perl ENV value avoid escape

    - by Michael
    In my makefile I have command in variable like this substitute := perl -p -e 's/@([^@]+)@/"$(update_url)"/ge' > output.txt update_url := em:updateURL=\"http:\/\/bla\/update.rdf\"\n this works fine when I run command in target and I have newline, quotes however I need to replace $(update_url)" with environment variable, using expression like this #substitute := perl -p -e 's/@([^@]+)@/defined $$ENV{$$1} ? $$ENV{$$1} : $$1/ge' I am exporting those variables from makefile. This gives me literally em:updateURL=\"http:\/\/bla\/update.rdf\"\n on output file... so how to make the second version to give output like first version?

    Read the article

  • A "smart" (forgiving) date parser?

    - by jdmuys
    I have to migrate a very large dataset from one system to another. One of the "source" column contains a date but is really a string with no constraint, while the destination system mandates a date in the format yyyy-mm-dd. Many, but not all, of the source dates are formatted as yyyymmdd. So to coerce them to the expected format, I do (in Perl): return "$1-$2-$3" if ($val =~ /(\d{4})[-\/]*(\d{2})[-\/]*(\d{2})/); The problem arises when the source dates moves away from the "generic" yyyymmdd. The goal is to salvage as many dates as possible, before giving up. Example source strings include: 21/3/1998, March 2004, 2001, 3/4/97 I can try to match as many of the examples I can find with a succession of regular expressions such as the one above. But is there something smarter to do? Am I not reinventing the wheel? Is there a library somewhere doing something similar? I couldn't find anything relevant googling "forgiving date parser". (any language is OK).

    Read the article

  • Regular Expression Question

    - by zyq524
    I'm trying to use regular expression to extract the comments in the heading of a file. For example, the source code may look like: //This is an example file. //Please help me. #include "test.h" int main() //main function { ... } What I want to extract from the code are the first two lines, i.e. //This is an example file. //Please help me. Any idea?

    Read the article

  • What is the RFC complicant and working regular expression to check if a string is a valid URL

    - by bestis
    There is question by the almost the same name already: What is the best regular expression to check if a string is a valid URL I don't understand this stackoverflow. It seems like I need reputation to comment an answer. As I don't have it, I don't know how to tell/ask that the proposed solution doesn't seem to work. So I'm forced to make a new question and ask for the solution this way? But that regexp seems to fail in input which has IPv6 address in it: For example facebook's IPv6 address: http://2620:0:1cfe:face:b00c::3/ Also link to localhost fails: http://::1/ Or is PHP to blame? /** * Validate URL - RFC 3987 (IRI) * * http://stackoverflow.com/questions/161738/what-is-the-best-regular-expression-to-check-if-a-string-is-a-valid-url * * @param string $str_url * @return boolean */ function is_url($str_url) { // RFC 3987 For absolute IRIs (internationalized): // @todo FIXME - Has bugs in IPv6 (http://2620:0:1cfe:face:b00c::3/) fails return (bool) preg_match('/^[a-z](?:[-a-z0-9\+\.])*:(?:\/\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:])*@)?(?:\[(?:(?:(?:[0-9a-f]{1,4}:){6}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|::(?:[0-9a-f]{1,4}:){5}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:){4}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:[0-9a-f]{1,4}:[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:){3}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,2}[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:){2}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,3}[0-9a-f]{1,4})?::[0-9a-f]{1,4}:(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,4}[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,5}[0-9a-f]{1,4})?::[0-9a-f]{1,4}|(?:(?:[0-9a-f]{1,4}:){0,6}[0-9a-f]{1,4})?::)|v[0-9a-f]+[-a-z0-9\._~!\$&\'\(\)\*\+,;=:]+)\]|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3}|(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=@])*)(?::[0-9]*)?(?:\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@]))*)*|\/(?:(?:(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@]))+)(?:\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@]))*)*)?|(?:(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@]))+)(?:\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@]))*)*|(?!(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@])))(?:\?(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@])|[\x{E000}-\x{F8FF}\x{F0000}-\x{FFFFD}|\x{100000}-\x{10FFFD}\/\?])*)?(?:\#(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@])|[\/\?])*)?$/iu',$str_url); } Here is the test for it: $urls=array('http://www.example.org/','http://www.example.org:80/','example.org','ftp://user:[email protected]/','http://example.org/?cat=5&test=joo','http://www.fi/?cat=5&amp;test=joo','http://::1/','http://2620:0:1cfe:face:b00c::3/','http://2620:0:1cfe:face:b00c::3:80/'); foreach ($urls as $a) { echo $a."\n"; $a=is_url($a); var_dump($a); } And that outputs: > `http://www.example.org/` bool(true) > `http://www.example.org:80/` bool(true) > example.org bool(false) > `ftp://user:[email protected]/` > bool(true) > `http://example.org/?cat=5&test=joo` > bool(true) > `http://www.fi/?cat=5&amp;test=joo` > bool(true) `http://::1/` bool(false) > `http://2620:0:1cfe:face:b00c::3/` > bool(false) > `http://2620:0:1cfe:face:b00c::3:80/` > bool(false) And it also seems that stackoverflow's code is miss behaving on those :) So what is the RFC compilicant and working regexp? ps. If you close this, please then tell me how this situation should be handled? I don't think that the answer is, just earn your reputation. Who wants to do that if they cannot even tell that some proposed solution isn't working correctly. pps. "we're sorry, but as a spam prevention mechanism, new users can only post a maximum of one hyperlink. Earn more than 10 reputation to post more hyperlinks.". Oh C'mon, I'm fine with plain text :D

    Read the article

  • Use Regular expression with fileinput

    - by chrissygormley
    Hello, I am trying to replace a variable stored in another file using regular expression. The code I have tried is: r = re.compile(r"self\.uid\s*=\s*('\w{12})'") for line in fileinput.input(['file.py'], inplace=True): print line.replace(r.match(line), sys.argv[1]), The format of the variable in the file is: self.uid = '027FC8EBC2D1' I am trying to pass in a parameter in this format and use regular expression to verify that the sys.argv[1] is correct format and to find the variable stored in this file and replace it with the new variable. Can anyone help. Thanks for the help.

    Read the article

  • preg_replace function to append a string to all the hyperlinks of a page

    - by KoolKabin
    hi guys, i want to append my own value to all hyperlinks in a page... e.g if there are links: <a href="abc.htm?val=1">abc 1</a> <br/> <a href="abc.htm?val=2">abc 1</a> <br/> <a href="abc.htm?val=3">abc 1</a> <br/> <a href="abc.htm?val=4">abc 1</a> <br/> I want to add next var like "type=int" to all hyperlinks output should be: <a href="abc.htm?val=1&type=int">abc 1</a> <br/> <a href="abc.htm?val=2&type=int">abc 1</a> <br/> <a href="abc.htm?val=3&type=int">abc 1</a> <br/> <a href="abc.htm?val=4&type=int">abc 1</a> <br/> I hope it can be done quite easily with preg_replace function

    Read the article

  • How to use regular expressions to pull a substring? (screen scraping)

    - by Diego
    Hey guys, i'm really trying to understand regular expressions while scraping a site, i've been using it in my code enough to pull the following, but am stuck here. I need to quickly grab this: http://www.example.com/online/store/TitleDetail?detail&sku=123456789 from this: ('<a href="javascript:if(handleDoubleClick(this.id)){window.location=\'http://www.example.com/online/store/TitleDetail?detail&sku=123456789\';}" id="getTitleDetails_123456789">\r\n\t\t\t \tcheck store inventory\r\n\t\t\t </a>', 1) This is where I got confused. any ideas?

    Read the article

  • How do you validate a URL with a regular expression in Python?

    - by Zachary Spencer
    I'm building a Google App Engine app, and I have a class to represent an RSS Feed. I have a method called setUrl which is part of the feed class. It accepts a url as an input. I'm trying to use the re python module to validate off of the RFC 3986 Reg-ex (http://www.ietf.org/rfc/rfc3986.txt) Below is a snipped which should work, right? I'm incredibly new to Python and have been beating my head against this for the past 3 days. p = re.compile('^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?') m = p.match(url) if m: self.url = url return url

    Read the article

< Previous Page | 115 116 117 118 119 120 121 122 123 124 125 126  | Next Page >