Search Results

Search found 3956 results on 159 pages for 'regex cookbook'.

Page 99/159 | < Previous Page | 95 96 97 98 99 100 101 102 103 104 105 106  | Next Page >

  • Remove newlines and spaces

    - by Cosmin
    How can I remove newline between <table> .... </table> and add \n after each ex: <table border="0" cellspacing="0" cellpadding="0" width="450" class="descriptiontable"><tr> <td width="50%" valign="top"> <span class="displayb">Model Procesor:</span> Intel Celeron<br><span class="displayb">Frecventa procesor (MHz):</span> 2660<br><span class="displayb">Placa Video:</span> Intel Extreme Graphics 2<br><span class="displayb">Retea integrata:</span> 10/100Mbps, RJ-45<br><span class="displayb">Chipset:</span> Intel 845G<br> </td> <td width="50%" valign="top"> <span class="displayb">Capacitate RAM (MB):</span> 512<br><span class="displayb">Tip RAM:</span> DDR<br> </td> </tr></table> and become : <table border="0" cellspacing="0" cellpadding="0" width="450" class="descriptiontable"><tr><td width="50%" valign="top"><span class="displayb">Model Procesor:</span> Intel Celeron<br><span class="displayb">Frecventa procesor (MHz):</span> 2660<br><span class="displayb">Placa Video:</span> Intel Extreme Graphics 2<br><span class="displayb">Retea integrata:</span> 10/100Mbps, RJ-45<br><span class="displayb">Chipset:</span> Intel 845G<br></td><td width="50%" valign="top"><span class="displayb">Capacitate RAM (MB):</span> 512<br><span class="displayb">Tip RAM:</span> DDR<br></td></tr></table>\n s.

    Read the article

  • How Do I Remove The First 4 Characters From A String If It Matches A Pattern In Ruby

    - by James
    I have the following string: "h3. My Title Goes Here" I basically want to remove the first 4 characters from the string so that I just get back: "My Title Goes Here". The thing is I am iterating over an array of strings and not all have the h3. part in front so I can't just ditch the first 4 characters blindly. I have checked the docs and the closest think I could find was chomp, but that only works for the end of a string. Right now I am doing this: "h3. My Title Goes Here".reverse.chomp(" .3h").reverse This gives me my desired output, but there has to be a better way right? I mean I don't want to reverse a string twice for no reason. I am new to programming so I might have missed something obvious, but I didn't see the opposite of chomp anywhere in the docs. Is there another method that will work? Thanks!

    Read the article

  • Django - urls.py - Filenames with a hash/pound (#) sign?

    - by miya
    I'm using django and realized that when the filename that the user wants to access (let's say a photo) has the pound sign, the entry in the url.py does not match. Any ideas? url(r'^static/(?P<path>.*)$', 'django.views.static.serve', {'document_root': MEDIA_ROOT}, it just says: "/home/user/project/static/upload/images/hello" does not exist when actually the name of the file is: hello#world.jpg Thanks, Nico

    Read the article

  • Find and Replace with Notepad++

    - by Levi
    I have a document that was converted from PDF to HTML for use on a company website to be referenced and indexed for search. I'm attempting to format the converted document to meet my needs and in doing so I am attempting to clean up some of the junk that was pulled over from when it was a PDF such as page numbers, headers, and footers. luckily all of these lines that need to be removed are in blocks of 4 lines unfortunately they are not exactly the same therefore cannot be removed with a simple literal replace. The lines contain numbers which are incremental as they correlate with the pages. How can I remove the following example from my html file. Title<br> 10<br> <hr> <A name=11></a>Footer<br> I've tried many different regular expression attempts but as my skill in that area is limited I can't find the proper syntax. I'm sure i'm missing something fairly easy as it would seem all I need is a wildcard replace for the two numbers in the code and the rest is literal. any help is apprciated

    Read the article

  • Regular expression for pipe delimited and double quoted string

    - by Hiren Amin
    I have a string something like this: "2014-01-23 09:13:45|\"10002112|TR0859657|25-DEC-2013>0000000000000001\"|10002112" I would like to split by pipe apart from anything wrapped in double quotes so I have something like (similar to how csv is done): [0] => 2014-01-23 09:13:45 [1] => 10002112|TR0859657|25-DEC-2013>0000000000000001 [2] => 10002112 I would like to know if there is a regular expression that can do this?

    Read the article

  • php - get content from second pair of quotes in string

    - by Aaron Turecki
    I'm trying to get the contents of the second quotes and only the second quotes from a string. Right now I'm able to get the contents of all three quotes. What am I doing wrong? Is it possible to just print the second value in the output array? Text 2014-06-02 11:48:41.519 -0700 Information 94 NICOLE Client "[WebDirect] (207.230.229.204) [207.230.229.204]" opening database "FMServer_Sample" as "Admin". PHP if (preg_match_all('~(["\'])([^"\']+)\1~', $line, $matches)) $database_names = $matches[2]; print_r($database); Output [WebDirect] (207.230.229.204) [207.230.229.204], FMServer_Sample, Admin

    Read the article

  • Delete all characters in a multline string up to a given pattern

    - by biffabacon
    Using Python I need to delete all charaters in a multiline string up to the first occurrence of a given pattern. In Perl this can be done using regular expressions with something like: #remove all chars up to first occurrence of cat or dog or rat $pattern = 'cat|dog|rat' $pagetext =~ s/(.*?)($pattern)/$2/xms; What's the best way to do it in Python?

    Read the article

  • How to detect identical part(s) inside string?

    - by Horace Ho
    I try to break down the http://stackoverflow.com/questions/2711961/decoding-algorithm-wanted question into smaller questions. This is Part I. Question: two strings: s1 and s2 part of s1 is identical to part of s2 space is separator how to extract the identical part(s)? example 1: s1 = "12 November 2010 - 1 visitor" s2 = "6 July 2010 - 100 visitors" the identical parts are "2010", "-", "1" and "visitor" example 2: s1 = "Welcome, John!" s2 = "Welcome, Peter!" the identical parts are "Welcome," and "!" Python and Ruby preferred. Thanks

    Read the article

  • regular expression code

    - by Gaia Andreoletti
    Deal all, I need to find match between two tab delimited files files like this: File 1: ID1 1 65383896 65383896 G C PCNXL3 ID1 2 56788990 55678900 T A ACT1 ID1 1 56788990 55678900 T A PRO55 File 2 ID2 34 65383896 65383896 G C MET5 ID2 2 56788990 55678900 T A ACT1 ID2 2 56788990 55678900 T A HLA what I would like to do is to retrive the matching line between the two file. What I would like to match is everyting after the gene ID So far I have written this code but unfortunately perl keeps giving me the error: use of "Use of uninitialized value in pattern match (m//)" Could you please help me figure out where i am doing it wrong? Thank you in advance! use strict; open (INA, $ARGV[0]) || die "cannot to open gene file"; open (INB, $ARGV[1]) || die "cannot to open coding_annotated.var files"; my @sample1 = <INA>; my @sample2 = <INB>; foreach my $line (@sample1) { my @tab = split (/\t/, $line); my $chr = $tab[1]; my $start = $tab[2]; my $end = $tab[3]; my $ref = $tab[4]; my $alt = $tab[5]; my $name = $tab[6]; foreach my $item (@sample2){ my @fields = split (/\t/,$item); if ($fields[1]=~ m/$chr(.*)/ && $fields[2]=~ m/$start(.*)/ && $fields[4]=~ m/$ref(.*)/ && $fields[5]=~ m/$alt(.*)/&& $fields[6]=~ m/$name(.*)/){ print $line,"\n",$item; } } }

    Read the article

  • Is there a way to get the PREMATCH ($`) and POSTMATCH ($') from pcrecpp?

    - by Eric Peers
    Is there a way to obtain the C++ equivalent of Perl's PREMATCH ($`) and POSTMATCH ($') from pcrecpp? I would be happy with a string, a char *, or pairs indices/startpos+length that point at this. StringPiece seems like it might accomplish part of this, but I'm not certain how to get it. in perl: $_ = "Hello world"; if (/lo\s/) { $pre = $`; #should be "Hel" $post = $'; #should be "world" } in C++ I would have something like: string mystr = "Hello world"; //do I need to map this in a StringPiece? if (pcrecpp::RE("lo\s").PartialMatch(mystr)) { //should I use Consume or FindAndConsume? //What should I do here to get pre+post matches??? } pcre plainjane c seems to have the ability to return the vector with the matches including the "end" portion of the string, so I could theoretically extract such a pre/post variable, but that seems like a lot of work. I like the simplicty of the pcrecpp interface. Suggestions? Thanks! --Eric

    Read the article

  • Split string on non-alphanumerics in PHP? Is it possible with php's native function?

    - by Jehanzeb.Malik
    I was trying to split a string on non-alphanumeric characters or simple put I want to split words. The approach that immediately came to my mind is to use regular expressions. Example: $string = 'php_php-php php'; $splitArr = preg_split('/[^a-z0-9]/i', $string); But there are two problems that I see with this approach. It is not a native php function, and is totally dependent on the PCRE Library running on server. An equally important problem is that what if I have punctuation in a word Example: $string = 'U.S.A-men's-vote'; $splitArr = preg_split('/[^a-z0-9]/i', $string); Now this will spilt the string as [{U}{S}{A}{men}{s}{vote}] But I want it as [{U.S.A}{men's}{vote}] So my question is that: How can we split them according to words? Is there a possibility to do it with php native function or in some other way where we are not dependent? Regards

    Read the article

  • Which is more efficient regular expression?

    - by Vagnerr
    I'm parsing some big log files and have some very simple string matches for example if(m/Some String Pattern/o){ #Do something } It seems simple enough but in fact most of the matches I have could be against the start of the line, but the match would be "longer" for example if(m/^Initial static string that matches Some String Pattern/o){ #Do something } Obviously this is a longer regular expression and so more work to match. However I can use the start of line anchor which would allow an expression to be discarded as a failed match sooner. It is my hunch that the latter would be more efficient. Can any one back me up/shoot me down :-)

    Read the article

  • Optimizing python link matching regular expression

    - by Matt
    I have a regular expression, links = re.compile('<a(.+?)href=(?:"|\')?((?:https?://|/)[^\'"]+)(?:"|\')?(.*?)>(.+?)</a>',re.I).findall(data) to find links in some html, it is taking a long time on certain html, any optimization advice? One that it chokes on is http://freeyourmindonline.net/Blog/

    Read the article

  • How do you validate a URL with a regular expression in Python?

    - by Zachary Spencer
    I'm building a Google App Engine app, and I have a class to represent an RSS Feed. I have a method called setUrl which is part of the feed class. It accepts a url as an input. I'm trying to use the re python module to validate off of the RFC 3986 Reg-ex (http://www.ietf.org/rfc/rfc3986.txt) Below is a snipped which should work, right? I'm incredibly new to Python and have been beating my head against this for the past 3 days. p = re.compile('^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?') m = p.match(url) if m: self.url = url return url

    Read the article

  • Match subpatterns in any order

    - by Yaroslav
    I have long regexp with two complicated subpatters inside. How i can match that subpatterns in any order? Simplified example: /(apple)?\s?(banana)?\s?(orange)?\s?(kiwi)?/ and i want to match both of apple banana orange kiwi apple orange banana kiwi It is very simplified example. In my case banana and orange is long complicated subpatterns and i don't want to do something like /(apple)?\s?((banana)?\s?(orange)?|(orange)?\s?(banana)?)\s?(kiwi)?/ Is it possible to group subpatterns like chars in character class? UPD Real data as requested: 14:24 26,37 Mb 108.53 01:19:02 06.07 24.39 19:39 46:00 my strings much longer, but it is significant part. Here you can see two lines what i need to match. First has two values: length (14 min 24 sec) and size 26.37 Mb. Second one has three values but in different order: size 108.53 Mb, length 01 h 19 m 02 s and date June, 07 Third one has two size and length Fourth has only length There are couple more variations and i need to parse all values. I have a regexp that pretty close except i can't figure out how to match patterns in different order without writing it twice. (?<size>\d{1,3}\[.,]\d{1,2}\s+(?:Mb)?)?\s? (?<length>(?:(?:01:)?\d{1,2}:\d{2}))?\s* (?<date>\d{2}\.\d{2}))? NOTE: that is only part of big regexp that forks fine already.

    Read the article

  • What is the RFC complicant and working regular expression to check if a string is a valid URL

    - by bestis
    There is question by the almost the same name already: What is the best regular expression to check if a string is a valid URL I don't understand this stackoverflow. It seems like I need reputation to comment an answer. As I don't have it, I don't know how to tell/ask that the proposed solution doesn't seem to work. So I'm forced to make a new question and ask for the solution this way? But that regexp seems to fail in input which has IPv6 address in it: For example facebook's IPv6 address: http://2620:0:1cfe:face:b00c::3/ Also link to localhost fails: http://::1/ Or is PHP to blame? /** * Validate URL - RFC 3987 (IRI) * * http://stackoverflow.com/questions/161738/what-is-the-best-regular-expression-to-check-if-a-string-is-a-valid-url * * @param string $str_url * @return boolean */ function is_url($str_url) { // RFC 3987 For absolute IRIs (internationalized): // @todo FIXME - Has bugs in IPv6 (http://2620:0:1cfe:face:b00c::3/) fails return (bool) preg_match('/^[a-z](?:[-a-z0-9\+\.])*:(?:\/\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:])*@)?(?:\[(?:(?:(?:[0-9a-f]{1,4}:){6}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|::(?:[0-9a-f]{1,4}:){5}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:){4}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:[0-9a-f]{1,4}:[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:){3}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,2}[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:){2}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,3}[0-9a-f]{1,4})?::[0-9a-f]{1,4}:(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,4}[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,5}[0-9a-f]{1,4})?::[0-9a-f]{1,4}|(?:(?:[0-9a-f]{1,4}:){0,6}[0-9a-f]{1,4})?::)|v[0-9a-f]+[-a-z0-9\._~!\$&\'\(\)\*\+,;=:]+)\]|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3}|(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=@])*)(?::[0-9]*)?(?:\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@]))*)*|\/(?:(?:(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@]))+)(?:\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@]))*)*)?|(?:(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@]))+)(?:\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@]))*)*|(?!(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@])))(?:\?(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@])|[\x{E000}-\x{F8FF}\x{F0000}-\x{FFFFD}|\x{100000}-\x{10FFFD}\/\?])*)?(?:\#(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@])|[\/\?])*)?$/iu',$str_url); } Here is the test for it: $urls=array('http://www.example.org/','http://www.example.org:80/','example.org','ftp://user:[email protected]/','http://example.org/?cat=5&test=joo','http://www.fi/?cat=5&amp;test=joo','http://::1/','http://2620:0:1cfe:face:b00c::3/','http://2620:0:1cfe:face:b00c::3:80/'); foreach ($urls as $a) { echo $a."\n"; $a=is_url($a); var_dump($a); } And that outputs: > `http://www.example.org/` bool(true) > `http://www.example.org:80/` bool(true) > example.org bool(false) > `ftp://user:[email protected]/` > bool(true) > `http://example.org/?cat=5&test=joo` > bool(true) > `http://www.fi/?cat=5&amp;test=joo` > bool(true) `http://::1/` bool(false) > `http://2620:0:1cfe:face:b00c::3/` > bool(false) > `http://2620:0:1cfe:face:b00c::3:80/` > bool(false) And it also seems that stackoverflow's code is miss behaving on those :) So what is the RFC compilicant and working regexp? ps. If you close this, please then tell me how this situation should be handled? I don't think that the answer is, just earn your reputation. Who wants to do that if they cannot even tell that some proposed solution isn't working correctly. pps. "we're sorry, but as a spam prevention mechanism, new users can only post a maximum of one hyperlink. Earn more than 10 reputation to post more hyperlinks.". Oh C'mon, I'm fine with plain text :D

    Read the article

  • Are .NET's regular expressions Turing complete?

    - by Robert
    Regular expressions are often pointed to as the classical example of a language that is not Turning complete. For example "regular expressions" is given in as the answer to this SO question looking for languages that are not Turing complete. In my, perhaps somewhat basic, understanding of the notion of Turning completeness, this means that regular expressions cannot be used check for patterns that are "balanced". Balanced meaning have an equal number of opening characters as closing characters. This is because to do this would require you to have some kind of state, to allow you to match the opening and closing characters. However the .NET implementation of regular expressions introduces the notion of a balanced group. This construct is designed to let you backtrack and see if a previous group was matched. This means that a .NET regular expressions: ^(?<p>a)*(?<-p>b)*(?(p)(?!))$ Could match a pattern that: ab aabb aaabbb aaaabbbb ... etc. ... Does this means .NET's regular expressions are Turing complete? Or are there other things that are missing that would be required for the language to be Turing complete?

    Read the article

< Previous Page | 95 96 97 98 99 100 101 102 103 104 105 106  | Next Page >