Search Results

Search found 3804 results on 153 pages for 'regex'.

Page 102/153 | < Previous Page | 98 99 100 101 102 103 104 105 106 107 108 109  | Next Page >

  • preg_match_all problems

    - by NeoNmaN
    i use preg_match_all and need to grab all a href="" tags in my code, but i not relly understand how to its work. i have this reg. exp. ( /(<([\w]+)[^])(.?)(<\/\2)/ ) its take all html codes, i need only all a href tags. i hobe i can get help :)

    Read the article

  • Convert a complicated string into an array in php

    - by Patrick Beardmore
    I have a php variable that comes from a form that needs tidying up. I hope you can help. The variable contains a list of items (possibly two or three word items with a space in between words). I want to convert it to a comma separated list with no superfluous white space. I want the divisions to fall only at commas, semi-colons or new-lines. Blank cannot be an item. Here's a comprehensive example (with a deliberately messy input): Variable In: "dog, cat ,car,tea pot,, ,,, ;;(++NEW LINE++)fly, cake" Variable Out "dog,cat,car,tea pot,fly,cake" Can anyone help?

    Read the article

  • Regular expressions in python unicode

    - by Remy
    I need to remove all the html tags from a given webpage data. I tried this using regular expressions: import urllib2 import re page = urllib2.urlopen("http://www.frugalrules.com") from bs4 import BeautifulSoup, NavigableString, Comment soup = BeautifulSoup(page) link = soup.find('link', type='application/rss+xml') print link['href'] rss = urllib2.urlopen(link['href']).read() souprss = BeautifulSoup(rss) description_tag = souprss.find_all('description') content_tag = souprss.find_all('content:encoded') print re.sub('<[^>]*>', '', content_tag) But the syntax of the re.sub is: re.sub(pattern, repl, string, count=0) So, I modified the code as (instead of the print statement above): for row in content_tag: print re.sub(ur"<[^>]*>",'',row,re.UNICODE But it gives the following error: Traceback (most recent call last): File "C:\beautifulsoup4-4.3.2\collocation.py", line 20, in <module> print re.sub(ur"<[^>]*>",'',row,re.UNICODE) File "C:\Python27\lib\re.py", line 151, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: expected string or buffer What am I doing wrong?

    Read the article

  • Dealing with regular expressions, Python

    - by Gusto
    I want to remove some symbols from a string using a regular expression, for example: == (that occur both at the beginning and at the end of a line), * (at the beginning of a line ONLY). def some_func(): clean = re.sub(r'= {2,}', '', clean) #Removes 2 or more occurrences of = at the beg and at the end of a line. clean = re.sub(r'^\* {1,}', '', clean) #Removes 1 or more occurrences of * at the beginning of a line. What's wrong with my code? It seems like expressions are wrong. How do I remove a character/symbol if it's at the beginning or at the end of the line (with one or more occurrences)?

    Read the article

  • Perl Regular expression remove double tabs, line breaks, white spaces

    - by Scoox
    Hi guys, I want to write a perl script that removes double tabs, line breaks and white spaces. What I have so far is: $txt=~s/\r//gs; $txt=~s/ +/ /gs; $txt=~s/\t+/\t/gs; $txt=~s/[\t\n]*\n/\n/gs; $txt=~s/\n+/\n/gs; But, 1. It's not beautiful. Should be possible to do that with far less regexps. 2. It just doesn't work and I really do not know why. It leaves some double tabs, white spaces and empty lines (i.e. lines with only a tab or whitespace) I could solve it with a while, but that is very slow and ugly. Any suggestions?

    Read the article

  • Help with this reg. exp. in PHP

    - by Jonathan
    Hi, i don't know about regular expressions, I asked here for one that: gets either anything up to the first parenthesis/colon or the first word inside the first parenthesis. This was the answer: preg_match('/(?:^[^(:]+|(?<=^\\()[^\\s)]+)/', $var, $match); I need an improvement, I need to get either anything up to the first parenthesis/colon/quotation marks or the first word inside the first parenthesis. So if I have something like: $var = 'story "The Town in Hell"s Backyard'; // I get this: $match = 'story'; $var = "screenplay (based on)"; // I get this: $match = 'screenplay'; $var = "(play)"; // I get this: $match = 'play'; $var = "original screen"; // I get this: $match = 'original screen'; Thanks!

    Read the article

  • Regular Expression Sanitize (PHP)

    - by atif089
    Hello, I would like to sanitize a string in to a URL so this is what I basically need. Everything must be removed except alphanumeric characters and spaces and dashed. Spaces should be converter into dashes. Eg. This, is the URL! must return this-is-the-url Thanks

    Read the article

  • Simple regular expression for decimal numbers?

    - by finch
    I know this may be the simplest question ever asked on Stack Overflow, but what is the regular expression for a decimal with a precision of 2? Valid examples: 123.12 2 56754 92929292929292.12 0.21 3.1 Invalid examples: 12.1232 2.23332 e666.76 Sorry for the lame question, but for the life of me I haven't been able to find anyone that can help! The decimal place may be option, and that integers may also be included.

    Read the article

  • regexp for detect that the url doesn´t end with an extension

    - by devnieL
    Hello. I'm using this regular expression for detect if an url ends with a jpg : var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]*^\.jpg)/ig; it detects the url : e.g. http://www.blabla.com/sdsd.jpg but now i want to detect that the url doesn't ends with an jpg extension, i try with this : var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]*[^\.jpg]\b)/ig; but only get http://www.blabla.com/sdsd then i used this : var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]*[^\.jpg]$)/ig; it works if the url is alone, but dont work if the text is e.g. : http://www.blabla.com/sdsd.jpg text

    Read the article

  • Regexs in Ruby getting filename

    - by user1290757
    i am extracting file names of html files using line: filename = File.basename(input_filename, ".*") which currently prints full file name excluding .html extension All files are stored in the form of http^x.x.edu^1^2 all file names begin with http^ and contain edu^ what i want is to extract 2 (which changes) but it is always the second element after .edu I have attempted destructive gsub! but i m weak with regular expressions.

    Read the article

  • A more elegant way to parse a string with ruby regular expression using variable grouping?

    - by i0n
    At the moment I have a regular expression that looks like this: ^(cat|dog|bird){1}(cat|dog|bird)?(cat|dog|bird)?$ It matches at least 1, and at most 3 instances of a long list of words and makes the matching words for each group available via the corresponding variable. Is there a way to revise this so that I can return the result for each word in the string without specifying the number of groups beforehand? ^(cat|dog|bird)+$ works but only returns the last match separately , because there is only one group.

    Read the article

  • Writing a PHP web crawler using cron

    - by Horse
    Hi all I have written myself a web crawler using simplehtmldom, and have got the crawl process working quite nicely. It crawls the start page, adds all links into a database table, sets a session pointer, and meta refreshes the page to carry onto the next page. That keeps going until it runs out of links That works fine however obviously the crawl time for larger websites is pretty tedious. I wanted to be able to speed things up a bit though, and possibly make it a cron job. Any ideas on making it as quick and efficient as possible other than setting the memory limit / execution time higher?

    Read the article

  • regular expression: extract last 2 characters

    - by dotnet-practitioner
    what is the best way to extract last 2 characters of a string using regular expression. For example, I want to extract state code from the following "A_IL" I want to extract IL as string.. please provide me C# code on how to get it.. string fullexpression = "A_IL"; string StateCode = some regular expression code.... thanks

    Read the article

  • Best way to correct garbled data caused by false encoding

    - by ercan
    Hi all, I have a set of data that contains garbled text fields because of encoding errors during many import/exports from one database to another. Most of the errors were caused by converting UTF-8 to ISO-8859-1. Strangely enough, the errors are not consistent: the word 'München' appears as 'München' in some place and as 'MÜnchen'. Is there a trick in SQL server to correct this kind of crap? The first thing that I can think of is to exploit the COLLATE clause, so that ü is interpreted as ü, but I don't exactly know how. If it isn't possible to make it in the DB level, do you know any tool that helps for a bulk correction? (no manual find/replace tool, but a tool that guesses the garbled text somehow and correct them)

    Read the article

  • Regular Expression for username

    - by neobie
    I need help on regular expression on the condition (4) below: Begin with a-z End with a-z0-9 allow 3 special characters like ._- The characters in (3) must be followed by alphanumeric characters, and it cannot be followed by any characters in (3) themselves. Not sure how to do this. Any help is appreciated, with the sample and some explanations.

    Read the article

  • regexp target last main li in list

    - by veilig
    I need to target the starting tag of the last top level LI in a list that may or may-not contain sublists in various positions - without using CSS or Javascript. Is there a simple/elegant regexp that can help with this? I'm no guru w/ them, but it appears the need for greedy/non-greedy selectors when I'm selecting all the middle text (.*) / (.+) changes as nested lists are added and moved around in the list - and this is throwing me off. $pattern = '/^(<ul>.*)<li>(.+<\/li><\/ul>)$/'; $replacement = '$1<li id="lastLi">$3'; Perhaps there is an easier approach?? converting to XML to target the LI and then convert back? ie: Single Element <ul> <li>TARGET</li> </ul> Multiple Elements <ul> <li>foo</li> <li>TARGET</li> </ul> Nested Lists before end <ul> <li> foo <ul> <li>bar</li> </ul> <li> <li>TARGET</li> </ul> Nested List at end <ul> <li>foo</li> <li> TARGET <ul> <li>bar</li> </ul> </li> </ul>

    Read the article

  • parse string with regular exression

    - by llamerr
    I trying to parse this string: $right = '34601)S(1,6)[2] - 34601)(11)[2] + 34601)(3)[2,4]'; with following regexp: const word = '(\d{3}\d{2}\)S{0,1}\([^\)]*\)S{0,1}\[[^\]]*\])'; preg_match('/'.word.'{1}(?:\s{1}([+-]{1})\s{1}'.word.'){0,}/', $right, $matches); print_r($matches); i want to return array like this: Array ( [0] => 34601)S(1,6)[2] - 34601)(11)[2] + 34601)(3)[2,4] [1] => 34601)S(1,6)[2] [2] => - [3] => 34601)(11)[2] [4] => + [5] => 34601)(3)[2,4] ) but i return only following: Array ( [0] => 34601)S(1,6)[2] - 34601)(11)[2] + 34601)(3)[2,4] [1] => 34601)S(1,6)[2] [2] => + [3] => 34601)(3)[2,4] ) i think, its becouse of [^)]* or [^]]* in the word, but how i should correct regexp for matching this in another way? i tryied to specify it: \d+(?:[,#]\d+){0,} so word become const word = '(\d{3}\d{2}\)S{0,1}\(\d+(?:[,#]\d+){0,}\)S{0,1}\[\d+(?:[,#]\d+){0,}\])'; but it gives nothing

    Read the article

  • javascript split() array contains

    - by Mahesha999
    While learning JavaScript, I did not get why the output when we print the array returned of the Sting.split() method (with regular expression as an argument) is as explained below. var colorString = "red,blue,green,yellow"; var colors = colorString.split(/[^\,]+/); document.write(colors); //this print 7 times comma: ,,,,,,, However when I print individual element of the array colors, it prints an empty string, three commas and an empty string: document.write(colors[0]); //empty string document.write(colors[1]); //, document.write(colors[2]); //, document.write(colors[3]); //, document.write(colors[4]); //empty string document.write(colors[5]); //undefined document.write(colors[6]); //undefined Then, why printing the array directly gives seven commas. Though I think its correct to have three commas in the second output, I did not get why there is a starting (at index 0) and ending empty string (at index 4). Please explain I am screwed up here.

    Read the article

  • Regular expressions in a Python find-and-replace script?

    - by Haidon
    I'm new to Python scripting, so please forgive me in advance if the answer to this question seems inherently obvious. I'm trying to put together a large-scale find-and-replace script using Python. I'm using code similar to the following: findreplace = [ ('term1', 'term2'), ] inF = open(infile,'rb') s=unicode(inF.read(),charenc) inF.close() for couple in findreplace: outtext=s.replace(couple[0],couple[1]) s=outtext outF = open(outFile,'wb') outF.write(outtext.encode('utf-8')) outF.close() How would I go about having the script do a find and replace for regular expressions? Specifically, I want it to find some information (metadata) specified at the top of a text file. Eg: Title: This is the title Author: This is the author Date: This is the date and convert it into LaTeX format. Eg: \title{This is the title} \author{This is the author} \date{This is the date} Maybe I'm tackling this the wrong way. If there's a better way than regular expressions please let me know! Thanks!

    Read the article

  • PHP: Regular Expression to get a URL from a string

    - by Matthew Iselin
    I'm working on some PHP code which takes input from various sources and needs to find the URLs and save them somewhere. The kind of input that needs to be handled is as follows: http://www.youtube.com/watch?v=IY2j_GPIqRA Try google: http://google.com! (note exclamation mark is not part of the URL) Is http://somesite.com/ down for anyone else? Output: http://www.youtube.com/watch?v=IY2j_GPIqRA http://google.com http://somesite.com/ I've already borrowed one regular expression from the internet which works, but unfortunately wipes the query string out - not good! Any help putting together a regular expression, or perhaps another solution to this problem, would be appreciated.

    Read the article

  • Reading a line backwards

    - by Jimmy
    Hi, I'm using regular expression to count the total spaces in a line (first occurrence). match(/^\s*/)[0].length; However this reads it from the start to end, How can I read it from end to start. Thanks

    Read the article

< Previous Page | 98 99 100 101 102 103 104 105 106 107 108 109  | Next Page >