regex - Page 101 - Developer IT

Regular Expression to isolate an html tag

- by orit cohen

I'm looking for a regular expression to isolate an html tag. This includes the TAG the ATTRIBUTES and the CONTNET inside. Let's say I have this: <html> <body> aajsdfkjaskd <TAGNAME name="bla" context="non">hfdfhdj </TAGNAME> </body> </html> I need a regular expression that would return: <TAGNAME name="bla" context="non">hfdfhdj </TAGNAME> Thank, Joe

Read the article

php Dollar amount Regular Expression

- by Thildemar

I am have completed javascript validation of a form using Regular Expressions and am now working on redundant verification server-side using PHP. I have copied this regular expression from my jscript code that finds dollar values, and reformed it to a PHP friendly format: /\$?((\d{1,3}(,\d{3})*)|(\d+))(\.\d{2})?$/ Specifically: if (preg_match("/\$?((\d{1,3}(,\d{3})*)|(\d+))(\.\d{2})?$/", $_POST["cost"])){} While the expression works great in javascript I get : Warning: preg_match() [function.preg-match]: Compilation failed: nothing to repeat at offset 1 when I run it in PHP. Anyone have a clue why this error is coming up?

Read the article

PHP: URL detection (regexp) includes line breaks

- by marco92w

I want to have a function which gets a text as the input and gives back the text with URLs made to HTML links as the output. My draft is as follows: function autoLink($text) { return preg_replace('/https?:\/\/[\S]+/i', '<a href="\0">\0</a>', $text); } But this doesn't work properly. For the input text which contains ... http://www.google.de/ ... I get the following output: <a href="http://www.google.de/<br">http://www.google.de/<br</a> /> Why does it include the line breaks? How could I limit it to the real URL? Thanks in advance!

Read the article

regular expression

- by xyz

I need regular expression to match braces correct e.g for every open one close one abc{abc{bc}xyz} I need it get all it from {abc{bc}xyz} not get {abc{bc} I tried this ({.*?})

Read the article

Why does this regular expression for sed break inside Makefile?

- by jcrocholl

I'm using GNU Make 3.81, and I have the following rule in my Makefile: jslint : java org.mozilla.javascript.tools.shell.Main jslint.js mango.js \ | sed 's/Lint at line $[0-9]\+$ character $[0-9]\+$/mango.js:\1:\2/' This works fine if I enter it directly on the command line, but the regular expression does not match if I run it with "make jslint". However, it works if I replace \+ with \{1,\} in the Makefile: jslint : java org.mozilla.javascript.tools.shell.Main jslint.js mango.js \ | sed 's/Lint at line $[0-9]\{1,\}$ character $[0-9]\{1,\}$/mango.js:\1:\2/' Is there some special meaning to \+ in Makefiles, or is this a bug?

Read the article

Need help parsing HTML with a regex in python

- by laspal

Hi, My string is mystring = "<tr><td>Total Amount : INR (Indian Rupees) 100.00</td></tr>" My problem here is I have to search and get the total amount test = re.search("(Indian Rupees)(\d{2})(?:\D|$)", mystring) but my test give me None. How can I get the values and values can be 10.00, 100.00, 1000.00 Thanks

Read the article

How can I display a list of characters that fail to match a regular expression?

- by Matt

For example, if I'm doing some form input validation and I'm using the following code for the name field. preg_match("/^[a-zA-Z .-]$/", $firstname); If someone types in Mr. (Awkward) Double-Barrelled I want to be able to display a message saying Invalid character(s): (, )

Read the article

How Do You Parse Column Data ?

- by discwiz

I am trying to parse a file generated by LGA Tracon that lists the position data for aircraft over a given time frame. The data of interest starts with TRACKING DATA and ends with SST and there are thousands of entries per file. The system generating the file, Common ARTS, is very rigid in its formatting and we can expect the column spacing to be consistent. Any help would be greatly appreciated. Thanks, Here is an image to preserve the exact formatting Here is a reduced text file. link text

Read the article

Writing a PHP web crawler using cron

- by Horse

Hi all I have written myself a web crawler using simplehtmldom, and have got the crawl process working quite nicely. It crawls the start page, adds all links into a database table, sets a session pointer, and meta refreshes the page to carry onto the next page. That keeps going until it runs out of links That works fine however obviously the crawl time for larger websites is pretty tedious. I wanted to be able to speed things up a bit though, and possibly make it a cron job. Any ideas on making it as quick and efficient as possible other than setting the memory limit / execution time higher?

Read the article

regexp target last main li in list

- by veilig

I need to target the starting tag of the last top level LI in a list that may or may-not contain sublists in various positions - without using CSS or Javascript. Is there a simple/elegant regexp that can help with this? I'm no guru w/ them, but it appears the need for greedy/non-greedy selectors when I'm selecting all the middle text (.*) / (.+) changes as nested lists are added and moved around in the list - and this is throwing me off. $pattern = '/^(<ul>.*)<li>(.+<\/li><\/ul>)$/'; $replacement = '$1<li id="lastLi">$3'; Perhaps there is an easier approach?? converting to XML to target the LI and then convert back? ie: Single Element <ul> <li>TARGET</li> </ul> Multiple Elements <ul> <li>foo</li> <li>TARGET</li> </ul> Nested Lists before end <ul> <li> foo <ul> <li>bar</li> </ul> <li> <li>TARGET</li> </ul> Nested List at end <ul> <li>foo</li> <li> TARGET <ul> <li>bar</li> </ul> </li> </ul>

Read the article

Dealing with regular expressions, Python

- by Gusto

I want to remove some symbols from a string using a regular expression, for example: == (that occur both at the beginning and at the end of a line), * (at the beginning of a line ONLY). def some_func(): clean = re.sub(r'= {2,}', '', clean) #Removes 2 or more occurrences of = at the beg and at the end of a line. clean = re.sub(r'^\* {1,}', '', clean) #Removes 1 or more occurrences of * at the beginning of a line. What's wrong with my code? It seems like expressions are wrong. How do I remove a character/symbol if it's at the beginning or at the end of the line (with one or more occurrences)?

Read the article

Regular Expression .net flavor

- by user1440109

Dont ask how this works but currently it does ("^\|(.?)\|*$")....kinda. This removes all extra pipes...part one....I have searched all over no anwser yet. I am using VB2011 beta...asp web form......vb coding though! I want to capture special character pipe (|) which is used to seperate words...i.e. car|truck|van|cycle problem is users lead with, trail with, use multiple, and use spaces before and after...i.e. |||car||truck | van || cycle. another example: george bush|micheal jordon|bill gates|steve jobs <-- this would be correct but when I do remove space it takes correct space out. so I want to get rid of whitespace leading, trailing, any space before | and space after | and only allow one pipe (|)....in between alphanumeric of course.

Read the article

How to capture strings using * or ? with groups in python regular expressions

- by user1334085

When the regular expression has a capturing group followed by "*" or "?", there is no value captured. Instead if you use "+" for the same string, you can see the capture. I need to be able to capture the same value using "?" >>> str1='This string has 29 characters' >>> re.search(r'(\d+)*', str1).group(0) '' >>> re.search(r'(\d+)*', str1).group(1) >>> >>> re.search(r'(\d+)+', str1).group(0) '29' >>> re.search(r'(\d+)+', str1).group(1) '29' More specific question is added below for clarity: I have str1 and str2 below, and I want to use just one regexp which will match both. In case of str1, I also want to be able to capture the number of QSFP ports >>> str1='''4 48 48-port and 6 QSFP 10GigE Linecard 7548S-LC''' >>> str2='''4 48 48-port 10GigE Linecard 7548S-LC''' >>> When I do not use a metacharacter, the capture works: >>> re.search(r'^4\s+48\s+.*(?:(\d+)\s+QSFP).*-LC', str1, re.I|re.M).group(1) '6' >>> It works even when I use the "+" to indicate one occurrence: >>> re.search(r'^4\s+48\s+.*(?:(\d+)\s+QSFP)+.*-LC', str1, re.I|re.M).group(1) '6' >>> But when I use "?" to match for 0 or 1 occurrence, the capture fails even for str1: >>> re.search(r'^4\s+48\s+.*(?:(\d+)\s+QSFP)?.*-LC', str1, re.I|re.M).group(1) >>>

Read the article

How to avoid resetting the java Scanner position

- by Derek

I have some code that looks more or less like this: while(scanner.hasNext()) { if(scanner.findInLine("Test") !=null) { //do some things }else{ scanner.nextLine(); } } I am using this to parse an ~10MB text file. The problem is, if I put a breakpoint on the while() and the scanner.nextLine(), I can see that sometimes the scanners position (in the debug window) goes back to zero. I think this is causing me some kind of loop blow up, because the regext in findInLine() starts at zero, looks through some amount of text, advancing the position, and then it randomly gets set back to zero, so it has to re-parse all that text again. Any ideas what can be causing that? Am I even doing this the right way? Thanks Some additional info: The Scanner is instantiated from an InputStream. After diubg sine debugging, it appears that there is a HearCharBuffer that Scanner uses and it only allows 1024 characters at a time, and then resets. Is there a way to avoid this, or do things differently? That seems like a small amount of characters to be able to scan. Derek

Read the article

string substitution regular expression not working in tcl

- by Puneet Mittal

i am trying to replace all the special characters including white space, hyphen, etc, to underscore, from a string variable in tcl. I wrote the code below but it doesn't seem to be working. set varname $origVar puts "Variable Name :>> $varname" if {$varname != ""} { regsub -all {[\s-\]\[$^?+*()|\\%&#]} $varname "_" $newVar } puts "New Variable :>> $newVar" one issue is that, instead of replacing the string in $varname, it is replacing the data inside $origVar. No idea why, and also i read the example code (for proper syntax) in my tcl book and according to that it should be something like this regsub -all {[\s-][$^?+*()|\\%&#]} $varname "_" newVar so i used the same syntax but it didn't work and gave the same result as modifying the $origVar instead of required $varname value.

Read the article

Regular Expression with Names and Emails

- by Nina

I am having a problem with regular expressions at the moment. What I'm trying to do is that for each line through the iteration, it checks for this type of pattern: Lastname, Firstname If it finds the name, then it will take the first letter of the first name, and the first six letters of the lastname and form it as an email. I have the following: $checklast = "[A-z],"; $checkfirst = "[A-z]"; if (ereg($checklast, $parts[1])||ereg($checkfirst, $parts[2])){ $first = preg_replace($checkfirst, $checkfirst{1,1}, $parts[2]); print "<a href='mailto:[email protected];'> $parts[$i] </a>"; } This one obviously broke the code. But I was initially attempting to find only the first letter of the firstname and then after that the first six letters of the lastname followed by the @email.com This didn't work out too well. I'm not sure what to do at this point. Any help is much appreciated.

Read the article

Replace all escape sequences with non-escaped equivalent strings in java

- by Mark

I have a string like this: <![CDATA[<ClinicalDocument>rest of CCD here</ClinicalDocument>]]> I'd like to replace the escape sequences with their non-escaped characters, to end up with: <![CDATA[<ClinicalDocument>rest of CCD here</ClinicalDocument>]]>

Read the article

How to replace only part of the match with python re.sub

- by Arty

I need to match two cases by one reg expression and do replacement 'long.file.name.jpg' - 'long.file.name_suff.jpg' 'long.file.name_a.jpg' - 'long.file.name_suff.jpg' I'm trying to do the following re.sub('(\_a)?\.[^\.]*$' , '_suff.',"long.file.name.jpg") But this is cut the extension '.jpg' and I'm getting long.file.name_suff. instead of long.file.name_suff.jpg I understand that this is because of [^.]*$ part, but I can't exclude it, because I have to find last occurance of '_a' to replace or last '.' Is there a way to replace only part of the match?

Read the article

Python program to search for specific strings in hash values (coding help)

- by Diego

Trying to write a code that searches hash values for specific string's (input by user) and returns the hash if searchquery is present in that line. Doing this to kind of just learn python a bit more, but it could be a real world application used by an HR department to search a .csv resume database for specific words in each resume. I'd like this program to look through a .csv file that has three entries per line (id#;applicant name;resume text) I set it up so that it creates a hash, then created a string for the resume text hash entry, and am trying to use the .find() function to return the entire hash for each instance. What i'd like is if the word "gpa" is used as a search query and it is found in s['resumetext'] for three applicants(rows in .csv file), it prints the id, name, and resume for every row that has it.(All three applicants) As it is right now, my program prints the first row in the .csv file(print resume['id'], resume['name'], resume['resumetext']) no matter what the searchquery is, whether it's in the resumetext or not. lastly, are there better ways to doing this, by searching word documents, pdf's and .txt files in a folder for specific words using python (i've just started reading about the re module and am wondering if this may be the route, rather than putting everything in a .csv file.) def find_details(id2find): resumes_f=open("resume_data.csv") for each_line in resumes_f: s={} (s['id'], s['name'], s['resumetext']) = each_line.split(";") resumetext = str(s['resumetext']) if resumetext.find(id2find): return(s) else: print "No data matches your search query. Please try again" searchquery = raw_input("please enter your search term") resume = find_details(searchquery) if resume: print resume['id'], resume['name'], resume['resumetext']

Read the article

Is is possible to parse a web page from the client side for a large number of words and if so, how?

- by Technoh

I have a list of keywords, about 25,000 of them. I would like people who add a certain < script tag on their web page to have these keywords transformed into links. What would be the best way to go and achieve this? I have tried the simple javascript approach (an array with lots of elements and regexping/replacing each) and it obviously slows down the browser. I could always process the content server-side if there was a way, from the client, to send the page's content to a cross-domain server script (I'm partial to PHP but it could be anything) but I don't know of any way to do this. Any other working solution is also welcome.

Read the article

Markdown implementation in PHP parses text within <a> tags — how does one disable this behavior?

- by Kyle

I'm using the Markdown library for PHP by Michel Fortin. I started noticing that it formats the text in tags with markdown rules, like so: http://foo.com/My_Url_With_Underscores essentially becomes: <a href="...">http://foo.com/MyUrlWith_Underscores</a> How do I disable that behavior or otherwise prevent the library from doing that?

Read the article

What is the Regular Expression For "Not Whitespace and Not a hyphen"

- by rudimenter

I tried this but it doesn't work : [^\s-] Any Ideas?

Read the article

Java: calculate linenumber from charwise position according to the number of "\n"

- by HH

I know charwise positions of matches like 1 3 7 8. I need to know their corresponding line number. Example: file.txt Match: X Mathes: 1 3 7 8. Want: 1 2 4 4 $ cat file.txt X2 X 4 56XX [Added: does not notice many linewise matches, there is probably easier way to do it with stacks] $ java testt 1 2 4 $ cat testt.java import java.io.*; import java.util.*; public class testt { public static String data ="X2\nX\n4\n56XX"; public static String[] ar = data.split("\n"); public static void main(String[] args){ HashSet<Integer> hs = new HashSet<Integer>(); Integer numb = 1; for(String s : ar){ if(s.contains("X")){ hs.add(numb); numb++; }else{ numb++; } } for (Integer i : hs){ System.out.println(i); } } }

Read the article

Regular expression problem

- by farka

i have exemple Term:a=27 B=90 C=65 ....and i want only value C and A, C first and A second i have do (C=(\d+)^|A=(\d+)) but no success why please

Read the article

How Do I grep For non-ASCII Characters in UNIX

- by Peter Conrey

I have several very large XML files and I'm trying to find the lines that contain non-ASCII characters. I've tried the following: grep -e "[\x{00FF}-\x{FFFF}]" file.xml But this returns every line in the file, regardless of whether the line contains a character in the range specified. Do I have the syntax wrong or am I doing something else wrong? I've also tried: egrep "[\x{00FF}-\x{FFFF}]" file.xml (with both single and double quotes surrounding the pattern).

Search Results

Search found 3804 results on 153 pages for 'regex'.

Page 101/153 | < Previous Page | 97 98 99 100 101 102 103 104 105 106 107 108 | Next Page >

- by orit cohen

- by Thildemar

- by marco92w

- by xyz

- by jcrocholl

- by laspal

- by Matt

- by discwiz

- by Horse

- by veilig

- by Gusto

- by user1440109

- by user1334085

- by Derek

- by Puneet Mittal

- by Nina

- by Mark

- by Arty

- by Diego

- by Technoh

- by Kyle

- by rudimenter

- by HH

- by farka

- by Peter Conrey

< Previous Page | 97 98 99 100 101 102 103 104 105 106 107 108 | Next Page >