Search Results

Search found 4539 results on 182 pages for 'regex grouping'.

Page 30/182 | < Previous Page | 26 27 28 29 30 31 32 33 34 35 36 37  | Next Page >

  • Difference in regex between Python and Rubular?

    - by Rosarch
    In Rubular, I have created a regular expression: (Prerequisite|Recommended): (\w|-| )* It matches the bolded: Recommended: good comfort level with computers and some of the arts. Summer. 2 credits. Prerequisite: pre-freshman standing or permission of instructor. Credit may not be applied toward engineering degree. S-U grades only. Here is a use of the regex in Python: note_re = re.compile(r'(Prerequisite|Recommended): (\w|-| )*', re.IGNORECASE) def prereqs_of_note(note): match = note_re.match(note) if not match: return None return match.group(0) Unfortunately, the code returns None instead of a match: >>> import prereqs >>> result = prereqs.prereqs_of_note("Summer. 2 credits. Prerequisite: pre-fres hman standing or permission of instructor. Credit may not be applied toward engi neering degree. S-U grades only.") >>> print result None What am I doing wrong here?

    Read the article

  • Apply PHP regex replace on a multi-line repeted pattern

    - by Hussain
    Let's say I have this input: I can haz a listz0rs! # 42 # 126 I can haz another list plox? # Hello, world! # Welcome! I want to split it so that each set of hash-started lines becomes a list: I can haz a listz0rs! <ul> <li>42</li> <li>126</li> </ul> I can haz another list plox? <ul> <li>Hello, world!</li> <li>Welcome!</li> </ul> If I run the input against the regex "/(?:(?:(?<=^# )(.*)$)+)/m", I get the following result: Array ( [0] => Array ( [0] => 42 ) Array ( [0] => 126 ) Array ( [0] => Hello, world! ) Array ( [0] => Welcome! ) ) This is fine and dandy, but it doesn't distinguish between the two different lists. I need a way to either make the quantifier return a concatenated string of all the occurrences, or, ideally, an array of all the occurrences. Idealy, this should be my output: Array ( [0] = Array ( [0] = 42 [1] = 126 ) Array ( [0] = Hello, world! [1] = Welcome! ) ) Is there any way of achieving this, and if not, is there a close alternative? Thanks in advance!

    Read the article

  • Use Javascript RegEx to extract column names from SQLite Create Table SQL

    - by NimbusSoftware
    I'm trying to extract column names from a SQLite result set from sqlite_master's sql column. I get hosed up in the regular expressions in the match() and split() functions. t1.executeSql('SELECT name, sql FROM sqlite_master WHERE type="table" and name!="__WebKitDatabaseInfoTable__";', [], function(t1, result) { for(i = 0;i < result.rows.length; i++){ var tbl = result.rows.item(i).name; var dbSchema = result.rows.item(i).sql; // errors out on next line var columns = dbSchema.match(/.*CREATE\s+TABLE\s+(\S+)\s+\((.*)\).*/)[2].split(/\s+[^,]+,?\s*/); } }, function(){console.log('err1');} ); I want to parse SQL statements like these... CREATE TABLE sqlite_sequence(name,seq); CREATE TABLE tblConfig (Key TEXT NOT NULL,Value TEXT NOT NULL); CREATE TABLE tblIcon (IconID INTEGER NOT NULL PRIMARY KEY,png TEXT NOT NULL,img32 TEXT NOT NULL,img64 TEXT NOT NULL,Version TEXT NOT NULL) into a strings like theses... name,seq Key,Value IconID,png,img32,img64,Version Any help with a RegEx would be greatly appreciated.

    Read the article

  • Java Scanner newline parsing with regex (Bug?)

    - by SEK
    I'm developing a syntax analyzer by hand in Java, and I'd like to use regex's to parse the various token types. The problem is that I'd also like to be able to accurately report the current line number, if the input doesn't conform to the syntax. Long story short, I've run into a problem when I try to actually match a newline with the Scanner class. To be specific, when I try to match a newline with a pattern using the Scanner class, it fails. Almost always. But when I perform the same matching using a Matcher and the same source string, it retrieves the newline exactly as you'd expect it too. Is there a reason for this, that I can't seem to discover, or is this a bug, as I suspect? FYI: I was unable to find a bug in the Sun database that describes this issue, so if it is a bug, it hasn't been reported. Example Code: Pattern newLinePattern = Pattern.compile("(\\r\\n?|\\n)", Pattern.MULTILINE); String sourceString = "\r\n\n\r\r\n\n"; Scanner scan = new Scanner(sourceString); scan.useDelimiter(""); int count = 0; while (scan.hasNext(newLinePattern)) { scan.next(newLinePattern); count++; } System.out.println("found "+count+" newlines"); // finds 7 newlines Matcher match = newLinePattern.matcher(sourceString); count = 0; while (match.find()) { count++; } System.out.println("found "+count+" newlines"); // finds 5 newlines

    Read the article

  • Apply PHP regex replace on a multi-line repeated pattern

    - by Hussain
    Let's say I have this input: I can haz a listz0rs! # 42 # 126 I can haz another list plox? # Hello, world! # Welcome! I want to split it so that each set of hash-started lines becomes a list: I can haz a listz0rs! <ul> <li>42</li> <li>126</li> </ul> I can haz another list plox? <ul> <li>Hello, world!</li> <li>Welcome!</li> </ul> If I run the input against the regex "/(?:(?:(?<=^# )(.*)$)+)/m", I get the following result: Array ( [0] => Array ( [0] => 42 ) [1] => Array ( [0] => 126 ) [2] => Array ( [0] => Hello, world! ) [3] => Array ( [0] => Welcome! ) ) This is fine and dandy, but it doesn't distinguish between the two different lists. I need a way to either make the quantifier return a concatenated string of all the occurrences, or, ideally, an array of all the occurrences. Ideally, this should be my output: Array ( [0] => Array ( [0] => 42 [1] => 126 ) [1] => Array ( [0] => Hello, world! [1] => Welcome! ) ) Is there any way of achieving this, and if not, is there a close alternative? Thanks in advance!

    Read the article

  • Java Regex for matching hexadecimal numbers in a file

    - by Ranman
    So I'm reading in a file (like java program < trace.dat) which looks something like this: 58 68 58 68 40 c 40 48 FA If I'm lucky but more often it has several whitespace characters before and after each line. These are hexadecimal addresses that I'm parsing and I basically need to make sure that I can get the line using a scanner, buffered reader... whatever and make sure I can then convert the hexadecimal to an integer. This is what I have so far: Scanner scanner = new Scanner(System.in); int address; String binary; Pattern pattern = Pattern.compile("^\\s*[0-9A-Fa-f]*\\s*$", Pattern.CASE_INSENSITIVE); while(scanner.hasNextLine()) { address = Integer.parseInt(scanner.next(pattern), 16); binary = Integer.toBinaryString(address); //Do lots of other stuff here } //DO MORE STUFF HERE... So I've traced all my errors to parsing input and stuff so I guess I'm just trying to figure out what regex or approach I need to get this working the way I want.

    Read the article

  • regex using vb.net

    - by akmalizhar
    hi, i have this html code <div class="name"> <span id="businessNumOnMap2" class="resultNumberOnMap" style="display:none;">2</span> <span> <a href="/len/aapproximatch%20search/284096.php" onclick="loadBusinessInfo('1', '284096'); return false;" class="businessName">Bangsar Seafood Garden Restaurant</a></span><span id="phoneSpan1"></span> <script type="text/javascript"> var d=document.getElementById('phoneSpan1');d.innerHTML+='0';d.innerHTML+='3';d.innerHTML+=0?'8':'-';d.innerHTML+=1?'2':'1';d.innerHTML+='2';d.innerHTML+=1?'8':'1';d.innerHTML+=0?'0':'2';d.innerHTML+='2';d.innerHTML+=0?'4':'5';d.innerHTML+='5';d.innerHTML+=1?'5':'0'; </script> </div> i start my regex with this : <div class="name"[^>]*>[\s\S]+?</div> and i remove the html. im using this : <[^>]*> how ever, the out put is Bangsar Seafood Garden Restaurant <script type = "text/javascript"> ...</script><div> the one that i want is on Bangsar Seafood Garden Restaurant..can anyone help me?

    Read the article

  • write regex code in java for following data

    - by giri
    <table> <tr> <td style="width:180px"> <a href="/search?q=user:240698+[java]" class="post-tag" title="show all posts by this user in 'java'">java</a><span class="item-multiplier">&times;&nbsp;176</span><br> <a href="/search?q=user:240698+[servlets]" class="post-tag" title="show all posts by this user in 'servlets'">servlets</a><span class="item-multiplier">&times;&nbsp;25</span><br> <a href="/search?q=user:240698+[jsp]" class="post-tag" title="show all posts by this user in 'jsp'">jsp</a><span class="item-multiplier">&times;&nbsp;11</span><br> <a href="/search?q=user:240698+[core]" class="post-tag" title="show all posts by this user in 'core'">core</a><span class="item-multiplier">&times;&nbsp;9</span><br> </tr> </table> from the above code I need to fetch only java, servlets, jsp and core. Can anybody plz help me out to write a regex in java to fetch those? Thanks

    Read the article

  • Looking for a Regex to get SccTeamFoundationServer value from .sln file

    - by Arthur
    I am looking tor a Regex for C# to get SccTeamFoundationServer value from .sln file. Maybe someone has come across such need and found a solution. Could you help? File: Microsoft Visual Studio Solution File, Format Version 10.00 # Visual Studio 2008 Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "WebApplication", "WebApplication\WebApplication.csproj", "{AE0F6C02-1C8D-426D-AFA0-C07A52E6112F}" EndProject Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "ConsoleApplication", "ConsoleApplication\ConsoleApplication.csproj", "{2BD82C34-CF50-4559-A3CD-F85ACD657292}" EndProject Global GlobalSection(TeamFoundationVersionControl) = preSolution SccNumberOfProjects = 3 SccEnterpriseProvider = {4CA58AB2-18FA-4F8D-95D4-32DDF27D184C} SccTeamFoundationServer = http://ServerName:8080/ SccLocalPath0 = . SccProjectUniqueName1 = ConsoleApplication\\ConsoleApplication.csproj SccProjectName1 = ConsoleApplication SccLocalPath1 = ConsoleApplication SccProjectUniqueName2 = WebApplication\\WebApplication.csproj SccProjectName2 = WebApplication SccLocalPath2 = WebApplication EndGlobalSection GlobalSection(SolutionConfigurationPlatforms) = preSolution Debug|Any CPU = Debug|Any CPU Release|Any CPU = Release|Any CPU EndGlobalSection GlobalSection(ProjectConfigurationPlatforms) = postSolution {AE0F6C02-1C8D-426D-AFA0-C07A52E6112F}.Debug|Any CPU.ActiveCfg = Debug|Any CPU {AE0F6C02-1C8D-426D-AFA0-C07A52E6112F}.Debug|Any CPU.Build.0 = Debug|Any CPU {AE0F6C02-1C8D-426D-AFA0-C07A52E6112F}.Release|Any CPU.ActiveCfg = Release|Any CPU {AE0F6C02-1C8D-426D-AFA0-C07A52E6112F}.Release|Any CPU.Build.0 = Release|Any CPU {2BD82C34-CF50-4559-A3CD-F85ACD657292}.Debug|Any CPU.ActiveCfg = Debug|Any CPU {2BD82C34-CF50-4559-A3CD-F85ACD657292}.Debug|Any CPU.Build.0 = Debug|Any CPU {2BD82C34-CF50-4559-A3CD-F85ACD657292}.Release|Any CPU.ActiveCfg = Release|Any CPU {2BD82C34-CF50-4559-A3CD-F85ACD657292}.Release|Any CPU.Build.0 = Release|Any CPU EndGlobalSection GlobalSection(SolutionProperties) = preSolution HideSolutionNode = FALSE EndGlobalSection EndGlobal

    Read the article

  • Matching innermost braces with regex or strpos?

    - by rich97
    I have a sort of mini parsing syntax I made up to help me streamline my view code in cakephp. Basically I have created a table helper which, when given a dataset and (optionally) a set of options for how to format the data will render out a table, as opposed to me looping though the data and editing it manually. It allows the user to be as complex or as simple as they like, it can get pretty powerful. However, In order to achieve this I had to make a simple parsing syntax. As a quick example the user would do something like so: $this->Table->data = $userData; $this->Table->elements['td']['data'] = array( '{:User.username:}', '{:User.created:}' => array('Time::nice') ); echo $this->Table->render(); And when rendering the table would then generate: <table> <tbody> <tr><td>rich97</td><td>Sun 21st 02:30pm</td></tr> </tbody> </table> The problem occurs then I try to nest the braces like so: {:User.levels.iconClasses.{:User.access:}:} Is there anyway I can only get the inner most brackets on the first time round and loop though until there are no matches? Or even do it in one go? Or even better use strpos? Here is my regex as it stands: '/\{\:([^}]+)\:\}/'

    Read the article

  • String doesn't match regex when read from keyboard.

    - by athspk
    public static void main(String[] args) throws IOException { String str1 = "??123456"; System.out.println(str1+"-"+str1.matches("^\\p{InGreek}{2}\\d{6}")); //??123456-true BufferedReader br = new BufferedReader(new InputStreamReader(System.in)); String str2 = br.readLine(); //??123456 same as str1. System.out.println(str2+"-"+str2.matches("^\\p{InGreek}{2}\\d{6}")); //?”??123456-false System.out.println(str1.equals(str2)); //false } The same String doesn't match regex when read from keyboard. What causes this problem, and how can we solve this? Thanks in advance. EDIT: I used System.console() for input and output. public static void main(String[] args) throws IOException { PrintWriter pr = System.console().writer(); String str1 = "??123456"; pr.println(str1+"-"+str1.matches("^\\p{InGreek}{2}\\d{6}")+"-"+str1.length()); String str2 = System.console().readLine(); pr.println(str2+"-"+str2.matches("^\\p{InGreek}{2}\\d{6}")+"-"+str2.length()); pr.println("str1.equals(str2)="+str1.equals(str2)); } Output: ??123456-true-8 ??123456 ??123456-true-8 str1.equals(str2)=true

    Read the article

  • [MySQL] Load data from .csv applying regex before insert into table

    - by Gabriel L. Oliveira
    I know that there is a code to import .csv data into a mysql table, and I'm using this one: LOAD DATA INFILE "file.csv" INTO TABLE foo FIELDS TERMINATED BY "," LINES TERMINATED BY "\\r\\n"; The data inside this .csv are lines like this example: 08/e0/Breast_Cancer_Res_2001_Nov_2_3(1)_55-60.tar.gz Breast Cancer Res. 2001 Nov 2; 3(1):55-60 PMC13900 b0/ac/Breast_Cancer_Res_2001_Nov_9_3(1)_61-65.tar.gz Breast Cancer Res. 2001 Nov 9; 3(1):61-65 PMC13901 I just want the first part (the .tar.gz path), always on the pattern (letter or number)(letter or number) / (letter or number)(letter or number)/... and the part starting by 'PMC', always on the pattern PMC(number...) where 'number' means a number between 0 to 9 and a letter means a letter between a to z (both upper and lower case) So, applying the LOAD DATA, and the regex, and inserting the result entries on my sql table, the result table should be: 1 08/e0/Breast_Cancer_Res_2001_Nov_2_3(1)_55-60.tar.gz PMC13900 2 b0/ac/Breast_Cancer_Res_2001_Nov_9_3(1)_61-65.tar.gz PMC13901 What should be the SQL command to do all this?

    Read the article

  • php clean up regex

    - by David
    hey can i clean up a preg_match in php from this: preg_match_all("/(".$this->reg['wat'].")?(".$this->reg['wat'].")?(".$this->reg['wat'].")?(".$this->reg['wat'].")?(".$this->reg['wat'].")?(".$this->reg['wat'].")?(".$this->reg['wat'].")?/",$value,$match); to look like this: preg_match_all("/ (".$this->reg['wat'].")? (".$this->reg['wat'].")? (".$this->reg['wat'].")? (".$this->reg['wat'].")? (".$this->reg['wat'].")? (".$this->reg['wat'].")? (".$this->reg['wat'].")? /",$value,$match); right now each space, it counts as a ling break so it wont return any finds when searching. but it just looks cleaner and easier to read is why i ask you know. i was looking for one of those letters to add after the closing "/" in the regex. thanks

    Read the article

  • Ruby - RegEx problem or maybe another solution altogether

    - by r3nrut
    Ok the problem I'm having is that I have a block of javascript I've successfully scraped out of a websites source and now I have to sift through the js to get the specific values I'm looking for. Below is the chunk i'm needing to deal with. I need to find "flvFileName" and get all the file names listed. In this case its: trailer1,trailer2,trailer3. At first I started using regex to match the start and end tags and them match the file names and extract them to an array but the problem is that there isn't always 3 videos in the list. Could be 0, 1, 2, 3, 4 etc. So matching doesn't work. Any thoughts on a way to approach this that won't make me continue to abuse my laptop? ["", "\r\n", "\n", "\r\n function IgnoreEnter(e) {\r\n var code;\r\n if (!e) // IE\r\n {\r\n var e = window.event;\r\n }\r\n if (e.keyCode) {\r\n code = e.keyCode;\r\n }\r\n else if (e.which) // Firefox, Opera\r\n {\r\n code = e.which;\r\n }\r\n\r\n if (code == 13) {\r\n e.cancelBubble = true;\r\n e.returnValue = false;\r\n }\r\n }\r\n\r\n function ResetDefault() {\r\n __defaultFired = false;\r\n }\r\n", "", "\r\n// <![CDATA[\r\n$(doc).ready(function () { $('#VideoObject').flash({ swf: '/scinema/video.swf', height: 300, width: 480, hasVersion: 8, menu: false, wmode: 'transparent', bgcolor: '#000',flashvars: {flvFileName: 'trailer1,trailer2,trailer3', age: 'no', isForced: 'true'} }); });

    Read the article

  • matching images inside a link in regex

    - by user225269
    What is wrong with regex pattern that I created: $link_image_pattern = '/\<a\shref="([^"]*)"\>\<img\s.+\><\/a\>/'; preg_match_all($link_image_pattern, $str, $link_images); What I'm trying to do is to match all the links which has images inside of them. But when I try to output $link_images it contains everything inside the first index: <pre> <?php print_r($link_images); ?> </pre> The markup looks something like this: Array ( [0] = Array ([0] = " <p>&nbsp;</p> <p><strong><a href="url">Title</a></strong></p> <p>Desc</p> <p><a href="{$image_url2}"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" border="0" alt="image" src="{$image_url2}" width="569" height="409"></a></p> But when outputting the contents of the matches, it simply returns the first string that matches the pattern plus all the other markup in the page like this: <a href="{$image_url}"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" border="0" alt="image" src="{$image_url}" width="568" height="347"></a></p> <p>&nbsp;</p> <p><strong><a href="url">Title</a></strong></p> <p>Desc</p> <p><a href="{$image_url2}"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" border="0" alt="image" src="{$image_url2}" width="569" height="409"></a></p>")

    Read the article

  • regex numeric data processing: match a series of numbers greater than X

    - by Mu Mind
    Say I have some data like this: number_stream = [0,0,0,7,8,0,0,2,5,6,10,11,10,13,5,0,1,0,...] I want to process it looking for "bumps" that meet a certain pattern. Imagine I have my own customized regex language for working on numbers, where [[ =5 ]] represents any number = 5. I want to capture this case: ([[ >=5 ]]{3,})[[ <3 ]]{2,} In other words, I want to begin capturing any time I look ahead and see 3 or more values = 5 in a row, and stop capturing any time I look ahead and see 2+ values < 3. So my output should be: >>> stream_processor.process(number_stream) [[5,6,10,11,10,13,5],...] Note that the first 7,8,... is ignored because it's not long enough, and that the capture ends before the 0,1,0.... I'd also like a stream_processor object I can incrementally pass more data into in subsequent process calls, and return captured chunks as they're completed. I've written some code to do it, but it was hideous and state-machiney, and I can't help feeling like I'm missing something obvious. Any ideas to do this cleanly?

    Read the article

  • LocationMatch Regex for versioning

    - by Aventus
    I've tried using the docs but I'm quite new to regex. I've had success with others but the same method is not working for what I'm actually after. I'm trying to send users to different servers based on the version number in the URL. This this case, older versions are to be sent to the new server for a particular service. <LocationMatch "/(1.0|2.0|3.0)/appname"> ... </LocationMatch> The following is working - <LocationMatch "/1/appname"> ... </LocationMatch> <LocationMatch "/2/appname"> ... </LocationMatch> What I would love to achieve is sending all those major releases with a single tag - <LocationMatch "/(1*|2*|3*)/appname"> ... </LocationMatch> I've already referred the documentation at http://httpd.apache.org/docs/2.2/mod/core.html#locationmatch but unfortunately it doesn't cover my case with enough detail to help me.

    Read the article

  • How to capture strings using * or ? with groups in python regular expressions

    - by user1334085
    When the regular expression has a capturing group followed by "*" or "?", there is no value captured. Instead if you use "+" for the same string, you can see the capture. I need to be able to capture the same value using "?" >>> str1='This string has 29 characters' >>> re.search(r'(\d+)*', str1).group(0) '' >>> re.search(r'(\d+)*', str1).group(1) >>> >>> re.search(r'(\d+)+', str1).group(0) '29' >>> re.search(r'(\d+)+', str1).group(1) '29' More specific question is added below for clarity: I have str1 and str2 below, and I want to use just one regexp which will match both. In case of str1, I also want to be able to capture the number of QSFP ports >>> str1='''4 48 48-port and 6 QSFP 10GigE Linecard 7548S-LC''' >>> str2='''4 48 48-port 10GigE Linecard 7548S-LC''' >>> When I do not use a metacharacter, the capture works: >>> re.search(r'^4\s+48\s+.*(?:(\d+)\s+QSFP).*-LC', str1, re.I|re.M).group(1) '6' >>> It works even when I use the "+" to indicate one occurrence: >>> re.search(r'^4\s+48\s+.*(?:(\d+)\s+QSFP)+.*-LC', str1, re.I|re.M).group(1) '6' >>> But when I use "?" to match for 0 or 1 occurrence, the capture fails even for str1: >>> re.search(r'^4\s+48\s+.*(?:(\d+)\s+QSFP)?.*-LC', str1, re.I|re.M).group(1) >>>

    Read the article

  • RegEx strip html tags problem

    - by Aleksandar Mirilovic
    Hi, I've tried to strip html tags using regex replace with pattern "<[^]*" from word generated html that looks like this: <html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:st1="urn:schemas-microsoft-com:office:smarttags" xmlns="http://www.w3.org/TR/REC-html40"> <head> <meta http-equiv=Content-Type content="text/html; charset=iso-8859-2"> <meta name=Generator content="Microsoft Word 11 (filtered medium)"> <!--[if !mso]> <style> v\:* {behavior:url(#default#VML);} o\:* {behavior:url(#default#VML);} w\:* {behavior:url(#default#VML);} .shape {behavior:url(#default#VML);} </style> <![endif]--><o:SmartTagType namespaceuri="urn:schemas-microsoft-com:office:smarttags" name="place" downloadurl="http://www.5iantlavalamp.com/"/> <!--[if !mso]> <style> st1\:*{behavior:url(#default#ieooui) } </style> <![endif]--> <style> <!-- /* Font Definitions / @font-face {font-family:Tahoma; panose-1:2 11 6 4 3 5 4 4 2 4;} / Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0in; margin-bottom:.0001pt; font-size:12.0pt; font-family:"Times New Roman";} a:link, span.MsoHyperlink {color:blue; text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {color:purple; text-decoration:underline;} span.EmailStyle17 {mso-style-type:personal; font-family:Arial; color:windowtext;} span.EmailStyle18 {mso-style-type:personal-reply; font-family:Arial; color:navy;} @page Section1 {size:8.5in 11.0in; margin:1.0in 1.25in 1.0in 1.25in;} div.Section1 {page:Section1;} --> </style> </head> Everything works fine except for the bolded lines above, anybody got ideas how to match the them to? Thanks, Aleksandar

    Read the article

  • Python regex help

    - by Dormish
    I am trying to make a regex that finds all names, url and phone numbers in an html page. But I'm having trouble with the phone number part. I think the problem with the numbers part is that is searches until it finds the </strong> but in that process it skips people, instead of making a empty string if the person has no phone number ( simply put instead of a list like this: url1+name1+num1 | url2+name2+"" | url3+name3+num3 it returns a list like this: url1+name1+num1 | url2+name2+num3 , with url3+name3 deleted in the process) for url, name, pnumber in re.findall('Name"><div>(?:<a href="/si([^">]*)"> )?([^<]*)(?:.*?</strong>([^<]*))?',page): I am searchin for people in s single very long line. A person could have an url or phone number. An example of a person with an url and a phone number <tr> <td class="lablinksName"><div><a href="/si/ivan-bratko/default.html"> dr. Ivan Bratko akad. prof.</a></div></td> <td class="lablinksMail"><a href="javascript:void(cmPopup('sendMessage', '/si/ivan-bratko/mailer.html', true, 350, 350));"><img src="/Static/images/gui/mail.gif" height="8" width="11"></a></td> <td class="lablinksPhone"><div><strong>T:</strong> +386 1 4768 393 </div></td> </tr> And an example of a person with no url or phone number <tr> <td class="lablinksName"><div> dr. Branko Matjaž Juric prof.</div></td> <td class="lablinksMail"><a href="javascript:void(cmPopup('sendMessage', '/si/branko-matjaz-juric/mailer.html', true, 350, 350));"><img src="/Static/images/gui/mail.gif" height="8" width="11"></a></td> <td class="lablinksPhone"><div> </div></td> </tr> I hope i was clear enough and if any one can help me.

    Read the article

  • php regex guitar tab (tabs or tablature, a type of music notation)

    - by John
    I am in the process of creating a guitar tab to rtttl (Ring Tone Text Transfer Language) converter in PHP. In order to prepare a guitar tab for rtttl conversion I first strip out all comments (comments noted by #- and ended with -#), I then have a few lines that set tempo, note the tunning and define multiple instruments (Tempo 120\nDefine Guitar 1\nDefine Bass 1, etc etc) which are stripped out of the tab and set aside for later use. Now I essentially have nothing left except the guitar tabs. Each tab is prefixed with it's instrument name in conjunction with the instrument name noted prior. Some times we have tabs for 2 separate instruments that are linked because they are to be played together, ie a Guitar and a Bass Guitar playing together. Example 1, Standard Guitar Tab: |Guitar 1 e|--------------3-------------------3------------| B|------------3---3---------------3---3----------| G|----------0-------0-----------0-------0--------| D|--------0-----------0-------0-----------0------| A|------2---------------2---2---------------2----| E|----3-------------------3-------------------3--| Example 2, Conjunction Tab: |Guitar 1 e|--------------3-------------------3------------| B|------------3---3---------------3---3----------| G|----------0-------0-----------0-------0--------| D|--------0-----------0-------0-----------0------| A|------2---------------2---2---------------2----| E|----3-------------------3-------------------3--| | | |Bass 1 G|----------0-------0-----------0-------0--------| D|--------2-----------2-------2-----------2------| A|------3---------------3---3---------------3----| E|----3-------------------3-------------------3--| I have considered other methods of identifying the tabs with no solid results. I am hoping that someone who does regular expressions could help me find a way to identify a single guitar tab and if possible also be able to match a tab with multiple instruments linked together. Once the tabs are in an array I will go through them one line at a time and convert them into rtttl lines (exploded at each new line "\n"). I do not want to separate the guitar tabs in the document via explode "\n\n" or something similar because it does not identify the guitar tab, rather, it is identifying the space between the tabs - not on the tabs themselves. I have been messing with this for about a week now and this is the only major hold up I have. Everything else is fairly simple. As of current, I have tried many variations of the regex pattern. Here is one of the most recent test samples: <?php $t = " |Guitar 1 e|--------------3-------------------3------------| B|------------3---3---------------3---3----------| G|----------0-------0-----------0-------0--------| D|--------0-----------0-------0-----------0------| A|------2---------------2---2---------------2----| E|----3-------------------3-------------------3--| |Guitar 1 e|--------------3-------------------3------------| B|------------3---3---------------3---3----------| G|----------0-------0-----------0-------0--------| D|--------0-----------0-------0-----------0------| A|------2---------------2---2---------------2----| E|----3-------------------3-------------------3--| | | |Bass 1 G|----------0-------0-----------0-------0--------| D|--------2-----------2-------2-----------2------| A|------3---------------3---3---------------3----| E|----3-------------------3-------------------3--| "; preg_match_all("/^.*?(\\|).*?(\\|)/is",$t,$p); print_r($p); ?> It is also worth noting that inside the tabs, where the dashes and #'s are, you may also have any variation of letters, numbers and punctuation. The beginning of each line marks the tuning of each string with one of the following case insensitive: a,a#,b,c,c#,d,d#,e,f,f#,g or g. Thanks in advance for help with this most difficult problem.

    Read the article

  • IIS7 URL Redirect with Regex

    - by andyjv
    I'm preparing for a major overhaul of our shopping cart, which is going to completely change how the urls are structured. For what its worth, this is for Magento 1.7. An example URL would be: {domain}/item/sub-domain/sub-sub-domain-5-16-7-16-/8083770?plpver=98&categid=1027&prodid=8090&origin=keyword and redirect it to {domain}/catalogsearch/result/?q=8083710 My web.config is: <?xml version="1.0" encoding="UTF-8"?> <configuration> <system.webServer> <rewrite> <rules> <rule name="Magento Required" stopProcessing="false"> <match url=".*" ignoreCase="false" /> <conditions> <add input="{URL}" pattern="^/(media|skin|js)/" ignoreCase="false" negate="true" /> <add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" /> <add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" /> </conditions> <action type="Rewrite" url="index.php" /> </rule> <rule name="Item Redirect" stopProcessing="true"> <match url="^item/([_\-a-zA-Z0-9]+)/([_\-a-zA-Z0-9]+)/([_\-a-zA-Z0-9]+)(\?.*)" /> <action type="Redirect" url="catalogsearch/result/?q={R:3}" appendQueryString="true" redirectType="Permanent" /> <conditions trackAllCaptures="true"> </conditions> </rule> </rules> </rewrite> <httpProtocol allowKeepAlive="false" /> <caching enabled="false" /> <urlCompression doDynamicCompression="true" /> </system.webServer> </configuration> Right now it seems the redirect is completely ignored, even though in the IIS GUI the sample url passes the regex test. Is there a better way to redirect or is there something wrong with my web.config?

    Read the article

  • I need to remove Java Script tags using regular expressions and JRegex

    - by piotr
    I need to remove all the Java Script tags and the content in between and style tags from the HTML code of web pages.So far I've come up with this expression : "(<[ \r\n\t]script([ \r\n\t]|){1,}([ \r\n\t]|.)?)|(<[ \r\n\t]noscript([ \r\n\t]|){1,}([ \r\n\t]|.)?)|(<[ \r\n\t]style([ \r\n\t]|){1,}([ \r\n\t]|.)?)" I use JRegex library to work with regular expressions. When I test it in any regex tester it works just fine, but once I run my program - it all crashes down with this error report: Exception in thread "Thread-0" java.lang.StackOverflowError at java.util.regex.Pattern$BranchConn.match(Unknown Source) at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source) at java.util.regex.Pattern$Branch.match(Unknown Source) at java.util.regex.Pattern$GroupHead.match(Unknown Source) at java.util.regex.Pattern$LazyLoop.match(Unknown Source) at java.util.regex.Pattern$GroupTail.match(Unknown Source) at java.util.regex.Pattern$BranchConn.match(Unknown Source) at java.util.regex.Pattern$CharProperty.match(Unknown Source) at java.util.regex.Pattern$Branch.match(Unknown Source) at java.util.regex.Pattern$GroupHead.match(Unknown Source) at java.util.regex.Pattern$LazyLoop.match(Unknown Source) .................................. And it keeps on going forever. If anyone can give me an advice on this one - I'll be very grateful.

    Read the article

  • SQL SERVER – Grouping by Multiple Columns to Single Column as A String

    - by pinaldave
    One of the most common questions I receive in email is how to group multiple column data in comma separate values in a single row grouping by another column. I have previously blogged about it in following two blog posts. However, both aren’t addressing the following exact problem. Comma Separated Values (CSV) from Table Column Comma Separated Values (CSV) from Table Column – Part 2 The question comes in many different formats but in following image I am demonstrating the same question in simple words. This is the most popular question on my Facebook page as well. (Example) Here is the sample script to build the sample dataset. CREATE TABLE TestTable (ID INT, Col VARCHAR(4)) GO INSERT INTO TestTable (ID, Col) SELECT 1, 'A' UNION ALL SELECT 1, 'B' UNION ALL SELECT 1, 'C' UNION ALL SELECT 2, 'A' UNION ALL SELECT 2, 'B' UNION ALL SELECT 2, 'C' UNION ALL SELECT 2, 'D' UNION ALL SELECT 2, 'E' GO SELECT * FROM TestTable GO Here is the solution which will build an answer to the above question. -- Get CSV values SELECT t.ID, STUFF( (SELECT ',' + s.Col FROM TestTable s WHERE s.ID = t.ID FOR XML PATH('')),1,1,'') AS CSV FROM TestTable AS t GROUP BY t.ID GO I hope this is an easy solution. I am going to point to this blog post in the future for all the similar questions. Final Clean Up Act -- Clean up DROP TABLE TestTable GO Here is the question back to you - Is there any better way to write above script? Please leave a comment and I will write a separate blog post with due credit. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: SQL XML

    Read the article

  • simplfy javascript code using regex

    - by Pradyut Bhattacharya
    Hi I have a code which can show youtube videos if there are any links to youtube in the text like for example the text example at:- pradyut.dyndns.org http://www.youtube.com/watch?v=-LiPMxFBLZY testing http://www.youtube.com/watch?v=Q3-l22b_Qg8&feature=related this text i m forwarding to the function... function to_youtubelink(text) { if ( text.indexOf ('<') > 0 || text.indexOf ('"') > 0 || text.indexOf ('>') > 0 ) return text; else { var obj_text = new Array(); var oi = 0; while(text.indexOf('http://') >=0) { //getting the paths var si = text.indexOf('http://'); var gr = text.indexOf('\n', si); var sp = text.indexOf(' ', si); var ei; if ( gr > 0 || sp > 0 ) { if ( gr >0 && sp > 0 ) { if ( gr < sp ) { ei = gr ; } else { ei = sp ; } } else if ( gr > 0) { ei = gr; } else { ei = sp; } } else { ei = text.length; } var it = text.substring(si,ei); if ( it.indexOf('"') > 0) { it.substring(0, it.indexOf('"') ); } if(ei < 0) ei = text.length; else ei = text.indexOf(' ', si) ; obj_text[oi] = it; text = text.replace( it, '[link_service]'); oi++; } var ob_text = new Array(); var ob =0; for (oi=0; oi<obj_text.length; oi++) { if ( is_youtubelink( obj_text[oi] ) ) { ob_text[ob] = to_utubelink(obj_text[oi]); ob++; } } oi = 0; while ( text.indexOf('[link_service]') >=0 ) { text = text.replace( '[link_service]', obj_text[oi]); oi ++; } for (ob=0; ob<ob_text.length; ob++) { text = text +"\n\n" + ob_text[ob]; } return text; } } function is_youtubelink(text) { var matches = text.match(/http:\/\/(?:www\.)?youtube.*watch\?v=([a-zA-Z0-9\-_]+)/); if (matches) { return true; } else { return false; } } function to_utubelink(text) { var video_id = text.split('v=')[1]; var ampersandPosition = video_id.indexOf('&'); if(ampersandPosition != -1) { video_id = video_id.substring(0, ampersandPosition); } text = "<iframe title=\"YouTube video player\" class=\"youtube-player\" type=\"text/html\" width=\"425\" height=\"350\" src=\"http://www.youtube.com/embed/" + video_id + "\" frameborder=\"0\"></iframe>" return text; } now i m getting the output properly... but i was thinking if the code could be done better and simplified using regex ...especially getting the urls part... thanks

    Read the article

< Previous Page | 26 27 28 29 30 31 32 33 34 35 36 37  | Next Page >