regex cookbook - Page 99

Use matching value of a RegExp to name the output file.

- by fx42

I have this file "file.txt" which I want to split into many smaller ones. Each line of the file has an id field which looks like "id:1" for a line belonging to id 1. For each id in the file, I like to create a file named idid.txt and put all lines that belong to this id in that file. My brute force bash script solution reads as follows. count=1 while [ $count -lt 19945 ] do cat file.txt | grep "id:$count " >> ./sets/id$count.txt count='expr $count + 1' done Now this is very inefficient as I have do read through the file about 20.000 times. Is there a way to do the same operation with only one pass through the file? - What I'm probably asking for is a way to use the value that matches for a regular expression to name the associated output file.

Read the article

How do I write this URL in Django?

- by alex

(r'^/(?P<the_param>[a-zA-z0-9_-]+)/$','myproject.myapp.views.myview'), How can I change this so that "the_param" accepts a URL(encoded) as a parameter? So, I want to pass a URL to it. mydomain.com/http%3A//google.com

Read the article

Is there an easy method to combine two relative paths in C# ?

- by Ioannis

I want to combine two relative paths in C#. For example: string path1 = "/System/Configuration/Panels/Alpha"; string path2 = "Panels/Alpha/Data"; I want to return string result = "/System/Configuration/Panels/Alpha/Data"; I can implement this by splitting the second array and compare it in a for loop but I was wondering if there is something similar to Path.Combine available or if this can be accomplished with regular expressions or Linq? Thanks

Read the article

How Do I Remove The First 4 Characters From A String If It Matches A Pattern In Ruby

- by James

I have the following string: "h3. My Title Goes Here" I basically want to remove the first 4 characters from the string so that I just get back: "My Title Goes Here". The thing is I am iterating over an array of strings and not all have the h3. part in front so I can't just ditch the first 4 characters blindly. I have checked the docs and the closest think I could find was chomp, but that only works for the end of a string. Right now I am doing this: "h3. My Title Goes Here".reverse.chomp(" .3h").reverse This gives me my desired output, but there has to be a better way right? I mean I don't want to reverse a string twice for no reason. I am new to programming so I might have missed something obvious, but I didn't see the opposite of chomp anywhere in the docs. Is there another method that will work? Thanks!

Read the article

Mod rewrite with multiple query strings

- by Boris

Hi, I'm a complete n00b when it comes to regular expressions. I need these redirects: (1) www.mysite.com/products.php?id=001&product=Product-Name&source=Source-Name should become -> www.mysite.com/Source-Name/001-Product-Name (2) www.mysite.com/stores.php?id=002&name=Store-Name should become -> www.mysite.com/002-Store-Name Any help much appreciated :)

Read the article

How to replace plain URLs with links?

- by Sergio del Amo

I am using the function below to match URLs inside a given text and replace them for HTML links. The regular expression is working great, but currently I am only replacing the first match. How I can replace all the URL? I guess I should be using the exec command, but I did not really figure how to do it. function replaceURLWithHTMLLinks(text) { var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/i; return text.replace(exp,"<a href='$1'>$1</a>"); }

Read the article

string abbreviation in matlab

- by Ali

is there an easy way to abbreviate strings in matlab? ex. 'Superior Temporal Gyrus' = 'STG'

Read the article

Classic asp comparison of comma separated lists

- by Reiwoldt

Hello, I have two comma separated lists:- 36,189,47,183,65,50 65,50,189,47 The question is how to compare the two in classic ASP in order to identify and return any values that exist in list 1 but that don't exist in list 2 bearing in mind that associative arrays aren't available. E.g., in the above example I would need the return value to be 36,183 Thanks

Read the article

Find last match with python regular expression

- by SDD

I wanto to match the last occurence of a simple pattern in a string, e.g. list = re.findall(r"\w+ AAAA \w+", "foo bar AAAA foo2 AAAA bar2) print "last match: ", list[len(list)-1] however, if the string is very long, a huge list of matches is generated. Is there a more direct way to match the second occurence of "AAAA" or should I use this workaround?

Read the article

Delete all characters in a multline string up to a given pattern

- by biffabacon

Using Python I need to delete all charaters in a multiline string up to the first occurrence of a given pattern. In Perl this can be done using regular expressions with something like: #remove all chars up to first occurrence of cat or dog or rat $pattern = 'cat|dog|rat' $pagetext =~ s/(.*?)($pattern)/$2/xms; What's the best way to do it in Python?

Read the article

Can a repeated piece of regular expression create multiple groups? Such as this example...

- by Yousui

Hi guys, I'm using RUBY 's regular expression to deal with text such as ${1:aaa|bbbb} ${233:aaa | bbbb | ccc ccccc } ${34: aaa | bbbb | cccccccc |d} ${343: aaa | bbbb | cccccccc |dddddd ddddddddd} ${3443:a aa|bbbb|cccccccc|d} ${353:aa a| b b b b | c c c c c c c c | dddddd} I want to get the trimed text between each pipe line. For example, for the first line of my upper example, I want to get the result aaa and bbbb, for the second line, I want aaa, bbbb and ccc ccccc. Now I have wrote a piece of regular expression and a piece of ruby code to test it: array = "${33:aaa|bbbb|cccccccc}".scan(/\$\{\s*(\d+)\s*:(\s*[^\|]+\s*)(?:\|(\s*[^\|]+\s*))+\}/) puts array Now my problem is the (?:\|(\s*[^\|]+\s*))+ part can't create multiple groups. I don't know how to solve this problem, because the number of text I need in each line is variable. Anyone can help? Great thanks.

Read the article

Flex 3 Regular Expression Problem

- by Tommy

I've written a url validator for a project I am working on. For my requirements it works great, except when the last part for the url goes longer than 22 characters it breaks. My expression: /((https?):\/\/)([^\s.]+.)+([^\s.]+)(:\d+\/\S+)/i It expects input that looks like "http(s)://hostname:port/location". When I give it the input: https://demo10:443/111112222233333444445 it works, but if I pass the input https://demo10:443/1111122222333334444455 it breaks. You can test it out easily at http://ryanswanson.com/regexp/#start. Oddly, I can't reproduce the problem with just the relevant (I would think) part /(:\d+\/\S+)/i. I can have as many characters after the required / and it works great. Any ideas or known bugs?

Read the article

Django - urls.py - Filenames with a hash/pound (#) sign?

- by miya

I'm using django and realized that when the filename that the user wants to access (let's say a photo) has the pound sign, the entry in the url.py does not match. Any ideas? url(r'^static/(?P<path>.*)$', 'django.views.static.serve', {'document_root': MEDIA_ROOT}, it just says: "/home/user/project/static/upload/images/hello" does not exist when actually the name of the file is: hello#world.jpg Thanks, Nico

Read the article

parse youtube video id using preg_match

- by Webbo

Hi, I am attempting to parse the video ID of a youtube URL using preg_match. I found a regular expression on this site that appears to work; (?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=[0-9]/)[^&\n]+|(?<=v=)[^&\n]+ As shown in this pic; http://i.imgur.com/SQJW2.jpg My PHP is as follows, but it doesn't work (gives Unknown modifier '[' error)... <? $subject = "http://www.youtube.com/watch?v=z_AbfPXTKms&NR=1"; preg_match("(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=[0-9]/)[^&\n]+|(?<=v=)[^&\n]+", $subject, $matches); print "<pre>"; print_r($matches); print "</pre>"; ?> Cheers

Read the article

Nullability (Regular Expressions)

- by danportin

In Brzozowski's "Derivatives of Regular Expressions" and elsewhere, the function d(R) returning ? if a R is nullable, and Ø otherwise, includes clauses such as the following: d(R1 + R2) = d(R1) + d(R2) d(R1 · R2) = d(R1) ? d(R2) Clearly, if both R1 and R2 are nullable then (R1 · R2) is nullable, and if either R1 or R2 is nullable then (R1 + R2) is nullable. It is unclear to me what the above clauses are supposed to mean, however. My first thought, mapping (+), (·), or the Boolean operations to regular sets is nonsensical, since in the base case, d(a) = Ø (for all a ? S) d(?) = ? d(Ø) = Ø and ? is not a set (nor is the return type of d, which is a regular expression). Furthermore, this mapping isn't indicated, and there is a separate notation for it. I understand nullability, but I'm lost on the definition of the sum, product, and Boolean operations in the definition of d: how are ? or Ø returned from d(R1) ? d(R2), for instance, in the definition off d(R1 · R2)?

Read the article

What regular expression(s) would I use to remove escaped html from large sets of data.

- by Elizabeth Buckwalter

Our database is filled with articles retrieved from RSS feeds. I was unsure of what data I would be getting, and how much filtering was already setup (WP-O-Matic Wordpress plugin using the SimplePie library). This plugin does some basic encoding before insertion using Wordpress's built in post insert function which also does some filtering. I've figured out most of the filters before insertion, but now I have whacko data that I need to remove. This is an example of whacko data that I have data in one field which the content I want in the front, but this part removed which is at the end: <img src="http://feeds.feedburner.com/~ff/SoundOnTheSound?i=xFxEpT2Add0:xFbIkwGc-fk:V_sGLiPBpWU" border="0"></img> <img src="http://feeds.feedburner.com/~ff/SoundOnTheSound?d=qj6IDK7rITs" border="0"></img> <img src="http://feeds.feedburner.com/~ff/SoundOnTheSound?i=xFxEpT2Add0:xFbIkwGc-fk:D7DqB2pKExk" Notice how some of the images are escape and some aren't. I believe this has to do with the last part being cut off so as to be unrecognizable as an html tag, which then caused it to be html endcoded. Another field has only this which is now filtered before insertion, but I have to get rid of the others: <img src="http://farm3.static.flickr.com/2183/2289902369_1d95bcdb85.jpg" alt="post_img" width="80" (all examples are on one line, but broken up for readability) Question: What is the best way to work with the above escaped html (or portion of an html tag)? I can do it in Perl, PHP, SQL, Ruby, and even Python. I believe Perl to be the best at text parsing, so that's why I used the Perl tag. And PHP times out on large database operations, so that's pretty much out unless I wanted to do batch processing and what not. PS One of the nice things about using Wordpress's insert post function, is that if you use php's strip_tags function to strip out all html, insert post function will insert <p> at the paragraph points. Let me know if there's anything more that I can answer. Some article that didn't quite answer my questions. (http://stackoverflow.com/questions/2016751/remove-text-from-within-a-database-text-field) (http://stackoverflow.com/questions/462831/regular-expression-to-escape-html-ampersands-while-respecting-cdata)

Read the article

Weird error using preg_match and unicode

- by Thorpe Obazee

if (preg_match('(\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+)', '2010/02/14/this-is-something')) { // do stuff } The above code works. However this one doesn't. if (preg_match('/\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+/u', '2010/02/14/this-is-something')) { // do stuff } Maybe someone could shed some light as to why the one below doesn't work. This is the error that is being produced: A PHP Error was encountered Severity: Warning Message: preg_match() [function.preg-match]: Unknown modifier '\'

Read the article

Confusion in RegExp Reluctant quantifier? Java

- by Dusk

Hi, Could anyone please tell me the reason of getting an output as: ab for the following RegExp code using Relcutant quantifier? Pattern p = Pattern.compile("abc*?"); Matcher m = p.matcher("abcfoo"); while(m.find()) System.out.println(m.group()); // ab and getting empty indices for the following code? Pattern p = Pattern.compile(".*?"); Matcher m = p.matcher("abcfoo"); while(m.find()) System.out.println(m.group());

Read the article

Switch statement for string matching in JavaScript

- by yaya3

How do I write a swtich for the following conditional? If the url contains "foo", then settings.base_url is "bar". The following is achieving the effect required but I've a feeling this would be more manageable in a switch: var doc_location = document.location.href; var url_strip = new RegExp("http:\/\/.*\/"); var base_url = url_strip.exec(doc_location) var base_url_string = base_url[0]; //BASE URL CASES // LOCAL if (base_url_string.indexOf('xxx.local') > -1) { settings = { "base_url" : "http://xxx.local/" }; } // DEV if (base_url_string.indexOf('xxx.dev.yyy.com') > -1) { settings = { "base_url" : "http://xxx.dev.yyy.com/xxx/" }; } Thanks

Read the article

Regular expression for matching words between <blockquote> & </blockquote>

- by Senthil

Basically I want to strip the document of words between blockquotes. I'm a regular expression newb and even after using rubular, I'm no closer to the answer. Any help is appreciated.

Read the article

How can I strip all html using a preg_replace ?

- by Webnet

strip_tags only catches tags that have a beginning and end tag. With the strings I'm working with it's causing issues and I need to removed all HTML tags.

Read the article

Using Regular Expression in VC++

- by Benit

Hi , I am finding Email ids in mu project, where I am preprocessing the input using some Regular Expression. RegExpPhone6.RegComp("[\[\{$][ -]?[s][h][i][f][t][ -]?[+-][2][ -]?[\]\}$]"); Here while I am compiling i am getting a warning msg like Warning 39 warning C4129: ')' : unrecognized character escape sequence How can i resolve this ? Why this is occuring and Where will it affect? Kindly help me...

Read the article

.net Regular Expression to match any kind of letter from any language

- by pedro

Which regular expression can I use to match (allow) any kind of letter from any language I need to match any letter including any diacritics (e.g. á, ü, ñ, etc.) and exlude any kind of symbol (math symbols, currency signs, dingbats, box-drawing characters, etc.) and punctuation characters. I've tried using \p{L}\p{M}* but it doesn't work.

Read the article

string substitution regular expression not working in tcl

- by Puneet Mittal

i am trying to replace all the special characters including white space, hyphen, etc, to underscore, from a string variable in tcl. I wrote the code below but it doesn't seem to be working. set varname $origVar puts "Variable Name :>> $varname" if {$varname != ""} { regsub -all {[\s-\]\[$^?+*()|\\%&#]} $varname "_" $newVar } puts "New Variable :>> $newVar" one issue is that, instead of replacing the string in $varname, it is replacing the data inside $origVar. No idea why, and also i read the example code (for proper syntax) in my tcl book and according to that it should be something like this regsub -all {[\s-][$^?+*()|\\%&#]} $varname "_" newVar so i used the same syntax but it didn't work and gave the same result as modifying the $origVar instead of required $varname value.

Read the article

javascript split() array contains

- by Mahesha999

While learning JavaScript, I did not get why the output when we print the array returned of the Sting.split() method (with regular expression as an argument) is as explained below. var colorString = "red,blue,green,yellow"; var colors = colorString.split(/[^\,]+/); document.write(colors); //this print 7 times comma: ,,,,,,, However when I print individual element of the array colors, it prints an empty string, three commas and an empty string: document.write(colors[0]); //empty string document.write(colors[1]); //, document.write(colors[2]); //, document.write(colors[3]); //, document.write(colors[4]); //empty string document.write(colors[5]); //undefined document.write(colors[6]); //undefined Then, why printing the array directly gives seven commas. Though I think its correct to have three commas in the second output, I did not get why there is a starting (at index 0) and ending empty string (at index 4). Please explain I am screwed up here.

Search Results

Search found 3956 results on 159 pages for 'regex cookbook'.

Page 99/159 | < Previous Page | 95 96 97 98 99 100 101 102 103 104 105 106 | Next Page >

- by fx42

- by alex

- by Ioannis

- by James

- by Boris

- by Sergio del Amo

- by Ali

- by Reiwoldt

- by SDD

- by biffabacon

- by Yousui

- by Tommy

- by miya

- by Webbo

- by danportin

- by Elizabeth Buckwalter

- by Thorpe Obazee

- by Dusk

- by yaya3

- by Senthil

- by Webnet

- by Benit

- by pedro

- by Puneet Mittal

- by Mahesha999

< Previous Page | 95 96 97 98 99 100 101 102 103 104 105 106 | Next Page >