Search Results

Search found 9016 results on 361 pages for 'regex libraries'.

Page 34/361 | < Previous Page | 30 31 32 33 34 35 36 37 38 39 40 41 | Next Page >

How to use regex to match ASTERISK in awk

- by Ken Chen

I'm stil pretty new to regular expression and just started learning to use awk. What I am trying to accomplish is writing a ksh script to read-in lines from text, and and for every lines that match the following: *RECORD 0000001 [some_serial_#] to replace $2 (i.e. 000001) with a different number. So essentially the script read in batch record dump, and replace the record number with date+record#, and write to separate file. So this is what I'm thinking the format should be: awk 'match($0,"/*FTR")!=0{$2="$DATE-n++"; print $0} match($0,"/*FTR")==0{print $0}' $BATCH > $OUTPUT but obviously "/*FTR" is not going to work, and I'm not sure if changing $2 and then write the whole line is the correct way to do this. So I am in need of some serious enlightenment.

Read the article
javascript RegEx hashtag matching #foo and #foo-fåäö but not http://this.is/no#hashtag

- by Simon B.

Currently we're using javascript new RegExp('#[^,#=!\s][^,#=!\s]*') (see [1]) and it mostly works, except that it also matches URLs with anchors like http://this.is/no#hashtag and also we'd rather avoid matching foo#bar Some attempts have been made with look-ahead but it doesn't seem to work, or that I just don't get it. With the below source text: #public #writable #kommentarer-till-beta -- all these should be matched Verkligen #bra jobbat! T ex #kommentarer till #artiklar och #blogginlägg, kool. -- mixed within text http://this.is/no#hashtag -- problem xxy#bar -- We'd prefer not matching this one, and... #foo=bar =foo#bar -- we probably shouldn't match any of those either. #foo,bar #foo;bar #foo-bar #foo:bar -- We're flexible on whether these get matched in part or in full . We'd like to get below output: (showing $ instead of <a class=tag href=.....>...</a> for readability reasons) $ $ $ -- all these should be matched Verkligen $ jobbat! T ex $ till $ och $, kool. -- mixed within text http://this.is/no$ -- problem xxy$ -- We'd prefer not matching this one, and... $=bar =foo$ -- we probably shouldn't match any of those either. $,bar $ $ $ -- We're flexible on whether these get matched in part or in full [1] http://github.com/ether/pad/blob/master/etherpad/src/plugins/twitterStyleTags/hooks.js

Read the article
Simple regex question?

- by Joan Venge

In the streams I am parsing I need to parse something in this pattern: <b>PaintTitle</b></td><td class=detail valign="top" align=left><div align=left><font size=small><b>The new great album by Pet Shop Boys</b> How would I get the string "The new great album by Pet Shop Boys" where <b>PaintTitle</b> is guaranteed to be once per album?

Read the article
Fuzzy Regex, Text Processing, Lexical Analysis?

- by justinzane

I'm not quite sure what terminology to search for, so my title is funky... Here is the workflow I've got: Semi-structured documents are scanned to file. The files are OCR'd to text. The text is parsed into Python objects The objects are serialized (to SQL, JSON, whatever) for use. The documents are structures like this: HEADER blah blah, Page ### blah Garbage text... 1. Question Text... continued until now. A. Choice text... adsadsf. B. Another Choice... 2. Another Question... I need to extract the questions and choices. The problem is that, because the text is OCR output, there are occasional strange substitutions like '2' - 'Z' which makes ordinary regular expressions useless. I've tried the Levenshtein module and it helps, but it requires prior knowledge of what edit distance is to be expected. I don't know whether I'm looking to create a parser? a lexer? something else? This has lead me down all kinds of interesting but nonrelevant paths. Guidance would be greatly appreciated. Oh, also, the text is generally from specific technical domains, so general spelling tools are not so helpful. Regarding the structure of the documents, there is no clear visual pattern -- like line breaks or indentation -- with the exception of the fact that "questions" usually begin a line. Crap on the document can cause characters to appear before the actual beginning of the line, which means that something along the lines of r'^[0-9]+' does not reliably work. Though the "questions" always begin with an int, a period and a space; the OCR can substitute other characters or skip characters. This is not so much a problem with Tesseract or Cunieform, rather with the poor quality of the paper documents. # Note: for the project in question, it was decided that having a human prep the OCR'd text was better that spending the time coding a solution. I'd still love good pointers, however.

Read the article
Modify jQuery Highlight? Javascript Regex

- by Matrym

How can I modify jquery highlight such that it doesn't find matches that appear directly before or after an alpha character? In other words, how do I prevent a match mid-word? /* highlight v3 Highlights arbitrary terms. <http://johannburkard.de/blog/programming/javascript/highlight-javascript-text-higlighting-jquery-plugin.html> MIT license. Johann Burkard <http://johannburkard.de> <mailto:[email protected]> */ jQuery.fn.highlight = function(pat) { function innerHighlight(node, pat) { var skip = 0; if (node.nodeType == 3) { var pos = node.data.toUpperCase().indexOf(pat); if (pos >= 0) { var spannode = document.createElement('span'); spannode.className = 'highlight'; var middlebit = node.splitText(pos); var endbit = middlebit.splitText(pat.length); var middleclone = middlebit.cloneNode(true); spannode.appendChild(middleclone); middlebit.parentNode.replaceChild(spannode, middlebit); skip = 1; } } else if (node.nodeType == 1 && node.childNodes && !/(script|style)/i.test(node.tagName)) { for (var i = 0; i < node.childNodes.length; ++i) { i += innerHighlight(node.childNodes[i], pat); } } return skip; } return this.each(function() { innerHighlight(this, pat.toUpperCase()); }); }; jQuery.fn.removeHighlight = function() { return this.find("span.highlight").each(function() { this.parentNode.firstChild.nodeName; with (this.parentNode) { replaceChild(this.firstChild, this); normalize(); } }).end(); };

Read the article
RegEx - How To Insert String Before File Extension

- by st4ck0v3rfl0w

Hi All, How would I insert "_thumb" into files that are being dyanmically generated. For example, I have a site that allows users to upload an image. The script takes the image, optimizes it and saves to file. How would I make it insert the string "_thumb" for the optimized image? I'm currently saving 1 version of the otpimized file. ch-1268312613-photo.jpg I want to save the original as the above string, but want to append, "_thumb" like the following string ch-1268312613-photo_thumb.jpg

Read the article
Regex for finding an unterminated string

- by Austin Hyde

I need to search for lines in a CSV file that end in an unterminated, double-quoted string. For example: 1,2,a,b,"dog","rabbit would match whereas 1,2,a,b,"dog","rabbit","cat bird" 1,2,a,b,"dog",rabbit would not. I have very limited experience with regular expressions, and the only thing I could think of is something like "[^"]*$ However, that matches the last quote to the end of the line. How would this be done?

Read the article
Naming convetion of regex,lookahead and lookbehind

- by user198729

Why is it counter intuitive? /(?<!\d)\d{8}(?!\d)/,here (?<!\d) comes first,but called lookbehind,(?!\d) next,but called lookahead.All are counter intuitive. What's the reason to name it this way?

Read the article
Powershell Regex help in extracting text between strings

- by vivekeviv

i Have an arguments like the one below which i pass to powershell script -arg1 -abc -def -arg2 -ghi -jkl -arg3 -123 -234 Now i need to extract three strings without any whitespace string 1: "-abc -def" string 2: "-ghi -jkl" string 3: "-123 -234" i figured this expression could do it. But this doesnt seem to work. $args -match '-arg1(?'arg1'.*?) -arg3(?'arg3'.*?) -arg3(?'arg3'.*)'. THis should return $matches['arg1'] etc. So whats wrong in above expression. Why do i get an error as shown below runScript.ps1 -arg1 -abc -def -arg2 -ghi -jkl -arg3 -123 -234 Unexpected token 'arg1'.?) -arg2 (?'arg2'.?) -arg3 (?'arg3'.)'' in expression or statement. At G:\powershell\tools\powershell\runTest.ps1:1 char:71 + $args -match '-arg1 (?'arg1'.?) -arg2 (?'arg2'.?) -arg3 (?'arg3'.)' <<<< + CategoryInfo : ParserError: (arg1'.?) -arg2...g3 (?'arg3'.)':String) [], ParseException + FullyQualifiedErrorId : UnexpectedToken and also the second question is how do i make arg1 or arg2 or arg3 optional? The argument to script can be -arg2 -def -ghi. I'll take some default values for arg(1|2|3) that is not mentioned. Thanks

Read the article
Weird Javascript Regex Replace Backreference Behavior

- by arshaw

why does the following js expression: "test1 foo bar test2".replace(/foo.bar/, "$'") result in the following string? "test1 test2 test2" is the $' in the replace string some sort of control code for including everything after the match??? this behavior was screwing with me most of the day. can anyone explain this? thanks a lot ps- this is the case in all browsers i've tested

Read the article
sed regex to match ['', 'WR' or 'RN'] + 2-4 digits

- by Karl

Hi I'm trying to do some conditional text processing on Unix and struggling with the syntax. I want to acheive Find the first 2, 3 or 4 digits in the string if 2 characters before the found digits are 'WR' (could also be lower case) Variable = the string we've found (e.g. WR1234) Type = "work request" else if 2 characters before the found digits are 'RN' (could also be lower case) Variable = the string we've found (e.g. RN1234) Type = "release note" else Variable = "WR" + the string we've found (Prepend 'WR' to the digits) Type = "Work request" fi fi I'm doing this in a Bash shell on Red Hat Enterprise Linux Server release 5.5 (Tikanga) Thanks in advance, Karl

Read the article
Perl regex matching output from `w -hs` command

- by Bushman

I'm trying to write a Perl script that will work better with KDE's kwrited, which, as far as I can tell, is connected to a pts and puts every line it receives through the KDE system tray notifications, with the title "KDE write daemon". Unfortunately, it makes a separate notification for each and every line, so it spams up the system tray with multiline messages on regular old write, and for some reason it cuts off the entire last line of the message when using wall (One-line messages are also goners.). I was also hoping to make it so that it could broadcast across a LAN with thick clients. Before starting on that (which would require ssh, of course), I tried to make an ssh-less version to make sure it works. Unfortunately, it doesn't. perl ./write.pl "Testing 1 2 3" where the following is the contents of ./write.pl: #!/usr/bin/perl use strict; use warnings; my $message = ""; my $device = ""; my $possibledevice = '`w -hs | grep "/usr/bin/kwrited"`'; #Where is kwrited? $possibledevice =~ s/^[^\t][\t]//; $possibledevice =~ s/[\t][^\t][\t ]\/usr\/bin\/kwrited$//; $possibledevice = '/dev/'.$possibledevice; unless ($possibledevice eq "") { $device = $possibledevice; } if ($ARGV[0] ne "") { $message = $ARGV[0]; $device = $ARGV[1]; } else { $device = $ARGV[0] unless $ARGV[0] eq ""; while (<STDIN>) { chomp; $message .= <STDIN>; } } if ($message ne "") { system "echo \'$message\' > $device"; } else { print "Error: empty message" } produces the following error: $ perl write.pl "Testing 1 2 3" Use of uninitialized value $device in concatenation (.) or string at write.pl line 29. sh: -c: line 0: syntax error near unexpected token `newline' sh: -c: line 0: `echo 'foo' > ' Somehow, the regular expressions and/or the backtick escape in processing $possibledevice are not working properly, because where kwrited is connected to /dev/pts/0, the following works perfectly: $ perl write.pl "Testing 1 2 3" /dev/pts/0

Read the article
regex , php, preg_match

- by Michael

I'm trying to extract the mileage value from different ebay pages but I'm stuck as there seem to be too many patterns because the pages are a bit different . Therefore I would like to know if you can help me with a better pattern . Some examples of items are the following : http://cgi.ebay.com/ebaymotors/1971-Chevy-C10-Shortbed-Truck-/250647101696?cmd=ViewItem&pt=US_Cars_Trucks&hash=item3a5bbb4100 http://cgi.ebay.com/ebaymotors/1987-HANDICAP-LEISURE-VAN-W-WHEEL-CHAIR-LIFT-/250647101712?cmd=ViewItem&pt=US_Cars_Trucks&hash=item3a5bbb4110 http://cgi.ebay.com/ebaymotors/ws/eBayISAPI.dll?ViewItemNext&item=250647101696 Please see the patterns at the following link (I still cannot figure it out how to escape the html here http://pastebin.com/zk4HAY3T However they are not enough many as it seems there are still new patters....

Read the article
Scite Lua - escaping right bracket in regex?

- by ~sd-imi

Hi all, Bumped into a somewhat weird problem... I want to turn the string: a\left(b_{d}\right) into a \left( b_{d} \right) in Scite using a Lua script. So, I made the following Lua script for Scite: function SpaceTexEquations() editor:BeginUndoAction() local sel = editor:GetSelText() local cln3 = string.gsub(sel, "\\left(", " \\left( ") local cln4 = string.gsub(cln3, "\\right)", " \\right) ") editor:ReplaceSel(cln4) editor:EndUndoAction() end The cln3 line works fine, however, cln4 crashes with: /home/user/sciteLuaFunctions.lua:49: invalid pattern capture >Lua: error occurred while processing command I think this is because bracket characters () are reserved characters in Lua; but then, how come the cln3 line works without escaping? By the way I also tried: -- using backslash \ as escape char: local cln4 = string.gsub(cln3, "\\right\)", " \\right) ") -- crashes all the same -- using percentage sign % as escape chare local cln4 = string.gsub(cln3, "\\right%)", " \\right) ") -- does not crash, but does not match either Could anyone tell me what would be the correct way to do this? Thanks, Cheers!

Read the article
how to prevent white spaces in a regular expression regex validation

- by Rees

i am completely new to regular expressions and am trying to create a regular expression in flex for a validation. using a regular expression, i am going to validate that the user input does NOT contain any white-space and consists of only characters and digits... starting with digit. so far i have: expression="[A-Za-z][A-Za-z0-9]*" this correctly checks for user input to start with a character followed by a possible digit, but this does not check if there is white space...(in my tests if user input has a space this input will pass through validation - this is not desired) can someone tell me how i can modify this expression to ensure that user input with whitespace is flagged as invalid?

Read the article
Simple PHP Regex question

- by Dave Kiss

Hi all, I'd like to validate a field in a form to make sure it contains the proper formatting for a URL linking to a Vimeo video. Below is what I have in Javascript, but I need to convert this over to PHP (not my forte) Basically, I need to check the field and if it is incorrectly formatted, I need to store an error message as a variable.. if it is correct, i store the variable empty. // Parse the URL var PreviewID = jQuery("#customfields-tf-1-tf").val().match(/http:\/\/(www.vimeo|vimeo)\.com(\/|\/clip:)(\d+)(.*?)/); if ( !PreviewID ) { jQuery("#cleaner").html('<div id="vvqvideopreview"><?php echo $this->js_escape( __("Unable to parse preview URL. Please make sure it's the <strong>full</strong> URL and a valid one at that.", 'vipers-video-quicktags') ); ?></div>'); return; } The traditional vimeo url looks like this: http://www.vimeo.com/10793773 Thanks!

Read the article
Regex validate dates like "Sun, 20 Jun 10"

- by Trindaz

Hi, I'm working on a regular expression that will only return true when a date string is in a format something like 'ddd, dd mmm yy'. Valid matches would be values like "Sun, 20 Jun 10" or "Mon, 21 Jun 10" but not "Sunday, 20 Jun 10" or "20 Jun 10". This will be used with mb_ereg in PHP. My attempts so far have only got me half way there. Any help appreciated! Thanks, Dave

Read the article
Regex - Modifying strings except in specified enclosures - PHP

- by Kovo

I have found similar answers to my current needs on SO, however I am still trying to grasp modifying strings based on a rule, except in certain enclosures within those strings. Example of what Im trying to accomplish now: preg_replace("/\s*,\s*/", ",", $text) I found the above in many places. It will remove spaces before and after all commas in a string. That works great. However, if I want to exclude modifying commas found within " ", I am not sure how that rule has to be modified. Any help? Thanks! EDIT: I want to clarify my question: I would like all whitespace before and after the commas in the following sentence removed, except commas found in double or single quotes: a, b , c "d, e f g , " , h i j ,k lm,nop Expected result: a,b,c "d, e f g , ",h i j,k lm,nop

Read the article
Regex negative look-behind in hgignore file

- by jco

I'm looking for a way to modify my .hgignore file to ignore all "Properties/AssemblyInfo.cs" files except those in either the "Test/" or the "Tests/" subfolders. I tried using the negative look-behind expression (?<!Test)/Properties/AssemblyInfo\.cs$, but I didn't find a way to "un-ignore" in both folders "Test/" and "Tests/".

Read the article
jQuery Validate plugin - password check - minimum requirements - Regex

- by QviXx

I've got a little problem with my password-checker. There's got a registration form with some fields. I use jQuery Validate plugin to validate user-inputs. It all works except the password-validation: The password should meet some minimum requirements: minimum length: 8 - I just use 'minlength: 8' at least one lower-case character at least one digit Allowed Characters: A-Z a-z 0-9 @ * _ - . ! At the moment I use this code to validate the password: $.validator.addMethod("pwcheck", function(value, element) { return /^[A-Za-z0-9\d=!\-@._*]+$/.test(value); }); This Code works for the allowed characters but not for minimum requirements. I know that you can use for example (?=.*[a-z]) for a lower-case-requirement. But I just don't get it to work. If I add (?=.*[a-z]) the whole code doesn't work anymore. I need to know how to properly add the code to the existing one. Thank you for your answers! This is the complete code <script> $(function() { $("#regform").validate({ rules: { forename: { required: true }, surname: { required: true }, username: { required: true }, password: { required: true, pwcheck: true, minlength: 8 }, password2: { required: true, equalTo: "#password" }, mail1: { required: true, email: true }, mail2: { required: true, equalTo: "#mail1" } }, messages: { forename: { required: "Vornamen angeben" }, surname: { required: "Nachnamen angeben" }, username: { required: "Usernamen angeben" }, password: { required: "Passwort angeben", pwcheck: "Das Passwort entspricht nicht den Kriterien!", minlength: "Das Passwort entspricht nicht den Kriterien!" }, password2: { required: "Passwort wiederholen", equalTo: "Die Passwörter stimmen nicht überein" }, mail1: { required: "Mail-Adresse angeben", email: "ungültiges Mail-Format" }, mail2: { required: "Mail-Adresse wiederholen", equalTo: "Die Mail-Adressen stimmen nicht überein" } } }); $.validator.addMethod("pwcheck", function(value, element) { return /^[A-Za-z0-9\d=!\-@._*]+$/.test(value); }); }); </script>

Read the article
Need to add $_GET args to my regex

- by TaMeR

url.rewrite-once = ( ".*\.(js|ico|gif|jpg|png|css|html)$" => "$0", "^/([^?]*)(\?.*)?$" => "/$1.php/$2", ) This is what I got but the args don't work. I like following url http://www.example.com/index.php/?r=something To look like this: http://www.example.com/index/?r=something Thanx

Read the article
Detect WebKit Version 525 and Below With RegEx

- by Jay

I'm no good at Regular Expressions, really! I would like to specifically detect WebKit browsers below version 525. I have a regular expression [/WebKit\/[\d.]+/.exec(navigator.appVersion)] that correctly returns WebKit/5….…, really, I'd like it to return only the version number, but if the browser isn't WebKit, return null, or better still 0. For example, if the browser was Trident, Presto or Gecko, return null, whereas if the browser is WebKit, return it's version number. To clarify, I would like the regular expression to check if navigator.appVersion contains WebKit and if it does not, return null, if it does, return the version number. I appreciate all your help! Please let's keep this focused, let's not flirt with jQuery or the sort, it's overkill in this scenario.

Read the article
Clean up domain list in Excel - regex / macros?

- by Tim

I have a huge spreadsheet of domains that I need to clean up as follows: Remove all http:// (simple replace all - "http://" with "") Remove any www. (simple replace all - "www." with "") Delete any sub-domains (delete the actual row completely, not just the subdomain from the url) Remove anything after the domain extension (i.e. website.com/blah/blahbah/ becomes just website.com (simple replace all - "/*" with "", then replace all "/" with "") So what I'm left with is just a spreadsheet of clean domains like "website.com". I think I've got 1, 2 and 4 sorted (as above), but I'm really struggling with 3. Any ideas? Can I do this with regexp / vba, and actually delete the row completely? Sample data: http://www.scholastic.com/kids/stacks/games/ http://imgworld.teamworkonline.com/ http://topfreegraphics.com/ http://www.workcircle.co.uk/ http://www.healthycanadians.gc.ca/index-eng.php http://gsociology.icaap.org/methods/soft.html Post 1, 2 and 4 would leave me with: scholastic.com imgworld.teamworkonline.com topfreegraphics.com workcircle.co.uk healthycanadians.gc.ca gsociology.icaap.org It's those pesky sub-domains I need to just delete completely, just delete the row. I've realised I can't just search for 2 x ".", because obviously plenty of domain extensions (i.e .co.uk) include that. Any help appreciated.

Read the article
Java String.replaceAll regex

- by atomsfat

I want to replace the first context of web/style/clients.html with the java String.replaceFirst method so I can get: ${pageContext.request.contextPath}/style/clients.html I tried String test = "web/style/clients.html".replaceFirst("^.*?/", "hello/"); And this give me: hello/style/clients.html but when I do String test = "web/style/clients.html".replaceFirst("^.*?/", "${pageContext.request.contextPath}/"); gives me java.lang.IllegalArgumentException: Illegal group reference

Read the article
What is proper RegEx expession for SWIFT codes?

- by abatishchev

I have to filter user input to on my web ASP.NET page: <asp:TextBox runat="server" ID="recipientBankIDTextBox" MaxLength="11" /> <asp:RegularExpressionValidator runat="server" ValidationExpression="?" ControlToValidate="recipientBankIDTextBox" ErrorMessage="*" /> As far is I know SWIFT code must contain 5 or 6 letters and other symbols up to total length 11 are alphanumeric. How to implement such rule properly? TIO

Read the article

< Previous Page | 30 31 32 33 34 35 36 37 38 39 40 41 | Next Page >