Search Results

Search found 3956 results on 159 pages for 'regex cookbook'.

Page 49/159 | < Previous Page | 45 46 47 48 49 50 51 52 53 54 55 56  | Next Page >

  • Regex to identify rows that do not contain exact number of occurences of quotemark character using Notepad++

    - by SamAspin
    I would like to be able to jump to rows that dont contain 6 quotemarks in a quoted-CSV file as it feels like a good way to identify broken rows. I think using a regular expression with Notepad++'s find features would be a sensible approach but I'm not sure how to pick the rows up. 6 quotemarks (") would suggest a complete row so I want to skip to any row that does not contain 6. Here is some sample data to play with, in this example its the 4th line I'd like to jump to "sam","mark","dave" "sam","mark","dave" "sam","mark","dave" "sam","mark"," dave" "sam","mark","dave" "sam","mark","dave"

    Read the article

  • How to regex match a string of alnums and hyphens, but which doesn't begin or end with a hyphen?

    - by Shahar Evron
    I have some code validating a string of 1 to 32 characters, which may contain only alpha-numerics and hyphens ('-') but may not begin or end with a hyphen. I'm using PCRE regular expressions & PHP (albeit the PHP part is not really important in this case). Right now the pseudo-code looks like this: if (match("/^[\p{L}0-9][\p{L}0-9-]{0,31}$/u", string) and not match("/-$/", string)) print "success!" That is, I'm checking first that the string is of right contents, doesn't being with a '-' and is of the right length, and then I'm running another test to see that it doesn't end with a '-'. Any suggestions on merging this into a single PCRE regular expression? I've tried using look-ahead / look-behind assertions but couldn't get it to work.

    Read the article

  • How to do regex HTML tag replace in MS SQL?

    - by timmerk
    I have a table in SQL Server 2005 with hundreds of rows with HTML content. Some of the content has HTML like: <span class=heading-2>Directions</span> where "Directions" changes depending on page name. I need to change all the <span class=heading-2> and </span> tags to <h2> and </h2> tags. I wrote this query to do content changes in the past, but it doesn't work for my current problem because of the ending HTML tag: Update ContentManager Set ContentManager.Content = replace(Cast(ContentManager.Content AS NVARCHAR(Max)), 'old text', 'new text') Does anyone know how I could accomplish the span to h2 replacing purely in T-SQL? Everything I found showed I would have to do CLR integration. Thanks!

    Read the article

  • Regex for Matching First Alphanumeric Character skipping (The |An? )

    - by TheLizardKing
    I have a list of artists, albums and tracks that I want to sort using the first letter of their respective name. The issue arrives when I want to ignore "The ", "A ", "An " and other various non-alphanumeric characters (Talking to you "Weird Al" Yankovic and [dialog]). Django has a nice start '^(An?|The) +' but I want to ignore those and a few others of my choice. I am doing this in Django, using a MySQL db with utf8_bin collation.

    Read the article

  • Best way to get back to using the power of lxml after having to use a regex to find something in an

    - by PyNEwbie
    I am trying to rip some text out of a large number of html documents (numbers in the hundreds of thousands). The documents are really forms but they are prepared by a very large group of different organizations so there is significant variation in how they create the document. For example, the documents are divided into chapters. I might want to extract the contents of Chapter 5 from every document so I can analyze the content of the chapter. Initially I thought this would be easy but it turns out that the authors might use a set of non-nested tables throughout the document to hold the content so that Chapter n could be displayed using td tags inside a table. Or they might use other elements such as p tags H tags, div tags or any other block level element. After trying repeatedly to use lxml to help me identify the beginning and end of each chapter I have determined that it is a lot cleaner to use a regular expression because in every case, no matter what the enclosing html element is the chapter label is always in the form of >Chapter # It is a little more complicated in that there might be some white space or non-breaking space represented in different ways (  or   or just spaces). Nonetheless it was trivial to write a regular expression to identify the beginning of each section. (The beginning of one section is the end of the previous section.) But now I want to use lxml to get the text out. My thought is that I have really no choice but to walk along my string to find the close tag for the element that encloses the text I am using to find the relevant section. That is here is one example where the element holding the Chapter name is a div <div style="DISPLAY: block; MARGIN-LEFT: 0pt; TEXT-INDENT: 0pt; MARGIN-RIGHT: 0pt" align="left"><font style="DISPLAY: inline; FONT-WEIGHT: bold; FONT-SIZE: 10pt; FONT-FAMILY: Times New Roman">Chapter 1.&#160;&#160;&#160;Our Beginnings.</font></div> So I am imagining that I would begin at the location where I found the match for chapter 1 and set up a regular expressions to find the next </div|</td|</p|</h1 . . . So at this point I have identified the type of element holding my chapter heading I can use the same logic to find all of the text that is within that element that is set up a regular expression to help me mark from >Chapter 1.&#160;&#160;&#160;Our Beginnings.< So I have identified where my Chapter 1 begins I can do the same for chapter 2 (which is where Chapter 1 ends) Now I am imagining that I am going to snip the document beginning at the opening of the element that I identified as the element the indicates where chapter 1 begins and ending just before the opening of the element that I identified as the element that indicates where Chapter 2 begins. The string that I have identified will then be fed to lxml to use its power to get the content. I am going to all of this trouble because I have read over and over - never use a regular expression to extract content from html documents and I have not hit on a way to be as accurate with lxml to identify the starting and ending locations for the text I want to extract. For example, I can never be certain that the subtitle of Chapter 1 is Our Beginnings it could be Our Red Canary. Let me say that I spent two solid days trying with lxml to be confident that I had the beginning and ending elements and I could only be accurate <60% of the time but a very short regular expression has given me better than 95% success. I have a tendency to make things more complicated than necessary so I am wondering if anyone has seen or solved a similar problems and if they had an approach (not the details mind you) that they would like to offer.

    Read the article

  • How can I match at the beginning of any line, including the first, with a Perl regex?

    - by JoelFan
    According the the Perl documentation on regexes: By default, the "^" character is guaranteed to match only the beginning of the string ... Embedded newlines will not be matched by "^" ... You may, however, wish to treat a string as a multi-line buffer, such that the "^" will match after any newline within the string ... you can do this by using the /m modifier on the pattern match operator. The "after any newline" part means that it will only match at the beginning of the 2nd and subsequent lines. What if I want to match at the beginning of any line (1st, 2nd, etc.)? EDIT: OK, it seems that the file has BOM information (3 chars) at the beginning and that's what's messing me up. Any way to get ^ to match anyway? EDIT: So in the end it works (as long as there's no BOM), but now it seems that the Perl documentation is wrong, since it says "after any newline"

    Read the article

  • Help with Perl Regex Recursive Replace One Liner? Replace MySQL comments '--' with '#'

    - by NJTechie
    I have various SQL files with '--' comments and we migrated to the latest version of MySQL and it hates these comments. I want to replace -- with #. I am looking for a recursive, inplace replace one-liner. This is what I have : perl -p -i -e 's/--/# /g' `fgrep -- -- * ` A sample .sql file : use myDB; --did you get an error I get the following error : Unrecognized switch: --did (-h will show valid options). p.s : fgrep skipping 2 dashes was just discussed here if you are interested. Any help is appreciated.

    Read the article

  • How to validate a domain name using Regex & Php?

    - by David
    Hi, I want a solution to validate only domain names not full urls, The following example is what i'm looking for: domain.com -> true domain.net/org/biz... -> true domain.co.uk -> true sub.domain.com -> true domain.com/folder -> false domµ*$ain.com -> false Thank you

    Read the article

  • Can I improve this regex check for valid domain names?

    - by Josh
    So, I have been working on this domain name regular expression. So far, it seems to pick up domain names with SLDs and TLDs (with the optional ccTLD), but there is duplication of the TLD listing. Can this be refactored any further? params[:domain_name].downcase.strip.match(/^[a-z0-9\-]{2,63} \.((a[cdefgilmnoqrstuwxz]|aero|arpa)|(b[abdefghijmnorstvwyz]|biz)| (c[acdfghiklmnorsuvxyz]|cat|com|coop)|d[ejkmoz]|(e[ceghrstu]|edu)|f[ijkmor]| (g[abdefghilmnpqrstuwy]|gov)|h[kmnrtu]|(i[delmnoqrst]|info|int)| (j[emop]|jobs)|k[eghimnprwyz]|l[abcikrstuvy]| (m[acdghklmnopqrstuvwxyz]|me|mil|mobi|museum)|(n[acefgilopruz]|name|net)|(om|org)| (p[aefghklmnrstwy]|pro)|qa|r[eouw]|s[abcdeghijklmnortvyz]| (t[cdfghjklmnoprtvwz]|travel)|u[agkmsyz]|v[aceginu]|w[fs]|y[etu]|z[amw]) (\.((a[cdefgilmnoqrstuwxz]|aero|arpa)|(b[abdefghijmnorstvwyz]|biz)| (c[acdfghiklmnorsuvxyz]|cat|com|coop)|d[ejkmoz]|(e[ceghrstu]|edu)|f[ijkmor]| (g[abdefghilmnpqrstuwy]|gov)|h[kmnrtu]|(i[delmnoqrst]|info|int)| (j[emop]|jobs)|k[eghimnprwyz]|l[abcikrstuvy]| m[acdghklmnopqrstuvwxyz]|mil|mobi|museum)| (n[acefgilopruz]|name|net)|(om|org)| (p[aefghklmnrstwy]|pro)|qa|r[eouw]|s[abcdeghijklmnortvyz]| (t[cdfghjklmnoprtvwz]|travel)|u[agkmsyz]|v[aceginu]|w[fs]|y[etu]|z[amw]))?$/)

    Read the article

  • .NET Regex - Replace multiple characters at once without overwriting?

    - by Everaldo Aguiar
    I'm implementing a c# program that should automatize a Mono-alphabetic substitution cipher. The functionality i'm working on at the moment is the simplest one: The user will provide a plain text and a cipher alphabet, for example: Plain text(input): THIS IS A TEST Cipher alphabet: A - Y, H - Z, I - K, S - L, E - J, T - Q Cipher Text(output): QZKL KL QJLQ I thought of using regular expressions since I've been programming in perl for a while, but I'm encountering some problems on c#. First I would like to know if someone would have a suggestion for a regular expression that would replace all occurrence of each letter by its corresponding cipher letter (provided by user) at once and without overwriting anything. Example: In this case, user provides plaintext "TEST", and on his cipher alphabet, he wishes to have all his T's replaced with E's, E's replaced with Y and S replaced with J. My first thought was to substitute each occurrence of a letter with an individual character and then replace that character by the cipherletter corresponding to the plaintext letter provided. Using the same example word "TEST", the steps taken by the program to provide an answer would be: 1 - replace T's with (lets say) @ 2 - replace E's with # 3 - replace S's with & 4 - Replace @ with E, # with Y, & with j 5 - Output = EYJE This solution doesn't seem to work for large texts. I would like to know if anyone can think of a single regular expression that would allow me to replace each letter in a given text by its corresponding letter in a 26-letter cipher alphabet without the need of splitting the task in an intermediate step as I mentioned. If it helps visualize the process, this is a print screen of my GUI for the program:

    Read the article

  • How to extract part of the path and the ending file name with Regex?

    - by brasofilo
    I need to build an associative array with the plugin name and the language file it uses in the following sequence: /whatever/path/length/public_html/wp-content/plugins/adminimize/languages/adminimize-en_US.mo /whatever/path/length/public_html/wp-content/plugins/audio-tube/lang/atp-en_US.mo /whatever/path/length/public_html/wp-content/languages/en_US.mo /whatever/path/length/public_html/wp-content/themes/twentyeleven/languages/en_US.mo Those are the language files WordPress is loading. They are all inside /wp-content/, but with variable server paths. I'm looking only for those inside the plugins folder, grab the plugin folder name and the filename. Hipothetical case in PHP, where reg_extract_* functions are the parts I'm missing: $plugins = array(); foreach( $big_array as $item ) { $folder = reg_extract_folder( $item ); if( 'plugin' == $folder ) { // "folder-name-after-plugins-folder" $plugin_name = reg_extract_pname( $item ); // "ending-mo-file.mo" $file_name = reg_extract_fname( $item ); $plugins[] = array( 'name' => $plugin_name, 'file' => $file_name ); } } [update] Ok, so I was missing quite a basic function, pathinfo... :/ No problem to detect if /plugins/ is contained in the array. But what about the plugin folder name?

    Read the article

  • How does this RegEx for parsing emails work in PHP?

    - by George Edison
    Okay, I have the following PHP code to extract an email address of the following two forms: Random Stranger <[email protected]> [email protected] Here is the PHP code: // The first example $sender = "Random Stranger <[email protected]>"; $pattern = '/([\w_-]*@[\w-\.]*)|.*<([\w_-]*@[\w-\.]*)>/'; preg_match($pattern,$sender,$matches,PREG_OFFSET_CAPTURE); echo "<pre>"; print_r($matches); echo "</pre><hr>"; // The second example $sender = "[email protected]"; preg_match($pattern,$sender,$matches,PREG_OFFSET_CAPTURE); echo "<pre>"; print_r($matches); echo "</pre>"; My question is... what is in $matches? It seems to be a strange collection of arrays. Which index holds the match from the parenthesis? How can I be sure I'm getting the email address and only the email address? Update: Here is the output: Array ( [0] => Array ( [0] => Random Stranger [1] => 0 ) [1] => Array ( [0] => [1] => -1 ) [2] => Array ( [0] => [email protected] [1] => 5 ) ) Array ( [0] => Array ( [0] => [email protected] [1] => 0 ) [1] => Array ( [0] => [email protected] [1] => 0 ) )

    Read the article

  • How to count the Chinese word in a file using regex in perl?

    - by Ivan
    I tried following perl code to count the Chinese word of a file, it seems working but not get the right thing. Any help is greatly appreciated. The Error message is Use of uninitialized value $valid in concatenation (.) or string at word_counting.pl line 21, <FILE> line 21. Total things = 125, valid words = which seems to me the problem is the file format. The "total thing" is 125 that is the string number (125 lines). The strangest part is my console displayed all the individual Chinese words correctly without any problem. The utf-8 pragma is installed. #!/usr/bin/perl -w use strict; use utf8; use Encode qw(encode); use Encode::HanExtra; my $input_file = "sample_file.txt"; my ($total, $valid); my %count; open (FILE, "< $input_file") or die "Can't open $input_file: $!"; while (<FILE>) { foreach (split) { #break $_ into words, assign each to $_ in turn $total++; next if /\W|^\d+/; #strange words skip the remainder of the loop $valid++; $count{$_}++; # count each separate word stored in a hash ## next comes here ## } } print "Total things = $total, valid words = $valid\n"; foreach my $word (sort keys %count) { print "$word \t was seen \t $count{$word} \t times.\n"; } ##---Data---- sample_file.txt ??????,???????,????.??????.????:"?????????????,??????,????????.????????,?????????, ???????????.????????,???????????,??????,??????.???:`??,???????????.'?????, ??????????."??????,??????.????.???, ????????????,????,??????,?????????,??????????????. ????????,??????,???????????,????????,????????.????,????,???????, ??????????,??????,????????.??????.

    Read the article

  • How do you replace many characters in a regex?

    - by macca1
    I am sanitizing an input field and manually getting and setting the caret position in the process. With some abstraction, here's the basic idea: <input type="text" onkeyup"check(this)"> And javascript... function check(element) { var charPosition = getCaretPosition(element); $(element).val( sanitize( $(element).val() ) ); setCaretPosition(element, charPosition); } function sanitize(s) { return s.replace(/[^a-zA-Z0-9\s]/g, ''); } This is working fine except when a character does actually get sanitized, my caret position is off by one. Basically I'd like a way to see if the sanitize function has actually replaced a character (and at what index) so then I can adjust the charPosition if necessary. Any ideas?

    Read the article

  • jQuery and regex for adding icons to specific links?!

    - by rayne
    I'm using jQuery to add icons to specific links, e.g. a myspace icon for sites starting with http://myspace.com etc. However, I can't figure out how to use regular expressions (if that's even possible here), to make jQuery recognize the link either with or without "www." (I'm very bad at regular expressions in general). Here are two examples: $("a[href^='http://www.last.fm']").addClass("lastfm").attr("target", "_blank"); $("a[href^='http://livejournal.com']").addClass("livejournal").attr("target", "_blank"); They work fine, but I now I want the last.fm link to work with http://last.fm, http://www.last.fm and http://www.lastfm.de. Currently it only works for www.last.fm. I also would like to make the livejournal link work with subdomains links like http://username.livejournal.com How can I do that? Thanks in advance!

    Read the article

  • Regex for Eclipse/Flash Builder File Search for comments?

    - by Brian Bishop
    In Eclipse (and Flash/Flex Builder) you get the option with Ctrl+Shift+F to do a file search and look for a regular expression. Would be a real handy thing to know. I want to find the word negate if it appears in a Flex/java comment like the following: // It was negated because or /* The negate option was.... */ or /** * We have to negate the value */ Any ideas? Will test them out at http://www.regexplanet.com/simple/index.html

    Read the article

  • How can I check with a regex that a string contains only certain allowed characters?

    - by Camran
    I need a special regular expression, have no experience in them whatsoever so I am turning to you guys on this one: I need to validate a classifieds title field so it doesn't have any special characters in it, almost. Only letters and numbers should be allowed, and also the swedish three letters å, ä, ö, and also not case sensitive. Besides the above, these should also be allowed: The "&" sign. Parenthesis sign "()" Mathematical signs "-", "+", "%", "/", "*" Dollar and Euro signs One accent signed letter: "é". //Only this one is required Double quote and singel quote signs. The comma "," and point "." signs Thanks

    Read the article

  • Regex for Password Must be contain at least 8 characters, least 1 number and both lower and uppercase letters and special characters

    - by user2442653
    I want a regular expression to check that Password Must be contain at least 8 characters, including at least 1 number and includes both lower and uppercase letters and special characters (e.g., #, ?, !) Cannot be your old password or contain your username, "password", or "websitename" And here is my validation expression which is for 8 characters including 1 uppercase letter, 1 lowercase letter, 1 number or special character. (?=^.{8,}$)((?=.*\d)|(?=.*\W+))(?![.\n])(?=.*[A-Z])(?=.*[a-z]).*$" How I can write it for password must be 8 characters including 1 uppercase letter, 1 special character and alphanumeric characters?

    Read the article

< Previous Page | 45 46 47 48 49 50 51 52 53 54 55 56  | Next Page >