Search Results

Search found 3804 results on 153 pages for 'regex'.

Page 47/153 | < Previous Page | 43 44 45 46 47 48 49 50 51 52 53 54  | Next Page >

  • New TPerlRegEx Compatible with Delphi XE

    - by Jan Goyvaerts
    The new RegularExpressionsCore unit in Delphi XE is based on the PerlRegEx unit that I wrote many years ago. Since I donated full rights to a copy rather than full rights to the original, I can continue to make my version of TPerlRegEx available to people using older versions of Delphi. I did make a few changes to the code to modernize it a bit prior to donating a copy to Embarcadero. The latest TPerlRegEx includes those changes. This allows you to use the same regex-based code using the RegularExpressionsCore unit in Delphi XE, and the PerlRegEx unit in Delphi 2010 and earlier. If you’re writing new code using regular expressions in Delphi 2010 or earlier, I strongly recomment you use the new version of my PerlRegEx unit. If you later migrate your code to Delphi XE, all you have to do is replace PerlRegEx with RegularExrpessionsCore in the uses clause of your units. If you have code written using an older version of TPerlRegEx that you want to migrate to the latest TPerlRegEx, you’ll need to take a few changes into account. The original TPerlRegEx was developed when Borland’s goal was to have a component for everything on the component palette. So the old TPerlRegEx derives from TComponent, allowing you to put it on the component palette and drop it on a form. The new TPerlRegEx derives from TObject. It can only be instantiated at runtime. If you want to migrate from an older version of TPerlRegEx to the latest TPerlRegEx, start with removing any TPerlRegEx components you may have placed on forms or data modules and instantiate the objects at runtime instead. When instantiating at runtime, you no longer need to pass an owner component to the Create() constructor. Simply remove the parameter. Some of the property and method names in the original TPerlRegEx were a bit unwieldy. These have been renamed in the latest TPerlRegEx. Essentially, in all identifiers SubExpression was replaced with Group and MatchedExpression was replaced with Matched. Here is a complete list of the changed identifiers: Old Identifier New Identifier StoreSubExpression StoreGroups NamedSubExpression NamedGroup MatchedExpression MatchedText MatchedExpressionLength MatchedLength MatchedExpressionOffset MatchedOffset SubExpressionCount GroupCount SubExpressions Groups SubExpressionLengths GroupLengths SubExpressionOffsets GroupOffsets Download TPerlRegEx. Source is included under the MPL 1.1 license.

    Read the article

  • How do I extract a postcode from one column in SSIS using regular expression

    - by Aphillippe
    I'm trying to use a custom regex clean transformation (information found here ) to extract a post code from a mixed address column (Address3) and move it to a new column (Post Code) Example of incoming data: Address3: "London W12 9LZ" Incoming data could be any combination of place names with a post code at the start, middle or end (or not at all). Desired outcome: Address3: "London" Post Code: "W12 9LZ" Essentially, in plain english, "move (not copy) any post code found from address3 into Post Code". My regex skills aren't brilliant but I've managed to get as far as extracting the post code and getting it into its own column using the following regex, matching from Address3 and replacing into Post Code: Match Expression: (?<stringOUT>([A-PR-UWYZa-pr-uwyz]([0-9]{1,2}|([A-HK-Ya-hk-y][0-9]|[A-HK-Ya-hk-y][0-9] ([0-9]|[ABEHMNPRV-Yabehmnprv-y]))|[0-9][A-HJKS-UWa-hjks-uw])\ {0,1}[0-9][ABD-HJLNP-UW-Zabd-hjlnp-uw-z]{2}|([Gg][Ii][Rr]\ 0[Aa][Aa])|([Ss][Aa][Nn]\ {0,1}[Tt][Aa]1)|([Bb][Ff][Pp][Oo]\ {0,1}([Cc]\/[Oo]\ )?[0-9]{1,4})|(([Aa][Ss][Cc][Nn]|[Bb][Bb][Nn][Dd]|[BFSbfs][Ii][Qq][Qq]|[Pp][Cc][Rr][Nn]|[Ss][Tt][Hh][Ll]|[Tt][Dd][Cc][Uu]|[Tt][Kk][Cc][Aa])\ {0,1}1[Zz][Zz]))) Replace Expression: ${stringOUT} So this leaves me with: Address3: "London W12 9LZ" Post Code: "W12 9LZ" My next thought is to keep the above match/replace, then add another to match anything that doesn't match the above regex. I think it might be a negative lookahead but I can't seem to make it work. I'm using SSIS 2008 R2 and I think the regex clean transformation uses .net regex implementation. Thanks.

    Read the article

  • Named captured substring in pcre++

    - by VDVLeon
    Hello, I want to capture named substring with the pcre++ library. I know the pcre library has the functionality for this, but pcre++ has not implemented this. This is was I have now (just a simple example): pcrepp::Pcre regex("test (?P<groupName>bla)"); if (regex.search("test bla")) { // Get matched group by name int pos = pcre_get_stringnumber( regex.get_pcre(), "groupName" ); if (pos == PCRE_ERROR_NOSUBSTRING) return; // Get match std::string temp = regex[pos - 1]; std::cout << "temp: " << temp << "\n"; } If I debug, pos return 1, and that is right, (?Pbla) is the 1th submatch (0 is the whole match). It should be ok. But... regex.matches() return 0. Why is that :S ? Btw. I do regex[pos - 1] because pcre++ reindexes the result with 0 pointing to the first submatch, so 1. So 1 becomes 0, 2 becomes 1, 3 becomes 2, etc. Does anybody know how to fix this?

    Read the article

  • rsyslog - regex trouble

    - by benmccann
    I'm trying to setup the logentries service. If a log entry has a token in it then I would like to send it to api.logentries.com:10000. The token is a guid in the format aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee. Right now I'm doing: # If there's a logentries token then send it directly to logentries :msg, regex, ".*[a-z0-9]{8}-[a-z0-9]{4}-[a-z0-9]{4}-[a-z0-9]{4}-[a-z0-9]{12}.*" & @@api.logentries.com:10000 I checked the rsyslog debug logs and my regex is not matching, but I can't figure out why or how to fix it: 5245.961161378:7fb79b514700: Filter: check for property 'msg' (value ' fb1c507f-2ede-4d7f-a140-2bd8d56e133 - application - [play-akka.actor.default-dispatcher-1] - Found user: 4fb11ea5e4b00a1aeebe2800') regex '.*[a-z0-9]{8}-[a-z0-9]{4}-[a-z0-9]{4}-[a-z0-9]{4}-[a-z0-9]{12}.*': FALSE

    Read the article

  • Quick Replace in Visual Studio 2010 fails to use Tagged Expression n

    - by slomojo
    I'm trying to do some basic regex Quick Replace operations in Visual Studio 2010, but when I use regex grouping I don't get Tagged Expressions (ie. \1 \2 etc.) returning their values, instead they are blank. For example: Text int a = int.Parse("10"); int b = int.Parse("20"); int c = int.Parse("30"); Search Pattern (regex enabled) int\.Parse\("([0-9]*)"\); Replace \1; Replaced Text int a = ; int b = ; int c = ;

    Read the article

  • Regex query: how can I search PDFs for a phrase where words in that phrase appear on more than one l

    - by Alison
    I am trying to set up an index page for the weekly magazine I work on. It is to show readers the names of companies mentioned in that weeks' issue, plus the page numbers they are appear on. I want to search all the PDF files for the week, where one PDF = one magazine page (originally made in Adobe InDesign CS3 and Adobe InCopy CS3). I have set up a list of companies I want to search for and, using PowerGREP and using delimited regular expressions, I am able to find most page numbers where a company is mentioned. However, where a company name contains two or more words, the search I am running will not pick up instances where the name appears over more than one line. For example, when looking for "CB Richard Ellis" and "Cushman & Wakefield", I got no result when the text appeared like this: DTZ beat BNP PRE, CB [line break here] Richard Ellis and Cushman & [line break here] Wakefield to secure the contract. [line end here] Could someone advise me on how to write a regular expression that will ignore white space between words and ignore line endings OR one that will look for the words including all types of white space (ie uneven spaces between words; spaces at the end of lines or line endings; and tabs (I am guessing that this info is imbedded somehow in PDF files). Here is a sample of the set of terms I have asked PowerGREP to search for: \bCB Richard Ellis\b \bCB Richard Ellis Hotels\b \bCentaur Services\b \bChapman Herbert\b \bCharities Property Fund\b \bChetwoods Architects\b \bChurch Commissioners\b \bClive Emson\b \bClothworkers’ Company\b \bColliers CRE\b \bCombined English Stores Group\b \bCommercial Estates Group\b \bConnells\b \bCooke & Powell\b \bCordea Savills\b \bCrown Estate\b \bCushman & Wakefield\b \bCWM Retail Property Advisors\b [Note that there is a delimited hard return between each \b at the end of each phrase and beginnong of the next phrase.] By the way, I am a production journalist and not usually involved in finding IT-type solutions and am finding it difficult to get to grips with the technical language on the PowerGREP site. Thanks for assistance Alison

    Read the article

  • Regex for [a-zA-Z0-9\-] with dashes allowed in between but not at the start or end

    - by orokusaki
    I'm using Python and I'm not trying to extract the value, but rather test to make sure it fits the pattern. allowed values: spam123-spam-eggs-eggs1 spam123-eggs123 spam 123 eggs123 I just can't have a dash at the starting or the end. There is a question on here that works in the opposite direction by getting the string value after the fact, but I simply need to test for the value so that I can disallow it. Also, it can be a maximum of 25 chars long, but a minimum of 4 chars long. Here's what I've come up with after some experimentation with lookbehind, etc: # Nothing here

    Read the article

  • C# Regex - Replace multiple characters at once without overwriting?

    - by Everaldo Aguiar
    Hello guys, I'm implementing a c# program that should automatize a Mono-alphabetic substitution cipher. The functionality i'm working on at the moment is the simplest one: The user will provide a plain text and a cipher alphabet, for example: Plain text(input): THIS IS A TEST Cipher alphabet: A - Y, H - Z, I - K, S - L, E - J, T - Q Cipher Text(output): QZKL KL QJLQ I thought of using regular expressions since I've been programming in perl for a while, but I'm encountering some problems on c#. First I would like to know if someone would have a suggestion for a regular expression that would replace all occurrence of each letter by its corresponding cipher letter (provided by user) at once and without overwriting anything. Example: In this case, user provides plaintext "TEST", and on his cipher alphabet, he wishes to have all his T's replaced with E's, E's replaced with Y and S replaced with J. My first thought was to substitute each occurrence of a letter with an individual character and then replace that character by the cipherletter corresponding to the plaintext letter provided. Using the same example word "TEST", the steps taken by the program to provide an answer would be: 1 - replace T's with (lets say) @ 2 - replace E's with # 3 - replace S's with & 4 - Replace @ with E, # with Y, & with j 5 - Output = EYJE This solution doesn't seem to work for large texts. I would like to know if anyone can think of a single regular expression that would allow me to replace each letter in a given text by its corresponding letter in a 26-letter cipher alphabet without the need of splitting the task in an intermediate step as I mentioned. If it helps visualize the process, this is a print screen of my GUI for the program: http://img43.imageshack.us/img43/2118/11618743.jpg

    Read the article

  • Using regex to extract variables from a plain-text form letter?

    - by Yaaqov
    Hi - I'm looking for a good example of using Regular Expressions in PHP to "reverse engineer" a form letter (with a known format, of course) that has been pasted into a multiline textbox and sent to a script for processing. So, for example, let's assume this is the original plain-text input (taken from a USDA press release): WASHINGTON, April 5, 2010 - North American Bison Co-Op, a New Rockford, N.D., establishment is recalling approximately 25,000 pounds of whole beef heads containing tongues that may not have had the tonsils completely removed, which is not compliant with regulations that require the removal of tonsils from cattle of all ages, the U.S. Department of Agriculture's Food Safety and Inspection Service (FSIS) announced today. For clarity, the fields that are variables are highlighted below: [pr_city=]WASHINGTON, [pr_date=]April 5, 2010 - [corp_name=]North American Bison Co-Op, a [corp_city=]New Rockford, [corp_state=]N.D., establishment is recalling approximately [amount=]25,000 pounds of [product=]whole beef heads containing tongues that may not have had the tonsils completely removed, which is not compliant with regulations that require [reason=]the removal of tonsils from cattle of all ages, the U.S. Department of Agriculture's Food Safety and Inspection Service (FSIS) announced today. How could I efficiently extract the contents of the pr_city pr_date corp_name corp_city corp_state amount product reason fields from my example? Any help would be appreciated, thanks.

    Read the article

  • How can I extract the nth occurrence of a match in a Perl regex?

    - by Zaid
    Is it possible to extract the n'th match in a string of single-quoted words? use strict; use warnings; my $string1 = '\'I want to\' \'extract the word\' \'Perl\',\'from this string\''; my $string2 = '\'What about\',\'getting\',\'Perl\',\'from\',\'here\',\'?\''; sub extract_quoted { my ($string, $index) = @_; my ($wanted) = $string =~ /some_regex_using _$index/; return $wanted; } extract_wanted ($string1, 3); # Should return 'Perl', with quotes extract_wanted ($string2, 3); # Should return 'Perl', with quotes

    Read the article

  • Regex for retrieving the parameter of the css url function...

    - by Kieron
    Hi, I'm trying to get the url portion of the following string: url(images/ui-bg_highlight-soft_75_cccccc_1x100.png) So the required part if images/ui-bg_highlight-soft_75_cccccc_1x100.png. Currently I've got this: url\((?<url>.*)\) But it seems to be choking on the following example: url(images/ui-bg_flat_0_aaaaaa_40x100.png) 50% 50% repeat-x; opacity: .30;filter:Alpha(Opacity=30) Which results in images/ui-bg_flat_0_aaaaaa_40x100.png) 50% 50% repeat-x; opacity: .30;filter:Alpha(Opacity=30... I'd like to make sure that it supports as many variations as possible (additional whitespace etc). Thanks! Kieron

    Read the article

  • PHP-REGEX: accented letters matches non-accented ones, and visceversa. How to achive it?

    - by Lightworker
    I want to do the typical higlight code. So I have something like: $valor = preg_replace("/(".$_REQUEST['txt_search'].")/iu", "<span style='background-color:yellow; font-weight:bold;'>\\1</span>", $valor); Now, the request word could be something like "josé". And with it, I want "jose" or "JOSÉ" or "José" or ... highlighted too. With this expression, if I write "josé", it matches "josé" and "JOSÉ" (and all the case variants). It always matches the accented variants only. If I search "jose", it matches "JOSE", "jose", "Jose"... but not the accented ones. So I've partially what I want, cause I have case insensitive on accented and non-accented separately. I need it fully combined, wich means accent (unicode) insensitive, so I can search "jose", and highlight "josé", "josÉ", "José", "JOSE", "JOSÉ", "JoSé", ... I don't want to do a replace of accents on the word, cause when I print it on screen I need to see the real word as it comes. Any ideas? Thanks!

    Read the article

  • What is wrong with this really really simple RegEx expression?

    - by Pure.Krome
    Hi folks, this one is really easy. I'm trying to create a Regular Expression that will result in a Successful Match when against the following text /default.aspx? So i tried the following... ^/default.aspx$ and it's failing to match it. Can someone help, please? (i'm guessing i'm screwing up becuase of the \ and the ? in the input expression).

    Read the article

  • Regex to identify rows that do not contain exact number of occurences of quotemark character using Notepad++

    - by SamAspin
    I would like to be able to jump to rows that dont contain 6 quotemarks in a quoted-CSV file as it feels like a good way to identify broken rows. I think using a regular expression with Notepad++'s find features would be a sensible approach but I'm not sure how to pick the rows up. 6 quotemarks (") would suggest a complete row so I want to skip to any row that does not contain 6. Here is some sample data to play with, in this example its the 4th line I'd like to jump to "sam","mark","dave" "sam","mark","dave" "sam","mark","dave" "sam","mark"," dave" "sam","mark","dave" "sam","mark","dave"

    Read the article

  • How to do regex HTML tag replace in MS SQL?

    - by timmerk
    I have a table in SQL Server 2005 with hundreds of rows with HTML content. Some of the content has HTML like: <span class=heading-2>Directions</span> where "Directions" changes depending on page name. I need to change all the <span class=heading-2> and </span> tags to <h2> and </h2> tags. I wrote this query to do content changes in the past, but it doesn't work for my current problem because of the ending HTML tag: Update ContentManager Set ContentManager.Content = replace(Cast(ContentManager.Content AS NVARCHAR(Max)), 'old text', 'new text') Does anyone know how I could accomplish the span to h2 replacing purely in T-SQL? Everything I found showed I would have to do CLR integration. Thanks!

    Read the article

  • Regex for Matching First Alphanumeric Character skipping (The |An? )

    - by TheLizardKing
    I have a list of artists, albums and tracks that I want to sort using the first letter of their respective name. The issue arrives when I want to ignore "The ", "A ", "An " and other various non-alphanumeric characters (Talking to you "Weird Al" Yankovic and [dialog]). Django has a nice start '^(An?|The) +' but I want to ignore those and a few others of my choice. I am doing this in Django, using a MySQL db with utf8_bin collation.

    Read the article

  • How to regex match a string of alnums and hyphens, but which doesn't begin or end with a hyphen?

    - by Shahar Evron
    I have some code validating a string of 1 to 32 characters, which may contain only alpha-numerics and hyphens ('-') but may not begin or end with a hyphen. I'm using PCRE regular expressions & PHP (albeit the PHP part is not really important in this case). Right now the pseudo-code looks like this: if (match("/^[\p{L}0-9][\p{L}0-9-]{0,31}$/u", string) and not match("/-$/", string)) print "success!" That is, I'm checking first that the string is of right contents, doesn't being with a '-' and is of the right length, and then I'm running another test to see that it doesn't end with a '-'. Any suggestions on merging this into a single PCRE regular expression? I've tried using look-ahead / look-behind assertions but couldn't get it to work.

    Read the article

  • Best way to get back to using the power of lxml after having to use a regex to find something in an

    - by PyNEwbie
    I am trying to rip some text out of a large number of html documents (numbers in the hundreds of thousands). The documents are really forms but they are prepared by a very large group of different organizations so there is significant variation in how they create the document. For example, the documents are divided into chapters. I might want to extract the contents of Chapter 5 from every document so I can analyze the content of the chapter. Initially I thought this would be easy but it turns out that the authors might use a set of non-nested tables throughout the document to hold the content so that Chapter n could be displayed using td tags inside a table. Or they might use other elements such as p tags H tags, div tags or any other block level element. After trying repeatedly to use lxml to help me identify the beginning and end of each chapter I have determined that it is a lot cleaner to use a regular expression because in every case, no matter what the enclosing html element is the chapter label is always in the form of >Chapter # It is a little more complicated in that there might be some white space or non-breaking space represented in different ways (  or   or just spaces). Nonetheless it was trivial to write a regular expression to identify the beginning of each section. (The beginning of one section is the end of the previous section.) But now I want to use lxml to get the text out. My thought is that I have really no choice but to walk along my string to find the close tag for the element that encloses the text I am using to find the relevant section. That is here is one example where the element holding the Chapter name is a div <div style="DISPLAY: block; MARGIN-LEFT: 0pt; TEXT-INDENT: 0pt; MARGIN-RIGHT: 0pt" align="left"><font style="DISPLAY: inline; FONT-WEIGHT: bold; FONT-SIZE: 10pt; FONT-FAMILY: Times New Roman">Chapter 1.&#160;&#160;&#160;Our Beginnings.</font></div> So I am imagining that I would begin at the location where I found the match for chapter 1 and set up a regular expressions to find the next </div|</td|</p|</h1 . . . So at this point I have identified the type of element holding my chapter heading I can use the same logic to find all of the text that is within that element that is set up a regular expression to help me mark from >Chapter 1.&#160;&#160;&#160;Our Beginnings.< So I have identified where my Chapter 1 begins I can do the same for chapter 2 (which is where Chapter 1 ends) Now I am imagining that I am going to snip the document beginning at the opening of the element that I identified as the element the indicates where chapter 1 begins and ending just before the opening of the element that I identified as the element that indicates where Chapter 2 begins. The string that I have identified will then be fed to lxml to use its power to get the content. I am going to all of this trouble because I have read over and over - never use a regular expression to extract content from html documents and I have not hit on a way to be as accurate with lxml to identify the starting and ending locations for the text I want to extract. For example, I can never be certain that the subtitle of Chapter 1 is Our Beginnings it could be Our Red Canary. Let me say that I spent two solid days trying with lxml to be confident that I had the beginning and ending elements and I could only be accurate <60% of the time but a very short regular expression has given me better than 95% success. I have a tendency to make things more complicated than necessary so I am wondering if anyone has seen or solved a similar problems and if they had an approach (not the details mind you) that they would like to offer.

    Read the article

  • How can I match at the beginning of any line, including the first, with a Perl regex?

    - by JoelFan
    According the the Perl documentation on regexes: By default, the "^" character is guaranteed to match only the beginning of the string ... Embedded newlines will not be matched by "^" ... You may, however, wish to treat a string as a multi-line buffer, such that the "^" will match after any newline within the string ... you can do this by using the /m modifier on the pattern match operator. The "after any newline" part means that it will only match at the beginning of the 2nd and subsequent lines. What if I want to match at the beginning of any line (1st, 2nd, etc.)? EDIT: OK, it seems that the file has BOM information (3 chars) at the beginning and that's what's messing me up. Any way to get ^ to match anyway? EDIT: So in the end it works (as long as there's no BOM), but now it seems that the Perl documentation is wrong, since it says "after any newline"

    Read the article

< Previous Page | 43 44 45 46 47 48 49 50 51 52 53 54  | Next Page >