Search Results

Search found 3804 results on 153 pages for 'regex'.

Page 116/153 | < Previous Page | 112 113 114 115 116 117 118 119 120 121 122 123  | Next Page >

  • mine phrases (up to 3 words) from a given text

    - by DS_web_developer
    I asked before for a simple solution to my problem (using sphinx search service) but I got nowhere... someone has kindly provided me with this code <?php /** * $Project: GeoGraph $ * $Id$ * * GeoGraph geographic photo archive project * This file copyright (C) 2005 Barry Hunter ([email protected]) * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License * as published by the Free Software Foundation; either version 2 * of the License, or (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */ /** * Provides the methods for updating the worknet tables * * @package Geograph * @author Barry Hunter <[email protected]> * @version $Revision$ */ function addTwoLetterPhrase($phrase) { global $w2; $w2[$phrase] = (isset($w2[$phrase]))?($w2[$phrase]+1):1; } function addThreeLetterPhrase($phrase) { global $w3; $w3[$phrase] = (isset($w3[$phrase]))?($w3[$phrase]+1):1; } function updateWordnet(&$db,$text,$field,$id) { global $w1,$w2,$w3; $alltext = strtolower(preg_replace('/\W+/',' ',str_replace("'",'',$text))); if (strlen($text)< 1) return; $words = preg_split('/ /',$alltext); $w1 = array(); $w2 = array(); $w3 = array(); //build a list of one word phrases foreach ($words as $word) { $w1[$word] = (isset($w1[$word]))?($w1[$word]+1):1; } //build a list of two word phrases $text = $alltext; $text = preg_replace('/(\w+) (\w+)/e','addTwoLetterPhrase("$1 $2")',$text); $text = $alltext; $text = preg_replace('/(\w+)/','',$text,1); $text = preg_replace('/(\w+) (\w+)/e','addTwoLetterPhrase("$1 $2")',$text); //build a list of three word phrases $text = $alltext; $text = preg_replace('/(\w+) (\w+) (\w+)/e','addThreeLetterPhrase("$1 $2 $3")',$text); $text = $alltext; $text = preg_replace('/(\w+)/','',$text,1); $text = preg_replace('/(\w+) (\w+) (\w+)/e','addThreeLetterPhrase("$1 $2 $3")',$text); $text = $alltext; $text = preg_replace('/(\w+) (\w+)/','',$text,1); $text = preg_replace('/(\w+) (\w+) (\w+)/e','addThreeLetterPhrase("$1 $2 $3")',$text); foreach ($w1 as $word=>$count) { $db->Execute("insert into wordnet1 set gid = $id,words = '$word',$field = $count");// ON DUPLICATE KEY UPDATE $field=$field+$count"); } foreach ($w2 as $word=>$count) { $db->Execute("insert into wordnet2 set gid = $id,words = '$word',$field = $count"); } foreach ($w3 as $word=>$count) { $db->Execute("insert into wordnet3 set gid = $id,words = '$word',$field = $count"); } } ?> It works fine and does almost exactly what I need....... except.... it is not utf8 friendly... I mean... it splits whole words into parts (on special chars) where it shouldn't! so my guess is I should use multibyte functions instead of regular preg_replace... I tried to replace preg_replace with mb_ereg_replace but it is not working as it should... at least not for 2 and 3 words phrases any ideas?

    Read the article

  • Splitting Nucleotide Sequences in JS with Regexp

    - by TEmerson
    I'm trying to split up a nucleotide sequence into amino acid strings using a regular expression. I have to start a new string at each occurrence of the string "ATG", but I don't want to actually stop the first match at the "ATG". Valid input is any ordering of a string of As, Cs, Gs, and Ts. For example, given the input string: ATGAACATAGGACATGAGGAGTCA I should get two strings: ATGAACATAGGACATGAGGAGTCA (the whole thing) and ATGAGGAGTCA (the first match of "ATG" onward). A string that contains "ATG" n times should result in n results. I thought the expression /(?:[ACGT]*)(ATG)[ACGT]*/g would work, but it doesn't. If this can't be done with a regexp it's easy enough to just write out the code for, but I always prefer an elegant solution if one is available.

    Read the article

  • How to modify complex argument strings in Perl

    - by mmccoo
    I have a cmdline that I'm trying to modify to remove some of the arguments. What makes this complex is that I can have nested arguments. Say that I have this: $cmdline = "-a -xyz -a- -b -xyz -b- -a -xyz -a-" I have three different -xyz flags that are to be interpreted in two different contexts. One is the -a context and the other is the -b context. I want to remove the "a" -xyz's but leave the ones in the "b" -xyz. How can I most effectively do this in Perl?

    Read the article

  • [Qt] Check octal number

    - by sterh
    Hello, I write simple application in C++/Qt. And i have a text and some octal number in it. My app splits this text by spaces. And i need to check octal numbers from text. How can i select octal numbers from this text with regular expressions? Thank you.

    Read the article

  • java - check if string ends with certain pattern

    - by The Learner
    I have string like: This.is.a.great.place.too.work. (or) This/is/a/great/place/too/work/ than my java program should give me that the sentence is valid and it has "work". if i Have : This.is.a.great.place.too.work.hahahha (or) This/is/a/great/place/too/work/hahahah Should not give me that there is a work in the sentance. so I am looking at java strings to find a word at the end of the sentance having . (or),(or)/ before it. How can I achieve that

    Read the article

  • python and regular expression with unicode

    - by bsn
    I need to delete some unicode symbols from the string '?????? ??????? ???????????? ??????????' I know they exist here for sure. I try: re.sub('([\u064B-\u0652\u06D4\u0670\u0674\u06D5-\u06ED]+)', '', '?????? ??????? ???????????? ??????????') but it doesn't work. String stays the same. ant suggestion what i do wrong?

    Read the article

  • Match Phrases (in array) in text string

    - by Tim Hanssen
    I'm using the Twitter API streaming to collect thousand of tweets every minute. They need to be matched to a list of keywords (can contain spaces). This is my current method: $text = preg_replace( '/[^a-z0-9]+/i', ' ', strtolower( $data['text'] ) ); $breakout = explode( " ", $text ); $result = array_intersect( $this->_currentTracks, $breakout ); I chop the tweet into words, and the matches them against my current keywords. This works well for all the keywords without a space ofc. If I wanted to find for example "Den Haag", It won't show up, because the string is exploded into words (based on the spaces). Any ideas about how I can do this in a quick way? Kind regards, Tim

    Read the article

  • preg_replace replacing with array

    - by Scott
    What I want to do is replace the "[replace]" in input string with the corresponding vaule in the replace array. The total number of values will change but there will always be the same number in the replace array as in input string. I have tried doing this with preg_replace and preg_replace_callback but I can't get the pattern right for [replace], I also tried using vsprintf but the % in <table width="100%"> was messing it up. All help is greatly appreciated! Replace Array: $array = array('value 1','value 2','value 3'); Input String $string = ' <table width="100%"> <tr> <td>Name:</td> <td>[replace]</td> </tr> <tr> <td>Date:</td> <td>[replace]</td> </tr> <tr> <td>Info:</td> <td>[replace]</td> </tr> </table> '; Desired Result <table width="100%"> <tr> <td>Name:</td> <td>value 1</td> </tr> <tr> <td>Date:</td> <td>value 2</td> </tr> <tr> <td>Info:</td> <td>value 3</td> </tr> </table>

    Read the article

  • Normalizing Strings using Regexes

    - by RasputinJones
    How do I match this string "1 & 2" from this string "Foo Bar 1 & 2"? How do I match this string "1, 2 & 3" from this string "Foo Baz 1, 2 & 3"? Trying to split out "Foo Bar" from the string using regexes while using the presence of "1 & 2" or "1, 2 & 3" as conditionals to normalize these strings into "Foo Bar 1" and "Foo Bar 2" or "Foo Baz 1", "Foo Baz 2" and "Foo Baz 3" respectively.

    Read the article

  • Form validation in JAvascript with Regexp

    - by Nikita Barsukov
    I have a webpage with an input field where only digits are allowed. The input field has an onkeyup event that starts this validating function: function validate() { var uah_amount = document.getElementById("UAH").value; var allowed = /^\d+$/; document.getElementById("error").innerHTML = document.getElementById("UAH").value; if (!allowed.test(uah_amount)) { document.getElementById("error").style.backgroundColor = "red"; } } Everything works as I expect until I hit Backspace button to remove some characters. In this case function always behaves as if I entered letters. How to correct this?

    Read the article

  • dropping characters from regular expression groups

    - by tcurdt
    The goal: I want to convert a number from the format "10.234,56" to "10234.56" Using this simple approach almost gets us there /([\d\.]+),(\d\d)/ => '\1.\2' The problem is that the first group of the match (of course) still contains the '.' character. So questions are: Is it possible to exclude a character from the group somehow? How would you solve this with a single regexp (I know this is a trivial problem when not using a single regexp)

    Read the article

  • Find multiple patterns with a single preg_match_all in PHP

    - by Mark
    Using PHP and preg_match_all I'm trying to get all the HTML content between the following tags (and the tags also): <p>paragraph text</p> don't take this <ul><li>item 1</li><li>item 2</li></ul> don't take this <table><tr><td>table content</td></tr></table> I can get one of them just fine: preg_match_all("(<p>(.*)</p>)siU", $content, $matches, PREG_SET_ORDER); Is there a way to get all the <p></p> <ul></ul> <table></table> content with a single preg_match_all? I need them to come out in the order they were found so I can echo the content and it will make sense. So if I did a preg_match_all on the above content then iterated through the $matches array it would echo: <p>paragraph text</p> <ul><li>item 1</li><li>item 2</li></ul> <table><tr><td>table content</td></tr></table>

    Read the article

  • How can I match everything in a string until the second occurrence of a delimiter with a regular expression?

    - by Steve
    I am trying to refine a preg_match_all by finding the second occurrence of a period then a space: <?php $str = "East Winds 20 knots. Gusts to 25 knots. Waters a moderate chop. Slight chance of showers."; preg_match_all ('/(^)((.|\n)+?)(\.\s{2})/',$str, $matches); $dataarray=$matches[2]; foreach ($dataarray as $value) { echo $value; } ?> But it does not work: the {2} occurrence is incorrect. I have to use preg_match_all because I am scraping dynamic HTML. I want to capture this from the string: East Winds 20 knots. Gusts to 25 knots.

    Read the article

  • Some pro regular expressions help needed here

    - by Camran
    I need a special regular expression, have no experience in them whatsoever so I am turning to you guys on this one: I need to validate a classifieds title field so it doesn't have any special characters in it, almost. Only letters and numbers should be allowed, and also the swedish three letters å, ä, ö, and also not case sensitive. Besides the above, these should also be allowed: The "&" sign. Parenthesis sign "()" Mathematical signs "-", "+", "%", "/", "*" Dollar and Euro signs Accent sign or whatever it's called, for example in "coupé" the apostrophe above the "e". Double quote and singel quote signs. The comma "," and point "." signs Thanks

    Read the article

  • Using varible in re.match in python

    - by screwuphead
    I am trying to create an array of things to match in a description line. So I cant ignore them later on in my script. Below is a sample script that I have been working on, on the side. Basically I am trying to take a bunch of strings and match it against a bunch of other strings. AKA: asdf or asfs or wrtw in string = true continue with script if not print this. import re ignorelist = ['^test', '(.*)set'] def guess(a): for ignore in ignorelist: if re.match(ignore, a): return('LOSE!') else: return('WIN!') a = raw_input('Take a guess: ') print guess(a) Thanks

    Read the article

  • Extracting numbers from a url using javascript?

    - by stormist
    var exampleURL = '/example/url/345234/test/'; var numbersOnly = [?] The /url/ and /test portions of the path will always be the same. Note that I need the numbers between /url/ and /test. In the example URL above, the placeholder word example might be numbers too from time to time but in that case it shouldn't be matched. Only the numbers between /url/ and /test. Thanks!

    Read the article

  • regular expression - function body extracting

    - by Altariste
    Hi, In Python script,for every method definition in some C++ code of the form: return_value ClassName::MethodName(args) {MehodBody} ,I need to extract three parts: the class name, the method name and the method body for further processing. Finding and extracting the ClassName and MethodName is easy, but is there any simple way to extract the body of the method? With all possible '{' and '}' inside it? Or are regexes unsuitable for such task?

    Read the article

< Previous Page | 112 113 114 115 116 117 118 119 120 121 122 123  | Next Page >