Search Results

Search found 5493 results on 220 pages for 'boost regex'.

Page 113/220 | < Previous Page | 109 110 111 112 113 114 115 116 117 118 119 120  | Next Page >

  • How do you implement a good profanity filter?

    - by Ben Throop
    Many of us need to deal with user input, search queries, and situations where the input text can potentially contain profanity or undesirable language. Oftentimes this needs to be filtered out. Where can one find a good list of swear words in various languages and dialects? Are there APIs available to sources that contain good lists? Or maybe an API that simply says "yes this is clean" or "no this is dirty" with some parameters? What are some good methods for catching folks trying to trick the system, like a$$, azz, or a55? Bonus points if you offer solutions for PHP. :) Edit: Response to answers that say simply avoid the programmatic issue: I think there is a place for this kind of filter when, for instance, a user can use public image search to find pictures that get added to a sensitive community pool. If they can search for "penis", then they will likely get many pictures of, yep. If we don't want pictures of that, then preventing the word as a search term is a good gatekeeper, though admittedly not a foolproof method. Getting the list of words in the first place is the real question. So I'm really referring to a way to figure out of a single token is dirty or not and then simply disallow it. I'd not bother preventing a sentiment like the totally hilarious "long necked giraffe" reference. Nothing you can do there. :)

    Read the article

  • Beginner questions on Java Regular Expression

    - by Robert
    Hello everyone. I began studying Java Regular Expression recently and I found some really intersting task.For example,I now need to dig out "Product Name","Product Description" and "Sellers for this product" out of the following HTML code.(I am sorry for the big chunck of code,but it is very straightforward) <td class="sr-check"> <input type="checkbox" name="cptitle" value="678560038" /></td> <td class="sr-image" style="width: 80px;"><a href="/Nikon-D300S-12-3-678560038/prices-html" class="strictRule" rel="nofollow"><img src="http://img01.static-nextag.com/image/Nikon-D300S-12-3-MP-Digital-SLR-Camera-Body-Black/0/000/006/789/461/678946110.jpg" alt="Nikon D300S 12.3 MP Digital SLR Camera Body - Black" class="imageLink strictRule" height="75" width="75" id="opILink_0" title="Nikon Digital Cameras - Nikon D300S 12.3 MP Digital SLR Camera Body - Black" /></a><div class="breaker">&nbsp;</div></td> <td class="sr-info"> <div class="sr-info"> <a id="opPNLink_0" class="underline" style="font-size:16px" href="/Nikon-D300S-12-3-678560038 /prices-html" >Nikon D300S 12.3 MP <b>Digital</b> SLR <b>Camera</b> Body - Black</a> <div class="sr-subinfo"> <div class="sr-info-description">SLR - 13.1MP, 12.3MP - 1x Optical Zoom - CompactFlash, SD/MMC Memory Card - 3in.</div> <div class="rating"> <img src="http://img01.static-nextag.com/imagefiles/stars/stars4_10px.gif" alt="4/5 stars" title="4/5 stars" /> (92 user ratings)</div> <div style="clear: both;"> <!-- nxtginc=nextag.api.ServerInclude$JSPIncludeWriter(/buyer/ATLSSI.jsp?ptid=678560038&dts=y) --> <a id="_atl_0" style="" href="http://www.nextag.com/serv/main/buyer/MyPDir.jsp?list=_transCookieList&amp;cmd=add&amp;ptitle=678560038" rel="nofollow">+ Add to Shopping List</a> &nbsp;|&nbsp; <!-- endnxtginc --> <a rel="nofollow" id="mltLink_0" class="mlt-link" href="/Digital-Cameras--zz500001z2z678560038zB2dgz5---html">See More Like This</a> </div> <div id="fsLink_0" class="featuredSeller"> <a rel="nofollow" class="featuredSeller" id="opFSLink_0_0" href="/norob/PtitleSeller.jsp?chnl=main&amp;tag=785646073amp;ctx=x%2BN%2Fs9zy56l4u8RXCzALE1jeLesDMzeK09rPQEdK3Yjx395ZzX9cMh9N5JAxjk7xPqF9hjk2ztM5IRXU5nspLubIXYaVzI%2B%2Fg7h1Qz58TzgvrWuNawV8qEIqqSmClArWMq6mpzNRuSlgg2xCXYObNnaIH00iKSUmBawDRvecwbCpAxhXgXoLEiEinTwr3EipComdzxL9UHFYTLoWUToUB5SRSsolQmEJ3mgnnvu83%2FC8W34TGpN9mJo%2BnyAeTkt4&amp;ptitle=678560038" target="_blank" >Thundercameras</a>:$1,289 &nbsp; <a rel="nofollow" class="featuredSeller" id="opFSLink_0_1" href="/norob/PtitleSeller.jsp?chnl=main&amp;tag=797076595&amp;ctx=x%2BN%2Fs9zy56l4u8RXCzALE1jeLesDMzeK09rPQEdK3Yjx395ZzX9cMh9N5JAxjk7xPqF9hjk2ztM5IRXU5nspLubIXYaVzI%2B%2Fg7h1Qz58TzgvrWuNawV8qEIqqSmClArWMq6mpzNRuSlgg2xCXYObNrcWLhL%2BhryuAGhXNhYSPE%2BpAxhXgXoLEiEinTwr3EipComdzxL9UHFYTLoWUToUB5SRSsolQmEJ3mgnnvu83%2FC8W34TGpN9mJo%2BnyAeTkt4&amp;ptitle=678560038" target="_blank" >PhotoVideoSuperStore</a>:$1,269 &nbsp; <a rel="nofollow" class="featuredSeller" id="opFSLink_0_2" href="/norob/PtitleSeller.jsp?chnl=main&amp;tag=803555293&amp;ctx=x%2BN%2Fs9zy56l4u8RXCzALE1jeLesDMzeK09rPQEdK3Yjx395ZzX9cMh9N5JAxjk7xPqF9hjk2ztM5IRXU5nspLubIXYaVzI%2B%2Fg7h1Qz58TzgvrWuNawV8qEIqqSmClArWMq6mpzNRuSlgg2xCXYObNt06qcvLJ5UQz7S3zKd4urWpAxhXgXoLEiEinTwr3EipComdzxL9UHFYTLoWUToUB5SRSsolQmEJ3mgnnvu83%2FC8W34TGpN9mJo%2BnyAeTkt4&amp;ptitle=678560038" target="_blank" >Digitalelect</a>:$1,279 &nbsp;</div> I would think of : (1) digging out the product name from <td class="sr-image >tag,and using regular expression exp ="<td><span\\s+class=\"sr-image\"[^>]*>" + ".*?</span><a href=\"" + "([^\"]+)" + "\"[^>]*>" + "([^<]+)" + "</a>.*?</td>"; (2) digging out the product info from the <div class="sr-info-description"> tag. exp = "<div class="sr-info-description"> [^>]*>" (3) digging out the Sellers' names from <div id="fsLink_0" class="featuredSeller"> tag. exp = "<div id="fslink_0" class="featuredSeller[^>]*>" + ".*?</span><a rel=\"" + "([^\"]+)" + "\"[^>]*>" + "([^<]+)" + "</a>.*?</td>"; I am just beginning learing using Java Regular Expression,I would be grateful if you could correct me if I am in the wrong track or my regular expressiona are wrong. Thanks a lot,guys.

    Read the article

  • regular expression of 0's and 1's

    - by Lopa
    Hello all I got this question which asks me to figure out why is it foolish to write a regular expression for the language that consists of strings of 0's and 1's that are palindromes( they read the same backwards and forwards). part 2 of the question says using any formal mechanism of your choice, show how it is possible to express the language that consists of strings of 0's and 1's that are palindromes?

    Read the article

  • Perl: Edit hyperlinks in nested tags that aren't on seperate lines

    - by user305801
    I have an interesting problem. I wrote the following perl script to recursively loop through a directory and in all html files for img/script/a tags do the following: Convert the entire url to lowercase Replace spaces and %20 with underscores The script works great except when an image tag in wrapped with an anchor tag. Is there a way to modify the current script to also be able to manipulate the links for nested tags that are not on separate lines? Basically if I have <a href="..."><img src="..."></a> the script will only change the link in the anchor tag but skip the img tag. #!/usr/bin/perl use File::Find; $input="/var/www/tecnew/"; sub process { if (-T and m/.+\.(htm|html)/i) { #print "htm/html: $_\n"; open(FILE,"+<$_") or die "couldn't open file $!\n"; $out = ''; while(<FILE>) { $cur_line = $_; if($cur_line =~ m/<a.*>/i) { print "cur_line (unaltered) $cur_line\n"; $cur_line =~ /(^.* href=\")(.+?)(\".*$)/i; $beg = $1; $link = html_clean($2); $end = $3; $cur_line = $beg.$link.$end; print "cur_line (altered) $cur_line\n"; } if($cur_line =~ m/(<img.*>|<script.*>)/i) { print "cur_line (unaltered) $cur_line\n"; $cur_line =~ /(^.* src=\")(.+?)(\".*$)/i; $beg = $1; $link = html_clean($2); $end = $3; $cur_line = $beg.$link.$end; print "cur_line (altered) $cur_line\n"; } $out .= $cur_line; } seek(FILE, 0, 0) or die "can't seek to start of file: $!"; print FILE $out or die "can't print to file: $1"; truncate(FILE, tell(FILE)) or die "can't truncate file: $!"; close(FILE) or die "can't close file: $!"; } } find(\&process, $input); sub html_clean { my($input_string) = @_; $input_string = lc($input_string); $input_string =~ s/%20|\s/_/g; return $input_string; }

    Read the article

  • Rewrite Query String

    - by Virgil
    Hello, I am trying to write some mod_rewrite rules to generate thumbnails on the fly. So when this url example.com/media/myphoto.jpg?width=100&height=100 the script should rewrite it to example.com/media/myphoto-100x100.jpg and if the file exists on the disk it gets served by Apache and if it doesn't exist it is called a script to generate the file. I wrote this RewriteCond %{QUERY_STRING} ^width=(\d+)&height=(\d+) RewriteRule ^media/([a-zA-Z0-9_\-]+)\.([a-zA-Z0-9]+)$ media/$1-%1x%2.$2 [L] RewriteCond %{QUERY_STRING} ^(.+)? RewriteRule ^media/([a-zA-Z0-9_\-\._]+)$ media/index.php?file=$1&%1 [L] and I get infinite internal redirects. The first condition is matched and the rule is executed and right after that I get an internal redirect. I need advice to finish this script. Thank you.

    Read the article

  • preg_match , regexp , php , ignore white spaces and new lines

    - by Michael
    I'm trying to extract richard123 using php preg_replace but there are a lot of white spaces and new lines and I think because of that my regexp doesn't work . The html can be seen here : http://pastebin.com/embed_iframe.php?i=vuD3z9ij My current preg_match is : $find = "/< tr bgcolor=\"F0F0F0\" valign=\"middle\">< td align=\"left\">< font size=\"-1\">(.*)<\/font><\/td>/"; preg_match_all($find, $res, $matches2); print_r($matches2); I also tried <\/td/s"; <\/td/m"; <\/td/x"; but doesn't work either .

    Read the article

  • Python - Strange Behavior in re.sub

    - by Greg
    Here's the code I'm running: import re FIND_TERM = r'C:\\Program Files\\Microsoft SQL Server\\90\\DTS\\Binn\\DTExec\.exe' rfind_term = re.compile(FIND_TERM,re.I) REPLACE_TERM = 'C:\\Program Files\\Microsoft SQL Server\\100\\DTS\\Binn\\DTExec.exe' test = r'something C:\Program Files\Microsoft SQL Server\90\DTS\Binn\DTExec.exe something' print rfind_term.sub(REPLACE_TERM,test) And the result I get is: something C:\Program Files\Microsoft SQL Server@\DTS\Binn\DTExec.exe something Why is there an @ sign?

    Read the article

  • Non greedy grep

    - by syker
    I want to grep the shortest match and the pattern should be something like: <car ... model=BMW ...> ... ... ... </car> ... means any character and the input is multiple lines.

    Read the article

  • Codeigniter Routes for filename with extension

    - by thehuby
    I am using codeigniter and its routes system successfully with some lovely regexp, however I have come unstuck on what should be an easy peasy thing in the system. I want to include a bunch of search engine related files (for Google webmaster etc.) plus the robots.txt file, all in a controller. So, I have create the controller and updated the routes file and don't seem to be able to get it working with these files. Here's a snip from my routes file: $route['robots\.txt|LiveSearchSiteAuth\.xml'] = 'search_controller/files'; Within the function I use the URI helper to figure out which content to show. Now I can't get this to match, which points to my regexp being wrong. I'm sure this is a really obvious one but its late and my caffeine tank is empty :)

    Read the article

  • Building a regexp to split a string

    - by Kivin
    I'm seeking a solution to splitting a string which contains text in the following format: "abcd efgh 'ijklm no pqrs' tuv" which will produce the following results: ['abcd', 'efgh', 'ijklm no pqrs', 'tuv'] In otherwords, it splits by whitespace unless inside of a single quoted string. I think it could be done with .NET regexps using "Lookaround" operators, particularly balancing operators. I'm not so sure about perl.

    Read the article

  • RegularExpression-esque search matching Objects in List

    - by Pindatjuh
    I'm currently working on an implementation of the following idea, and I was wondering if there is any literature on this subject. Working with Java, but the principle applies on any language with a decent type-system, I like to implement: matching Objects from a List using a RegularExpression-esque search: So let's say I have a List containing List<Object> x = new ArrayList<Object>(); x.add(new Object()); x.add("Hello World"); x.add("Second String"); x.add(5); // Integer (auto-boxing) x.add(6); // Integer Then I create a "Regular Expression" (not working with a stream of characters, but working with a stream of Objects), and instead of character-classes, I use type-system properties: [String][Integer] And this would match one sublist: {Match["Second String", 5]}. The expression: [String:length()<15] Will match two sublist (each of length 1) containing a String which instance is passing the expression instance.length() < 5: {Match["Hello World"],Match["Second String"]}. [Object][Object] Matches any pair in the List: {Match[Object,"Hello World"],Match["Second String", 5]}, in a streamed manner (no overlapping matches). Ofcourse, my implementation will have grouping, lookahead/lookbehinds and is hierarchical (i.e. matching n elements from Lists in Lists), etc. The above merely illustrates the concept. Is there a name for this principle, and is there literature available on it?

    Read the article

  • How to remove a tab attribute in ASP .NET AJAX Toolkit using Regular Expression

    - by Nassign
    I have tried to remove the following tag generated by the AJAX Control toolkit. The scenario is our GUI team used the AJAX control toolkit to make the GUI but I need to move them to normal ASP .NET view tag using MultiView. I want to remove all the __designer: attributes Here is the code <asp:TextBox ID="a" runat="server" __designer:wfdid="w540" /> <asp:DropdownList ID="a" runat="server" __designer:wfdid="w541" /> ..... <asp:DropdownList ID="a" runat="server" __designer:wfdid="w786" /> I tried to use the regular expression find replace in Visual Studio using: Find: :__designer\:wfdid="w{([0-9]+)}" Replace with empty space Can any regular expression expert help?

    Read the article

  • Backreferences syntax in replacement strings (why dollar sign?)

    - by polygenelubricants
    In Java, and it seems in a few other languages, backreferences in the pattern is preceded by a slash (e.g. \1, \2, \3, etc), but in a replacement string it's preceded by a dollar sign (e.g. $1, $2, $3, and also $0). Here's a snippet to illustrate: System.out.println( "left-right".replaceAll("(.*)-(.*)", "\\2-\\1") // WRONG!!! ); // prints "2-1" System.out.println( "left-right".replaceAll("(.*)-(.*)", "$2-$1") // CORRECT! ); // prints "right-left" System.out.println( "You want million dollar?!?".replaceAll("(\\w*) dollar", "US\\$ $1") ); // prints "You want US$ million?!?" System.out.println( "You want million dollar?!?".replaceAll("(\\w*) dollar", "US$ \\1") ); // throws IllegalArgumentException: Illegal group reference Questions: Is the use of $ for backreferences in replacement strings unique to Java? If not, what language started it? What flavors use it and what don't? Why is this a good idea? Why not stick to the same pattern syntax? Wouldn't that lead to a more cohesive and an easier to learn language? Wouldn't the syntax be more streamlined if statements 1 and 4 in the above were the "correct" ones instead of 2 and 3?

    Read the article

  • regular expression to extract @name symbols from tweet

    - by Joey
    Hello All, I would like to use regular expression to extract only @patrick @michelle from the following sentence: @patrick @michelle we having diner @home tonight do you want to join? Note: @home should not be include in the result because, it is not at beginning of the sentence nor is followed by another @name. Any solution, tip, comments will be really appreciated.

    Read the article

  • Extract number from string in MSBuild

    - by Ole Lynge
    I would like to extract the number from a string in MSBuild. How can I do that using the built in tasks or the MSBuild.Community.Tasks? (RegexMatch might do, but how?) Example: I have the string agent0076 and I would like to get out the number, without the leading zeros: 76

    Read the article

  • preg_replace Pattern

    - by codeworxx
    Hey Guys, i'm not very firm with preg_replace - in other Words i do not really understand - so i hope you can help me. I have a string in a Text like this one: [demo category=1] and want to replace with the Content of Category (id=1) e.g. "This is the Content of my first Category" This is my startpoint Pattern - that's all i have: '/[demo\s*.*?]/i'; Hope you can help? Thanks, Sascha

    Read the article

  • How can I split a string by whitespace unless inside of a single quoted string?

    - by Kivin
    I'm seeking a solution to splitting a string which contains text in the following format: "abcd efgh 'ijklm no pqrs' tuv" which will produce the following results: ['abcd', 'efgh', 'ijklm no pqrs', 'tuv'] In other words, it splits by whitespace unless inside of a single quoted string. I think it could be done with .NET regexps using "Lookaround" operators, particularly balancing operators. I'm not so sure about Perl.

    Read the article

  • Python re.IGNORECASE being dynamic

    - by Adam Nelson
    I'd like to do something like this: re.findall(r"(?:(?:\A|\W)" + 'Hello' + r"(?:\Z|\W))", 'hello world',re.I) And have re.I be dynamic, so I can do case-sensitive or insensitive comparisons on the fly. This works but is undocumented: re.findall(r"(?:(?:\A|\W)" + 'Hello' + r"(?:\Z|\W))", 'hello world',1) To set it to sensitive. Is there a Pythonic way to do this? My best thought so far is: if case_sensitive: regex_senstive = 1 else: regex_sensitive = re.I re.findall(r"(?:(?:\A|\W)" + 'Hello' + r"(?:\Z|\W))", 'hello world',regex_sensitive)

    Read the article

  • regular expression help

    - by JPro
    I always get confused using regular expressions. Can anyone please suggest me a tutorial? I need help with checking for a string which, cannot contain any wild characters except colon, comma, full stop. It will be better to replace these if found. Any help? Thanks.

    Read the article

  • Regular Expression help

    - by user104628
    hi everyone. I can't seem to make my regular expression work. I'd like to have some alpha text, no numbers, an underscore and then some more aplha text. for example: blah_blah I have an non-working example here ^[a-z][_][a-z]$ Thanks in advance people. EDIT: I applogize, I'd like to enforce the use of all lower case.

    Read the article

< Previous Page | 109 110 111 112 113 114 115 116 117 118 119 120  | Next Page >