How can I parse a strings like :
name1="val1" name2="val2" name3="val3"
I cannot use split(\s+) as it can be name = "val 1".
I am doing java but any laguage is okay.
x = "abcdefg"
x = x.match(/ab(?:cd)ef/)
shouldn't x be abef? it is not, it is actually abcdef
Why is it that my ?: not having any effect? (of course my understanding could very well be wrong)
I know JavaScript regular expressions can ignore case for the entire match, but what about just the first character? Then tuesday would match Tuesday but not TUESDAY.
The text follows this pattern
<tr class="text" (any sequence of characters here, except ABC)ABC(any sequence of characters here, except ABC)
<tr class="text" (any sequence of characters here, except ABC)ABC(any sequence of characters here, except ABC)
<tr class="text" (any sequence of characters here, except ABC)ABC(any sequence of characters here, except ABC)
<tr class="text" (any sequence of characters here, except ABC)ABC(any sequence of characters here, except ABC)
so basically the above line might repeat itself multiple times, and the idea is to retrieve the first 3 characters immediately after ABC.
I have tried regular expressions along the lines of
\<tr class="text" [.]+ABC(?<capture>[.]{3})
but they all fail. Can someone give me a hint?
Hi
I have an array that its first element might contains something like [some text, here. That's some text]
I'm trying to figure out a pattern to check if such string exists and if not create it but having problem with making the pattern. Here's what I've done so far
$pattern = '/^\[*\]$/';
if(preg_match($pattern,$exploded[0])){
$name = array_shift($exploded);
}else{
$name = "[Unnamed import] - " .gmdate("His");
}
But I always get [Unnamed import] - 032758 even when I'm sure that pattern match
Hello, I have a string:
CriteriaCondition={FieldName={*EPS}*$MinValue=(-201)$MaxValue=(304)$TradingPeriod=(-1)}
Help me to get the first word which ends with the first word "={" & get the next following word which ends with "}".
The result must be:
Word1 = "CriteriaCondition"
Word2 = "FieldName={EPS}$MinValue=(-201)$MaxValue=(304)$TradingPeriod=(-1)"
And with the string "FieldName=(EPS)$MinValue=(-201)$MaxValue=(304)$TradingPeriod=(-1)", help me to split to pairs:
FieldName EPS
MinValue -201
MaxValue 304
TradingPeriod -1
Thanks.
Any idea how to find and replace the HTML font-size in style attribute?
eg. <span style="font-size:12px">hello world</span>
I would like to remove all font-size using javascript.
Thank you
I have the following input string:
key1 = "test string1" ; key2 = "test string 2"
I need to convert it to the following without tokenizing
key1="test string1";key2="test string 2"
Hi,
I'm trying to extract a number from a string.
And do something like this [0-9]+ on this string "aaaa12xxxx" and get "12".
I thought it would be something like:
> grep("[0-9]+","aaa12xxx", value=TRUE)
[1] "aaa12xxx"
And then I figured...
> sub("[0-9]+", "\\1", "aaa12xxxx")
[1] "aaa12xxx"
But I got some form of response doing:
> sub("[0-9]+", "ARGH!", "aaa12xxxx")
[1] "aaaARGH!xxx"
There's a small detail I'm missing Please advice :-)
I'm using R version 2.10.1 (2009-12-14)
Thanks !
Comments on the solution
The best solution is to ignore the standard functions and install Hadley Wickham's stringr package to get something that actually makes sense.
Kudos to Marek for figuring out how the standard library worked.
I am working with a generic view in Django. I want to capture a named group parameter in the URL and pass the value to the URL pattern dictionary. For example, in the URLConf below, I want to capture the parent_slug value in the URL and pass it to the queryset dictionary value like so:
urlpatterns = patterns('django.views.generic.list_detail',
(r'^(?P<parent_slugs>[-\w])$', 'object_list', {'queryset':Promotion.objects.filter(category=parent_slug)}, 'promo_promotion_list'),
)
Is this possible to do in one URLConf entry, or would it be wiser if I create a custom view to capture the value and pass the queryset directly to the generic view from my overridden view?
I'm trying to match a SEDOL (exactly 7 chars: 6 alpha-numeric chars followed by 1 numeric char)
My regex
([A-Z 0-9]{6})[0-9]{1}
matches correctly but strings greater than 7 chars that begin with a valid match also match (if you see what I mean :)). For example:
B3KMJP4
matches correctly but so does:
B3KMJP4x
which shouldn't match.
Can anyone show me how to avoid this?
When is a good design decision to use the Ternary Operator over if then else clause? Are there any efficiency hit? Do they get compiled to the same code? Do you think one is more readable than the other in some cases?
In my snippet below, the non-capturing group "(?:aaa)" should be ignored in matching result,
so the result should be "_bbb" only.
However, I get "aaa_bbb" in matching result; only when I specify group(2) does it show "_bbb".
import re
string1 = "aaa_bbb"
print(re.match(r"(?:aaa)(_bbb)", string1).group())
>>> aaa_bbb
I'm trying to match on an option group in Scala 2.8 (beta 1) with the following code:
import scala.xml._
val StatementPattern = """([\w\.]+)\s*:\s*([+-])?(\d+)""".r
def buildProperty(input: String): Node = input match {
case StatementPattern(name, value) => <propertyWithoutSign />
case StatementPattern(name, sign, value) => <propertyWithSign />
}
val withSign = "property.name: +10"
val withoutSign = "property.name: 10"
buildProperty(withSign) // <propertyWithSign></propertyWithSign>
buildProperty(withoutSign) // <propertyWithSign></propertyWithSign>
But this is not working. What is the correct way to match optional regex groups?
Hi, there
I'm writing a program that has to get values from a file. In the file each line indicates an entity. Each entity has three values. For example:
Value1 Value2 value3
I have a regular expresion to match them
m/(.*?) (.*?) (.*?)/m;
But it seems that the third value in never matched! The only way to match the third value is to add another value in the file and another "matching brackets" in the expresion. But this does not satisfy me.
Thanks in Advance!
I wanto to match the last occurence of a simple pattern in a string, e.g.
list = re.findall(r"\w+ AAAA \w+", "foo bar AAAA foo2 AAAA bar2)
print "last match: ", list[len(list)-1]
however, if the string is very long, a huge list of matches is generated. Is there a more direct way to match the second occurence of "AAAA" or should I use this workaround?
I'm having trouble checking in PHP if a value is is any of the following combinations
letters (upper or lowercase)
numbers (0-9)
underscore (_)
dash (-)
point (.)
no spaces! or other characters
a few examples:
OK: "screen123.css"
OK: "screen-new-file.css"
OK: "screen_new.js"
NOT OK: "screen new file.css"
I guess I need a regex for this, since I need to throw an error when a give string has other characters in it than the ones mentioned above.
I'd like to "grab" a few hundred urls from a few hundred html pages.
Pattern:
<h2><a href="http://www.the.url.might.be.long/urls.asp?urlid=1" target="_blank">The Website</a></h2>
this program intended to read a .txt, a set of numbers, file and wwrite to another two .txt files called even amd odd as follows:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
int i=0,even,odd;
int number[i];
// check to make sure that all the file names are entered
if (argc != 3) {
printf("Usage: executable in_file output_file\n");
exit(0);
}
FILE *dog = fopen(argv[1], "r");
FILE *feven= fopen(argv[2], "w");
FILE *fodd= fopen (argv[3], "w");
// check whether the file has been opened successfully
if (dog == NULL)
{ printf("File %s cannot open!\n", argv[1]);
exit(0);
}
//odd = fopen(argv[2], "w");
{ if
(i%2!=1)
i++;}
fprintf(feven, "%d", even);
fscanf(dog, "%d", &number[i]);
else {
i%2==1;
i++;}
fprintf(fodd, "%d", odd);
fscanf(dog, "%d", &number[i]);
fclose(feven);
fclose(fodd);
I'm currently modifying my regex for this:
http://stackoverflow.com/questions/2782031/extracting-email-addresses-in-an-html-block-in-ruby-rails
basically, im making another obfuscator that uses ROT13 by parsing a block of text for all links that contain a mailto referrer(using hpricot). One use case this doesn't catch is that if the user just typed in an email address(without turning it into a link via tinymce)
So here's the basic flow of my method:
1. parse a block of text for all tags with href="mailto:..."
2. replace each tag with a javascript function that changes this into ROT13 (using this script: http://unixmonkey.net/?p=20)
3. once all links are obfuscated, pass the resulting block of text into another function that parses for all emails(this one has an email regex that reverses the email address and then adds a span to that email - to reverse it back)
step 3 is supposed to clean the block of text for remaining emails that AREN'T in a href tags(meaning it wasn't parsed by hpricot). Problem with this is that the emails that were converted to ROT13 are still found by my regex. What i want to catch are just emails that WEREN'T CONVERTED to ROT13.
How do i do this? well all emails the WERE CONVERTED have a trailing "'.replace" in them. meaning, i need to get all emails WITHOUT that string. so far i have this regex:
/\b([A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}('.replace))\b/i
but this gets all the emails with the trailing '.replace i want to get the opposite and I'm currently stumped with this. any help from regex gurus out there?
MORE INFO:
Here's the regex + the block of text im parsing:
http://www.rubular.com/r/NqXIHrNqjI
as you can see, the first two 'email addresses' are already obfuscated using ROT13. I need a regex that gets the emails [email protected] and [email protected]
Hello.
I have text file with several thousands lines. I want to parse this file into database and decided to write a regexp. Here's part of file:
blablabla checked=12 unchecked=1
blablabla unchecked=13
blablabla checked=14
As a result, I would like to get something like
(12,1)
(0,13)
(14,0)
Is it possible?
in twitter
when you write @moustafa
will change to <a href='user/moustafa'>@moustafa</a>
now i want make the same thing
when write @moustafa + space its change @moustafa only
i am a real beginner in csh/tcsh scripting and that's why i need your help. The problem is I have to go through some regular files in directories and find those files, that have their own name in its content. In the following piece of script is cycle in which I am going through paths and using grep to find the file's name in its content.
What is surely correct is $something:q - is array of paths where i have to find files.
The next variable is name in which is only name of current file.
for example: /home/computer/text.txt (paths)
and: text.txt (name)
And my biggest problem is to find names of files in their content. It's quite difficult for me to write correct grep for this, cause the names of files and directories that i am passing through are mad. Here are some of them:
/home/OS/pocitacove/testovaci_adresar/z/test4.pre_expertov/!_1
/home/OS/pocitacove/testovaci_adresar/z/test4.pre_expertov/dam/$user/:e/'/-r
/home/OS/pocitacove/testovaci_adresar/z/test3/skusime/ taketo/ taketo
/home/OS/pocitacove/testovaci_adresar/z/test4.pre_expertov/.-bla/.-bla/.a=b
/home/OS/pocitacove/testovaci_adresar/z/test4.pre_expertov/.-bla/.-bla/@
/home/OS/pocitacove/testovaci_adresar/z/test4.pre_expertov/.-bla/.-bla/:
/home/OS/pocitacove/testovaci_adresar/z/test4.pre_expertov/.-bla/.-bla/'ano'
foreach paths ($something:q)
set name = "$paths:t"
@ number = (`grep -Ec "$name" "$paths"`)
if ($number != 0) then
echo -n "$paths "
echo $number
endif
@ number = 0
end