I trying to parse this string:
$right = '34601)S(1,6)[2] - 34601)(11)[2] + 34601)(3)[2,4]';
with following regexp:
const word = '(\d{3}\d{2}\)S{0,1}\([^\)]*\)S{0,1}\[[^\]]*\])';
preg_match('/'.word.'{1}(?:\s{1}([+-]{1})\s{1}'.word.'){0,}/', $right, $matches);
print_r($matches);
i want to return array like this:
Array
(
[0] => 34601)S(1,6)[2] - 34601)(11)[2] + 34601)(3)[2,4]
[1] => 34601)S(1,6)[2]
[2] => -
[3] => 34601)(11)[2]
[4] => +
[5] => 34601)(3)[2,4]
)
but i return only following:
Array
(
[0] => 34601)S(1,6)[2] - 34601)(11)[2] + 34601)(3)[2,4]
[1] => 34601)S(1,6)[2]
[2] => +
[3] => 34601)(3)[2,4]
)
i think, its becouse of [^)]* or [^]]* in the word,
but how i should correct regexp for matching this in another way?
i tryied to specify it:
\d+(?:[,#]\d+){0,}
so word become
const word = '(\d{3}\d{2}\)S{0,1}\(\d+(?:[,#]\d+){0,}\)S{0,1}\[\d+(?:[,#]\d+){0,}\])';
but it gives nothing
How can I parse text and find all instances of hyperlinks with a string? The hyperlink will not be in the html format of <a href="http://test.com">test</a> but just http://test.com
Secondly, I would like to then convert the original string and replace all instances of hyperlinks into clickable html hyperlinks.
I found an example in this thread:
Easiest way to convert a URL to a hyperlink in a C# string?
but was unable to reproduce it in python :(
I have the following asp.net custom validator:
<asp:CustomValidator runat="server"
ClientValidationFunction="valUCRRequired" ID="valUCRRequired"
ErrorMessage="Field 7-Date/Time Between is Required"
ControlToValidate="DTE_FROM" />
Notice that the ID and ClientValidationFunction have the same value. I want to do a regular expression search where they are the same. Right now, I am just searching for all CustomValidators.
I want the validator for password text input.
At least one Upper case letter
At least one numeric character
At least one special character such as @, #, $, etc.
should be there in password how can i give it in action script or mxml.please help me.
Thanks.
I have a need to identify comments in different kinds of source files in a given directory. ( For example java,XML, JavaScript, bash). I have decided to do this using Python (as an attempt to learn Python). The questions I have are
1) What should I know about python to get this done? ( I have an idea that Regular Expressions will be useful but are there alternatives/other modules that will be useful? Libraries that I can use to get this done?)
2) Is Python a good choice for such a task? Will some other language make this easier to accomplish?
Is there a way to obtain patterns in one file (a list of patterns) from another file using ack as the -f option in grep? I see there is an -f option in ack but it's different with the -f in grep.
Perhaps an example will give you a better idea. Suppose I have file1:
file1:
a
c
e
And file2:
file2:
a 1
b 2
c 3
d 4
e 5
And I want to obtain all the patterns in file1 from file2 to give:
a 1
c 3
e 5
Can ack do this? Otherwise, is there a better way to handle the job (such like awk or using hash) because I have millions of records in both files and really need an efficient way to complete? Thanks!
I have a string. That string is a html code and it serves as a teaser for the blog posts I am creating. The whole html code (teaser) is stored in a field in the database.
My goal: I'd like to make that when a user (facebook like social button) likes certain blog post, right data is displayed on his news feeds. In order to do that I need to extract from the teaser in the first occurrence of an image an image path inside src="i-m-a-g-e--p-a-t-h". I succeeded when a user puts only one image in teaser, but if he accidentally puts two images or more the whole thing craches.
Furthermore, for description field I need to extract text inside the first occurrence inside <p> tag. The problem is also that a user can put an image inside the first tag.
I would very much appreciate if an expert could help me resolve this what's been bugging me for days.
Text string with a regular expression for extracting src can be found here: http://rubular.com/r/gajzivoBSf
Thanks!
Scenario:
I have a text file that has pipe (as in the "|" character) delimited data.
Each field of data in the pipe delimited fields can be of variable length, so counting characters won't work (or using some sort of substring function... if that even exists in VIM).
Is it possible, using VIM / Vi to delete all data from the second pipe to the end of the line for the entire file? There are approx 150,000 lines, so doing this manually would only be appealing to a masochist...
e.g.
Change the following lines from:
1111|random sized text 12345|more random data la la la|1111|abcde
2222|random sized text abcdefghijk|la la la la|2222|defgh
3333|random sized text|more random data|33333|ijklmnop
to:
1111|random sized text 12345
2222|random sized text abcdefghijk
3333|random sized text
I'm sure this can be done somehow... I hope.
TIA
UPDATE: I should have mentioned that I'm running this on Windows XP, so I don't have access to some of the mentioned *nix commands (CUT is not recognized on Windows).
I have a string in my code that I receive that contains some html tags. It is not part of the HTML page being displayed so I cannot grab the html tag contents using the DOM (i.e. document.getElementById('tag id').firstChild.data);
So, for example within the string of text would appear a tag like this:
12
My question is how would I use a regular expression to access the '12' numeric digit in this example? This quantity could be any number of digits (i.e. it is not always a double digit).
I have tried some regular expressions, but always end up getting the full span tag returned along with the contents. I only want the '12' in the example above, not the surrounding tag. The id of the tags will always be 'myQty' in the string of text I receive.
Thanks in advance for any help!
Edit: OK, I can't read, thanks to Col. Shrapnel for the help. If anyone comes here looking for the same thing to be answered...
print_r(preg_split('/([\!|\?|\.|\!\?])/', $string, null, PREG_SPLIT_DELIM_CAPTURE));
Is there any way to split a string on a set of delimiters, and retain the position and character(s) of the delimiter after the split?
For example, using delimiters of ! ? . !? turning this:
$string = 'Hello. A question? How strange! Maybe even surreal!? Who knows.';
into this
array('Hello', '.', 'A question', '?', 'How strange', '!', 'Maybe even surreal', '!?', 'Who knows', '.');
Currently I'm trying to use print_r(preg_split('/([\!|\?|\.|\!\?])/', $string)); to capture the delimiters as a subpattern, but I'm not having much luck.
Hello
please help me
<html>
<body>
http://domainname.com/abc/xyz.zip
http://domainname2.com/abc/xyz.zip
</body>
</html>
I want replace with link and out put like
<html>
<body>
<a href="http://domainname.com/abc/xyz.zip">http://domainname.com/abc/xyz.zip</a>
<a href="http://domainname2.com/abc/xyz.zip">http://domainname2.com/abc/xyz.zip</a>
</body>
</html>
Great Thank
In my web app I've got a form field where the user can enter an URL. I'm already doing some preliminary client-side validation and I was wondering if I could use a regexp to validate if the entered string is a valid URL. So, two questions:
Is it safe to do this with a regexp? A URL is a complex beast, and just like you shouldn't use a regexp for parsing HTML, I'm worried that it might be unsuitable for a URL as well.
If it can be done, what would be a good regexp for the task? (I know that Google turns up countless regexps, but I'm worried about their quality).
My goal is to prevent a situation where the URL appears in the web page and is unusable by the browser.
What will be the regular expression in javascript to match a name field,
which allows only letters, apostrophes and hyphons?
so that jhon's avat-ar or Josh is valid?
Thanks
Hi,
I'm using regular expression to count the total spaces in a line (first occurrence).
match(/^\s*/)[0].length;
However this reads it from the start to end, How can I read it from end to start.
Thanks
If I have a list of regular expressions, is there an easy way to determine that no two of them will both return a match for the same string?
That is, the list is valid if and only if for all strings a maximum of one item in the list will match the entire string.
It seems like this will be very hard (maybe impossible?) to prove definitively, but I can't seem to find any work on the subject.
The reason I ask is that I am working on a tokenizer that accepts regexes, and I would like to ensure only one token at a time can match the head of the input.
I'm trying to put together a regular expression for a JavaScript command that accurately counts the number of words in a textarea.
One solution I had found is as follows:
document.querySelector("#wordcount").innerHTML = document.querySelector("#editor").value.split(/\b\w+\b/).length -1;
But this doesn't count any non-Latin characters (eg: Cyrillic, Hangul, etc); it skips over them completely.
Another one I put together:
document.querySelector("#wordcount").innerHTML = document.querySelector("#editor").value.split(/\s+/g).length -1;
But this doesn't count accurately unless the document ends in a space character. If a space character is appended to the value being counted it counts 1 word even with an empty document. Furthermore, if the document begins with a space character an extraneous word is counted.
Is there a regular expression I can put into this command that counts the words accurately, regardless of input method?
I'm getting an output data file of a program which looks like this, with more than one line for each time step:
0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
7.9819E-06 1.7724E-02 2.3383E-02 3.0048E-02 3.8603E-02 4.9581E-02 5.6635E-02 4.9991E-02 3.9052E-02 3.0399E-02
....
I want to arrange it in ten columns
I have made a Python script, using regular expressions to delete \n in the proper lines, but I think that there should be a simpler more elegant way to do it, here is my script:
import re
with open('inputfile', encoding='utf-8') as file1:
datai=file1.read()
dataf=re.sub(r'(?P<nomb>( \d\.\d\d\d\dE.\d\d){8})\n','\g<nomb>',datai)
with open('result.txt',mode='w',encoding='utf-8') as resultfile:
resultfile.write(datof)
Thanks in advance
Currently it takes about 3 minutes to run through a single 53 page word document. Hopefully you all have some advice about speeding up the process.
Code:
import win32com.client as win32
from glob import glob
import io
import re
from collections import namedtuple
from collections import defaultdict
import pprint
raw_files = glob('*.docx')
word = win32.gencache.EnsureDispatch('Word.Application')
word.Visible = False
oFile = io.open("rawsort.txt", "w+", encoding = "utf-8")#text dump
doccat= list()
for f in raw_files:
word.Documents.Open(f)
doc = word.ActiveDocument #whichever document is active at the time
doc.ConvertNumbersToText()
print doc.Paragraphs.Count
for x in xrange(1, doc.Paragraphs.Count+1):#for loop to print through paragraphs
oText = doc.Paragraphs(x)
if not oText.Range.Tables.Count >0 :
results = re.match('(?P<number>(([1-3]*[A-D]*[0-9]*)(.[1-3]*[0-9])+))', oText.Range.Text)
stylematch = re.match('Heading \d', oText.Style.NameLocal)
if results!= None and oText.Style != None and stylematch != None:
doccat.append((oText.Style.NameLocal, oText.Range.Text[:len(results.group('number'))],oText.Range.Text[len(results.group('number')):]))
style = oText.Style.NameLocal
else:
if oText.Range.Font.Bold == True :
doccat.append(style, oText)
oFile.write(unicode(doccat))
oFile.close()
The for Paragraph loop obviously takes the most amount of time. Is there some way of identifying and appending it without going through every Paragraph?
what is the best way to extract last 2 characters of a string using regular expression.
For example, I want to extract state code from the following
"A_IL"
I want to extract IL as string..
please provide me C# code on how to get it..
string fullexpression = "A_IL";
string StateCode = some regular expression code....
thanks
What will be proper regular expression for git repositories?
example link:
[email protected]:someone/someproject.git
so it will be like
server can be url or ip
Project can contain some other characters than alphanumeric like '-'
I'm not sure what is the role of '/'
any suggestions?
I have a list of phone numbers that start with the below numbers and in different formats...i need to grab the numbers that start only with the below numbers/format using php......
020 8
07974
+44 (0) 20
+44 0
440203
any help will be appreciated..
For example, here is a string representing an expression:
var str = 'total = sum(price * qty) * 1.09875';
I want to extract variables (i.e., 'total', 'price' and 'qty' but not 'sum' since 'sum' is a function name) from this expression. What is the regexp pattern in javascript? Variable name consists of letters, digits, or the underscore, beginning with letters or the underscore.
I have some text files with some useful data wrapped in between HTML tags like <td>, <span>, etc. I want to write a program which extracts the data in between the tags.
The text file contains other junk data too. I would also like to store these extracted data into SQL Table. Anyone who can guide me in right direction?
I want to validate login name with special characters !@#S%^*()+_-?/<:"';. space using regular expression in ruby on rails. These special characters should not be acceptable. What is the code for that?
Thanks,
Pallavi