Search Results

Search found 3804 results on 153 pages for 'regex'.

Page 101/153 | < Previous Page | 97 98 99 100 101 102 103 104 105 106 107 108  | Next Page >

  • sed - trying to replace first occurrence after a match

    - by wakkaluba
    I am facing a situation that drives me nuts. I am setting up an update server which uses a json file. Don't ask why or how, it sucks and is my only possibility to achieve it. I have been trying and researching for HOURS (many) because I went ballistic and wanted to crack this on my own. But I have to realize I got stuck and need help. So sorry for this chunk but I think it is somewhat important to see... The file is a one liner and repeating the following sequence with changing values (of course). "plugin_name_foo_bar": {"buildDate": "bla", "dependencies": [{"name": "bla", "optional": true, "version": "1.00"}], "developers": [{"developerId": "bla", "email": "[email protected]", "name": "Bla bla2nd"}], "excerpt": "some text {excerpt} !bla.png|thumbnail,border=1! ", "gav": "bla", "labels": ["report", "scm-related"], "name": "plugin_name_foo_bar", "previousTimestamp": "bla", "previousVersion": "1.0", "releaseTimestamp": "bla", "requiredCore": "1", "scm": "github.com", "sha1": "ynnBM2jWo25ZLDdP3ybBOnV/Pio=", "title": "bla", "url": "http://bla.org", "version": "1.0", "wiki": "https://bla.org"}, "Exclusion": {"buildDate": "bla", "dependencies": [], and the next plugin block is glued straight afterwards. What I now want to do is to search for "plugin_foo_bar": {" as this is the unique identifier for a new plugin description block. I want to replace the first sha1 value occuring afterwards. That's where I keep failing. I always grab the first,last or any occurrence in the entire file and not the block :( "title" is the unique identifier after the sha1 value. So I tried to make the .* less greedy but it ain't working out. last attempt was heading towards: sed -i 's/("name": "plugin_name_foo_bar.*sha1": ")([a-zA-Z0-9!@#\$%^&*()\[\]]*)(", "title"\)/\1blablabla\2/1' default.json to find the sha1 value of that plugin but still no joy. I hope someone knows - preferably a simpler approach - before I now continue with trial and error until I have to puke and freakout. I am working with SED on Windows, so Unix approach might help me to figure out how to achieve this in batch but please make it as one-liner if possible. Scripts are a real pain to convert. And I just need SED and no other solution with other tools like AWK. That is absolutely out of discussion. Any help is appreciated :) Cheers Jan

    Read the article

  • Please help on multiple match replacement

    - by duenguyen
    I have a perl code: my $s = "The+quick+brown+fox+jumps+over+the+lazy+dog+that+is+my+dog"; what I want is to replace every + with space and dog with cat i have this regular expression $s =~ s/+(.*)dog/ ${1}cat/g; But it only match first occurrence of + and last dog. Please help

    Read the article

  • Java: calculate linenumber from charwise position according to the number of "\n"

    - by HH
    I know charwise positions of matches like 1 3 7 8. I need to know their corresponding line number. Example: file.txt Match: X Mathes: 1 3 7 8. Want: 1 2 4 4 $ cat file.txt X2 X 4 56XX [Added: does not notice many linewise matches, there is probably easier way to do it with stacks] $ java testt 1 2 4 $ cat testt.java import java.io.*; import java.util.*; public class testt { public static String data ="X2\nX\n4\n56XX"; public static String[] ar = data.split("\n"); public static void main(String[] args){ HashSet<Integer> hs = new HashSet<Integer>(); Integer numb = 1; for(String s : ar){ if(s.contains("X")){ hs.add(numb); numb++; }else{ numb++; } } for (Integer i : hs){ System.out.println(i); } } }

    Read the article

  • Python program to search for specific strings in hash values (coding help)

    - by Diego
    Trying to write a code that searches hash values for specific string's (input by user) and returns the hash if searchquery is present in that line. Doing this to kind of just learn python a bit more, but it could be a real world application used by an HR department to search a .csv resume database for specific words in each resume. I'd like this program to look through a .csv file that has three entries per line (id#;applicant name;resume text) I set it up so that it creates a hash, then created a string for the resume text hash entry, and am trying to use the .find() function to return the entire hash for each instance. What i'd like is if the word "gpa" is used as a search query and it is found in s['resumetext'] for three applicants(rows in .csv file), it prints the id, name, and resume for every row that has it.(All three applicants) As it is right now, my program prints the first row in the .csv file(print resume['id'], resume['name'], resume['resumetext']) no matter what the searchquery is, whether it's in the resumetext or not. lastly, are there better ways to doing this, by searching word documents, pdf's and .txt files in a folder for specific words using python (i've just started reading about the re module and am wondering if this may be the route, rather than putting everything in a .csv file.) def find_details(id2find): resumes_f=open("resume_data.csv") for each_line in resumes_f: s={} (s['id'], s['name'], s['resumetext']) = each_line.split(";") resumetext = str(s['resumetext']) if resumetext.find(id2find): return(s) else: print "No data matches your search query. Please try again" searchquery = raw_input("please enter your search term") resume = find_details(searchquery) if resume: print resume['id'], resume['name'], resume['resumetext']

    Read the article

  • How Do You Parse Column Data ?

    - by discwiz
    I am trying to parse a file generated by LGA Tracon that lists the position data for aircraft over a given time frame. The data of interest starts with TRACKING DATA and ends with SST and there are thousands of entries per file. The system generating the file, Common ARTS, is very rigid in its formatting and we can expect the column spacing to be consistent. Any help would be greatly appreciated. Thanks, Here is an image to preserve the exact formatting Here is a reduced text file. link text

    Read the article

  • Regular Expression to return the contents of a HTML tag received as a string of text

    - by Nathan Hernandez
    I have a string in my code that I receive that contains some html tags. It is not part of the HTML page being displayed so I cannot grab the html tag contents using the DOM (i.e. document.getElementById('tag id').firstChild.data); So, for example within the string of text would appear a tag like this: 12 My question is how would I use a regular expression to access the '12' numeric digit in this example? This quantity could be any number of digits (i.e. it is not always a double digit). I have tried some regular expressions, but always end up getting the full span tag returned along with the contents. I only want the '12' in the example above, not the surrounding tag. The id of the tags will always be 'myQty' in the string of text I receive. Thanks in advance for any help!

    Read the article

  • How to capture strings using * or ? with groups in python regular expressions

    - by user1334085
    When the regular expression has a capturing group followed by "*" or "?", there is no value captured. Instead if you use "+" for the same string, you can see the capture. I need to be able to capture the same value using "?" >>> str1='This string has 29 characters' >>> re.search(r'(\d+)*', str1).group(0) '' >>> re.search(r'(\d+)*', str1).group(1) >>> >>> re.search(r'(\d+)+', str1).group(0) '29' >>> re.search(r'(\d+)+', str1).group(1) '29' More specific question is added below for clarity: I have str1 and str2 below, and I want to use just one regexp which will match both. In case of str1, I also want to be able to capture the number of QSFP ports >>> str1='''4 48 48-port and 6 QSFP 10GigE Linecard 7548S-LC''' >>> str2='''4 48 48-port 10GigE Linecard 7548S-LC''' >>> When I do not use a metacharacter, the capture works: >>> re.search(r'^4\s+48\s+.*(?:(\d+)\s+QSFP).*-LC', str1, re.I|re.M).group(1) '6' >>> It works even when I use the "+" to indicate one occurrence: >>> re.search(r'^4\s+48\s+.*(?:(\d+)\s+QSFP)+.*-LC', str1, re.I|re.M).group(1) '6' >>> But when I use "?" to match for 0 or 1 occurrence, the capture fails even for str1: >>> re.search(r'^4\s+48\s+.*(?:(\d+)\s+QSFP)?.*-LC', str1, re.I|re.M).group(1) >>>

    Read the article

  • re.sub emptying list

    - by jmau5
    def process_dialect_translation_rules(): # Read in lines from the text file specified in sys.argv[1], stripping away # excess whitespace and discarding comments (lines that start with '##'). f_lines = [line.strip() for line in open(sys.argv[1], 'r').readlines()] f_lines = filter(lambda line: not re.match(r'##', line), f_lines) # Remove any occurances of the pattern '\s*<=>\s*'. This leaves us with a # list of lists. Each 2nd level list has two elements: the value to be # translated from and the value to be translated to. Use the sub function # from the re module to get rid of those pesky asterisks. f_lines = [re.split(r'\s*<=>\s*', line) for line in f_lines] f_lines = [re.sub(r'"', '', elem) for elem in line for line in f_lines] This function should take the lines from a file and perform some operations on the lines, such as removing any lines that begin with ##. Another operation that I wish to perform is to remove the quotation marks around the words in the line. However, when the final line of this script runs, f_lines becomes an empty lines. What happened? Requested lines of original file: ## English-Geek Reversible Translation File #1 ## (Moderate Geek) ## Created by Todd WAreham, October 2009 "TV show" <=> "STAR TREK" "food" <=> "pizza" "drink" <=> "Red Bull" "computer" <=> "TRS 80" "girlfriend" <=> "significant other"

    Read the article

  • Using awk to return only certain chunks of data

    - by Koriar
    I'm not 100% certain how to phrase my question simply, so I apologize if this has been answered somewhere and I was just unable to find it. What I have are debug logs with authentication packets in them along with a bunch of other output. I need to search through about 2 million lines of logs to find every packet that contains a certain mac address. The packets look something like this (slightly censored): -----------------[ header ]----------------- Event: Authd-Response (1900) Sequence: -54 Timestamp: 1969-12-31 19:30:00 (0) ---------------[ attributes ]--------------- Auth-Result = Auth-Accept Service-Profile-SID = 53 Service-Profile-SID = 49 RADIUS-Access-Accept-Attr/WiMAX-Capability = 0x(numbers) Session-Timeout = 3600 Service-Profile-SID = 4 Service-Profile-SID = 29 Chargeable-User-Identity = "(Numbers)" User-Password = "(the MAC address I'm looking for)" -------------------------------------------- However there are about 10 different possible types with different possible lengths. They all start with the header line and end with the all-dashes line. I've had success using awk to get the code blocks themselves using this: awk '/-----------------\[ header \]-----------------/,/--------------------------------------------/' filename.txt But I was hoping to be able to use it to return only the packets which contain the MAC address that I need. I've been trying to figure this out for a few days now and I'm pretty stuck. I could try and write a bash script, but I could swear that I've used awk to do something like this before...

    Read the article

  • Need help parsing HTML with a regex in python

    - by laspal
    Hi, My string is mystring = "<tr><td><span class='para'><b>Total Amount : </b>INR (Indian Rupees) 100.00</span></td></tr>" My problem here is I have to search and get the total amount test = re.search("(Indian Rupees)(\d{2})(?:\D|$)", mystring) but my test give me None. How can I get the values and values can be 10.00, 100.00, 1000.00 Thanks

    Read the article

  • Is is possible to parse a web page from the client side for a large number of words and if so, how?

    - by Technoh
    I have a list of keywords, about 25,000 of them. I would like people who add a certain < script tag on their web page to have these keywords transformed into links. What would be the best way to go and achieve this? I have tried the simple javascript approach (an array with lots of elements and regexping/replacing each) and it obviously slows down the browser. I could always process the content server-side if there was a way, from the client, to send the page's content to a cross-domain server script (I'm partial to PHP but it could be anything) but I don't know of any way to do this. Any other working solution is also welcome.

    Read the article

  • Simple regular expression for decimal numbers?

    - by finch
    I know this may be the simplest question ever asked on Stack Overflow, but what is the regular expression for a decimal with a precision of 2? Valid examples: 123.12 2 56754 92929292929292.12 0.21 3.1 Invalid examples: 12.1232 2.23332 e666.76 Sorry for the lame question, but for the life of me I haven't been able to find anyone that can help! The decimal place may be option, and that integers may also be included.

    Read the article

  • Regular Expression with Names and Emails

    - by Nina
    I am having a problem with regular expressions at the moment. What I'm trying to do is that for each line through the iteration, it checks for this type of pattern: Lastname, Firstname If it finds the name, then it will take the first letter of the first name, and the first six letters of the lastname and form it as an email. I have the following: $checklast = "[A-z],"; $checkfirst = "[A-z]"; if (ereg($checklast, $parts[1])||ereg($checkfirst, $parts[2])){ $first = preg_replace($checkfirst, $checkfirst{1,1}, $parts[2]); print "<a href='mailto:[email protected];'> $parts[$i] </a>"; } This one obviously broke the code. But I was initially attempting to find only the first letter of the firstname and then after that the first six letters of the lastname followed by the @email.com This didn't work out too well. I'm not sure what to do at this point. Any help is much appreciated.

    Read the article

  • Regular expression match, extracting only wanted segments of string

    - by Ben Carey
    I am trying to extract three segments from a string. As I am not particularly good with regular expressions, I think what I have done could probably be done better... I would like to extract the bold parts of the following string: SOMETEXT: ANYTHING_HERE (Old=ANYTHING_HERE, New=ANYTHING_HERE) Some examples could be: ABC: Some_Field (Old=,New=123) ABC: Some_Field (Old=ABCde,New=1234) ABC: Some_Field (Old=Hello World,New=Bye Bye World) So the above would return the following matches: $matches[0] = 'Some_Field'; $matches[1] = ''; $matches[2] = '123'; So far I have the following code: preg_match_all('/^([a-z]*\:(\s?)+)(.+)(\s?)+\(old=(.+)\,(\s?)+new=(.+)\)/i',$string,$matches); The issue with the above is that it returns a match for each separate segment of the string. I do not know how to ensure the string is the correct format using a regular expression without catching and storing the match if that makes sense? So, my question, if not already clear, how I can retrieve just the segments that I want from the above string?

    Read the article

  • Regular expressions in python unicode

    - by Remy
    I need to remove all the html tags from a given webpage data. I tried this using regular expressions: import urllib2 import re page = urllib2.urlopen("http://www.frugalrules.com") from bs4 import BeautifulSoup, NavigableString, Comment soup = BeautifulSoup(page) link = soup.find('link', type='application/rss+xml') print link['href'] rss = urllib2.urlopen(link['href']).read() souprss = BeautifulSoup(rss) description_tag = souprss.find_all('description') content_tag = souprss.find_all('content:encoded') print re.sub('<[^>]*>', '', content_tag) But the syntax of the re.sub is: re.sub(pattern, repl, string, count=0) So, I modified the code as (instead of the print statement above): for row in content_tag: print re.sub(ur"<[^>]*>",'',row,re.UNICODE But it gives the following error: Traceback (most recent call last): File "C:\beautifulsoup4-4.3.2\collocation.py", line 20, in <module> print re.sub(ur"<[^>]*>",'',row,re.UNICODE) File "C:\Python27\lib\re.py", line 151, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: expected string or buffer What am I doing wrong?

    Read the article

  • Regular Expression to isolate an html tag

    - by orit cohen
    I'm looking for a regular expression to isolate an html tag. This includes the TAG the ATTRIBUTES and the CONTNET inside. Let's say I have this: <html> <body> aajsdfkjaskd <TAGNAME name="bla" context="non">hfdfhdj </TAGNAME> </body> </html> I need a regular expression that would return: <TAGNAME name="bla" context="non">hfdfhdj </TAGNAME> Thank, Joe

    Read the article

  • How Do I grep For non-ASCII Characters in UNIX

    - by Peter Conrey
    I have several very large XML files and I'm trying to find the lines that contain non-ASCII characters. I've tried the following: grep -e "[\x{00FF}-\x{FFFF}]" file.xml But this returns every line in the file, regardless of whether the line contains a character in the range specified. Do I have the syntax wrong or am I doing something else wrong? I've also tried: egrep "[\x{00FF}-\x{FFFF}]" file.xml (with both single and double quotes surrounding the pattern).

    Read the article

  • Perl Strip Comments with Regex Unique Request

    - by YoDar
    Hello, I'm running a code that read files, do some parsing but need to ignore all comments. There are good explanations how to conduct it. like this link $/ = undef; $_ = <>; s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $2 ? $2 : ""#gse; print; My first problem is that after run this line $/ = undef; my code doesn't work properly. Actually, I don't know what it does. But if I could turn it back after ignoring all comments it will be helpful. In general, What is the useful way to ignore all comments without changing the rest of the code ? Thanks, YoDar

    Read the article

  • Regular expression: who's greedier?

    - by polygenelubricants
    My primary concern is with the Java flavor, but I'd also appreciate information regarding others. Let's say you have a subpattern like this: (.*)(.*) Not very useful as is, but let's say these two capture groups (say, \1 and \2) are part of a bigger pattern that matches with backreferences to these groups, etc. So both are greedy, in that they try to capture as much as possible, only taking less when they have to. My question is: who's greedier? Does \1 get first priority, giving \2 its share only if it has to? What about: (.*)(.*)(.*) Let's assume that \1 does get first priority. Let's say it got too greedy, and then spit out a character. Who gets it first? Is it always \2 or can it be \3? Let's assume it's \2 that gets \1's rejection. If this still doesn't work, who spits out now? Does \2 spit to \3, or does \1 spit out another to \2 first?

    Read the article

  • How to replace only part of the match with python re.sub

    - by Arty
    I need to match two cases by one reg expression and do replacement 'long.file.name.jpg' - 'long.file.name_suff.jpg' 'long.file.name_a.jpg' - 'long.file.name_suff.jpg' I'm trying to do the following re.sub('(\_a)?\.[^\.]*$' , '_suff.',"long.file.name.jpg") But this is cut the extension '.jpg' and I'm getting long.file.name_suff. instead of long.file.name_suff.jpg I understand that this is because of [^.]*$ part, but I can't exclude it, because I have to find last occurance of '_a' to replace or last '.' Is there a way to replace only part of the match?

    Read the article

< Previous Page | 97 98 99 100 101 102 103 104 105 106 107 108  | Next Page >