Does anyone have suggestions for detecting url's in a set of elements and converting them to links?
$$('#pad dl dd').each(function(s){
//detect urls and convert to a elements.
});
I have an interesting problem. I wrote the following perl script to recursively loop through a directory and in all html files for img/script/a tags do the following:
Convert the entire url to lowercase
Replace spaces and %20 with underscores
The script works great except when an image tag in wrapped with an anchor tag. Is there a way to modify the current script to also be able to manipulate the links for nested tags that are not on separate lines? Basically if I have <a href="..."><img src="..."></a> the script will only change the link in the anchor tag but skip the img tag.
#!/usr/bin/perl
use File::Find;
$input="/var/www/tecnew/";
sub process {
if (-T and m/.+\.(htm|html)/i) {
#print "htm/html: $_\n";
open(FILE,"+<$_") or die "couldn't open file $!\n";
$out = '';
while(<FILE>) {
$cur_line = $_;
if($cur_line =~ m/<a.*>/i) {
print "cur_line (unaltered) $cur_line\n";
$cur_line =~ /(^.* href=\")(.+?)(\".*$)/i;
$beg = $1;
$link = html_clean($2);
$end = $3;
$cur_line = $beg.$link.$end;
print "cur_line (altered) $cur_line\n";
}
if($cur_line =~ m/(<img.*>|<script.*>)/i) {
print "cur_line (unaltered) $cur_line\n";
$cur_line =~ /(^.* src=\")(.+?)(\".*$)/i;
$beg = $1;
$link = html_clean($2);
$end = $3;
$cur_line = $beg.$link.$end;
print "cur_line (altered) $cur_line\n";
}
$out .= $cur_line;
}
seek(FILE, 0, 0) or die "can't seek to start of file: $!";
print FILE $out or die "can't print to file: $1";
truncate(FILE, tell(FILE)) or die "can't truncate file: $!";
close(FILE) or die "can't close file: $!";
} } find(\&process, $input);
sub html_clean {
my($input_string) = @_;
$input_string = lc($input_string);
$input_string =~ s/%20|\s/_/g;
return $input_string;
}
Is it possible to write a regular expression that matches a nested pattern that occurs an unknown number of times. For example, can a regular expression match an opening and closing brace when there are an unknown number of open closing braces nested within the outer braces.
For example:
public MyMethod()
{
if (test)
{
// More { }
}
// More { }
} // End
Should match:
{
if (test)
{
// More { }
}
// More { }
}
I want to change
<lang class='brush:xhtml'>test</lang>
to
<pre class='brush:xhtml'>test</pre>
my code like that.
<?php
$content="<lang class='brush:xhtml'>test</lang>";
$pattern=array();
$replace=array();
$pattern[0]="/<lang class=([A-Za-z='\":])* </";
$replace[0]="<pre $1>";
$pattern[1]="/<lang>/";
$replace[1]="</pre>";
echo preg_replace($pattern, $replace,$content);
?>
but it's not working. How to change my code or something wrong in my code ?
I'm currently working on an implementation of the following idea, and I was wondering if there is any literature on this subject.
Working with Java, but the principle applies on any language with a decent type-system, I like to implement: matching Objects from a List using a RegularExpression-esque search:
So let's say I have a List containing
List<Object> x = new ArrayList<Object>();
x.add(new Object());
x.add("Hello World");
x.add("Second String");
x.add(5); // Integer (auto-boxing)
x.add(6); // Integer
Then I create a "Regular Expression" (not working with a stream of characters, but working with a stream of Objects), and instead of character-classes, I use type-system properties:
[String][Integer]
And this would match one sublist: {Match["Second String", 5]}. The expression:
[String:length()<15]
Will match two sublist (each of length 1) containing a String which instance is passing the expression instance.length() < 5: {Match["Hello World"],Match["Second String"]}.
[Object][Object]
Matches any pair in the List: {Match[Object,"Hello World"],Match["Second String", 5]}, in a streamed manner (no overlapping matches).
Ofcourse, my implementation will have grouping, lookahead/lookbehinds and is hierarchical (i.e. matching n elements from Lists in Lists), etc. The above merely illustrates the concept.
Is there a name for this principle, and is there literature available on it?
This is a simple one. I want to replace a sub-string with another sub-string on client-side using Javascript.
Original string is 'original READ ONLY'
I want to replace the 'READ ONLY' with 'READ WRITE'
Any quick answer please? Possibly with a javascript code snippet...
I am writing a small windows script in javascript/jscript for finding a match for a regexp with a string that i got by manipulating a file.
The file path can be provided relative or absolute. How to find whether a given path is absolute/relative and convert it to absolute for file manipulation?
I have string like this:
command "http://www.mysite.com" some_param="string param" some_param2=50
I want to tokenize this string into:
command
"http://www.mysite.com"
some_param="string param"
some_param2=50
I know it's possible to split with spaces but these parameters can also be seperated by commas, like:
command "http://www.mysite.com", some_param="string param", some_param2=50
I tried to do it like this:
\w+\=?\"?.+\"?
but it didn't work.
I'm using Amazon Web Service to get product descriptions of various items. The problem is that Amazon's content contains mark up that is sometimes destructive to the layout of my web page (e.g. unclosed DIVs, etc.).
I want to sanitize the content I get from Amazon. My solution would be to do the following (my initial list so far):
Remove unnecessary tags such as div, span, etc. while keeping tags like p, ul, ol, etc.
Remove all attributes from all the tags (e.g. seems like there are style attributes in some of the tags)
Remove excess white space (e.g. multiple spaces, carriage returns, new lines, tabs, etc.)
Etc.
Before I go off trying to build my solution, I'm wondering if anyone has a better idea (or an already existing solution). Thanks.
I want to find files that have "abc" AND "efg" in that order, and those two strings are on different lines in that file. Eg: a file with content:
blah blah..
blah blah..
blah abc blah
blah blah..
blah blah..
blah blah..
blah efg blah blah
blah blah..
blah blah..
Should be matched.
I need to grab the video ID from a Google Video URL. There are two different types of URLs that I need to be able to match:
http://video.google.com/videoplay?docid=-3498228245415745977#
where I need to be able to match -3498228245415745977 (note the dash; -), and
video.google.com/videoplay?docid=-3498228245415745977#docid=2728972720932273543
where I need to match 2728972720932273543. Is there any good regular expression that can match this?
This is what I've got so far: @"docid=(-?\d{19}+)" since the video ID seems to be 19 characters except when it's prefixed with the dash.
I'm using C# (of which I have very little experience) if that changes anything.
P.s. I would also appreciate you review my regular expressions for YouTube (@"[\?&]v=([^&#])";), RedTube (@"/(\d{1,6})") and Vimeo (@"/(\d*)").
I do not expect users to enter the full URL and thus do not match the ^http://\\.?sitename+\\.\\w{2,3}.
I want to grep the shortest match and the pattern should be something like:
<car ... model=BMW ...>
...
...
...
</car>
... means any character and the input is multiple lines.
Hi,
I am trying to match pattern like '@(a-zA-Z0-9)+ " but not like 'abc@test'.
So this is what I tried:
Pattern MY_PATTERN
= Pattern.compile("\\s@(\\w)+\\s?");
String data = "[email protected] #gogasig @jytaz @tibuage";
Matcher m = MY_PATTERN.matcher(data);
StringBuffer sb = new StringBuffer();
boolean result = m.find();
while(result) {
System.out.println (" group " + m.group());
result = m.find();
}
But I can only see '@jytaz', but not @tibuage.
How can I fix my problem? Thank you.
Hello,
I am trying to write some mod_rewrite rules to generate thumbnails on the fly.
So when this url
example.com/media/myphoto.jpg?width=100&height=100
the script should rewrite it to
example.com/media/myphoto-100x100.jpg
and if the file exists on the disk it gets served by Apache and if it doesn't exist it is called a script to generate the file.
I wrote this
RewriteCond %{QUERY_STRING} ^width=(\d+)&height=(\d+)
RewriteRule ^media/([a-zA-Z0-9_\-]+)\.([a-zA-Z0-9]+)$ media/$1-%1x%2.$2 [L]
RewriteCond %{QUERY_STRING} ^(.+)?
RewriteRule ^media/([a-zA-Z0-9_\-\._]+)$ media/index.php?file=$1&%1 [L]
and I get infinite internal redirects.
The first condition is matched and the rule is executed and right after that I get an internal redirect.
I need advice to finish this script.
Thank you.
I would like to extract the number from a string in MSBuild.
How can I do that using the built in tasks or the MSBuild.Community.Tasks? (RegexMatch might do, but how?)
Example: I have the string
agent0076
and I would like to get out the number, without the leading zeros:
76
hello!
I need to extract the zipcode from file's line.
each line contains an adress and is formatted in a different way.
eg.
"Großen Haag 5c, DE-47559 Kranenburg"
or
"Lange Ruthe 7b, 55294 Bodenheim"
the zipcode is always a five digit number and sometimes follows "DE-".
I use Java.
Thanks a lot!
While I can see the value and usefulness of regular expressions, I also find that they are extremely complicated and difficult to create and debug. I am often at the point where I find their usefulness is offset by the difficulty in creating expressions.
I am a bit astonished by the fact that there is nothing quite like them and that there hasn't been an effort to recreate them use a more verbose or less arcane syntax.
so, are regular expressions here to stay? are there alternatives that are gaining traction? do other people just ignore them and write hundreds of lines of string compare functions?
I'm trying to create a standardized show/hide element system, like so:
<div class="opener popup_1">Click Me</div>
<div class="popup popup_1">I'm usually hidden</div>
Clicking on the div with the opener class should show() the div with the popup class. I don't know how many opener/popup combinations I'm going to have on any given page, I don't know where on any given page the opener and the popup are going to be displayed, and I don't know how many popups a given opener should call show() for. Both the opener and the popup have to be able to have more classes than just what's used by jQuery.
What I'd like to do is something like this:
$(".opener").click(function() {
var openerTarget = $(this).attr("class").filter(function() {
return this.class.match(/^popup_([a-zA-Z0-9-_\+]*) ?$/);
});
$(".popup." + openerTarget).show();
The idea is that when you click on an opener, it filters out "popup_whatever" from opener's classes and stores that as openerTarget. Then anything with class=popup and openerTarget will be shown.
I am using codeigniter and its routes system successfully with some lovely regexp, however I have come unstuck on what should be an easy peasy thing in the system.
I want to include a bunch of search engine related files (for Google webmaster etc.) plus the robots.txt file, all in a controller.
So, I have create the controller and updated the routes file and don't seem to be able to get it working with these files.
Here's a snip from my routes file:
$route['robots\.txt|LiveSearchSiteAuth\.xml'] = 'search_controller/files';
Within the function I use the URI helper to figure out which content to show.
Now I can't get this to match, which points to my regexp being wrong. I'm sure this is a really obvious one but its late and my caffeine tank is empty :)
i have following code that strip all tags. now i want to strip only anchor tags.
x = re.compile(r'<[^<]*?/?>')
how to modify so that only anchor tags stripped.
I have a variable that has this string:
<DIV><SPAN style="FONT-FAMILY: Tahoma; FONT-SIZE: 10pt">[If the confirmation is active the subscriber will receive this email after succesfully confirming. If not, this will be the first and only email he will receive.]</SPAN></DIV>
<p align=center>
<input class=fieldbox10 type = 'button' name = 'button' value = 'Close' onclick = "window.close()">
</p>
How do I remove the below string without worrying about spaces via Javascript (or jQuery)?
<p align=center>
<input class=fieldbox10 type = 'button' name = 'button' value = 'Close' onclick = "window.close()">
</p>