Search Results

Search found 32731 results on 1310 pages for 'regex for html'.

Page 2/1310 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Removing .html and index.html from URL

    - by James Turner
    I'm having some problems trying to Remove the .html extension from URLs Removing 'index.html' from an URL 1) To remove the extension I have tried using this in my htaccess file. RewriteEngine on RewriteCond %{REQUEST_FILENAME} !-d RewriteCond %{REQUEST_FILENAME}\.html -f RewriteRule ^(.*)$ $1.html However when I click links in my HTML such as <a href="abcde.html"></a> it doesn't remove the .html from the URL and I am left with www.website.com/abcde.html 2) I tried using this to remove the index.html RewriteCond %{THE_REQUEST} \/index\.(php|html)\ HTTP [NC] RewriteRule (.*)index\.(php|html)$ /$1 [R=301,L] But when I load an index.html file on my server, my URL looks something like this www.website.com/folder// I am left with an extra / at the end. Can anyone help me out?

    Read the article

  • .htaccess 301 redirect with regex?

    - by Eddie ZA
    How to do this with regular expression? Old -> New http://www.example.com/1.html -> http://www.example.com/dir/1.html http://www.example.com/2.html -> http://www.example.com/dir/2.html http://www.example.com/3.asp -> http://www.example.com/dir/3.html http://www.example.com/4.asp -> http://www.example.com/dir/4.html http://www.example.com/4_a.html -> http://www.example.com/dir/sub/4-a.html http://www.example.com/4_b.html -> http://www.example.com/dir/sub/4-b.html I've tried this: Redirect 301 /1.html http://www.example.com/dir/1.html Redirect 301 /2.html http://www.example.com/dir/2.html Redirect 301 /3.asp http://www.example.com/dir/3.html Redirect 301 /4.asp http://www.example.com/dir/4.html Redirect 301 /4_a.html http://www.example.com/dir/sub/4-a.html Redirect 301 /4_b.html http://www.example.com/dir/sub/4-b.html

    Read the article

  • Get the rendered text from HTML (Delphi)

    - by Daisetsu
    I have some HTML and I need to extract the actual written text from the page. So far I have tried using a web browser and rendering the page, then going to the document property and grabbing the text. This works, but only where the browser is supported (IE com object). The problem is I want this to be able to run under wine also, so I need a solution that doesn't use IE COM. There must be a programatic way to do this that is reasonable.

    Read the article

  • .NET regex: Match.nextMatch() never returns

    - by Jimmy
    I have a regex that seems to have worked fine for the past year or so, and all of a sudden today with a new slightly different text to match against, Match.nextMatch() never returns. I'm no regex expert and I'm sure the regex can be optimized, but previous data sets weren't much more complex than what I've tried today. Furthermore, the regex works fine against the offending data set in a tool like RegexBuddy; it's only in .net (running in debug in Visual Studio) that it seems to hang. Nevertheless, if anyone can figure out how to tweak the regex to make it work, I'd really appreciate it. This is the regex: <tr>(<td[^>]*><a[^>]*>(?<callOptionTicker>[A-Z]{1,5}\d{6}C\d{8})</a></td>)(<td[^>]*>.*?</td>){6}(<td[^>]*><b><a[^>]*>(?<strikePrice>\d*\.\d*)</a></b></td>)(<td[^>]*><a[^>]*>(?<putOptionTicker>[A-Z]{1,5}\d{6}P\d{8})</a></td>) It's meant to extract put and call option tickers from a Yahoo option chain page (i.e., raw HTML). It works fine for IBM http://finance.yahoo.com/q/os?s=IBM&m=2010-05-21 It doesn't work for SPX options (this is the offending data set) http://finance.yahoo.com/q/os?s=I:SPX.W&m=2010-05

    Read the article

  • Regex to leave desired string remaining and others removed

    - by m7d
    In Ruby, what regex will strip out all but a desired string if present in the containing string? I know about /[^abc]/ for characters, but what about strings? Say I have the string "group=4&type_ids[]=2&type_ids[]=7&saved=1" and want to retain the pattern group=\d, if it is present in the string using only a regex? Currently, I am splitting on & and then doing a select with matching condition =~ /group=\d/ on the resulting enumerable collection. It works fine, but I'd like to know the regex to do this more directly.

    Read the article

  • Trying to parse links in an HTML directory listing using Java regex

    - by DiskCrasher
    Ok I know everyone is going to tell me not to use RegEx for parsing HTML, but I'm programming on Android and don't have ready access to an HTML parser (that I'm aware of). Besides, this is server generated HTML which should be more consistent than user-generated HTML. The regex looks like this: Pattern patternMP3 = Pattern.compile( "<A HREF=\"[^\"]+.+\\.mp3</A>", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE); Matcher matcherMP3 = patternMP3.matcher(HTML); while (matcherMP3.find()) { ... } The input HTML is all on one line, which is causing the problem. When the HTML is on separate lines this pattern works. Any suggestions?

    Read the article

  • Haskell: 'No instance for' arising from a trivial usage of Regex library

    - by artemave
    Following the (accepted) answer from this question, I am expecting the following to work: Prelude Text.Regex.Posix Text.Regex.Base.RegexLike Text.Regex.Posix.String> makeRegex ".*" (makeRegex is a shortcut for makeRegexOpts with predefined options) However, it doesn't: <interactive>:1:0: No instance for (RegexMaker regex compOpt execOpt [Char]) arising from a use of `makeRegex' at <interactive>:1:0-13 Possible fix: add an instance declaration for (RegexMaker regex compOpt execOpt [Char]) In the expression: makeRegex ".*" In the definition of `it': it = makeRegex ".*" Prelude Text.Regex.Posix Text.Regex.Base.RegexLike Text.Regex.Posix.String> make Regex ".*"::Regex <interactive>:1:0: No instance for (RegexMaker Regex compOpt execOpt [Char]) arising from a use of `makeRegex' at <interactive>:1:0-13 Possible fix: add an instance declaration for (RegexMaker Regex compOpt execOpt [Char]) In the expression: makeRegex ".*" :: Regex In the definition of `it': it = makeRegex ".*" :: Regex And I really don't understand why. EDIT Haskell Platform 2009.02.02 (GHC 6.10.4) on Windows EDIT2 Prelude Text.Regex.Base.RegexLike Text.Regex.Posix.String> :i RegexMaker class (RegexOptions regex compOpt execOpt) => RegexMaker regex compOpt execOpt source | regex -> compOpt execOpt, compOpt -> regex execOpt, execOpt -> regex compOpt where makeRegex :: source -> regex makeRegexOpts :: compOpt -> execOpt -> source -> regex makeRegexM :: (Monad m) => source -> m regex makeRegexOptsM :: (Monad m) => compOpt -> execOpt -> source -> m regex -- Defined in Text.Regex.Base.RegexLike

    Read the article

  • How to extract terms from an HTML document

    - by bookcasey
    I have a HTML document filled with terms that I need to put into a spreadsheet. They follow this basic pattern: <ul> <li class="name"><a href="spot.html">Spot</a></li> <li class="type">Dog</li> <li class="color">Red</li> </ul> <ul> <li class="name"><a href="mittens.html">Mittens</a></li> <li class="type">Cat</li> <li class="color">Brown</li> </ul> <ul> <li class="name"><a href="squakers.html">Squakers</a></li> <li class="type">Little Parrot</li> <li class="color">Rainbow</li> </ul> It's very consistent. I need to extract the string within the li.name a (so, "Spot") but only if the type is "Dog" or "Parrot", and put them in a spreadsheet. I've been trying to use Sublime Text's ability to Find with regex, but I'm really struggling, and since regex and HTML usually don't play nice, I was wondering if there is a better and easier way to accomplish this. Thanks.

    Read the article

  • OWASP Regex Repository: Is this regex correct?

    - by Jacco
    I was looking at the regular expression for validating various data types from the (OWASP Regex Repository). One of the regular expressions in there is called safetext and looks like: ^[a-zA-Z0-9\s.\-]+$ My first question is: Is this regular expression correct? complementary question If this Regex Repository any good at all?

    Read the article

  • regex return conditional group

    - by priyank.mp
    Hi , I spent lot time figuring out a simple regex to return a group (only 1st group). So the string can be - "No purchase required" or "Purchase of $50.00 worth groceries is required." I am trying to write a regex which can parse "No" or "50" based on the given string. This is what I have written. (?:(No) monthly maintenance|Purchase of \$([\d\.]+ worth groceries) This works fine but I want my output as 1st group/group 1 only.

    Read the article

  • Number nested ordered lists in HTML

    - by John
    Hi I have a nested ordered list. <ol> <li>first</li> <li>second <ol> <li>second nested first element</li> <li>second nested secondelement</li> <li>second nested thirdelement</li> </ol> </li> <li>third</li> <li>fourth</li> </ol> Currently the nested elements start back from 1 again, e.g. first second second nested first element second nested second element second nested third element third fourth What I want is for the second element to be numbered like this: first second 2.1. second nested first element 2.2. second nested second element 2.3. second nested third element third fourth Is there a way of doing this? Thanks

    Read the article

  • Replacing specific HTML tags using Regex

    - by matthewpe
    Alright, an easy one for you guys. We are using ActiveReport's RichTextBox to display some random bits of HTML code. The HTML tags supported by ActiveReport can be found here : http://www.datadynamics.com/Help/ARNET3/ar3conSupportedHtmlTagsInRichText.html An example of what I want to do is replace any match of <div style="text-align:*</div> by <p style=\"text-align:*</p> in order to use a supported tag for text-alignment. I have found the following regex expression to find the correct match in my html input: <div style=\"text-align:(.*?)</div> However, I can't find a way to keep the previous text contained in the tags after my replacement. Any clue? Is it me or Regex are generally a PITA? :) private static readonly IDictionary<string, string> _replaceMap = new Dictionary<string, string> { {"<div style=\"text-align:(.*?)</div>", "<p style=\"text-align:(.*?)</p>"} }; public static string FormatHtml(string html) { foreach(var pair in _replaceMap) { html = Regex.Replace(html, pair.Key, pair.Value); } return html; } Thanks!

    Read the article

  • Find multiple regex in each line and skip result if one of the regex doesn't match

    - by williamx
    I have a list of variables: variables = ['VariableA', 'VariableB','VariableC'] which I'm going to search for, line by line ifile = open("temp.txt",'r') d = {} match = zeros(len(variables)) for line in ifile: emptyCells=0 for i in range(len(variables)): regex = r'('+variables[i]+r')[:|=|\(](-?\d+(?:\.\d+)?)(?:\))?' pattern_variable = re.compile(regex) match[i] = re.findall(pattern_variable, line) if match[j] == []: emptyCells = emptyCells+1 if emptyCells == 0: for k, v in match[j]: d.setdefault(k, []).append(v) The requirement is that I will only keep the lines where all the regex'es matches! I want to collect all results for each variable in a dictionary where the variable name is the key, and the value becomes a list of all matches. The code provided is only what I've found out so far, and is not working perfectly yet...

    Read the article

  • Force page reload with html anchors (#) - HTML & JS

    - by yuval
    Say I'm on a page called /example#myanchor1 where myanchor is an anchor in the page. I'd like to link to /example#myanchor2, but force the page to reload while doing so. The reason is that I run js to detect the anchor from the url at the page load. The problem [normally expected behavior] here though, is that the browser just sends me to that specific anchor on the page without reloading the page. How would I go about doing so (JS OK). Thanks!

    Read the article

  • Using Regex, how can I remove certain characters from inside angle-brackets, leaving the characters

    - by Iain Fraser
    Edit: To be clear, please understand that I am not using Regex to parse the html, that's crazy talk! I'm simply wanting to clean up a messy string of html so it will parse Edit #2: I should also point out that the control character I'm using is a special unicode character - it's not something that would ever be used in a proper tag under any normal circumstances Suppose I have a string of html that contains a bunch of control characters and I want to remove the control characters from inside tags only, leaving the characters outside the tags alone. For example Here the control character is the numeral "1". Input The quick 1<strong>orange</strong> lemming <sp11a1n 1class1='jumpe111r'11>jumps over</span> 1the idle 1frog Desired Output The quick 1<strong>orange</strong> lemming <span class='jumper'>jumps over</span> 1the idle 1frog So far I can match tags which contain the control character but I can't remove them in one regex. I guess I could perform another regex on my matches, but I'd really like to know if there's a better way. My regex Bear in mind this one only matches tags which contain the control character. <(([^>])*?`([^>])*?)*?> Thanks very much for your time and consideration. Iain Fraser

    Read the article

  • PHP regex help -- reverse search?

    - by Ian Silber
    So, I have a regex that searches for HTML tags and modifies them slightly. It's working great, but I need to do something special with the last closing HTML tag I find. Not sure of the best way to do this. I'm thinking some sort of reverse reg ex, but haven't found a way to do that. Here's my code so far: $html = "<div id="test"><p style="hello_world">This is a test.</p></div>"; $pattern = array('/<([A-Z][A-Z0-9]*)(\b[^>]*)>/i'); $replace = array('<tag>'); $html = preg_replace($pattern,$replace,$html); // Outputs: <tag><tag>This is a test</p></div> I'd like to replace the last occurance of "" with something special, say for example, "". Any ideas?

    Read the article

  • Vim Regex to replace tags

    - by Rudiger Wolf
    I am lookin for a regex express to remove the email addresses from a text file. Input file: Hannah Churchman <[email protected]>; Julie Drew <[email protected]>; Output file: Hannah Churchman; Julie Drew; I thought a generic regex shuch as s/<(.*?)//g would be a good starting point but I am unable to find the right expression for use Vim? something like :%s/ <\(.*?\)>//g does not work. Error is "E486: Pattern not found:". :%s#[^ <]*>##g almost works but it leaves the space and < behind. :%s# <##g to remove the " <" remaining stuff. Any tips on how to better craft this command?

    Read the article

  • Parsing HTML Documents with the Html Agility Pack

    Screen scraping is the process of programmatically accessing and processing information from an external website. For example, a price comparison website might screen scrape a variety of online retailers to build a database of products and what various retailers are selling them for. Typically, screen scraping is performed by mimicking the behavior of a browser - namely, by making an HTTP request from code and then parsing and analyzing the returned HTML. The .NET Framework offers a variety of classes for accessing data from a remote website, namely the WebClient class and the HttpWebRequest class. These classes are useful for making an HTTP request to a remote website and pulling down the markup from a particular URL, but they offer no assistance in parsing the returned HTML. Instead, developers commonly rely on string parsing methods like String.IndexOf, String.Substring, and the like, or through the use of regular expressions. Another option for parsing HTML documents is to use the Html Agility Pack, a free, open-source library designed to simplify reading from and writing to HTML documents. The Html Agility Pack constructs a Document Object Model (DOM) view of the HTML document being parsed. With a few lines of code, developers can walk through the DOM, moving from a node to its children, or vice versa. Also, the Html Agility Pack can return specific nodes in the DOM through the use of XPath expressions. (The Html Agility Pack also includes a class for downloading an HTML document from a remote website; this means you can both download and parse an external web page using the Html Agility Pack.) This article shows how to get started using the Html Agility Pack and includes a number of real-world examples that illustrate this library's utility. A complete, working demo is available for download at the end of this article. Read on to learn more! Read More >

    Read the article

  • Wishful Thinking: Why can't HTML fix Script Attacks at the Source?

    - by Rick Strahl
    The Web can be an evil place, especially if you're a Web Developer blissfully unaware of Cross Site Script Attacks (XSS). Even if you are aware of XSS in all of its insidious forms, it's extremely complex to deal with all the issues if you're taking user input and you're actually allowing users to post raw HTML into an application. I'm dealing with this again today in a Web application where legacy data contains raw HTML that has to be displayed and users ask for the ability to use raw HTML as input for listings. The first line of defense of course is: Just say no to HTML input from users. If you don't allow HTML input directly and use HTML Encoding (HttyUtility.HtmlEncode() in .NET or using standard ASP.NET MVC output @Model.Content) you're fairly safe at least from the HTML input provided. Both WebForms and Razor support HtmlEncoded content, although Razor makes it the default. In Razor the default @ expression syntax:@Model.UserContent automatically produces HTML encoded content - you actually have to go out of your way to create raw HTML content (safe by default) using @Html.Raw() or the HtmlString class. In Web Forms (V4) you can use:<%: Model.UserContent %> or if you're using a version prior to 4.0:<%= HttpUtility.HtmlEncode(Model.UserContent) %> This works great as a hedge against embedded <script> tags and HTML markup as any HTML is turned into text that displays as HTML but doesn't render the HTML. But it turns any embedded HTML markup tags into plain text. If you need to display HTML in raw form with the markup tags rendering based on user input this approach is worthless. If you do accept HTML input and need to echo the rendered HTML input back, the task of cleaning up that HTML is a complex task. In the projects I work on, customers are frequently asking for the ability to post raw HTML quite frequently.  Almost every app that I've built where there's document content from users we start out with text only input - possibly using something like MarkDown - but inevitably users want to just post plain old HTML they created in some other rich editing application. See this a lot with realtors especially who often want to reuse their postings easily in multiple places. In my work this is a common problem I need to deal with and I've tried dozens of different methods from sanitizing, simple rejection of input to custom markup schemes none of which have ever felt comfortable to me. They work in a half assed, hacked together sort of way but I always live in fear of missing something vital which is *really easy to do*. My Wishlist Item: A <restricted> tag in HTML Let me dream here for a second on how to address this problem. It seems to me the easiest place where this can be fixed is: In the browser. Browsers are actually executing script code so they have a lot of control over the script code that resides in a page. What if there was a way to specify that you want to turn off script code for a block of HTML? The main issue when dealing with HTML raw input isn't that we as developers are unaware of the implications of user input, but the fact that we sometimes have to display raw HTML input the user provides. So the problem markup is usually isolated in only a very specific part of the document. So, what if we had a way to specify that in any given HTML block, no script code could execute by wrapping it into a tag that disables all script functionality in the browser? This would include <script> tags and any document script attributes like onclick, onfocus etc. and potentially also disallow things like iFrames that can potentially be scripted from the within the iFrame's target. I'd like to see something along these lines:<article> <restricted allowscripts="no" allowiframes="no"> <div>Some content</div> <script>alert('go ahead make my day, punk!");</script> <div onfocus="$.getJson('http://evilsite.com/')">more content</div> </restricted> </article> A tag like this would basically disallow all script code from firing from any HTML that's rendered within it. You'd use this only on code that you actually render from your data only and only if you are dealing with custom data. So something like this:<article> <restricted> @Html.Raw(Model.UserContent) </restricted> </article> For browsers this would actually be easy to intercept. They render the DOM and control loading and execution of scripts that are loaded through it. All the browser would have to do is suspend execution of <script> tags and not hookup any event handlers defined via markup in this block. Given all the crazy XSS attacks that exist and the prevalence of this problem this would go a long way towards preventing at least coded script attacks in the DOM. And it seems like a totally doable solution that wouldn't be very difficult to implement by vendors. There would also need to be some logic in the parser to not allow an </restricted> or <restricted> tag into the content as to short-circuit the rstricted section (per James Hart's comment). I'm sure there are other issues to consider as well that I didn't think of in my off-the-back-of-a-napkin concept here but the idea overall seems worth consideration I think. Without code running in a user supplied HTML block it'd be pretty hard to compromise a local HTML document and pass information like Cookies to a server. Or even send data to a server period. Short of an iFrame that can access the parent frame (which is another restriction that should be available on this <restricted> tag) that could potentially communicate back, there's not a lot a malicious site could do. The HTML could still 'phone home' via image links and href links potentially and basically say this site was accessed, but without the ability to run script code it would be pretty tough to pass along critical information to the server beyond that. Ahhhh… one can dream… Not holding my breath of course. The design by committee that is the W3C can't agree on anything in timeframes measured less than decades, but maybe this is one place where browser vendors can actually step up the pressure. This is something in their best interest to reduce the attack surface for vulnerabilities on their browser platforms significantly. Several people commented on Twitter today that there isn't enough discussion on issues like this that address serious needs in the web browser space. Realistically security has to be a number one concern with Web applications in general - there isn't a Web app out there that is not vulnerable. And yet nothing has been done to address these security issues even though there might be relatively easy solutions to make this happen. It'll take time, and it's probably not going to happen in our lifetime, but maybe this rambling thought sparks some ideas on how this sort of restriction can get into browsers in some way in the future.© Rick Strahl, West Wind Technologies, 2005-2012Posted in ASP.NET  HTML5  HTML  Security   Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

    Read the article

  • Complex(?) regex: Is expression, but not another

    - by Kieron
    (If you can make a better title, please do) Hi, I need to make sure a string matches the following regex: ^[0-9a-zA-Z]{1}[0-9a-zA-Z\.\-_]*$ (Starts with a letter or number, then any number of letters, numbers, dots, dashes or underscores) But given that, I need to make sure it doesn't match a Guid, my Guid matching reg-ex looks like this (obviously, this needs to be negated in the merged result): ^([0-9a-fA-F]){8}-([0-9a-fA-F]){4}-([0-9a-fA-F]){4}-([0-9a-fA-F]){4}-([0-9a-fA-F]){12}$ The last requirement here is that they must (if it's possible) be merged into a single expression.

    Read the article

  • A pattern matching an expression that doesn't end with specific sequence

    - by patryk
    I need a regex pattern which matches such strings that DO NOT end with such a sequence: \.[A-z0-9]{2,} by which I mean the examined string must not have at its end a sequence of a dot and then two or more alphanumeric characters. For example, a string /home/patryk/www and also /home/patryk/www/ should match desired pattern and /home/patryk/images/DSC002.jpg should not. I suppose this has something to do with lookarounds (look aheads) but still I have no idea how to make it. Any help appreciated.

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >