Search Results

Search found 33668 results on 1347 pages for 'html to txt'.

Page 3/1347 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Parsing HTML Documents with the Html Agility Pack

    Screen scraping is the process of programmatically accessing and processing information from an external website. For example, a price comparison website might screen scrape a variety of online retailers to build a database of products and what various retailers are selling them for. Typically, screen scraping is performed by mimicking the behavior of a browser - namely, by making an HTTP request from code and then parsing and analyzing the returned HTML. The .NET Framework offers a variety of classes for accessing data from a remote website, namely the WebClient class and the HttpWebRequest class. These classes are useful for making an HTTP request to a remote website and pulling down the markup from a particular URL, but they offer no assistance in parsing the returned HTML. Instead, developers commonly rely on string parsing methods like String.IndexOf, String.Substring, and the like, or through the use of regular expressions. Another option for parsing HTML documents is to use the Html Agility Pack, a free, open-source library designed to simplify reading from and writing to HTML documents. The Html Agility Pack constructs a Document Object Model (DOM) view of the HTML document being parsed. With a few lines of code, developers can walk through the DOM, moving from a node to its children, or vice versa. Also, the Html Agility Pack can return specific nodes in the DOM through the use of XPath expressions. (The Html Agility Pack also includes a class for downloading an HTML document from a remote website; this means you can both download and parse an external web page using the Html Agility Pack.) This article shows how to get started using the Html Agility Pack and includes a number of real-world examples that illustrate this library's utility. A complete, working demo is available for download at the end of this article. Read on to learn more! Read More >

    Read the article

  • Googlebot cant access my site webmaster tools reply Unreachable robots.txt

    - by Ahmad Ahmadi
    When I try to fetch my site as a googlebot in webmaster tools it return Unreachable robots.txt, after investigate I understood google bot can see my server: tcpdump | grep google it return that google can access my server with IP 66.249.81.172 or 66.249.75.111. but there is not any think in access log or error log or other apache logs. cat access_log | grep google or cat error_log | grep 66.249.81.172 Other bot (bing,...) can access apache but google cant. there is not any problem in my robots.txt or its permissions because as you know robots.txt is not necessary so I delete it but again webmaster tools returned Unreachable robots.txt not 404 not found! information about server: Server OS : CentOS 6 Web Server : Apache 2.x Firewall : IPTables is stoped SELinux is Disabled There is not any think else for security on my server. how can I investigate the problem and is there any other command that can help me to find the problem.

    Read the article

  • Amazon SES domain verification TXT DNS record

    - by Skittles
    I currently am trying to get my domain verified on Amazon's SES and running int a problem that google searches are not helping me get any closer to solving. According to SES, I have to create a TXT record in my DNS for the domain I'm trying to verify. Amazon gives you the following (value changed for security purposes); TYPE: TXT NAME: _amazonses.somedomain.com VALUE: M2sXTycXkgZXXuMuWI8TczngaPIDDMToPefzGhZ3uYA= I have tried numerous entries in my registrar's DNS manager, but SES still fails to find what it's looking for. I am not a DNS guru, so, I have tried to construct the TXT record from very sparse examples, at best, to try to get this right. My present TXT record is this; "v=DKIM1 s=_domainkey d=_amazonses.somedomain.com p=M2sXTycXkgZXXuMuWI8TczngaPIDDMToPefzGhZ3uYA=" Am I doing something incorrect? Thanks

    Read the article

  • apache robots.txt with SSL

    - by user224013
    I have an .htaccess file with a rewrite rule to get a redirect of every HTTP request to HTTPS. But now I have a problem that my robots.txt is not recognized by some online checker. If I remove the redirect from the .htaccess file the robots.txt is recognized correctly. Maybe I should exclude that the robots.txt is redirects to an HTTPS connection? This is the part of .htaccess for redirecting to HTTPS RewriteCond %{SERVER_PORT} !^443$ RewriteRule (.*) https://%{HTTP_HOST}/$1 [L]

    Read the article

  • Wishful Thinking: Why can't HTML fix Script Attacks at the Source?

    - by Rick Strahl
    The Web can be an evil place, especially if you're a Web Developer blissfully unaware of Cross Site Script Attacks (XSS). Even if you are aware of XSS in all of its insidious forms, it's extremely complex to deal with all the issues if you're taking user input and you're actually allowing users to post raw HTML into an application. I'm dealing with this again today in a Web application where legacy data contains raw HTML that has to be displayed and users ask for the ability to use raw HTML as input for listings. The first line of defense of course is: Just say no to HTML input from users. If you don't allow HTML input directly and use HTML Encoding (HttyUtility.HtmlEncode() in .NET or using standard ASP.NET MVC output @Model.Content) you're fairly safe at least from the HTML input provided. Both WebForms and Razor support HtmlEncoded content, although Razor makes it the default. In Razor the default @ expression syntax:@Model.UserContent automatically produces HTML encoded content - you actually have to go out of your way to create raw HTML content (safe by default) using @Html.Raw() or the HtmlString class. In Web Forms (V4) you can use:<%: Model.UserContent %> or if you're using a version prior to 4.0:<%= HttpUtility.HtmlEncode(Model.UserContent) %> This works great as a hedge against embedded <script> tags and HTML markup as any HTML is turned into text that displays as HTML but doesn't render the HTML. But it turns any embedded HTML markup tags into plain text. If you need to display HTML in raw form with the markup tags rendering based on user input this approach is worthless. If you do accept HTML input and need to echo the rendered HTML input back, the task of cleaning up that HTML is a complex task. In the projects I work on, customers are frequently asking for the ability to post raw HTML quite frequently.  Almost every app that I've built where there's document content from users we start out with text only input - possibly using something like MarkDown - but inevitably users want to just post plain old HTML they created in some other rich editing application. See this a lot with realtors especially who often want to reuse their postings easily in multiple places. In my work this is a common problem I need to deal with and I've tried dozens of different methods from sanitizing, simple rejection of input to custom markup schemes none of which have ever felt comfortable to me. They work in a half assed, hacked together sort of way but I always live in fear of missing something vital which is *really easy to do*. My Wishlist Item: A <restricted> tag in HTML Let me dream here for a second on how to address this problem. It seems to me the easiest place where this can be fixed is: In the browser. Browsers are actually executing script code so they have a lot of control over the script code that resides in a page. What if there was a way to specify that you want to turn off script code for a block of HTML? The main issue when dealing with HTML raw input isn't that we as developers are unaware of the implications of user input, but the fact that we sometimes have to display raw HTML input the user provides. So the problem markup is usually isolated in only a very specific part of the document. So, what if we had a way to specify that in any given HTML block, no script code could execute by wrapping it into a tag that disables all script functionality in the browser? This would include <script> tags and any document script attributes like onclick, onfocus etc. and potentially also disallow things like iFrames that can potentially be scripted from the within the iFrame's target. I'd like to see something along these lines:<article> <restricted allowscripts="no" allowiframes="no"> <div>Some content</div> <script>alert('go ahead make my day, punk!");</script> <div onfocus="$.getJson('http://evilsite.com/')">more content</div> </restricted> </article> A tag like this would basically disallow all script code from firing from any HTML that's rendered within it. You'd use this only on code that you actually render from your data only and only if you are dealing with custom data. So something like this:<article> <restricted> @Html.Raw(Model.UserContent) </restricted> </article> For browsers this would actually be easy to intercept. They render the DOM and control loading and execution of scripts that are loaded through it. All the browser would have to do is suspend execution of <script> tags and not hookup any event handlers defined via markup in this block. Given all the crazy XSS attacks that exist and the prevalence of this problem this would go a long way towards preventing at least coded script attacks in the DOM. And it seems like a totally doable solution that wouldn't be very difficult to implement by vendors. There would also need to be some logic in the parser to not allow an </restricted> or <restricted> tag into the content as to short-circuit the rstricted section (per James Hart's comment). I'm sure there are other issues to consider as well that I didn't think of in my off-the-back-of-a-napkin concept here but the idea overall seems worth consideration I think. Without code running in a user supplied HTML block it'd be pretty hard to compromise a local HTML document and pass information like Cookies to a server. Or even send data to a server period. Short of an iFrame that can access the parent frame (which is another restriction that should be available on this <restricted> tag) that could potentially communicate back, there's not a lot a malicious site could do. The HTML could still 'phone home' via image links and href links potentially and basically say this site was accessed, but without the ability to run script code it would be pretty tough to pass along critical information to the server beyond that. Ahhhh… one can dream… Not holding my breath of course. The design by committee that is the W3C can't agree on anything in timeframes measured less than decades, but maybe this is one place where browser vendors can actually step up the pressure. This is something in their best interest to reduce the attack surface for vulnerabilities on their browser platforms significantly. Several people commented on Twitter today that there isn't enough discussion on issues like this that address serious needs in the web browser space. Realistically security has to be a number one concern with Web applications in general - there isn't a Web app out there that is not vulnerable. And yet nothing has been done to address these security issues even though there might be relatively easy solutions to make this happen. It'll take time, and it's probably not going to happen in our lifetime, but maybe this rambling thought sparks some ideas on how this sort of restriction can get into browsers in some way in the future.© Rick Strahl, West Wind Technologies, 2005-2012Posted in ASP.NET  HTML5  HTML  Security   Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

    Read the article

  • Is there a way to disallow only crawling in https in robots.txt?

    - by David Wilkins
    I just realized that Bingbot is crawling my company's website's pages over https. Bing already crawls the site over http, so this seems frivolous. Is there a way to specify Disallow: / for https only? According to Wikipedia, each protocol has its own robots.txt And according to Google's Robots.txt Specification, the robots.txt applies to http AND https I don't want to Disallow: / for Bing totally, just over https.

    Read the article

  • Disallow robots.txt from being accessed in a browser but still accessible by spiders?

    - by Michael Irigoyen
    We make use of the robots.txt file to prevent Google (and other search spiders) from crawling certain pages/directories in our domain. Some of these directories/files are secret, meaning they aren't linked (except perhaps on other pages encompassed by the robots.txt file). Some of these directories/files aren't secret, we just don't want them indexed. If somebody browses directly to www.mydomain.com/robots.txt, they can see the contents of the robots.txt file. From a security standpoint, this is not something we want publicly available to anybody. Any directories that contain secure information are set behind authentication, but we still don't want them to be discoverable unless the user specifically knows about them. Is there a way to provide a robots.txt file but to have it's presence masked by John Doe accessing it from his browser? Perhaps by using PHP to generate the document based on certain criteria? Perhaps something I'm not thinking of? We'd prefer a way to centrally do it (meaning a <meta> tag solution is less than ideal).

    Read the article

  • Is there a way to disallow crawling of only HTTPS in robots.txt?

    - by David Wilkins
    I just realized that Bingbot is crawling my company's website's pages over https. Bing already crawls the site over http, so this seems frivolous. Is there a way to specify Disallow: / for https only? According to Wikipedia, each protocol has its own robots.txt And according to Google's Robots.txt Specification, the robots.txt applies to http AND https I don't want to Disallow: / for Bing totally, just over https.

    Read the article

  • How do I work around an HTML Table rendering bug in IE 7?

    - by osmaniac
    I have a table. Some cells span multiple columns. Some cells span multiple rows and columns. But one row (which spans all columns but the rightmost one) creates an artifact. Part of the text in the cell is erroneously repeated left justified on a new row just below the table. I'm baffled. I tried rendering with and without "table-layout: fixed;". Same result. When I originally composed the design using just HTML and CSS, it looked great. But then I worked it into a page and had to add more columns to my master table the right to hold buttons. These buttons are in three groups, each having their own div to control floating and rewrapping when the window gets narrower. One div has another table inside it that groups a single row of buttons. Thus I have table inside div inside td inside outer table. I would prefer a simpler design, but how? This is what I want to have: ................................................................................... . . . . Four rows of data . Three groups of buttons that can reflow . . With several columns . if window gets narrower . . meticulously layed out, . . . That should not resize . . . when window gets narrower . . ................................................................................... . One more row of data spanning the whole screen which stays below the buttons . ................................................................................... What I was doing was putting the three divs with the buttons in the upper right in a single cell that spanned four rows. What other opportunities does CSS offer? The buttons are not allowed to overlap the data on the left or go past the data line below. The original design had the divs with the buttons NOT in a table with the data, but when the window gets narrow, some of the buttons flow such that they go underneath the data, which looks bad. I would post actual HTML, except it is generated by ASP, huge, with lots of CSS styling, and the feature that lets me view the final HTML is not working at the moment. (Built in security in the application.)

    Read the article

  • Bitnami redmine error SVN

    - by Evgeniy
    I'm installing the Bitnami Redmine stack (redmine + subversion). Firstly I install configure and test it locally (Ubuntu 14.04 LTS). And everything is OK. I install Bitnami stack on server (Red Hat 4.4.7-4) and configure SVN. I commit files into SVN and connect project into Redmine with SVN repository, but when I try see it Rredmine displays 404 error. In the Redmine log file I see the following errors: Started GET "/redmine/projects/web-user-panel/repository" for 127.0.0.1 at 2014-04-24 11:34:20 +0300 Processing by RepositoriesController#show as HTML Parameters: {"id"=>"web-user-panel"} Current user: user (id=13) Error parsing svn output: #<REXML::ParseException: No close tag for /lists/list> /var/www/html/redmine/ruby/lib/ruby/1.9.1/rexml/parsers/treeparser.rb:28:in `parse' /var/www/html/redmine/ruby/lib/ruby/1.9.1/rexml/document.rb:245:in `build' /var/www/html/redmine/ruby/lib/ruby/1.9.1/rexml/document.rb:43:in `initialize' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activesupport-3.2.17/lib/active_support/xml_mini/rexml.rb:30:in `new' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activesupport-3.2.17/lib/active_support/xml_mini/rexml.rb:30:in `parse' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activesupport-3.2.17/lib/active_support/xml_mini.rb:80:in `parse' /var/www/html/redmine/apps/redmine/htdocs/lib/redmine/scm/adapters/abstract_adapter.rb:313:in `parse_xml' /var/www/html/redmine/apps/redmine/htdocs/lib/redmine/scm/adapters/subversion_adapter.rb:106:in `block in entries' /var/www/html/redmine/apps/redmine/htdocs/lib/redmine/scm/adapters/abstract_adapter.rb:258:in `call' /var/www/html/redmine/apps/redmine/htdocs/lib/redmine/scm/adapters/abstract_adapter.rb:258:in `block in shellout' /var/www/html/redmine/apps/redmine/htdocs/lib/redmine/scm/adapters/abstract_adapter.rb:255:in `popen' /var/www/html/redmine/apps/redmine/htdocs/lib/redmine/scm/adapters/abstract_adapter.rb:255:in `shellout' /var/www/html/redmine/apps/redmine/htdocs/lib/redmine/scm/adapters/abstract_adapter.rb:212:in `shellout' /var/www/html/redmine/apps/redmine/htdocs/lib/redmine/scm/adapters/subversion_adapter.rb:100:in `entries' /var/www/html/redmine/apps/redmine/htdocs/app/models/repository.rb:198:in `scm_entries' /var/www/html/redmine/apps/redmine/htdocs/app/models/repository.rb:203:in `entries' /var/www/html/redmine/apps/redmine/htdocs/app/controllers/repositories_controller.rb:116:in `show' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_controller/metal/implicit_render.rb:4:in `send_action' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/abstract_controller/base.rb:167:in `process_action' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_controller/metal/rendering.rb:10:in `process_action' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/abstract_controller/callbacks.rb:18:in `block in process_action' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activesupport-3.2.17/lib/active_support/callbacks.rb:491:in `_run__2883861927089110970__process_action__2542827355008294621__callbacks' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activesupport-3.2.17/lib/active_support/callbacks.rb:405:in `__run_callback' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activesupport-3.2.17/lib/active_support/callbacks.rb:385:in `_run_process_action_callbacks' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activesupport-3.2.17/lib/active_support/callbacks.rb:81:in `run_callbacks' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/abstract_controller/callbacks.rb:17:in `process_action' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_controller/metal/rescue.rb:29:in `process_action' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_controller/metal/instrumentation.rb:30:in `block in process_action' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activesupport-3.2.17/lib/active_support/notifications.rb:123:in `block in instrument' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activesupport-3.2.17/lib/active_support/notifications/instrumenter.rb:20:in `instrument' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activesupport-3.2.17/lib/active_support/notifications.rb:123:in `instrument' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_controller/metal/instrumentation.rb:29:in `process_action' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_controller/metal/params_wrapper.rb:207:in `process_action' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activerecord-3.2.17/lib/active_record/railties/controller_runtime.rb:18:in `process_action' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/abstract_controller/base.rb:121:in `process' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/abstract_controller/rendering.rb:45:in `process' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_controller/metal.rb:203:in `dispatch' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_controller/metal/rack_delegation.rb:14:in `dispatch' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_controller/metal.rb:246:in `block in action' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_dispatch/routing/route_set.rb:73:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_dispatch/routing/route_set.rb:73:in `dispatch' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_dispatch/routing/route_set.rb:36:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/journey-1.0.4/lib/journey/router.rb:68:in `block in call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/journey-1.0.4/lib/journey/router.rb:56:in `each' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/journey-1.0.4/lib/journey/router.rb:56:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_dispatch/routing/route_set.rb:608:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/rack-openid-1.3.1/lib/rack/openid.rb:98:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_dispatch/middleware/best_standards_support.rb:17:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/etag.rb:23:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/conditionalget.rb:25:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_dispatch/middleware/head.rb:14:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_dispatch/middleware/params_parser.rb:21:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_dispatch/middleware/flash.rb:242:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/session/abstract/id.rb:210:in `context' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/session/abstract/id.rb:205:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_dispatch/middleware/cookies.rb:341:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activerecord-3.2.17/lib/active_record/query_cache.rb:64:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activerecord-3.2.17/lib/active_record/connection_adapters/abstract/connection_pool.rb:479:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_dispatch/middleware/callbacks.rb:28:in `block in call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activesupport-3.2.17/lib/active_support/callbacks.rb:405:in `_run__1805290955544829105__call__1486932417638469082__callbacks' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activesupport-3.2.17/lib/active_support/callbacks.rb:405:in `__run_callback' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activesupport-3.2.17/lib/active_support/callbacks.rb:385:in `_run_call_callbacks' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activesupport-3.2.17/lib/active_support/callbacks.rb:81:in `run_callbacks' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_dispatch/middleware/callbacks.rb:27:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_dispatch/middleware/remote_ip.rb:31:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_dispatch/middleware/debug_exceptions.rb:16:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_dispatch/middleware/show_exceptions.rb:56:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/railties-3.2.17/lib/rails/rack/logger.rb:32:in `call_app' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/railties-3.2.17/lib/rails/rack/logger.rb:16:in `block in call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activesupport-3.2.17/lib/active_support/tagged_logging.rb:22:in `tagged' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/railties-3.2.17/lib/rails/rack/logger.rb:16:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_dispatch/middleware/request_id.rb:22:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/methodoverride.rb:21:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/runtime.rb:17:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/activesupport-3.2.17/lib/active_support/cache/strategy/local_cache.rb:72:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/lock.rb:15:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.17/lib/action_dispatch/middleware/static.rb:63:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/rack-cache-1.2/lib/rack/cache/context.rb:136:in `forward' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/rack-cache-1.2/lib/rack/cache/context.rb:245:in `fetch' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/rack-cache-1.2/lib/rack/cache/context.rb:185:in `lookup' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/rack-cache-1.2/lib/rack/cache/context.rb:66:in `call!' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/rack-cache-1.2/lib/rack/cache/context.rb:51:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/railties-3.2.17/lib/rails/engine.rb:484:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/railties-3.2.17/lib/rails/application.rb:231:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/railties-3.2.17/lib/rails/railtie/configurable.rb:30:in `method_missing' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/builder.rb:134:in `call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/urlmap.rb:64:in `block in call' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/urlmap.rb:49:in `each' /var/www/html/redmine/apps/redmine/htdocs/vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/urlmap.rb:49:in `call' /var/www/html/redmine/ruby/lib/ruby/gems/1.9.1/gems/passenger-4.0.40/lib/phusion_passenger/rack/thread_handler_extension.rb:74:in `process_request' /var/www/html/redmine/ruby/lib/ruby/gems/1.9.1/gems/passenger-4.0.40/lib/phusion_passenger/request_handler/thread_handler.rb:141:in `accept_and_process_next_request' /var/www/html/redmine/ruby/lib/ruby/gems/1.9.1/gems/passenger-4.0.40/lib/phusion_passenger/request_handler/thread_handler.rb:109:in `main_loop' /var/www/html/redmine/ruby/lib/ruby/gems/1.9.1/gems/passenger-4.0.40/lib/phusion_passenger/request_handler.rb:448:in `block (3 levels) in start_threads' ... No close tag for /lists/list Line: 4 Position: 93 Last 80 unconsumed characters: Output was: <?xml version="1.0" encoding="UTF-8"?> <lists> <list path="svn://127.0.0.1/voxysuser"> Rendered common/error.html.erb within layouts/base (0.1ms) Completed 404 Not Found in 69.1ms (Views: 15.1ms | ActiveRecord: 3.0ms) How can I resolve this problem? I googled it, but similar problem fixed should be fixed 3 years ago. I'm installing the latest Bitnami Redmine 2.5.1-1 stack. UPDATE Well, I found next way. If I use the http protocol it works fine, but I should remove access for svn by web. That's why I create virtual host on localhost and get info from svn use 127.0.0.1 IP. <VirtualHost 127.0.0.1:8000> <Location /repo> DAV svn SVNPath "PATH_TO_MY_REPOSITORY" </Location> And this it work good.

    Read the article

  • No description for any page on the website is available in Google despite robots.txt allowing crawling

    - by Abhijit
    I seem to have the weirdest issue with Search Engine Optimization, and I asked the IT folks at my university, I asked people on Joomla forums and I am trying to sort this issue out using Google Webmaster Tools for more than 2 months to little avail. I want to know if I have some blatantly wrong configuration somewhere that is causing search engines to be unable to index this site. I noticed a similar issue with another website I searched for online (ECEGSA - The University of British Columbia at gsa.ece.ubc.ca), making me believe this might be a concern that people might be looking an answer for. Here are the details: The website in question is: http://gsa.ece.umd.edu/. It runs using Joomla 2.5.x (latest). The site was up since around mid December of 2013, and I noticed right from the get go that the site was not being indexed correctly on Google. Specifically I see the following message when I search for the website on Google: A description for this result is not available because of this site's robots.txt – learn more. The thing is in December till around March I used the default Joomla robots.txt file which is: User-agent: * Disallow: /administrator/ Disallow: /cache/ Disallow: /cli/ Disallow: /components/ Disallow: /images/ Disallow: /includes/ Disallow: /installation/ Disallow: /language/ Disallow: /libraries/ Disallow: /logs/ Disallow: /media/ Disallow: /modules/ Disallow: /plugins/ Disallow: /templates/ Disallow: /tmp/ Nothing there should stop Google from searching my website. And even more confusingly, when I go to Google Webmaster tools, under "Blocked URLs" tab, when I try many of the links on the site, they are all shown up as "Allowed". I then tried adding a sitemap, putting it in the robots.txt file. That did not help. Same exact search result, same behavior in the "Blocked URLs" tab on the webmaster tools. Now additionally, the "sitemaps" tab says for several links an error saying "URL is robotted out". I tried those exact links in the "Blocked URLs" and they are allowed! I then tried deleting the robots.txt file. No use. Same exact problem. Here is an example screenshot from Google's Webmaster Tools: At this point I cannot give a rational explanation to why this is happening and neither can anyone in the IT department here. No one on Joomla forums can seem to understand what is going on. Based on what I explained, does it seem that I have somehow set a setting in the robots.txt or in .htaccess or somewhere else, incorrectly?

    Read the article

  • How to parse invalid HTML with Perl?

    - by bodacydo
    I maintain a database of articles with HTML formatting. Unfortunately the editors who wrote articles didn't know proper HTML, so they often have written stuff like: <div class="highlight"><html><head></head><body><p>Note that ...</p></html></div> I tried using HTML::TreeBuilder to parse this HTML but after parsing it and dumping the resulting tree, all the elements between <div class="highlight">...</div> are gone. I'm left with just <div class="highlight"></div>. The editors often have also done things like: <div class="article"><style>@font-face { font-family: "Cambria"; }</style>Article starts here</div> Parsing this with HTML::TreeBuilder results in empty <div class="article"></div> again. Any ideas how to approach this broken HTML and actually make sense out of it?

    Read the article

  • Copy HTML code but without javascript changes [closed]

    - by PaulP
    In Firebug there is very useful "Copy HTML" option in HTML Tab. But that copied HTML code also includes javascript changes like for example added new classes on document.ready (jQuery) event. I would like to copy raw HTML code like in "View source" option (it is every browser) without and javascript changes. Yes, I can use "View source" option but code in there is very scattered and it is very hard to copy one big HTML node not losing closing tag and in firebug with fold blessing I can match folded HTML node, right click and select "Copy HTML".

    Read the article

  • Why do Google search results include pages disallowed in robots.txt?

    - by Ilmari Karonen
    I have some pages on my site that I want to keep search engines away from, so I disallowed them in my robots.txt file like this: User-Agent: * Disallow: /email Yet I recently noticed that Google still sometimes returns links to those pages in their search results. Why does this happen, and how can I stop it? Background: Several years ago, I made a simple web site for a club a relative of mine was involved in. They wanted to have e-mail links on their pages, so, to try and keep those e-mail addresses from ending up on too many spam lists, instead of using direct mailto: links I made those links point to a simple redirector / address harvester trap script running on my own site. This script would return either a 301 redirect to the actual mailto: URL, or, if it detected a suspicious access pattern, a page containing lots of random fake e-mail addresses and links to more such pages. To keep legitimate search bots away from the trap, I set up the robots.txt rule shown above, disallowing the entire space of both legit redirector links and trap pages. Just recently, however, one of the people in the club searched Google for their own name and was quite surprised when one of the results on the first page was a link to the redirector script, with a title consisting of their e-mail address followed by my name. Of course, they immediately e-mailed me and wanted to know how to get their address out of Google's index. I was quite surprised too, since I had no idea that Google would index such URLs at all, seemingly in violation of my robots.txt rule. I did manage to submit a removal request to Google, and it seems to have worked, but I'd like to know why and how Google is circumventing my robots.txt like that and how to make sure that none of the disallowed pages will show up in their search results. Ps. I actually found out a possible explanation and solution, which I'll post below, while preparing this question, but I thought I'd ask it anyway in case someone else might have the same problem. Please do feel free to post your own answers. I'd also be interested in knowing if other search engines do this too, and whether the same solutions work for them also.

    Read the article

  • Can the .htaccess file slow down a website to a crawl? If so, are there better ways to solve these problems with different rewrite rules and such?

    - by Parimal
    here is my htaccess file...... RewriteCond %{REQUEST_URI} ^/patients/billing/FAQ_billing\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/billing/getintouch\.html$ RewriteRule ^patients/billing/(.*)\.html$ $1.php [L,NC] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/a\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/b\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/c\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/d\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/e\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/f\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/g\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/h\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/i\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/j\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/k\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/l\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/m\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/n\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/o\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/p\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/q\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/r\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/s\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/t\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/u\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/v\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/w\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/x\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/y\.html$ [OR] RewriteCond %{REQUEST_URI} ^/patients/findadoctor/z\.html$ RewriteRule ^patients/findadoctor/(.*)\.html$ findadoctor.php?id=$1 [L,NC] like that there is lots of rules around 250 line please help me...

    Read the article

  • How to resolve "Google can't find your site's robots.txt" error?

    - by Manivasagam
    I've recently found that "Google can't find your site's robots.txt" in crawl errors. When I tried Fetching as Google, I got result "SUCCESS", then I tried looking at crawl errors and it still shows "Google can't find your site's robots.txt". What can I do to resolve this issue? Before this issue arose, my site was indexed within a few mintues, but now I find that it took time to be indexed in Google's search. When I access http://mydomain.com/robots.txt, it shows the data below: User-agent: *Disallow: /wp-admin/ Disallow: /wp-includes/ I found Blocked URLs = 0, also no any other errors. Is there any other thing I need to change? Or what could be the solution for this? Any help would be appreciated.

    Read the article

  • Cross-submission robots.txt for multiple domains on single host

    - by sidd.darko
    We are running a site with multiple languages hosted in a single environment on IIS7. For example, oursite.com - english oursite.de - german oursite.es - spanish This is a single-host environment. All of these sites are in the same application space on the same physical machine. I need to do cross-submission of sitemaps via robots.txt. Looking at the sitemap.org guidelines for this suggest this is possible, but the example indicates different physical machines. Will the following entries in oursite.com/robots.txt work? http://www.oursite.com/sitemap-oursite-de.xml http://www.oursite.com/sitemap-oursite-es.xml

    Read the article

  • Parse html and find data in the html

    - by Dan.StackOverflow
    Hi all. I am trying to use html5lib to parse an html page in to something I can query with xpath. html5lib has close to zero documentation and I've spent too much time trying to figure this problem out. Ultimate goal is to pull out the second row of a table: <html> <table> <tr><td>Header</td></tr> <tr><td>Want This</td></tr> </table> </html> so lets try it: >>> doc = html5lib.parse('<html><table><tr><td>Header</td></tr><tr><td>Want This</td> </tr></table></html>', treebuilder='lxml') >>> doc <lxml.etree._ElementTree object at 0x1a1c290> that looks good, lets see what else we have: >>> root = doc.getroot() >>> print(lxml.etree.tostring(root)) <html:html xmlns:html="http://www.w3.org/1999/xhtml"><html:head/><html:body><html:table><html:tbody><html:tr><html:td>Header</html:td></html:tr><html:tr><html:td>Want This</html:td></html:tr></html:tbody></html:table></html:body></html:html> LOL WUT? seriously. I was planning on using some xpath to get at the data I want, but that doesn't seem to work. So what can I do? I am willing to try different libraries and approaches.

    Read the article

  • Managing JS and CSS for a static HTML web application

    - by Josh Kelley
    I'm working on a smallish web application that uses a little bit of static HTML and relies on JavaScript to load the application data as JSON and dynamically create the web page elements from that. First question: Is this a fundamentally bad idea? I'm unclear on how many web sites and web applications completely dispense with server-side generation of HTML. (There are obvious disadvantages of JS-only web apps in the areas of graceful degradation / progressive enhancement and being search engine friendly, but I don't believe that these are an issue for this particular app.) Second question: What's the best way to manage the static HTML, JS, and CSS? For my "development build," I'd like non-minified third-party code, multiple JS and CSS files for easier organization, etc. For the "release build," everything should be minified, concatenated together, etc. If I was doing server-side generation of HTML, it'd be easy to have my web framework generate different development versus release HTML that includes multiple verbose versus concatenated minified code. But given that I'm only doing any static HTML, what's the best way to manage this? (I realize I could hack something together with ERB or Perl, but I'm wondering if there are any standard solutions.) In particular, since I'm not doing any server-side HTML generation, is there an easy, semi-standard way of setting up my static HTML so that it contains code like <script src="js/vendors/jquery.js"></script> <script src="js/class_a.js"></script> <script src="js/class_b.js"></script> <script src="js/main.js"></script> at development time and <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.8.2/jquery.min.js"></script> <script src="js/entire_app.min.js"></script> for release?

    Read the article

  • parsing simple html for iphone

    - by sitara
    I have a very simple html page to parse. The html page will remain simple always. as simple as this <html> <head><title>title</title></head> <body>some data here</body> </html> I have fetched the html content of such an html page and have it in an NSString. I want to get what ever data is there in the body of the html page. Please tell me how can this be done and let me know if there are more than one possible ways. I would prefer doing it using basic obj-c if it is possible. Thanks

    Read the article

  • Disallowed images in the robots.txt of my Joomla site can't be displayed when shared in Facebook

    - by opk
    I have noticed that since I have disallowed images using the robots.txt in my Joomla site, when sharing an article in Facebook, the image will not be displayed. Why is that? Is it indeed related? My robots.txt file: User-agent: * Disallow: /administrator/ Disallow: /cache/ Disallow: /cli/ Disallow: /components/ Disallow: /images/ Disallow: /includes/ Disallow: /installation/ Disallow: /language/ Disallow: /libraries/ Disallow: /logs/ Disallow: /media/ Disallow: /modules/ Disallow: /plugins/ Disallow: /templates/ Disallow: /tmp/

    Read the article

  • iphone read txt file from UIWebView

    - by Ni
    I can read the data in file.txt file located in local disk. NSString *filePath = [[NSBundle mainBundle] pathForResource:@"file" ofType:@"txt"]; NSString* Data = [NSString stringWithContentsOfFile:filePath encoding:NSUTF8StringEncoding error:&error ]; now, I upload the file.txt file into a website. how can i read the data from the txt file now from UIWebView? Please help!!

    Read the article

  • Ajax Control Toolkit July 2011 Release and the New HTML Editor Extender

    - by Stephen Walther
    I’m happy to announce the July 2011 release of the Ajax Control Toolkit which includes important bug fixes and a completely new HTML Editor Extender control. You can download the July 2011 Release by visiting the Ajax Control Toolkit CodePlex site at: http://AjaxControlToolkit.CodePlex.com Using the New HTML Editor Extender Control You can use the new HTML Editor Extender to extend any standard ASP.NET TextBox control so that it supports rich formatting such as bold, italics, bulleted lists, numbered lists, typefaces and different foreground and background colors. The following code illustrates how you can extend a standard ASP.NET TextBox control with the HtmlEditorExtender: <%@ Page Language="C#" AutoEventWireup="true" CodeBehind="Simple.aspx.cs" Inherits="WebApplication1.Simple" %> <%@ Register TagPrefix="asp" Namespace="AjaxControlToolkit" Assembly="AjaxControlToolkit" %> <html xmlns="http://www.w3.org/1999/xhtml"> <head runat="server"> <title>Simple</title> </head> <body> <form id="form1" runat="server"> <asp:ToolkitScriptManager runat="Server" /> <asp:TextBox ID="txtComments" TextMode="MultiLine" Columns="60" Rows="8" runat="server" /> <asp:HtmlEditorExtender TargetControlID="txtComments" runat="server" /> </form> </body> </html> This page has the following three controls: ToolkitScriptManager – The ToolkitScriptManager renders all of the scripts required by the Ajax Control Toolkit. TextBox – The TextBox control is a standard ASP.NET TextBox which is set to display multiple lines (a TextArea instead of an Input element). HtmlEditorExtender – The HtmlEditorExtender is set to extend the TextBox control. You can use the standard TextBox Text property to read the rich text entered into the TextBox control on the server. Lightweight and HTML5 The HTML Editor Extender works on all modern browsers including the most recent versions of Mozilla Firefox (Firefox 5), Google Chrome (Chrome 12), and Apple Safari (Safari 5). Furthermore, the HTML Editor Extender is compatible with Microsoft Internet Explorer 6 and newer. The HTML Editor Extender is very lightweight. It takes advantage of the HTML5 ContentEditable attribute so it does not require an iframe or complex browser workarounds. If you select View Source in your browser while using the HTML Editor Extender, we hope that you will be pleasantly surprised by how little markup and script is generated by the HTML Editor Extender. Customizable Toolbar Buttons Depending on the web application that you are building, you will want to display different toolbar buttons with the HTML Editor Extender. One of the design goals of the HTML Editor Extender was to make it very easy for you to customize the toolbar buttons. Imagine, for example, that you want to use the HTML Editor Extender when accepting comments on blog posts. In that case, you might want to restrict the type of formatting that a user can display. You might want to enable a user to format text as bold or italic but you do not want the user to make any other formatting changes. The following page illustrates how you can customize the HTML Editor Extender toolbar: <%@ Page Language="C#" AutoEventWireup="true" CodeBehind="CustomToolbar.aspx.cs" Inherits="WebApplication1.CustomToolbar" %> <%@ Register TagPrefix="asp" Namespace="AjaxControlToolkit" Assembly="AjaxControlToolkit" %> <html> <head runat="server"> <title>Custom Toolbar</title> </head> <body> <form id="form1" runat="server"> <asp:ToolkitScriptManager Runat="server" /> <asp:TextBox ID="txtComments" TextMode="MultiLine" Columns="50" Rows="10" Text="Hello <b>world!</b>" Runat="server" /> <asp:HtmlEditorExtender TargetControlID="txtComments" runat="server"> <Toolbar> <asp:Bold /> <asp:Italic /> </Toolbar> </asp:HtmlEditorExtender> </form> </body> </html> Notice that the HTML Editor Extender in the page above has a Toolbar subtag. You can list the toolbar buttons which you want to appear within the subtag. In the case above, only Bold and Italic buttons are displayed. Here is a complete list of the Toolbar buttons currently supported by the HTML Editor Extender: Undo Redo Bold Italic Underline StrikeThrough Subscript Superscript JustifyLeft JustifyCenter JustifyRight JustifyFull InsertOrderedList InsertUnorderedList CreateLink UnLink RemoveFormat SelectAll UnSelect Delete Cut Copy Paste BackgroundColorSelector ForeColorSelector FontNameSelector FontSizeSelector Indent Outdent InsertHorizontalRule HorizontalSeparator Of course the HTML Editor Extender was designed to be extensible. You can create your own buttons and add them to the control. Compatible with the AntiXSS Library When using the HTML Editor Extender on a public facing website, we strongly recommend that you use the HTML Editor Extender with the AntiXSS Library. If you allow users to submit arbitrary HTML, and you don’t take any action to strip out malicious markup, then you are opening your website to Cross-Site Scripting Attacks (XSS attacks). The HTML Editor Extender uses the Provider Model to support different Sanitizer Providers. The July 2011 release of the Ajax Control Toolkit ships with a single Sanitizer Provider which uses the AntiXSS library (see http://AntiXss.CodePlex.com ). A Sanitizer Provider is responsible for sanitizing HTML markup by removing any malicious elements, attributes, and attribute values. For example, the AntiXss Sanitizer Provider will take the following block of HTML: <b><a href=""javascript:doEvil()"">Visit Grandma</a></b> <script>doEvil()</script> And return the following sanitized block of HTML: <b><a href="">Visit Grandma</a></b> Notice that the JavaScript href and <SCRIPT> tag are both stripped out. Be aware that there are a depressingly large number of ways to sneak evil markup into your HTML. You definitely want a Sanitizer as a safety net. Before you can use the AntiXSS Sanitizer Provider, you must add three assemblies to your web application: AntiXSSLibrary.dll, HtmlSanitizationLibrary.dll, and SanitizerProviders.dll. All three assemblies are included with the CodePlex download of the Ajax Control Toolkit in the SanitizerProviders folder. Here’s how you modify your web.config file to use the AntiXSS Sanitizer Provider: <configuration> <configSections> <sectionGroup name="system.web"> <section name="sanitizer" requirePermission="false" type="AjaxControlToolkit.Sanitizer.ProviderSanitizerSection, AjaxControlToolkit"/> </sectionGroup> </configSections> <system.web> <compilation targetFramework="4.0" debug="true"/> <sanitizer defaultProvider="AntiXssSanitizerProvider"> <providers> <add name="AntiXssSanitizerProvider" type="AjaxControlToolkit.Sanitizer.AntiXssSanitizerProvider"></add> </providers> </sanitizer> </system.web> </configuration> You can detect whether the HTML Editor Extender is using the AntiXSS Sanitizer Provider by checking the HtmlEditorExtender SanitizerProvider property like this: if (MyHtmlEditorExtender.SanitizerProvider == null) { throw new Exception("Please enable the AntiXss Sanitizer!"); } When the SanitizerProvider property has the value null, you know that a Sanitizer Provider has not been configured in the web.config file. Because the AntiXSS library requires Full Trust, you cannot use the AntiXSS Sanitizer Provider with most shared website hosting providers. Because most shared hosting providers only support Medium Trust and not Full Trust, we do not recommend using the HTML Editor Extender with a public website hosted with a shared hosting provider. Why a New HTML Editor Control? The Ajax Control Toolkit now includes two HTML Editor controls. Why did we introduce a new HTML Editor control when there was already an existing HTML Editor? We think you will like the new HTML Editor much more than the previous one. We had several goals with the new HTML Editor Extender: Lightweight – We wanted to leverage HTML5 to create a lightweight HTML Editor. The new HTML Editor generates much less markup and script than the previous HTML Editor. Secure – We wanted to make it easy to integrate the AntiXSS library with the HTML Editor. If you are creating a public facing website, we strongly recommend that you use the AntiXSS Provider. Customizable – We wanted to make it easy for users to customize the toolbar buttons displayed by the HTML Editor. Compatibility – We wanted to ensure that the HTML Editor will work with the latest versions of the most popular browsers (including Internet Explorer 6 and higher). The old HTML Editor control is still included in the Ajax Control Toolkit and continues to live in the AjaxControlToolkit.HTMLEditor namespace. We have not modified the control and you can continue to use the control in the same way as you have used it in the past. However, we hope that you will consider migrating to the new HTML Editor Extender for the reasons listed above. Summary We’ve introduced a new Ajax Control Toolkit control with this release. I want to thank the developers and testers on the Superexpert team for the huge amount of work which they put into this control. It was a non-trivial task to build an entirely new control which has the complexity of the HTML Editor in less than 6 weeks. Please let us know what you think! We want to hear your feedback. If you discover issues with the new HTML Editor Extender control, or you have questions about the control, or you have ideas for how it can be improved, then please post them to this blog. Tomorrow starts a new sprint

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >