Search Results

Search found 33668 results on 1347 pages for 'html to txt'.

Page 4/1347 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

301 redirect from "/index.html" to root if index.html not exist

- by Andrij Muzychka

Can I create 301 redirect from "index.html" to root directory if file "index.html" not exist? For example: link "http://example.com/index.html" show "404 Error" page. I need 301 redirect to root directory: "http://example.com/" in .htaccess I add rule: Options +FollowSymLinks RewriteCond %{THE_REQUEST} ^.*/index.html RewriteRule ^(.*)index.html$ http://example.com/$1 [R=301,L] but it doesn't work. Can you help me solve this problem?

Read the article
Meaning of Disallow: /*? in robots.txt

- by hussain

Yahoo's robots.txt contains: User-agent: * Disallow: /p/ Disallow: /r/ Disallow: /*? What does the last line mean? ("Disallow: /*?")

Read the article
What are DNS TXT records and why are they needed

- by Saif Bechan

I am creating a server with a primary nameserver and mutliple domains. Do i need to make TXT records for all the domain, they all have there own email, but use the mailserver from the primary.

Read the article
how to read the txt file from the database(line by line)

- by Ranjana

i have stored the txt file to sql server database . i need to read the txt file line by line to get the content in it. my code : DataTable dtDeleteFolderFile = new DataTable(); dtDeleteFolderFile = objutility.GetData("GetTxtFileonFileName", new object[] { ddlSelectFile.SelectedItem.Text }).Tables[0]; foreach (DataRow dr in dtDeleteFolderFile.Rows) { name = dr["FileName"].ToString(); records = Convert.ToInt32(dr["NoOfRecords"].ToString()); bytes = (Byte[])dr["Data"]; } FileStream readfile = new FileStream(Server.MapPath("txtfiles/" + name), FileMode.Open); StreamReader streamreader = new StreamReader(readfile); string line = ""; line = streamreader.ReadLine(); but here i have used the FileStream to read from the Particular path. but i have saved the txt file in byt format into my Database. how to read the txt file using the byte[] value to get the txt file content, instead of using the Path value.

Read the article
how to read the txt file from database(byte[] to filestream)

- by Ranjana

i have stored the txt file to sql server database . i need to read the txt file line by line to get the content in it. my code : DataTable dtDeleteFolderFile = new DataTable(); dtDeleteFolderFile = objutility.GetData("GetTxtFileonFileName", new object[] { ddlSelectFile.SelectedItem.Text }).Tables[0]; foreach (DataRow dr in dtDeleteFolderFile.Rows) { name = dr["FileName"].ToString(); records = Convert.ToInt32(dr["NoOfRecords"].ToString()); bytes = (Byte[])dr["Data"]; } FileStream readfile = new FileStream(Server.MapPath("txtfiles/" + name), FileMode.Open); StreamReader streamreader = new StreamReader(readfile); string line = ""; line = streamreader.ReadLine(); but here i have used the FileStream to read from the Particular path. but i have saved the txt file in byt format into my Database. how to read the txt file using the byte[] value to get the txt file content, instead of using the Path value.

Read the article
Strange robots.txt - how and why did it get there?

- by Mick

I recently created a very simple, pure HTML website which I have hosted with "hostmonster". Hostmonster had very good reviews on some comparison website and in general so far they appear to be perfectly good in every way... At least I thought so until just now... I have been making lots of edits to my site on an almost daily basis. My site now appears on the first page (7th on the list) for my most important keyphrase when doing a google search. But I did notice some problem with the snippet chosen by google. I asked a question on this site about snippets and got some great answers. I then made some modifications to my meta data and within 48hrs the google snippet for my search was perfect. The odd thing though was that looking at the "cached" version google had, it appeared that the cache was still very odl- like three weeks previous. This seemed very odd - how could it be that the google robots had read my new metadata without updating the cache? This puzzled me greatly. Just now it occurred to me that maybe I had some goofey setting in my robots.txt file. I didn't actually remember even making one - but I thought I'd have a look just in case. Much to my horror, I saw that there was a robots.txt and it contained the disturbing text below: sitemap: http://cdn.attracta.com/sitemap/728687.xml.gz Intuitively this looks like some kind of junk, spam trick, and I had indeed been getting some spam from "attracta". So my questions are: 1. Should I simply delete this robots.txt? 2. Was the file there all along - placed there because of some commercial tie-in between attracta and hostmonster. 3. Does the attracta robots file explain the lack of re-caching?

Read the article
How to get search engines to properly index an ajax driven search page

- by Redtopia

I have an ajax-driven search page that will allow users to search through a large collection of records. Each search result points to index.php?id=xyz (where xyz is the id of the record). The initial view does not have any records listed, and there is no interface that allows you to browse through all records. You can only conduct a search. How do I build the page so that spiders can crawl each record? Or is there another way (outside of this specific search page) that will allow me to point spiders to a list of all records. FYI, the collection is rather large, so dumping links to every record in a single request is not a workable solution. Outputting the records must be done in multiple requests. Each record can be viewed via a single page (eg "record.php?id=xyz"). I would like all the records indexed without anything indexed from the sitemap that shows where the records exist, for example: <a href="/result.php?id=record1">Record 1</a> <a href="/result.php?id=record2">Record 2</a> <a href="/result.php?id=record3">Record 3</a> <a href="/seo.php?page=2">next</a> Assuming this is the correct approach, I have these questions: How would the search engines find the crawl page? Is it possible to prevent the search engines from indexing the words "Record 1", etc. and "next"? Can I output only the links? Or maybe something like:

Read the article
robots.txt file with more restrictive rules for certain user agents

- by Carson63000

Hi, I'm a bit vague on the precise syntax of robots.txt, but what I'm trying to achieve is: Tell all user agents not to crawl certain pages Tell certain user agents not to crawl anything (basically, some pages with enormous amounts of data should never be crawled; and some voracious but useless search engines, e.g. Cuil, should never crawl anything) If I do something like this: User-agent: * Disallow: /path/page1.aspx Disallow: /path/page2.aspx Disallow: /path/page3.aspx User-agent: twiceler Disallow: / ..will it flow through as expected, with all user agents matching the first rule and skipping page1, page2 and page3; and twiceler matching the second rule and skipping everything?

Read the article
Is it safe to block redirected (but still linked) URLs with robots.txt?

- by Edgar Quintero

I have a website that has all URLs optimized and 301 redirected from nasty URLs to clean ones. However, everywhere throughout the site the unclean URLs are linked in menus, content, products, etc. Google currently has all clean URLs indexed, along with a few unclean URLs too. So the site still has linked everywhere the old URLs (ideally this wouldn't be the case but this is how it is ATM). I would like to block the unclean URLs with robots.txt. The question: if I block these unclean URLs with the robots.txt, when the entire website is linked with them (but they all redirect to the clean version), will this affect the indexing status at all?

Read the article
Which token from a long User-Agent should I use in robots.txt?

- by Gaia

The definition of User-Agent states that several tokens can be included, as deemed necessary by the client. I want to block certain bots via robots.txt and I am confused as to which part of the User-Agent string to use, especially for more obscure bots. For example: Mozilla/5.0 (compatible; uMBot-LN/1.0; mailto: [email protected])" JS-Kit URL Resolver, http://js-kit.com/ Mozilla/5.0 (compatible; SEOkicks-Robot +http://www.seokicks.de/robot.html Do I use the second token? Can tokens contain spaces, or did the SEOkicks folks forget a semicolon after SEOkicks-Robot? I don't actually intend on making my question specific to a couple bots - I want to know the guideline: which part of UA do I place in robots.txt for these exotic bots with UA as long as a haiku? User-agent: uMBot-LN/1.0 Disallow: / PS: Thank you but I do not need to hear that undesirable bots are better blocked with mod_security. I already have commercial mod_sec rules in place.

Read the article
Is it safe to Block These URLs with Robots.txt?

- by Edgar Quintero

I have a website that has all URLs optimized and 301 redirected from nasty URLs to clean ones. However, everywhere throughout the site the unclean URLs are linked in menus, content, products, etc. Google currently has all clean URLs indexed, along with a few unclean URLs too. So the site still has linked everywhere the old URLs (ideally this wouldn't be the case but this is how it is ATM). I would like to block the unclean URLs with robots.txt. The question: If I block these unclean URLs with the robots.txt, when the entire website is linked with them (but they all redirect to the clean version), will this affect the indexing status at all?

Read the article
When should JavaScript generate HTML?

- by VirtuosiMedia

I try to generate as little HTML from JavaScript as possible. Instead, I prefer to manipulate existing markup whenever I can and only generate HTML when I need to dynamically insert an element that isn't a good candidate for using Ajax. This, I believe, makes it far easier to maintain the code and quickly make changes to it because the markup is easier to read and trace. My rule of thumb is: HTML is for document structure, CSS is for presentation, JavaScript is for behavior. However, I've seen a lot of JS code that generates mounds of HTML, including entire forms and content-heavy modal dialogs. In general, which method is considered best practice? In what circumstances should JavaScript be used to generate HTML and when should it not?

Read the article
Java library for HTML analysis

- by Raj

Hi, (I've seen similar questions, but I think none of them cater to my specific needs, hence...) I would like to know if there is a Java library for analysis of real-world (read: incomplete, ill-formed) HTML. By analysis, I mean things like: figuring out the most prominent color in an HTML chunk changing that color to some other color (hence, has to support modification of the HTML as well) pruning out unwanted tags fixing up the HTML to result in a well formed HTML snippet Parts of the last two are done by libraries such as Jericho, and jTidy. 'Plugins' on top of these would be great. Thanks in advance!

Read the article
Pure JSP without mixing HTML, by writing html as Java-like code

- by ADTC

Please read before answering. This is a fantasy programming technique I'm dreaming up. I want to know if there's anything close in real life. The following JSP page: <% html { head { title {"Pure fantasy";} } body { h1 {"A heading with double quote (\") character";} p {"a paragraph";} String s = "a paragraph in string. the date is "; p { s; new Date().toString(); } table (Border.ZERO, new Padding(27)) { tr { for (int i = 0; i < 10; i++) { td {i;} } } } } } %> could generate the following HTML page: <html> <head> <title>Pure fantasy</title> </head> <body> <h1>A heading with double quote (") character</h1> <p>a paragraph</p> <p>a paragraph in string. the date is 11 December 2012</p> <table border="0" padding="27"> <tr> <td>0</td> <td>1</td> <td>2</td> <td>3</td> <td>4</td> <td>5</td> <td>6</td> <td>7</td> <td>8</td> <td>9</td> </tr> </table> </body> </html> The thing about this fantasy is it reuses the same old Java programming language technique that enable customized keywords used in a way similar to if-else-then, while, try-catch etc to represent html tags in a non-html way that can easily be checked for syntactic correctness, and most importantly can easily be mixed up with regular Java code without being lost in a sea of <%, %>, <%=, out.write(), etc. An added feature is that strings can directly be placed as commands to print out into generated HTML, something Java doesn't support (where pure strings have to be assigned to variables before use). Is there anything in real life that comes close? If not, is it possible to define customized keywords in Java or JSP? Or do I have to create an entirely new programming language for that? What problems do you see with this kind of setup? PS: I know you can use HTML libraries to construct HTML using Java code, but the problem with such libraries is, the source code itself doesn't have a readable HTML representation like the code above does - if you get what I mean.

Read the article
Sending HTML to Gmail always lands in Spam

- by cartaysm

I am having an issue with sending HTML emails to Gmail. I can send them to Yahoo, Hotmail, RR, AOL, etc. with no problem at all, but when I send them to Gmail I get kicked to spam. I have checked my IP with a lot of different list to make sure it is not listed anywhere, which it is not. spamhaus = is not listed in the DBL abuse.net = is not listed in the SBL abuse.net = is not listed in the PBL abuse.net = is not listed in the XBL spamcop = not listed in bl.spamcop.net host 24.172.204.xxx xxx.204.172.24.in-addr.arpa domain name pointer xxxevents.com. host xxxevents.com xxxevents.com has address 24.172.204.xxx xxxevents.com mail is handled by 10 mail.xxxevents.com. I am just trying to send a very VERY basic HTML message (listed below). I use an Ubuntu server, swiftmailer, multipart/alternative (HTML & plain), SPF = pass, and I am going to setup DKIM today to see if that fixes it (but I doubt it will)... For now I will only post the message I sent that gets kicked to spam and can provide any details needed. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head><title>Triathlon</title></head> <body> <table cellpadding="0" cellspacing="0"> <tr> <td> <p>Thank you for attending our 4th annual Triathlon/Duathlon/5k at Hueston Woods State Park on August 12th. This event is held annually to raise research funding for Crohn's Disease, Ulcerative Colitis, and Muscular Dystrophy diseases.</p> </td> </tr> <tr> <td> <p>As you know the results and pictures have been posted on our home page at since Sunday 8/13/2012. Now we also have updated our Facebook page with those photos and you can start tagging yourself or downloading the pictures now! <br /> our page and tag yourself at </p> <p> test test </p> <p>Race day events is professionally managed by Speedy-Feet</p> </td> </tr> </table> </body> </html> Just plain text works great, I thought maybe wording was messing me up but not the case... I am almost done install opendkim so I will be able to rule that out very soon. Edit: Okay installed opendkim and I am getting passing results so I sent the html I posted above it went through just fine. So now when I start to add a few more lines I am getting kicked back to spam again. Here is updated html code: ` <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head><title>Triathlon</title></head> <body> <table cellpadding="0" cellspacing="0"> <tr> <td> <center><a href='http://xxxevents.com' target="_blank"> <font face="Verdana, Arial, Helvetica, sans-serif" color="#666666" size="2"> <img src="http://xxxevents.com/marketemailimages/xxxlogo.png" alt="xxx It Events | Raising funds for Crohns, Colitis, and Muscular Dystrophy" border="0" /> </font></a></center> </td> <tr> <td> <p>Thank you for attending our 4th annual Triathlon/Duathlon/5k at Hueston Woods State Park on August 12th. This event is held annually to raise research funding for Crohn's Disease, Ulcerative Colitis, and Muscular Dystrophy diseases.</p> </td> </tr> <tr> <td> <p>As you know the results and pictures have been posted on our home page at since Sunday 8/13/2012. Now we also have updated our Facebook page with those photos and you can start tagging yourself or downloading the pictures now! <br /> our page and tag yourself at </p> <p> test test </p> <p>Race day events is professionally managed by Speedy-Feet</p> </td> </tr> </table> <table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td valign="top"> <div align="center" style="font-family:Verdana, Arial, Helvetica, sans-serif; font-size:10px;"><br />PO Box xxx Maineville, OH 45039<br /> <a href="mailto:[email protected]">[email protected]</a> | <a href='http://xxxevents.com' target="_blank">xxxevents.com</a><br /> <br /> </div> </td> </tr> </table> </body> </html>`

Read the article
Html Agility Pack for Reading “Real World” HTML

- by WeigeltRo

In an ideal world, all data you need from the web would be available via well-designed services. In the real world you sometimes have to scrape the data off a web page. Ugly, dirty – but if you really want that data, you have no choice. Just don’t write (yet another) HTML parser. I stumbled across the Html Agility Pack (HAP) a long time ago, but just now had the need for a robust way to read HTML. A quote from the website: This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world" malformed HTML. The object model is very similar to what proposes System.Xml, but for HTML documents (or streams). Using the HAP was a simple matter of getting the Nuget package, taking a look at the example and dusting off some of my XPath knowledge from years ago. The documentation on the Codeplex site is non-existing, but if you’ve queried a DOM or used XPath or XSLT before you shouldn’t have problems finding your way around using Intellisense (ReSharper tip: Press Ctrl+Shift+F1 on class members for reading the full doc comments).

Read the article
Google Sitemap and Robots.txt Issue

- by Sarfaraz Soomro

Hi, We have a sitemap at our site, http://www.gamezebo.com/sitemap.xml Some of the urls in the sitemap, are being reported in the webmaster central as being blocked by our robots.txt, see, gamezebo.com/robots.txt ! Although these urls are not Disallowed in Robots.txt. There are other such urls aswell, for example, gamezebo.com/gamelinks is present in our sitemap, but it's being reported as "URL restricted by robots.txt". Also I have this parse result in the Webmaster Central that says, "Line 21: Crawl-delay: 10 Rule ignored by Googlebot". What does it mean? I appreciate your help, Thanks.

Read the article
read out a txt and send the last line to email adress - vbscript

- by matthias

Hallo, I'm back haha :-) so i have the next question and i hope someone can help me... I know i have a lot of questions but i will try to learn vbscript :-) Situation: I try to make a program that check all 5 min a txt and if there a new line in the txt, i'll try to send it to my eMail Address Option Explicit Dim fso, WshShell, Text, Last, objEmail Const folder = "C:\test.txt" Set fso=CreateObject("Scripting.FileSystemObject") Set WshShell = WScript.CreateObject("WScript.Shell") Do Text = Split(fso.OpenTextFile(Datei, 1).ReadAll, vbCrLF) Letzte = Text(UBound(Text)) Set objEmail = CreateObject("CDO.Message") objEmail.From = "[email protected]" objEmail.To = "[email protected]" objEmail.Subject = "Control" objEmail.Textbody = Last objEmail.Configuration.Fields.Item _ ("http://schemas.microsoft.com/cdo/configuration/sendusing") = 2 objEmail.Configuration.Fields.Item _ ("http://schemas.microsoft.com/cdo/configuration/smtpserver") = _ "smtpip" objEmail.Configuration.Fields.Item _ ("http://schemas.microsoft.com/cdo/configuration/smtpserverport") = 25 objEmail.Configuration.Fields.Update objEmail.Send WScript.Sleep 300000 Loop This program works, but this program send me all 5 mins a mail...but i will only have a new mail when there is a new line in the txt. Can someone help me?

Read the article
how to upload .txt files to sql database using asp.net(C#)

- by Ranjana

how to upload .txt files to sql database using asp.net(C#)

Read the article
From escaped html -> to regular html? - Python

- by RadiantHex

Hi folks, I used BeautifulSoup to handle XML files that I have collected through a REST API. The responses contain HTML code, but BeautifulSoup escapes all the HTML tags so it can be displayed nicely. Unfortunately I need the HTML code. How would I go on about transforming the escaped HTML into proper markup? Help would be very much appreciated!

Read the article
Showing HTML comment strings () in HTML files

- by Andrei

Hello all. I'm building a source code search engine, and I'm returning the results on a HTML page (aspx to be exact, but the view logic is in HTML). When someone searches a string, I also return the whole line of code where this string can be found in a file. However, some lines of code come from HTML/aspx files and these lines contain HTML specific comments (). When I try to print this line on the HTML page, it interprets it as a comment and does not show it on the screen....how should I go about solving this so that it actually shows up? Any help would be welcomed. Thanks. edit: err...i see now that firebug could help me with this: 

Read the article
Should I add a "nofollow" attribute to download links, or disallow the URLs in robots.txt?

- by Laurent

I have a download link very similar to Opera's one - it's just a script that sends the file. It doesn't have an extension and there's no obvious way to tell that it's actually a download link. So since I don't want robots to crawl this link, do I need to add it to robots.txt or maybe add a "nofollow" attribute to it? I see that on Opera's website they didn't do either of this, so perhaps it's not necessary?

Read the article
Should I disallow(robots.txt) archive/author pages with links already available on the front page? [on hold]

- by WPRookie82

I am working on a simple Wordpress blog where when an article is published, it appears on ALL these pages: Homepage - Headline(clickable) + 3-line summary Parent category page - Headline(clickable) + 3-line summary Child category page - Headline(clickable) + 3-line summary Author page - Headline(clickable) sitemap.xml I've been told that I should add all author pages to my robots.txt, under disallow, so as search engine bots do not spider /author/* since all links on these pages are available elsewhere. Is this a good approach or maybe rel=nofollow is better, or maybe I shouldn't worry about this at all?

Read the article
How do you parse an HTML in vb.net

- by tooleb

I would like to know if there is a simple way to parse HTML in vb.net. I know that HTML is not sctrict subset of XML, but it would be nice if it could be treated that way. Is there anything out there that would let me parse HTML in an XML-like way in VB.net?

Read the article
Fast, lightweight HTML parser for C++

- by Jen

I'm looking for a fast, lightweight open-source HTML parser -- something along the lines of a non-validating SAX parser (except, of course, for HTML). The answers to this question cover a parser that generates a DOM (don't want that), and these answers suggest conforming the HTML to XML before sending it to Xerxes (can't do that in my case). Any suggestions?

Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >