robots txt - Page 8 - Developer IT

How do I forbid search engine to index subdirectory /CRM with robot.txt?

Does GoogleBot respect User-agent: *

I blocked a page in robots.txt under User-agent: *, and tried to do a manual removal of that URL from Google's cache in the webmasters tools. Google said it wasn't being blocked in my robots.txt, so I then blocked it specifically under User-agent: GoogleBot and tried removing it again and this time it worked. Does that mean Google doesn't respect User-agent: * or what?

Read the article

[Disallow: /index.php] seems to block /my-beautiful-sef-url-123

- by Jaroslav Záruba

Hello I have robots.txt that looks like this: User-agent: * Disallow: /system/ Disallow: /admin/ Disallow: /index.php The obvious goal has been to prevent all the ugly URLs from being indexed, as they all begin with "/index.php". But for some reason all URLs like /my-beautiful-sef-url-123 are listed under Crawl errors in Google Webmaster Tools with "URL restricted by robots.txt". (When I test such URL it yields Allowed for both Googlebot and Googlebot-Mobile.) Can anyone help please?

Read the article

Where to put the SPF TXT record?

- by YellowSquirrel

I've set up Google apps for my domain: I've registered the domain with Google by adding the CNAME Google asked and I've apparently succesfully setup the MX Google mail servers. So far I haven't yet a dedicated server: I'm just having a domain at a registrar. Now I want to activate SPF and I'm confused. In the following short webpage: http://www.google.com/support/a/bin/answer.py?answer=178723 it is written that I must add a TXT record containing: v=spf1 include:_spf.google.com ~all Where should I enter this? Should this go in the zone (?) file, like I did for the CNAME and the MX records? So far I have something like this: @ 10800 IN A 217.42.42.42 @ 10800 IN MX 5 ASPMX3.GOOGLEMAIL.COM. @ 10800 IN MX 5 ASPMX2.GOOGLEMAIL.COM. @ 10800 IN MX 3 ALT2.ASPMX.L.GOOGLE.COM. @ 10800 IN MX 3 ALT1.ASPMX.L.GOOGLE.COM. @ 10800 IN MX 1 ASPMX.L.GOOGLE.COM. google8a70835987f31e34 10800 IN CNAME google.com. Does adding the SPF TXT record mean I should literally have something like that: @ 10800 IN A 217.42.42.42 @ 10800 IN MX 5 ASPMX3.GOOGLEMAIL.COM. @ 10800 IN MX 5 ASPMX2.GOOGLEMAIL.COM. @ 3600 IN TXT "v=spf1 include:_spf.google.com ~all" @ 10800 IN MX 3 ALT2.ASPMX.L.GOOGLE.COM. @ 10800 IN MX 3 ALT1.ASPMX.L.GOOGLE.COM. @ 10800 IN MX 1 ASPMX.L.GOOGLE.COM. google8a70835987f31e34 10800 IN CNAME google.com. I made that one up and included right in the middle to show how confused I am. What I'd like to know is the exact syntax and where/how I should put this TXT record.

Read the article

Loading levels from .txt or .XML for XNA

- by Dave Voyles

I'm attemptin to add multiple levels to my pong game. I'd like to simply exchange a few elements with each level, nothing crazy. Just the background texture, the color of the AI paddle (the one on the right side), and the music. It seems that the best way to go about this is by utilizing the StreamReader to read and write the files from XML. If there is a better, or more efficient alternative way then I'm all for it. In looking over the XNA Starter Platformer Kit provided by MS it seems that they've done it in this manner as well. I'm perplexed by a few things, however, namely parts within the Level class which aren't commented. /// <summary> /// Iterates over every tile in the structure file and loads its /// appearance and behavior. This method also validates that the /// file is well-formed with a player start point, exit, etc. /// </summary> /// <param name="fileStream"> /// A stream containing the tile data. /// </param> private void LoadTiles(Stream fileStream) { // Load the level and ensure all of the lines are the same length. int width; List<string> lines = new List<string>(); using (StreamReader reader = new StreamReader(fileStream)) { string line = reader.ReadLine(); width = line.Length; while (line != null) { lines.Add(line); if (line.Length != width) throw new Exception(String.Format("The length of line {0} is different from all preceeding lines.", lines.Count)); line = reader.ReadLine(); } } What does width = line.Length mean exactly? I mean I know how it reads the line, but what difference does it make if one line is longer than any of the others? Finally, their levels are simply text files that look like this: .................... .................... .................... .................... .................... .................... .................... .........GGG........ .........###........ .................... ....GGG.......GGG... ....###.......###... .................... .1................X. #################### It can't be that easy..... Can it?

Read the article

Write data into .txt file created by CFileDialog, in C++

- by younevertell

I wanna Write data into .txt file created by CFileDialog, in C++. The problem I am facing is that below codes doesn't work, although there is no build error. The .txt file created by CFileDialog can not be found for some reason. What's wrong the code? what's the efficient way to Write data into .txt file created by CFileDialog, in C++? Thanks CFileDialog dlg(FALSE, NULL, NULL, OFN_OVERWRITEPROMPT, _T("My Data File (*.txt)|*.txt||")); if(dlg.DoModal() != IDOK) return; CString filename = dlg.GetPathName(); ofstream outfile (filename); int mydata = 10; outfile << "my data:" << mydata << endl; outfile.close();

Read the article

I'm getting a "Does not implement IController" error on images and robots.txt in MVC2

- by blesh

I'm getting a strange error on my webserver for seemingly every file but the .aspx files. Here is an example. Just replace '/robots.txt' with any .jpg name or .gif or whatever and you'll get the idea: The controller for path '/robots.txt' was not found or does not implement IController. I'm sure it's something to do with how I've setup routing but I'm not sure what exactly I need to do about it. Also, this is a mixed MVC and WebForms site, if that makes a difference.

Read the article

Unit testing Google wave robots in Java

- by Paul

Any tips or best practices for unit testing Google Wave robots written in Java? I'm expecting to deploy on AppEngine, if that helps. I'm a fan of TDD but new to both Wave Robots and AppEngine, so I'm hoping to use TDD to help me explore the design space.

Read the article

Convert file for XLS to TXT format

- by meenakshi

I renamed a .xls file to .txt file but the .txt file shows like this: öRn÷åËþEõZ;zûÕÃ·IÉD;Üÿêá¾ú‡×_*4u\çBq&?!€¨(ˆô‰/t/ª§‹'Oý?WïçþJ®½ãÁÁ|DiæEÇ’IŠ›ðä/\$r'Ù¡îê?|ÕÝê®²¸ó—QG¦çåÎŠ/–×ÒÏpQP~Q|å?‘ÈpÖ‘»ëËß62 Š/zaqçÎÖ©ú•ênÆ›WÚ·«àÙ}S«ˆº/~Ø×:U”¥‡7îTísvçùòl!ý0ýá»êéqx[T¯ß¿ø1 ¼ð…PïXáÃñ}ý«Å'gK)á*ÐÒm¦ŽØ™¯Püä©ïªÇÃ‹‹—‹U¡Jq5ÿ‚Šÿò6qúßŸX½ûuqi&×6êÞç~°ðÄÜ‹Òý¥Ÿø…cŸEão(ýNÖáÛ«îâW·‰ôîÏn¥zÚ¨—ÞŸqwø6XuÁËóñ¤;¢òhz¾&®û\â:çýÎxÒL\ïùJn2ì4ÓÖ¾‚{9˜Œ›‰<Ÿ¸qgØ0ä†Ï'®Ûë5¹Ñó‰ë›ò0~FšÜäù How can I get the same data as in an .xls file?

Read the article

Cannot save properly the source of .html file containing Russian letters as .txt

- by brilliant

When I save the source of this page of a Russian website: http://www.mail.ru/ as a .txt file, all Russian letters turn into Chinese characters (I am working on a Chinese computer at the moment), but when I save another page of another Russian website: http://starling.rinet.ru/cgi-bin/response.cgi?root=/usr/local/share/starling/morpho&morpho=0&basename=\usr\local\share\starling\morpho\ozhegov\ozhegov&first=4001 also as a .txt file, all Russian letters are saved in it as the are. Why is it so?

Read the article

Preventing Thunderbird to add .txt to attachments on open

- by Horcrux7

How can I prevent Thunderbird to add the extension .txt to a file when open the attachment. I have the problem with .patch files which I want look with notepad++. The problem is that notepad++ does not detect the right formating for the file because the extension is .txt. If I drag the file on the desktop an double click all is working. Why change Thunderbird the file name on opening? I am working on Windows 7.

Read the article

make programmer notepad default files as .txt?

- by acidzombie24

I am using http://www.pnotepad.org/ (i wouldnt mind switching to something else if it lightweight and has most/all features i like which i'll check on a app by app basis) When i create a new tab/file and save it unless i write .txt i get a file with no extention. Which makes it hard to open since i cant double click it (i dont think i can tell win7 to set a default app for files with no extension) How do i make pnotepad save with a .txt when non are specified?

Read the article

make irfanview ignore txt files?

- by acidzombie24

i just installed irfanview and while viewing images i notice irfanview include txt files. How do i make it NOT include txt files?

Read the article

Why deny access to website for msnbot/bingbot?

- by Quandary

I've seen quite a lot of tutorials that recommend you to ban user agents containing the strings libwww-perl and msnbot. I understand why one would ban libwww-perl, it's mainly if not only used for hacking and spamming. But why are there so many sites recommending to ban msnbot/bingbot? Since it's a search engine, even if only with a marginal market share, I would except one would want this bot to crawl one's sites. What is it that msnbot does that makes people ban it?

Read the article

What bots are really worth letting onto a site?

- by blunders

Having written a number of bots, and seen the massive amounts of random bots that happen to crawl a site, I am wondering if the goal of the site allowing bots is for the potential for the bot to send real traffic back to the site if there is any reason to allow bots that are not known to be sending real traffic back, and how to spot these "good" bots; based on how they ID themselves, IPs they come from, behaviors, etc.

Read the article

How to get search engines to properly index an ajax driven search page

- by Redtopia

I have an ajax-driven search page that will allow users to search through a large collection of records. Each search result points to index.php?id=xyz (where xyz is the id of the record). The initial view does not have any records listed, and there is no interface that allows you to browse through all records. You can only conduct a search. How do I build the page so that spiders can crawl each record? Or is there another way (outside of this specific search page) that will allow me to point spiders to a list of all records. FYI, the collection is rather large, so dumping links to every record in a single request is not a workable solution. Outputting the records must be done in multiple requests. Each record can be viewed via a single page (eg "record.php?id=xyz"). I would like all the records indexed without anything indexed from the sitemap that shows where the records exist, for example: <a href="/result.php?id=record1">Record 1</a> <a href="/result.php?id=record2">Record 2</a> <a href="/result.php?id=record3">Record 3</a> <a href="/seo.php?page=2">next</a> Assuming this is the correct approach, I have these questions: How would the search engines find the crawl page? Is it possible to prevent the search engines from indexing the words "Record 1", etc. and "next"? Can I output only the links? Or maybe something like:

Read the article

Using rel=canonical and noindex in a 1-n partners enviroment

- by Telemako Mako

We sell a whole site (domain, etc) to partners that create content that is shown together at the main site. What we want to achieve is that the main site copy is the original, but the one that is indexed is the partners copy. We want to do it this way so the search results point to the partner sites but never to the main site while the main site gets all the credit for the links. We are trying setting the main site article with a noindex, follow and a link to the partner article, and in the partner article we have a rel=canonical pointing to the main site article. Are we correct or the noindex at the main site will break the canonical reference?

Read the article

Adsense click bot is click bombing my site

- by Graham

I have a site that get's roughly 7,000 - 10,000 page views per day right now. Starting around 1 AM on 7/1/12 I noticed the CTR was rising dramatically. These clicks would be credited then de-credited soon after. So, they were obviously fraudulent clicks. The next day I had about 200 clicks in account with about 100 of them being fraudulent. It's about 3 - 8 per hour evenly dispersed for each of the three ads 24 hours a day. This leads me to believe that it's some sort of Adsense click bot. Also, I removed the ads last evening then put them back up around 3AM and the invalid clicks started within 10 minutes. I signed up for statcounter.com to analyze the exit links on the Adsense. Then I conditionally blocked ads for the IP address of the person / bot I suspected doing this. But, I think that the bot has several proxies to choose from and can refresh IP addresses. I've notified Google through the invalid click form / email 4 times over the past two days in order to let them know I'm aware of the situation and am working on a solution. I've also temporally removed all ads on that site. How can I block a bot like this? Thank you.

Read the article

Should I block bots from my site and why?

- by Frank E

My logs are full of bot visitors, often from Eastern Europe and China. The bots are identified as Ahrefs, Seznam, LSSRocketCrawler, Yandex, Sogou and so on. Should I block these bots from my site and why? Which ones have a legitimate purpose in increasing traffic to my site? Many of them are SEO. I have to say I see less traffic if anything since the bots have arrived in large numbers. It would not be too hard to block these since they all admit in their User Agent that they are bots.

Read the article

Receiving requests where absolute URL on page are morphed to relative URLs

- by Jacob

In our web pages, we have a hyperlink with an href to an absolute URL: https://some.other.host.com/blah.aspx?var1=val1&var2=val2 For some reason, in our logs, we see a lot of requests to URLs of this format: http://our.site.com/https:/some.other.host.com/blah.aspx?var1=val1&var2=val2 We don't have any JavaScript that would request that URL; it only appears inside of a hyperlink. Is there some sort of known bot, browser plugin, bug, etc. that could be responsible for these requests being made?

Read the article

Yandex frequently replaces page names with ampersands

- by Guy

The Yandex spider is a frequent visitor to one of the sites I manage. On ocassion it replaces the page name with two ampersands and a space. So if the page is: /mypage.aspx?param=value then it will try and crawl it as: /&& ?param=value Any idea why it is doing this? [EDIT] If I remember correctly the IP that this "mistake" is coming from is based in California and not Russia. I believe that they crawl US sites from a US based IP address. Not sure if that helps. More Info about request: IP: 199.21.99.82 City: Palo Alto State: California Country: United States ISP: Yandex Inc. User-Agent: Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)

Read the article

Uploading a non-finished website

- by Daniel

I have a pretty basic question. I developed a neet little website wich I'm ready to upload, but still needs a bit of work. The designer needs the html to do his work so the website needs to be uploaded. Besides that, I have to correct a couple details, do the friendly-urls, etc. What's the best way to set up the webpage in the definitive hosting with the defintive domain, blocking it to any unknown users and without affecting affecting SEO and those kind of things. If I were to just upload it, the non-definitive website might be crawled by a SE-bot. Thanks!

Read the article

How to hide download file from bots? [closed]

- by CJ7

Possible Duplicate: How to restrict the download of all files in a folder? I want to make a private file available for download but not use username/password protection. I want to put the file into a directory called something like download. How can I ensure: the file does not become part of search engine results, and the file cannot be accessed by bots that might guess the directory name?

Read the article

Should I set NOINDEX header for my JS, CSS and image files?

- by Yoga

Are there any harms if my site send NOINDEX headers for all my static assets? For image files, I refer to those valueless, e.g. background images, button images, etc. Update: more background information I have this concern is since recent Google said they also execute JS and they might fetch content via Ajax. So, for example, if I send noindex for my jQuery script, so Google would not be able to use them to load Ajax, I suppose it is not good for my site's SEO, right?

Read the article

Is browser and bot whitelisting a practical approach?

- by Sn3akyP3t3

With blacklisting it takes plenty of time to monitor events to uncover undesirable behavior and then taking corrective action. I would like to avoid that daily drudgery if possible. I'm thinking whitelisting would be the answer, but I'm unsure if that is a wise approach due to the nature of deny all, allow only a few. Eventually someone out there will be blocked unintentionally is my fear. Even so, whitelisting would also block plenty of undesired traffic to pay per use items such as the Google Custom Search API as well as preserve bandwidth and my sanity. I'm not running Apache, but the idea would be the same I'm assuming. I would essentially be depending on the User Agent identifier to determine who is allowed to visit. I've tried to take into account for accessibility because some web browsers are more geared for those with disabilities although I'm not aware of any specific ones at the moment. The need to not depend on whitelisting alone to keep the site away from harm is fully understood. Other means to protect the site still need to be in place. I intend to have a honeypot, checkbox CAPTCHA, use of OWASP ESAPI, and blacklisting previous known bad IP addresses.

Search Results

Search found 4984 results on 200 pages for 'robots txt'.

Page 8/200 | < Previous Page | 4 5 6 7 8 9 10 11 12 13 14 15 | Next Page >

- by user198729

- by rkulla

- by Jaroslav Záruba

- by YellowSquirrel

- by Dave Voyles

- by younevertell

- by blesh

- by Paul

- by meenakshi

- by brilliant

- by Horcrux7

- by acidzombie24

- by acidzombie24

- by Quandary

- by blunders

- by Redtopia

- by Telemako Mako

- by Graham

- by Frank E

- by Jacob

- by Guy

- by Daniel

- by CJ7

- by Yoga

- by Sn3akyP3t3

< Previous Page | 4 5 6 7 8 9 10 11 12 13 14 15 | Next Page >