Search Results

Search found 12829 results on 514 pages for 'crawl errors'.

Page 101/514 | < Previous Page | 97 98 99 100 101 102 103 104 105 106 107 108  | Next Page >

  • How to remove duplicate content, which is still indexed, but not linked to anymore?

    - by David
    A bug in the tool, which we use to create search-engine-friendly URLs changed our whole URL-structure overnight, and we only noticed after Google already indexed the page. Now, we have a massive duplicate content issue, causing a harsh drop in rankings. Webmaster Tools shows over 1,000 duplicate title tags, so I don't think, Google understands what is going on. Right URL: abc.com/price/sharp-ah-l13-12000-btu.html Wrong URL: abc.com/item/sharp-l-series-ahl13-12000-btu.html (created by mistake) After that, we ... Changed back all URLs to the "Right URLs" Set up a 301-redirect for all "Wrong URLs" a few days later Now, still a massive amount of pages is in the index twice. As we do not link internally to the "Wrong URLs" anymore, I am not sure, if Google will re-crawl them very soon. What can we do to solve this issue and tell Google, that all the "Wrong URLs" now redirect to the "Right URLs"? Best, David

    Read the article

  • Does a "nofollow" attribute on a link prevent URL discovery by search engines?

    - by Stephen Ostermiller
    I know that nofollow prevents link juice from being passed across a link. But if search engine robots discover a link with a nofollow on it, will they add that link to their crawl queue? In other words, if I create a link to a brand new page and put a rel=nofollow attribute on that link, will it prevent search engine bots (particularly Googlebot) from crawling the page. (Assuming that this link remains the only link into that page.) I've read conflicting reports about this over the years and I'm looking for authoritative references about the current state of affairs. Official statements from Google or published results of independent testing would be ideal.

    Read the article

  • Google is not indexing my entire site despite having a sitemap

    - by Anusha
    I have an e-commerce website www.beyondtime.in. I have been constantly monitoring Googlebot crawling on my website and my webmaster account. Lately, I have found two issues that I have not been able to understand. 1.) The Google Bots have been only crawling www.beyondtime.in/telecom.php when the URL is not even valid. What needs to be done to let Google crawl other pages of the website as well? 2.) The second question is about the Google Webmaster account, where I've submitted my sitemap with 227 URLs. Out of that, only 156 have been indexed. None of the images of my website have been indexed by Google.

    Read the article

  • webmaster tools - Network Unreachable

    - by Jayapal Chandran
    Hi, webmaster tools for my site displays that robots.txt unreachable and for all links in sitemap it says network unreachable. sitemap.xml unreachable. These appear in crawl stats page. I discussed with the support team of my hosting and they said... Hi, I have verified apache logs, i cannot see any issues on your website/webserver/ Possible issues. There may the routing issue from the googles server to our server. When a google bots hits goes high the IP will be automatically blacklisted by our firewall to avoid server loads & downtimes. As we donot have access to their services, We cannot able to give details of their details/logs etc. The sitemaps link shows an exclamation mark which means the file was not reachable. What could be the problem and how to solve it?

    Read the article

  • Problem with homepage's SEO when using subfolders in a multi language website

    - by Antonio
    After watching a hundreds of threads about multilanguage website I haven't found an answer to my specific problem, so I think its not a common issue and I must have done something terribly wrong ;-) We have a brand.com website in DE main language and the following subfolders: /de/ = canonical of / + redirect to / /it/ /en/ When I crawl google.com for EN keywords or google.it for IT keywords then I get as results the homepage in German language (both title and description) as the top result with no trace of the /it/ or the /en/ homepage. Is this because /it/ and /en/ both needs a separate link building strategy? I've already configured Google webmaster tool into the following way: brand.com, no language preference brand.com/de/, de language brand.com/it/, it language brand.com/en/, en language Perhaps having "/" as DE main page is it wrong and I should use a different approach? i.e. like having "/" to be a 301 to /de/ instead ? Thanks in advance.

    Read the article

  • Sample size and statistical significance in Google Analytics

    - by colmcq
    I have been asked to compile a report into dropout rates during checkout for a global webstore I have used a sample size over one month as my sample because: google analytics slows to a crawl over larger sample sizes and makes much of the analysis agonisingly small I believe it to be statistically significant and a representative sample My client has asked me why I didn't use yearly figures and wants proof that one month of data is 'statistically significant'. Am I right in thinking that I need to compare the standard deviation of my monthly sample to the yearly sample and ensure that the deviation is under a certain %age? Question: how do I prove one month of Google Analytics data is representative to one year worth of data? Stats: 90k unique views/month ~1.1m per year.

    Read the article

  • My domain PageRank shows as unavailable, why is that?

    - by Emerson
    My domain, http://www.anovaordemmundial.com , has been snatched by some opportunist when I failed to renew the domain. I know, it's all my fault :/ . After I have being ripped off and bought my domain back, and everything is configured and working, the pagerank for that domain shows as unavailable. Also searches for "nova ordem mundial" (in portuguese), which used to show my domain as the first result in searches in any language, now don't show it anymore. Do you think this is something temporary and it will recover its pagerank after a full crawl by google? There exists hundreds of sites pointing to my domain, that is why I got the previous relevance in searches. The domain is back for more than 5 days already. In reality, bing already Is there anything I can do to help get my domain back to its pagerank??? Thanks for the help!

    Read the article

  • Server overhead caused by bots?

    - by giuseppe
    I have one customer website causing overhead (http://www.modacalcio.it/en/by-kind/football-boots.html). With htop opened, I am trying navigate the website and the much load of the website is done by the ajax link being placed on the left side of the website. The website is hosted by a VPS with 3 proc and 2GB RAM, with enough hard with disk space. The real problem is that this website is new and not visited much. From the http-status module I am seeing that the overhead is caused by bots (Google bots, Bing bots, hrefs checker and so on). So I thought that's probably due to those spiders trying to crawl all those links at once - could this be causing this overhead? I have also put rel="nofollow" in those links, but this doesn't keep the bots away. Is there any way through code or Plesk to disable those links to those bots?

    Read the article

  • Google Bot trying to access my web app's sitemap

    - by geekrutherford
    Interesting find today...   I was perusing the event log on our web server today for any unexpected ASP.NET exceptions/errors. Found the following:   Exception information: Exception type: HttpException Exception message: Path '/builder/builder.sitemap' is forbidden. Request information: Request URL: https://www.bondwave.com:443/builder/builder.sitemap Request path: /builder/builder.sitemap User host address: 66.249.71.247 User: Is authenticated: False Authentication Type: Thread account name: NT AUTHORITY\NETWORK SERVICE   At first I thought this was maybe an attempt by a hacker to mess with the sitemap. Using a handy web site (www.network-tools.com) I did a lookup on the IP address and found it was a Google bot trying to crawl the application. In this case, I would expect an exception or 403 since the site requires authentication anyway.

    Read the article

  • Google webmaster Verification failed.

    - by KMC
    I have a site created by Ruby on Rails. I had verified against Google Webmaster Tool some months ago, which was successful. One day webmaster starts giving me Re-verification fails. I tried again to verify my site using Meta tags and HTML files. But I kept having "Verification failed. The connection to your server timed out." Since then, Google stop crawling my site's content - though, somehow google still crawl my PDF contents on my site.

    Read the article

  • Google Authorship: can I display:none for link to profile?

    - by RubenGeert
    I'd like to have my 'mugshot' in Google's SERPs but I couldn't care less about Google+. I don't really want to link my website to Google+ either. Can I use CSS display:none; on the link leading to my profile and still have authorship, which looks like <a href='https://plus.google.com/111823012258578917399?rel=author' rel='nofollow'>Google</a>? Will the nofollow attribute here spoil things? I don't want to lose 'link juice' on Google+ if I don't have to. Now Google should crawl only the HTML but I'm sure they'll figure out the link is not visible (perhaps it's technically even cloaking. Does anybody have experience with this situation? And do I really have to become (reasonably) active on Google+ in order for authorship to show? This answer suggests I do but I didn't read anything on that in Google's guidelines.

    Read the article

  • Panda 4: Reducing #indexed pages. How much is enough?

    - by Noam
    I've been hit by panda 4 (40% decrease). I didn't see any change during panda 1-3. From what I've read it and when compared to my site, the change is probably due to the fact that I have over 30M pages indexed on Google, and they've starting seeing that as some sort of bad indication. Although I feel all of the pages have a unique value that Google should crawl, it seems I should make some tough calls and deduce the indexed pages according to some prioritization I will conduct. The question is what should be my target, or what factors should help me figure out a relevant target. How many pages should I try to reduce to? - 25M - 15M - 1M - 2000 Is it enough to add noindex to low priority pages or should I also remove all internal linking to them?

    Read the article

  • My First robots.txt

    - by Whitechapel
    I'm creating my first robots.txt and wanted to get a second opinion on it. Basically I have a FTP setup on my board for some special users to transfer files between each other and I do NOT want that included in the search by the bots. I also want to point to my sitemap which gets auto generated by a PHP page. So here is what I have, what else should I include, and if I need to fix anything with it? Also, it's linking to xmlsitemap.php because that generates the sitemap when called. My goal is to allow any search bot crawl the forums to grab meta data. User-agent: * Disallow: /admin/ Disallow: /ali/ Disallow: /benny/ Disallow: /cgi-bin/ Disallow: /ders/ Disallow: /empire/ Disallow: /komodo_117/ Disallow: /xanxan/ Disallow: /zeroordie/ Disallow: /tmp/ Sitemap: http://www.vivalanation.com/forums/xmlsitemap.php Edit, I'm not sure how to handle all the user's folders under /public_html/ since the robots.txt will be going in /public_html.

    Read the article

  • Games display across both monitors when opened

    - by Mitch
    I am not sure what setting to change for this but I open games like Dungeon Crawl or a Steam game and the game wants to take up both screens. Is there a way to have the game open just on one screen xrandr shows this. So they are both on Screen 0: Screen 0: minimum 320 x 200, current 2966 x 900, maximum 8192 x 8192 LVDS1 connected 1366x768+1600+75 (normal left inverted right x axis y axis) 344mm x 193mm VGA1 connected 1600x900+0+0 (normal left inverted right x axis y axis) 443mm x 249mm If you need any more info or can point me to a place you may have already found an answer please let me know.

    Read the article

  • Webmaster tools showing 404 for non existent folder pages

    - by Jody
    Google webmaster tools is reporting some/many 404 urls that don't exist on my site. The links are things such as domain.com/xyz/ However that doesn't exist, but domain.com/xyz/index.html does exist. The "linked from" pages all show proper links to the "/xyz/index.html". The page without index.html DOES 404, but why is google even trying these urls if they are not linked to? My real question, is there a way to have google stop attempting to load these pages, and ultimately remove these from the crawl errors report. Thanks.

    Read the article

  • Directing crawlers to content in language per language sub-domain

    - by Noam
    I have a site with multilingual website with many pages (40M). The site has UGC, and each translation is actually for the titles. Each sub-domain points to the same content with different titles per language. As far as I understand, each sub-domain should be indexed by search engines, meaning they will actually need to crawl 40M x supported-languages. So I thought it might be best to direct each subdomain crawler, to pages that are fully in that language (titles + UGC). Is there a way to do this? Should search engines understand this on their own?

    Read the article

  • Google doesn't index a subdomain. What can be the problem and what can be done?

    - by fudge
    Hi! I have a domain, let's call it example.com, which has a subdomain, games.example.com. I maintain a games forum using phpbbseo which is located at games.example.com/forum. The problem is that the forum is not being crawled. I used Google's webmaster tools and tested that the page is seen by google. P.S. There is a link from games.example.com to games.example.com/forum. What can I do? How can I make google crawl my forum?

    Read the article

  • Does SEO optimisation count on the responsive side of a site?

    - by Rick Donohoe
    I'm looking at making some SEO optimisation fixes, and at this point I'm sorting out the heading structure and keywords - H1's, H2's etc We have a site where there are a number of similar blocks, and one is always visible, and one is hidden depending on the screen size. This is our method of making a single site responsive. Firstly, how does this technique affect the SEO, and in general does the responsive side of a site matter at all to search engines? What I mean by this is if the site has different content depending on screen sizes, then which content would the search spider crawl?

    Read the article

  • Crawling for geotagged data

    - by abe3
    I have no experience with web crawlers -- but I know that Apache maintains an open source web crawler called "Lucene." How would I go about writing such a crawler to search the web for geo tagged data close to a particular location? What would a general road map look like? How do I pick which slice of the web to crawl? Do I use regular expressions to find things that look like longitudes and latitudes? What does a general sketch of that solution look like?

    Read the article

  • robots.txt, how effective is it and how long does it take?

    - by Stefan
    We recently updated the site to a single page site using jQuery to slide between "pages". So we now have only index.php. When you search the company on engines such as Google, you get the site and a listing of its sub pages which now lead to outdated pages. Our plan doesn't allow us to edit the .htaccess and the old pages are .html docs so I cannot use PHP redirects either. So if I put in place a robots.txt telling the engines to not crawl beyond index.php, how effective will this be in preventing/removing crawled sub pages. And rough guess, how long before the search engines would update?

    Read the article

  • Did Google Delete my index?

    - by Sei
    I have a website that I haven't had much time to take care of. I updated it 4 times since last October, the contents are all original and informative. I can see those cache from Way back Machine and the site was indexed in Yahoo, but not in google. Did I get my index on Google deleted because I did not update it often? Normally Google crawl often and index fast, I just got really worried if I did something wrong. Is it possible that I set something wrong in the hosting or something? Please help me

    Read the article

  • I've changed my URL schema. How do I tell Google to index the new schema and forget the old one?

    - by growse
    I had a site where the urls were constructed like this /index.php/Topic /index.php/AnotherTopic These were indexed in google, and search results returned that pointed to these. However, I've recently replatformed that site, and reconfigured it so the above urls would be: /index.php?title=Topic /index.php?title=AnotherTopic The original urls are returning 404s. The site is linking to the correct URL schema internally, but Google is retaining the original schema in its search results. I've updated and resubmitted the sitemap which only contains the new schema. Also, Google's webmasters tool is going slightly bananas at the fact there's now a spike in 404 errors in its crawl results. What would be the best approach to get Google to 'forget' about the old schema, and instead index the new schema? Should I try blocking /index.php/ in robots.txt? Should I be returning 301 codes instead of 404 for the original urls?

    Read the article

  • SEO: Joomla Category Page Optimization + Canonical Linking

    - by Huberis
    I'm wondering how best to optimize my Joomla site's SEO. I have pages with multiple articles on each page. Either via category-type pages, or via modules. In each case, I'm not wanting users to access the articles separately from the forward facing, menu-linked pages. I understand however that Joomla still generates a url for those articles, and Google can still crawl and display these articles separate from the pages. My question is what is the best way to control this so that my users get directed only to the front-facing pages? By using the canonical element for each article to point to the front-facing page it's on? Or is there a better method? Thanks for your help!

    Read the article

  • problem showing my website correctly in search engines

    - by dinbrca
    Hello guys, I have a website which i have indexed on google for example (like 15 days ago). some of my pages pass arguments like: http://www.bla.com/products.php?pro=bla&page=view suddently i saw that passing arguments like this isn't good for SEO purposes and started using htaccess rewrite. and changed the arguments to like this: http://www.bla.com/products/bla/*view*/ now my site on google still shows as i showed at link number 1 what should i do? i thought i should wait for the search engine to crawl my site again but nothing happened. thanks in advanced, Din

    Read the article

  • Data architecture for event log metrics?

    - by elliot42
    My service has a large ongoing number of user events, and we would like to do things like "count occurrence of event type T since date D." We are trying to make two basic decisions: What to store? Storing every event vs. only storing aggregates (Event log style) log every event and count them later, vs. (Time-series style) store a single aggregated "count of event E for date D" for every day Where to store the data In a relational database (particularly MySQL) In a non-relational (NoSQL) database In flat log files (collected centrally over the network via syslog-ng) What is standard practice / where can I read more about comparing the different types of systems? Additional details: The total event stream is large, potentially hundreds of thousands of entries per day But our current need is only to count certain types of events within it We don't necessarily need real-time access to the raw data or aggregation results IMHO, "log all events to files, crawl them at a later time to filter and aggregate the stream" is a pretty standard UNIX Way, but my Rails-y compatriots seem to think that nothing is real unless it's in MySQL.

    Read the article

< Previous Page | 97 98 99 100 101 102 103 104 105 106 107 108  | Next Page >