Search Results

Search found 287 results on 12 pages for 'crawling pasta hellion'.

Page 6/12 | < Previous Page | 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • When Canonicalization is an Issue

    Although extremely hard to pronounce, canonicalization is a hot topic right now. If there are a lot of URLs that lead to pretty much the same page, you're going to make the search engines work extra hard and spend a lot more time crawling all the different URLs. Often times, this means that they'll miss the important pages of your website because your crawl time is limited or too slow.

    Read the article

  • How Many Web Pages Should Be Indexed?

    Search engines are crawling websites around the clock for unique web pages and content.Google has always been on the top in indexing deep-links of any website, Google indexed 26 million pages in 1998 and in past 10 years Google have indexed over 1 trillion pages. So, this gives a fair idea that how big this cyber world is.

    Read the article

  • How to Use SEO Services to Have a Successful Website

    Essentially the optimization of web pages in a site is required because search engines are software programs based on a specific algorithm that is used at the time of its crawling into your website. Each website has numerous web pages and it is practically difficult to index and crawl each and every web page. No search engine can perform this function.

    Read the article

  • How Many Web Pages Should Be Indexed?

    Search engines are crawling websites around the clock for unique web pages and content.Google has always been on the top in indexing deep-links of any website, Google indexed 26 million pages in 1998 and in past 10 years Google have indexed over 1 trillion pages. So, this gives a fair idea that how big this cyber world is.

    Read the article

  • Force request to miss cache but still store the response

    - by Tom Marthenal
    I have a slow web app that I've placed Varnish in front of. All of the pages are static (they don't vary for a different user), but they need to be updated every 5 minutes so they contain recent data. I have a simple script (wget --mirror) that crawls the entire website every 15 minutes. Each crawl takes about 5 minutes. The point of the crawl is to update every page in the Varnish cache so that a user never has to wait for the page to generate (since all pages have been generated recently thanks to the spider). The timeline looks like this: 00:00:00: Cache flushed 00:00:00: Spider starts crawling to update cache with new pages 00:05:00: Spider finishes crawling, all pages are updated until 1:15 A request that comes in between 0:00:00 and 0:05:00 might hit a page that hasn't been updated yet, and will be forced to wait a few seconds for a response. This isn't acceptable. What I'd like to do is, perhaps using some VCL magic, always foward requests from the spider to the backend, but still store the response in the cache. This way, a user will never have to wait for a page to generate since there is no 5-minute window in which parts of the cache are empty (except perhaps at server startup). How can I do this?

    Read the article

  • Using jQuery to Dynamically Insert Into List Alphabetically

    - by Dex
    I have two ordered lists next to each other. When I take a node out of one list I want to insert it alphabetically into the other list. The catch is that I want to take just the one element out and place it back in the other list without refreshing the entire list. The strange thing is that when I insert into the list on the right, it works fine, but when I insert back into the list on the left, the order never comes out right. I have also tried reading everything into an array and sorting it there just in case the children() method isn't returning things in the order they are displayed, but I still get the same results. Here is my jQuery: function moveNode(node, to_list, order_by){ rightful_index = 1; $(to_list) .children() .each(function(){ var ordering_field = (order_by == "A") ? "ingredient_display" : "local_counter"; var compA = $(node).attr(ordering_field).toUpperCase(); var compB = $(this).attr(ordering_field).toUpperCase(); var C = ((compA > compB) ? 1 : 0); if( C == 1 ){ rightful_index++; } }); if(rightful_index > $(to_list).children().length){ $(node).fadeOut("fast", function(){ $(to_list).append($(node)); $(node).fadeIn("fast"); }); }else{ $(node).fadeOut("fast", function(){ $(to_list + " li:nth-child(" + rightful_index + ")").before($(node)); $(node).fadeIn("fast"); }); } } Here is what my html looks like: <ol> <li ingredient_display="Enriched Pasta" ingredient_id="101635" local_counter="1"> <span class="rank">1</span> <span class="rounded-corners"> <span class="plus_sign">&nbsp;&nbsp;+&nbsp;&nbsp;</span> <div class="ingredient">Enriched Pasta</div> <span class="minus_sign">&nbsp;&nbsp;-&nbsp;&nbsp;</span> </span> </li> </ol>

    Read the article

  • Metaprogramming ActiveRecord Rails

    - by Dimitar Vouldjeff
    Hi, I have the following code in my project`s lib directory module Pasta module ClassMethods def self.has_coordinates self.send :include, InstanceMethods end end module InstanceMethods def coordinates [longitude ||= 43.0, latitude ||= 25.0] end end ActiveRecord::Base.extend ClassMethods end And it should create a class method for ActiveRecord::Base - has_coordinates - which I can "assign" to models... But I receive the error undefined local variable or method 'has_coordinates' Thanks in advance!

    Read the article

  • google search engine

    - by kourosh
    I am working on a google box, something like this, http://mytwentyfive.com/blog/wp-content/uploads/byme/Google%20Search%20Appliances.jpg I am pointing the crawler to a folder where there are html files. before the crawler was crawling the files and indexing them but right now it finds the pattern or the folder but not following any html files within the folder. I have tried everything I could and know but, can't think of anything else. Can someone help? thanks

    Read the article

  • Innotop and Monit to kill thread using too much resources

    - by pocesar
    Instead of restarting the whole MYSQL process, sometimes I just want to kill the offending thread instead of making everything go down. Usually the spike in CPU is when a bot is crawling the first pages of pagination of my site (over 70.000 paginated results, 45 items per page). Is there a way I could do this automatically using monit and innotop? I couldn't find relevant information on Google, that's why I'm asking here. If these two tools aren't par to the task, which ones should I use?

    Read the article

  • Can I increase Windows 7 start menu vertical size to let mire items fit in it?

    - by Ivan
    I hate putting shortcuts/files on desktop as well as crawling through "All Programs" menu any frequently (and I only pin some essential every-day applications to the task bar). So, I put all the programs I occasionally use to the start menu itself (above the automatic recently used programs section). But even though I've switched it to use small icons, I run out of vertical space in it (just about 16 shortcuts fit there at maximum).

    Read the article

  • I have a collection of dead consumer grade routers, should I buy a real one?

    - by Ex Networking Guy
    Am I crazy for considering purchasing a Cisco 2621 for the house? I am familiar enough with IOS to set up a simple gateway router, I don't really need the experience. At this point, I'm a developer so my days of crawling through CO's and under desks are long past me. But I am really sick of crappy consumer grade networking gear. Maybe I have lousy luck and this stack of WRTG54s is just because I have lousy power, or whatever.

    Read the article

  • Correção de permissão de pastas [closed]

    - by Cezar Luiz
    Olá todos boa tarde. Fiz uma mer... aqui e uma pasta minha ficou assim. ls -lha total 20K ?--------- ? ? ? ? ? brsdinfra001 Onde deveria ter alguma coisa como drwxrwxr-x 2 nobody nobody 4.0K Oct 4 09:45 Alguem sabe como consertar isso? ENGLISH TRANSLATION: Hello good afternoon everyone. I made a mer ... and here I was just a folder. ls-lha 20K total? ---------? ? ? ? ? brsdinfra001 Where should have something like drwxrwxr-x 2 nobody nobody 4.0K Oct 4 09:45 Anyone know how to fix this?

    Read the article

  • Thousands of 404 errors in Google Webmaster Tools

    - by atticae
    Because of a former error in our ASP.Net application, created by my predecessor and undiscovered for a long time, thousands of wrong URLs where created dynamically. The normal user did not notice it, but Google followed these links and crawled itself through these incorrect URLs, creating more and more wrong links. To make it clearer, consider the url example.com/folder should create the link example.com/folder/subfolder but was creating example.com/subfolder instead. Because of bad url rewriting, this was accepted and by default showed the index page for any unknown url, creating more and more links like this. example.com/subfolder/subfolder/.... The problem is resolved by now, but now I have thousands of 404 errors listed in the Google Webmaster Tools, which got discovered 1 or 2 years ago, and more keep coming up. Unfortunately the links do not follow a common pattern that I could deny for crawling in the robots.txt. Is there anything I can do to stop google from trying out those very old links and remove the already listed 404s from Webmaster Tools?

    Read the article

  • Can preventing directory listings in WordPress upload folders cause Google ranking drops when they cause 403 errors in Webmaster Tools?

    - by Kelly
    I recently moved to a new host that blocks crawling to my uploads folders but (hopefully) allows the files in the folder to be crawled. I now show many 403 errors for each folder in the uploads folder in my Webmaster Tools. For example, http://www.rewardcharts4kids.com/wp-content/uploads/2013/07/ shows a 403 error. For example, I can access this file: http://www.rewardcharts4kids.com/wp-content/uploads/2013/07/lunch-box-notes.jpg but I cannot access the folder it is in. My rankings went down after I moved to this host and I am wondering if: this could be the reason. is this how files/folders are supposed to be set up?

    Read the article

  • Where can I find an exhaustive list of meta tags and what they do?

    - by leeand00
    It seems to me that there are a ton of <meta> tags for all sorts of different purposes out there... Though they all follow a similar format of <meta name="" content="" /> they seem to serve a vast variety of different purposes from controlling the crawling of search engine bots, providing search engine bots with descriptions of pages, to making sure a page display correctly on a mobile device. These tags fall into so many different categories I was wondering if anyone had a wiki or master list of possible meta tags and their content.

    Read the article

  • Preventing indexing duplicate content by search engines

    - by umesh awasthi
    I am in process of migrating my old domain (www.oldurl.com) to new domain (www.newurl.com). Almost all the content,URL structure as well database is same except for few URL's and only difference will be in the domain name. I have made entries in the Apache's .htaccess file to set 301 redirect and currently have blocked all search engines from crawling my new domain by setting in robot.txt file. I am not sure how i will handle the duplicate content issue as when i will make the new domain go live. Should i block search engines to index/crawl my old domain? i am new to this field and not sure if this is actually any duplicate content issue or not.

    Read the article

  • Best way to prevent Google from indexing a directory [duplicate]

    - by Gkhan14
    This question already has an answer here: Stopping Google index some web pages I have 5 answers I've researched many methods on how to prevent Google/other search engines from crawling a specific directory. The two most popular ones I've seen are: Adding it into the robots.txt file: Disallow: /directory/ Adding a meta tag: <meta name="robots" content="noindex, nofollow"> Which method would work the best? I want this directory to remain "invisible" from search engines so it does not affect any of my site's ranking. In other words, I want this directory to be neutral/invisible and "just there." I don't want it to affect any ranking. Which method would be the best to achieve this?

    Read the article

  • 410 Responses when your CMS host doesn't support them?

    - by leeand00
    Sending a 410 responses for a page that no longer exist should make Google stop crawling for that page. The site I am working on has been recently migrated, and very little of the content was migrated. I've already turned the existing content into 301 redirects (the content that is on both the old and the new site), but now I would like to flush the old content from Google's memory by placing 410 responses in it's path when it returns to crawl for them and finds a 404 response. However, I asked our CMS host about it, and they said that our CMS does not support 410 responses. Is there some other way to post a 410 response, like making a dead link 301 redirect to a page that a 410 response in the form of a meta tag?

    Read the article

  • Does a "nofollow" attribute on a link prevent URL discovery by search engines?

    - by Stephen Ostermiller
    I know that nofollow prevents link juice from being passed across a link. But if search engine robots discover a link with a nofollow on it, will they add that link to their crawl queue? In other words, if I create a link to a brand new page and put a rel=nofollow attribute on that link, will it prevent search engine bots (particularly Googlebot) from crawling the page. (Assuming that this link remains the only link into that page.) I've read conflicting reports about this over the years and I'm looking for authoritative references about the current state of affairs. Official statements from Google or published results of independent testing would be ideal.

    Read the article

  • Which token from a long User-Agent should I use in robots.txt?

    - by Gaia
    The definition of User-Agent states that several tokens can be included, as deemed necessary by the client. I want to block certain bots via robots.txt and I am confused as to which part of the User-Agent string to use, especially for more obscure bots. For example: Mozilla/5.0 (compatible; uMBot-LN/1.0; mailto: [email protected])" JS-Kit URL Resolver, http://js-kit.com/ Mozilla/5.0 (compatible; SEOkicks-Robot +http://www.seokicks.de/robot.html Do I use the second token? Can tokens contain spaces, or did the SEOkicks folks forget a semicolon after SEOkicks-Robot? I don't actually intend on making my question specific to a couple bots - I want to know the guideline: which part of UA do I place in robots.txt for these exotic bots with UA as long as a haiku? User-agent: uMBot-LN/1.0 Disallow: / PS: Thank you but I do not need to hear that undesirable bots are better blocked with mod_security. I already have commercial mod_sec rules in place.

    Read the article

  • Good Literature for "Object oriented programming in C"

    - by Dipan Mehta
    This is not a debate question about whether or not C is a good candidate for Object oriented programming or not. Quite often C is the primary platform where the development is happening. I have seen, and hopefully learnt through crawling many open source and commercial projects - that while the language inherently doesn't stop you if you create "non-object" code. However, you can still think in the "Object" way and reasonably write code that captures this designs thinking. For those who has done this, OO way is still the best way to write code even when you are programming in C. While, I have learnt most of it through the hard way, are there any deep literature that can help educate the relatively young guys about how to do OO programming in C?

    Read the article

  • I need to go from Linux to VS2012 fast. Anybody have a guide?

    - by Mikhail
    I need to parallelize a library through the use of a graphic accelerator. I have had no trouble doing similar work on Linux but I am struggling with using Visual Studios 2012. I can't figure out how to do analogs to simple things. I can't figure out how to do simple things like specifying linkage, libraries, and include files. I need to move quickly from understanding the Linux build system to the Windows build system. Does anybody have a guide or some advice on moving from Linux to Visual Studios development? I feel like I am crawling through a labyrinth of menus. With frequent dead ends saying that this feature has moved to another place. Also this code must build with VS2012.

    Read the article

< Previous Page | 2 3 4 5 6 7 8 9 10 11 12  | Next Page >