Search Results

Search found 4 results on 1 pages for 'cuil'.

Page 1/1 | 1 

  • Google pourrait séparer les résultats homonymes dans des onglets en rachetant des brevets de son défunt concurrent Cuil

    Le moteur de recherche de Google pourrait séparer les résultats homonymes dans des onglets Google rachète des brevets de son défunt concurrent Cuil Comme la base de données de l'USPTO l'atteste, Google a acquis les brevets d'application d'un de ses rivaux : Cuil, un moteur de recherche surnommé un temps « tueur potentiel de Google ». Le 28 juillet 2008, ce moteur était né avec beaucoup d'espoirs. Ces espoirs étaient alimentés en partie par le fait que ses fondateurs étaient des anciens de Google (Anna Patterson et Russel Power), un co-fondateur d'IBM (Tom Costello) et le fondateur d'Altavista (Louis Monier). [IMG]http://x-plode.developpez.com/images/webmarketing/google-cuil...

    Read the article

  • Apache logging issues

    - by Dan
    I'm trying to parse apache log files, but I'm finding some strange results and I'm not sure what they mean. Hopefully someone can provide some insight. (all of the IP addresses were altered. none actually start with 192, I didn't figure the search engines mattered though.) In the first example, multiple ip addresses are showing up in the host field: 192.249.71.25 - - [04/Aug/2009:04:21:44 -0500] "GET /publications/example.pdf HTTP/1.1" 200 2738 192.0.100.93, 192.20.31.86 - - [04/Aug/2009:04:21:22 -0500] "GET /docs/another.pdf HTTP/1.0" 206 371469 What causes this? Does it have to do with proxy servers? Is there a way to have Apache only log one? In the second example, a bunch of information is just completely missing! What would cause this? msnbot-65-55-207-50.search.msn.com - - [29/Dec/2009:15:45:16 -0600] "GET /publications/example.pdf HTTP/1.1" 200 3470073 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)" 266 3476792 - - - - "-" - - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; InfoPath.1)" 285 594 - - - - "-" - - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; InfoPath.1)" 285 4195 - - - - "-" - - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; InfoPath.1)" 299 109218 crawl-17c.cuil.com - - [29/Dec/2009:15:45:46 -0600] "GET /publications/another.pdf HTTP/1.0" 200 101481 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)" 253 101704 My CustomLog configuration says: LogFormat "%h %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-agent}i\" %I %O" common

    Read the article

  • robots.txt file with more restrictive rules for certain user agents

    - by Carson63000
    Hi, I'm a bit vague on the precise syntax of robots.txt, but what I'm trying to achieve is: Tell all user agents not to crawl certain pages Tell certain user agents not to crawl anything (basically, some pages with enormous amounts of data should never be crawled; and some voracious but useless search engines, e.g. Cuil, should never crawl anything) If I do something like this: User-agent: * Disallow: /path/page1.aspx Disallow: /path/page2.aspx Disallow: /path/page3.aspx User-agent: twiceler Disallow: / ..will it flow through as expected, with all user agents matching the first rule and skipping page1, page2 and page3; and twiceler matching the second rule and skipping everything?

    Read the article

1