crawl - Page 8 - Developer IT

open 500 mb, 4 million record csv file

- by Giorgi

Hello, I have a csv file which has about 4 million rows and is about 500MB in size. Can you recommend any editor that can open the file without making the system crawl? I tried EmEditor but it is complaining that there are too many characters in a single line. Thanks.

Read the article

Application to open a 500 MB, 4 million record CSV file?

- by Giorgi

Hello, I have a csv file which has about 4 million rows and is about 500MB in size. Can you recommend any editor that can open the file without making the system crawl? I tried EmEditor but it is complaining that there are too many characters in a single line. Thanks.

Read the article

How to fix “SearchAdministration.aspx webpage cannot be found. 404”

- by ybbest

Problems: One of my colleague is having a wired issue today with Search Service Application in SharePoint2010.After he created the Search Service Application, he could not browse to the Search Administration (http://ybbest:5555/searchadministration.aspx?appid=6508b5cc-e19a-4bdc-89b3-05d984999e3c) ,he got 404 page not found every time he browse to the page. Analysis After some basic trouble-shooting, it turns out we can browse to any other page in the search application ,e.g. Manage Content Sources(/_admin/search/listcontentsources.aspx) or Manage Crawl Rules(/_admin/search/managecrawlrules.aspx).After some more research , we think some of the web parts in the Search Administration page might cause the problem. Solution You need to activate a hidden feature using #Enable-SPFeature SearchAdminWebParts -url <central admin URL> Enable-SPFeature SearchAdminWebParts -url http://ybbest:5555 If the feature is already enabled, you need to disable the feature first and then enable it. Disable-SPFeature SearchAdminWebParts -url http://ybbest:5555 Enable-SPFeature SearchAdminWebParts -url http://ybbest:5555 References: MSDN Forum

Read the article

How google handle site traffic in google analytics

- by Hamidreza

I have a site with address www.exam.com and I have put Google analytics javascript scripts in it. I have made an app for my site, I want that everytime a user uses app, he visit the site in the application with built in browser which is inside the application ( I am using C# for application and .NET web browser ). User will address www.example.com/appvisit in the app and I just have put google analytics scripts in that page and nothing else. And I want to disallow this address /appvisit in my robots.txt file . I want to know that Is there any problem with doing this? will google crawl in the /appvisit directory ? Does google hate this work? and will google think this traffic is true and normal? thanks

Read the article

Inside NASA’s Shuttle Trainer

- by Jason Fitzpatrick

After more than 30 years of service, NASA has retired their full-scale shuttle training simulator. Take a photo tour and learn where you can visit the trainer and crawl around inside for a more hands-on experience. The trainer is currently on display at the Charles Simonyi Space Gallery at the Museum of Flight in Seattle, Washington. For those of us unable to visit the trainer in person, Wired Magazine has a full photo tour at the link below. Get Inside the Replica that Trained Every Shuttle Astronaut [Wired] Why Does 64-Bit Windows Need a Separate “Program Files (x86)” Folder? Why Your Android Phone Isn’t Getting Operating System Updates and What You Can Do About It How To Delete, Move, or Rename Locked Files in Windows

Read the article

Modelling photo-realistic grass in realtime

- by sebf

Hello, I see a number of tutorials on how to create good looking grasses when creating 3D renders but can't think how to model it for realtime/use in a game's scenery. Sure simple models with alpha cutouts can be used to create plants and trees in really awesome scenery but what about a lawn? Are there any good tricks to achieve this effect? I tried with a simple 4 sided box and a small texture and the number of objects needed for a decent appearance made Max crawl to a halt. (I am thinking it may be possible with a shader but that is a whole other area so thought I would just ask about anyones experience with modelling it here) Thanks!

Read the article

https & ajax crawling

- by Christoph Gassauer

We made on our webpage https://www.1point618.com a transition to ssl and now we using nearly entirely ajax to load the content. Therefore all urls of existing pages have changed. We used the 301 redirect as recommended, also we have implemented google's specification that the webpage is still crawl-able. We thought that maybe it would last a month that we have the same ranking in google's search results, but still google's search results are much worse than before these changes. Most of the content (artist profiles) isn't indexed anymore. For example of the submitted sitemap only 3 of around 450 urls are indexed. Before almost all urls were indexed. My question is now: Does google's ajax crawling work together with ssl? (It looks like it would work, cause of the access log file.)

Read the article

Weird .ASP pages from my non-ASP site generating 404s

- by Amanda

In Google Webmaster Tools, I have a huge list of "Not Found" crawl errors (404) with URLs that look like this: http://www.exclusivevillas.co.za/villa_view.asp?vSeq=82&activitySeq=3&page=3, seemingly originating from URLs very similar to that (eg http://www.exclusivevillas.co.za/villa_view.asp?vSeq=82&activitySeq=3&page=4. Thing is, the site is WordPress. Has been for almost a year now. Was plain html before that. I don't know where these ASP requests are coming from. And furthermore, the dates these supposed ASP pages requested these other ASP pages, resulting in 404s, are very recent. What's going on?

Read the article

What do you think of the following job specification?

- by m.edmondson

Just received this out of the blue from a recruiter - a number of things stand out to me: PERSON PROFILE Hard working - with a stay until the job in done mentality Thrive on the pressure of tight weekly development deadlines Good attention to detail to ensure bug free development Ability to test all development work from user's perpective Ability to think like a user as well as a developer Good communication skills to understand new funcationality and bugs Flexibility to contribute outside main responsbilities when needed. BENEFITS Salary dependant on skills Contributary Pension with 4% contribution from employer (after 1 year of service) Private Healthcase (after 1 year of service) 20 days holiday + 3-4 days holiday between Christmas and New year - 1 day extra holiday available each quarter you don't have a day off sick (and an additional day if you are not off sick for the whole year ). Would you want to work here? From what I can see they want a work-a-holic who will crawl out of his death bed in order to not lose holiday entitlement.

Read the article

.XML Sitemaps and HTML Sitemaps Clarification

- by MSchumacher

I've got a website with about 170 pages and I want to create an effective Sitemap for it as it is long due. The website is internally linked very well but I still want to take advantage of creating a sitemap to allow SE's to crawl my site easier and to hopefully increase my websites PR. Though I am slightly confused to what I must do: Is it necessary to create a .xml sitemap AND a HTML Sitemap (both)? ... Because I've never worked with .xml ... where do I put this file once it's created? In the Root folder? So I assume that this sitemap.xml is ONLY to be read by spiders and NOT by website visitors. IE: No visitor on my website is going to visit the page sitemap.xml, am I correct? ... Hence why I should also create an HTML sitemap (sitemap.htm)?

Read the article

Google Webmaster Tools, DNS Errors & HostPapa

- by Gravy

Received a message from Google Webmaster Tools: Over the last 24 hours, Googlebot encountered 2 errors while attempting to retrieve DNS information for your site. The overall error rate for DNS queries for your site is 40.0%. You can see more details about these errors in Webmaster Tools. Recommended action Contacted HostPapa and they deny that there is any issue with the site / server!!! Support in terms of what I can do to actually resolve this issue is non-existent!!!! The site is currently online. And I don't know much about DNS... so any advice about what I can do to resolve this problem would be much appreciated. Basically, the message from Google says that it is my webhosts fault, the message from my webhost (HostPapa) is... "Just tell google to crawl your site as there are no errors."

Read the article

Crawling an ajax based page with both a hash fragment and a meta tag

- by Christofian

According to google's documentation on crawling ajax based web pages, if a url contains a hash fragment, or something at the end of an url that looks like #helloworld, and if there is an ! after the #, as in #!helloworld, google will then request the url url?_escaped_fragment_=helloworld. I currently have an ajax based webpage that I want google to be able to crawl. Sometimes, the page uses hash fragments, and for those situations I set up the server so it will return an html snapshot for that page using _escaped_fragment_. However, that webpage often does not load a hash fragment, and when that happens the webpage still loads content using ajax. I couldn't find a good solution to enable ajax crawling for pages that sometimes have a hash fragment and sometimes don't. How can I tell google to use _escaped_fragment_ when there is a hash fragment, and to use something else to get an html snapshot of a page when there isn't a hash fragment?

Read the article

How to solve "Login Only" rejection?

- by Renan

Recently, a site of mine was rejected due to "Login Only": "Login Only: During our review of your website, we found that the majority of pages on your site are behind a login, or there is restricted access. Please note that we will not approve applications for login-protected pages, as we are not able to review their content for acceptance into the program." Although the site does require login to send content, it doesn't require any to view any page. How do I tell the Googlebot or whatever is used to crawl pages to adsense that all the content is publicly available but registration is needed to post?

Read the article

Does iframe affect SEO of its parent page?

- by Xu Jiawan

I would like to know that, does iframe affect the SEO of its parent page (the page contains iframe)? I've done some searching, such as Do we still need to avoid using frame/iframe for good SEO? and Using iFrame: SEO and Accessibility Points, which tell me that: The content in an iframe is not considered part of the parent page. The page within an iframe may be spidered and indexed (or it may be not) but no PR is definitely passed. But these are the content in the iframe, what about the parent page? Does the PageRank of the parent page will decrease because the iframe? Or maybe Googlebot wouldn't crawl the parent page? Or is the parent page not affected at all?

Read the article

Preventing indexing duplicate content by search engines

- by umesh awasthi

I am in process of migrating my old domain (www.oldurl.com) to new domain (www.newurl.com). Almost all the content,URL structure as well database is same except for few URL's and only difference will be in the domain name. I have made entries in the Apache's .htaccess file to set 301 redirect and currently have blocked all search engines from crawling my new domain by setting in robot.txt file. I am not sure how i will handle the duplicate content issue as when i will make the new domain go live. Should i block search engines to index/crawl my old domain? i am new to this field and not sure if this is actually any duplicate content issue or not.

Read the article

Robots.txt and "Bad" Robots

- by Lynda

I understand robots.txt and its purpose. I have read some people saying that using a Robots.txt gives "bad" robots or robots who do not obey a robots.txt a way to access pages on your site that you do not want accessed. While I am not looking to get into a debate about that I do have a question: If I have a structure like this: /Folder/ /Sub-Folder 1/ /Sub-Folder 2/ (Note: There are no pages within /Folder/ only other folders.) If I Disallow: /Folder/ it will prevent "good" robots from accessing the directory and any contents within the sub-folders. While we know that bad robots will see the /Folder/ will they be able to see and acess the sub-folders and the pages within the subfolders if they are not listed in the robots.txt? (Note: I do not fully understand how robots good or bad crawl a site beyond using a robots.txt and links within the site.)

Read the article

410 Responses when your CMS host doesn't support them?

- by leeand00

Sending a 410 responses for a page that no longer exist should make Google stop crawling for that page. The site I am working on has been recently migrated, and very little of the content was migrated. I've already turned the existing content into 301 redirects (the content that is on both the old and the new site), but now I would like to flush the old content from Google's memory by placing 410 responses in it's path when it returns to crawl for them and finds a 404 response. However, I asked our CMS host about it, and they said that our CMS does not support 410 responses. Is there some other way to post a 410 response, like making a dead link 301 redirect to a page that a 410 response in the form of a meta tag?

Read the article

Should I post my PDF library for SEO? [closed]

- by Iunknown

Possible Duplicate: Do search engines crawl PDFs and if so are there any rules to follow when making them When a Sales call comes in, the caller often says something like: 'I searched for 3 days before finding your product and it's exactly what I need!' That's telling me that I need some SEO work. We redid our website and streamlined it which removed many of our 'How-To' documents. Since those PDF documents contain words that people might search for, I was wondering if I could add a 'Complete library' link to the bottom of a page that will load up the entire PDF library. Would that help my ranking?

Read the article

How to remove duplicate content, which is still indexed, but not linked to anymore?

- by David

A bug in the tool, which we use to create search-engine-friendly URLs changed our whole URL-structure overnight, and we only noticed after Google already indexed the page. Now, we have a massive duplicate content issue, causing a harsh drop in rankings. Webmaster Tools shows over 1,000 duplicate title tags, so I don't think, Google understands what is going on. Right URL: abc.com/price/sharp-ah-l13-12000-btu.html Wrong URL: abc.com/item/sharp-l-series-ahl13-12000-btu.html (created by mistake) After that, we ... Changed back all URLs to the "Right URLs" Set up a 301-redirect for all "Wrong URLs" a few days later Now, still a massive amount of pages is in the index twice. As we do not link internally to the "Wrong URLs" anymore, I am not sure, if Google will re-crawl them very soon. What can we do to solve this issue and tell Google, that all the "Wrong URLs" now redirect to the "Right URLs"? Best, David

Read the article

Creating Google sitemap.xml , is it okay for the images to be wrapped in url tags?

- by AzizAG

I'm using a tool to generate the sitemap.xml file for me, it started to crawl my website, got the pages and all images, but when exporting it, I review the xml(to make sure nothing is wrong) and I noticed that the images in my website are wrapped in url tags(I think it should be in image tags). See this: <url><loc>http://mywebsite.com/images/12.jpg</loc><lastmod>2012-05-23T13:39:02+00:00</lastmod><changefreq>weekly</changefreq><priority>0.50</priority></url> Shouldn't it be wrapped in image tag?(just like videos wrapped in video tag) Thanks.

Read the article

Does a "nofollow" attribute on a link prevent URL discovery by search engines?

- by Stephen Ostermiller

I know that nofollow prevents link juice from being passed across a link. But if search engine robots discover a link with a nofollow on it, will they add that link to their crawl queue? In other words, if I create a link to a brand new page and put a rel=nofollow attribute on that link, will it prevent search engine bots (particularly Googlebot) from crawling the page. (Assuming that this link remains the only link into that page.) I've read conflicting reports about this over the years and I'm looking for authoritative references about the current state of affairs. Official statements from Google or published results of independent testing would be ideal.

Read the article

Problem with homepage's SEO when using subfolders in a multi language website

- by Antonio

After watching a hundreds of threads about multilanguage website I haven't found an answer to my specific problem, so I think its not a common issue and I must have done something terribly wrong ;-) We have a brand.com website in DE main language and the following subfolders: /de/ = canonical of / + redirect to / /it/ /en/ When I crawl google.com for EN keywords or google.it for IT keywords then I get as results the homepage in German language (both title and description) as the top result with no trace of the /it/ or the /en/ homepage. Is this because /it/ and /en/ both needs a separate link building strategy? I've already configured Google webmaster tool into the following way: brand.com, no language preference brand.com/de/, de language brand.com/it/, it language brand.com/en/, en language Perhaps having "/" as DE main page is it wrong and I should use a different approach? i.e. like having "/" to be a 301 to /de/ instead ? Thanks in advance.

Read the article

Google is not indexing my entire site despite having a sitemap

- by Anusha

I have an e-commerce website www.beyondtime.in. I have been constantly monitoring Googlebot crawling on my website and my webmaster account. Lately, I have found two issues that I have not been able to understand. 1.) The Google Bots have been only crawling www.beyondtime.in/telecom.php when the URL is not even valid. What needs to be done to let Google crawl other pages of the website as well? 2.) The second question is about the Google Webmaster account, where I've submitted my sitemap with 227 URLs. Out of that, only 156 have been indexed. None of the images of my website have been indexed by Google.

Read the article

webmaster tools - Network Unreachable

- by Jayapal Chandran

Hi, webmaster tools for my site displays that robots.txt unreachable and for all links in sitemap it says network unreachable. sitemap.xml unreachable. These appear in crawl stats page. I discussed with the support team of my hosting and they said... Hi, I have verified apache logs, i cannot see any issues on your website/webserver/ Possible issues. There may the routing issue from the googles server to our server. When a google bots hits goes high the IP will be automatically blacklisted by our firewall to avoid server loads & downtimes. As we donot have access to their services, We cannot able to give details of their details/logs etc. The sitemaps link shows an exclamation mark which means the file was not reachable. What could be the problem and how to solve it?

Read the article

Sample size and statistical significance in Google Analytics

- by colmcq

I have been asked to compile a report into dropout rates during checkout for a global webstore I have used a sample size over one month as my sample because: google analytics slows to a crawl over larger sample sizes and makes much of the analysis agonisingly small I believe it to be statistically significant and a representative sample My client has asked me why I didn't use yearly figures and wants proof that one month of data is 'statistically significant'. Am I right in thinking that I need to compare the standard deviation of my monthly sample to the yearly sample and ensure that the deviation is under a certain %age? Question: how do I prove one month of Google Analytics data is representative to one year worth of data? Stats: 90k unique views/month ~1.1m per year.

Search Results

Search found 446 results on 18 pages for 'crawl'.

Page 8/18 | < Previous Page | 4 5 6 7 8 9 10 11 12 13 14 15 | Next Page >

- by Giorgi

- by Giorgi

- by ybbest

- by Hamidreza

- by Jason Fitzpatrick

- by sebf

- by Christoph Gassauer

- by Amanda

- by m.edmondson

- by MSchumacher

- by Gravy

- by Christofian

- by Renan

- by Xu Jiawan

- by umesh awasthi

- by Lynda

- by leeand00

- by Iunknown

- by David

- by AzizAG

- by Stephen Ostermiller

- by Antonio

- by Anusha

- by Jayapal Chandran

- by colmcq

< Previous Page | 4 5 6 7 8 9 10 11 12 13 14 15 | Next Page >