Search Results

Search found 42868 results on 1715 pages for 'web crawlers'.

Page 40/1715 | < Previous Page | 36 37 38 39 40 41 42 43 44 45 46 47 | Next Page >

Junior software developer - How to understand web applications in depth?

- by nat_gr

I am currently a junior developer in web applications and specifically in ASP.NET MVC technology. My problem is that the C# senior developer in the company has no experience with this technology and I try to learn without any guidance. I went through all tutorials (e.g music store), codeplex projects and also read Pro ASP.NET MVC 4. However, most of the examples are about CRUD and e-commerce applications. What I don't understand is how dependency injection fits in web applications (I have realized that is not only used for facilitating unit testing) or when I should use a custom model binder or how to model the business logic when there is already a database schema in place. I read the forum quite often and it would very helpful if some experienced developer could give me an insight about how to proceed. Do I need to read some books to understand the overall idea behind web applications? And what kind of application should I start building myself - I don't think it would be useful to create similar examples with the tutorials.

Read the article
Why deny access to website for msnbot/bingbot?

- by Quandary

I've seen quite a lot of tutorials that recommend you to ban user agents containing the strings libwww-perl and msnbot. I understand why one would ban libwww-perl, it's mainly if not only used for hacking and spamming. But why are there so many sites recommending to ban msnbot/bingbot? Since it's a search engine, even if only with a marginal market share, I would except one would want this bot to crawl one's sites. What is it that msnbot does that makes people ban it?

Read the article
What bots are really worth letting onto a site?

- by blunders

Having written a number of bots, and seen the massive amounts of random bots that happen to crawl a site, I am wondering if the goal of the site allowing bots is for the potential for the bot to send real traffic back to the site if there is any reason to allow bots that are not known to be sending real traffic back, and how to spot these "good" bots; based on how they ID themselves, IPs they come from, behaviors, etc.

Read the article
java - call web service operation - wrong return type

- by user1639680

i have a simple web service with one method that returns List<org.company.data.mp> i've created a simple web service client and specified a web service with wsdl. in netbeans i try to call a web service operation: right click, insert code, ... and i pick my web service operation. the code gets inserted but the method's return type is not List<org.company.data.mp> but it is List<org.company.server.mp>! i don't get it.. in the package "server" there is no class called mp! i check the implementation class of my web service - it says the return type is ...data.mp not ...server.mp

Read the article
Do I need to know servlets and JSP to learn spring or hibernate or any other java web frameworks?

- by KyelJmD

I've been asking a lot of people where to start learning java web development, I already know core java (Threading,Generics,Collections, a little experience with (JDBC)) but I do not know JSPs and servlets. I did my fair share of development with several web based applications using PHP for server-side and HTML,CSS,Javascript,HTML5 for client side. Most people that I asked told me to jump right ahead to Hibernate while some told me that I do not need to learn servlets and jsps and I should immediately study the Spring framework. Is this true? do I not need to learn servlets and JSPs to learn hibernate or Spring? All of their answers confused me and now I am completely lost what to learn or study. I feel that if I skipped learning JSP and servlets I would missed a lot of important concepts that will surely help me in the future. So the question, do I need to have foundation/know servlets and JSP to learn spring or hibernate or any other java web frameworks.?

Read the article
GWT: reporting crawling errors for non existing links

- by pixeline

Google Webmaster Tools is reporting crawl errors for links that never existed, and if i check the "Linked from" tab for a given error link, it shows another that never existed. They all mention joomla/ which is not the cms used on this domain (it's wordpress fyi). Exampled: http://example.com/joomla/index.php/component/user/register Linked from: http://example.com/joomla/component/user/login?return=L2###### What is going on? UPDATE 1 I tried something: I provided one of the faulty urls to the "Fetch as Google" functionality. Instead of returning a 404, it returns a 301 to another Joomla page. HTTP/1.1 301 Moved Permanently Server: Apache/2.4.3 X-Powered-By: PHP/5.4.4-10 X-Pingback: http://example.com/xmlrpc.php Expires: Wed, 11 Jan 1984 05:00:00 GMT Cache-Control: no-cache, must-revalidate, max-age=0 Pragma: no-cache Set-Cookie: PHPSESSID=1fgr5v2oip39miibuptd51s8h0; path=/ Set-Cookie: woocommerce_items_in_cart=0; expires=Sat, 12-Jan-2013 11:44:01 GMT; path=/ Location: http://example.com/joomla/component/user/register Content-Type: text/html; charset=iso-8859-1 Content-Length: 387 Date: Sat, 12 Jan 2013 12:44:01 GMT Via: 1.1 varnish Connection: keep-alive Accept-Ranges: bytes Age: 0 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>301 Moved Permanently</title> </head><body> <h1>Moved Permanently</h1> <p>The document has moved <a href="http://example.com/joomla/component/user/register">here</a>.</p> <p>Additionally, a 301 Moved Permanently error was encountered while trying to use an ErrorDocument to handle the request.</p> </body></html>

Read the article
Managing web.config for teams in VS2010 & TFS

- by Jarrett

With VS2010's mandate that web.config be included in the project, how do we allow everyone to keep their own custom config file without getting into source control problems? Previously, we would simply leave web.config out of our project, allowing everyone to keep their own local version of web.config on their machine. We moved to VS2010, and it is now forcing me to add web.config to my project in order to run debug mode. Because our project is linked to TFS, it automatically adds web.config to source control and tries to maintain it that way. Is there a way to run in debug mode without including web.config in your project? Or is there a better way to manage config files?

Read the article
How to test the render speed of my solution in a web browser?

- by Cuartico

Ok, I need to test the speed of my solution in a web browser, but I have some problems, there are 2 versions of the web solution, the original one that is on server A and the "fixed" version that is on server B. I have VS2010 Ultimate, so I can make a web and load test on solution B, but I can't load the A solution on my IDE. I was trying to use fiddle2 and jmeter, but they only gave me the times of the request and response of the browsers with the server, I also want the time it takes to the browser to render the whole page. Maybe I'm misusing some of this tools... I don't know if this could be usefull but: Solution A is on VB 6.0 Solution B is on VB.Net Thanks in advance!

Read the article
Is it possible to use AWS as a web host?

- by Matrym

Is it possible to load / host an entire website using AWS? Or is it only a service that can load specific pieces of a website - such as images, etc. Obviously, I'd want to use my own domain. If you can use it, are there any limitations? Here's the AWS link, for context: http://aws.amazon.com/s3/

Read the article
Unpacking Argument Lists and Instantiating WTForms objects from web.py

- by Morris Cornell-Morgan

After a bit of searching, I've found that it's possible to instantiate a WTForms object in web.py using the following code: form = my_form(**web.input()) web.input() returns a "dictionary-like" web.storage object, but without the double asterisks WTForms will raise an exception: TypeError: formdata should be a multidict-type wrapper that supports the 'getlist' method From the Python documentation I understand that the two asterisks are used to unpack a dictionary of named arguments. That said, I'm still a bit confused about exactly what is going on. What makes the web.storage object returned by web.input() "dictionary-like" enough that it can be unpacked by ** but not "dictionary-like" enough that it can be passed as-is to the WTForms constructor? I know that this is an extremely basic question, but any advice to help a novice programmer would be greatly appreciated!

Read the article
how to get IP adress of google search/crawler bot to add to our white list of ip address

- by Jayapal Chandran

Hi, Google webmaster tools says network unreachable. When i contacted my hosting provider they said that they have installed firewall which could block frequent incoming ip addresses and they dont know the google's ip adress to unblock. so they requested me to find google search/crawler bot's ip adress so that they can add it to their whitelist. How to find the ip address of google search bot or crawler bot? My site stopped appearing in google search. My hits had gone too low. What should i do? any kind of reply would he helpful.

Read the article
Junior software developer - How to understand web aplications in depth?

- by nat_gr

I am currently a junior developer in web applications and specifically in asp.net mvc technology. My problem is that the c# senior developer in the company has no experience with this technology and I try to learn without any guidance. I went through all tutorials (e.g music store), codeplex projects and also read pro asp.net mvc 4. However, most of the examples are about crud and e-commerce applications. What I don't understand is how dependency injection fits in web applications (I have realized that is not only used for facilitating unit testing) or when i should use a custom model binder or how to model the business logic when there is already a database schema in place. I read the forum quite often and it would very helpful if some experienced developers could give me an insight about how to proceed. Do I need to read some books to understand the overall idea behind web applications? And what kind of application should I start building myself - I don't think it would be useful to create similar examples with the tutorials.

Read the article
Does a "nofollow" attribute on a link prevent URL discovery by search engines?

- by Stephen Ostermiller

I know that nofollow prevents link juice from being passed across a link. But if search engine robots discover a link with a nofollow on it, will they add that link to their crawl queue? In other words, if I create a link to a brand new page and put a rel=nofollow attribute on that link, will it prevent search engine bots (particularly Googlebot) from crawling the page. (Assuming that this link remains the only link into that page.) I've read conflicting reports about this over the years and I'm looking for authoritative references about the current state of affairs. Official statements from Google or published results of independent testing would be ideal.

Read the article
What's the best way to move to linux from windows for web development ?

- by rajesh pillai

I am primarily a programmer developing on windows based OS using c# as my primary language. I am evaluating Ubuntu Linux as an alternate platform and would like to know the best stack for doing web development on this. I had gone through the following thread Moving development from Windows to Linux but it doesn't answer my questions fully. Some of the points I am interested are outlined below PHP/Ruby/Python (What would you recommend?) Is Mono mature enough for any large scale development? Has anyone any real experience using Mono. IDE (including debugging support, intellisense, source control integration,Unit testing) Unit testing framework based on the language recommended Web framework if any. Load Testing tools Web server (I know there are many webservers, but would like to know which one is primarily used by most people) Your inputs is greatly appreciated. Thanks.

Read the article
How do I remove a LOT of indexed pages from Google?

- by Thierry

A few weeks ago we have figured out that Google has indexed some information we would rather keep in some confidentiality, in the format of individual PDF files. Our assumption was that this was a problem with our robots.txt we had overlooked. Even though we are not sure whether or not this is the case, we are certain that the robots.txt file is in a valid format and is, according to Google's webmaster tools, blocking the files. However, even after this adjustment that has been made weeks ago, Google still has the PDF files indexed, but does tell us further information cannot be provided due to the robots.txt file being present. As you can hopefully understand, this is unwanted behaviour due to the nature of the documents. I am aware that there is a request page being provided by Google for this purpose, but there are a lot of files. Is there an easier way to get Google to remove all of the files from its search engine? If not, is there anything else you could advise us to do besides manually requesting Google to remove every single page? Thanks in advance.

Read the article
Which web crawler to use to save news articles from a website into .txt files?

- by brokencoding

Hi, i am currently in dire need of news articles to test a LSI implementation (it's in a foreign language, so there isnt the usual packs of files ready to use). So i need a crawler that given a starting url, let's say http://news.bbc.co.uk/ follows all the contained links and saves their content into .txt files, if we could specify the format to be UTF8 i would be in heaven. I have 0 expertise in this area, so i beg you for some sugestions in which crawler to use for this task.

Read the article
Grapeshot crawler ignoring robots.txt

- by QF_Developer

Has anyone come across a crawler called Grapeshot? They are hammering the same page repeatedly on our website. I believe they are looking for ad related keywords, based on previous content ad campaigns. The odd thing is we never ran any such campaigns on the page they are so interested in. We do have only a few pages running AdSense, is this what has attracted Grapeshot? I've added the following declaration to my robots.txt, but they don't seem to be honouring it? User-agent: grapeshot Disallow: / Any ideas on how to block this nuisance crawler? I'm starting to think the best way is by setting up IP rules in IIS?

Read the article
Page load speeds effect on crawl rate

- by Sam Pegler

We've noticed a big drop in the total pages crawled per day on our site, we have no control over the crawl rate in google webmaster tools so it's possible this has been changed by google. However it's a fairly large site and I wouldn't of thought that the crawl rate would've been decreased. What we have noticed though is a sizeable increase in page load times, in my mind this would be the cause. Can anyone else confirm if the crawl rate is directly correlated to page load time? Seems logical, longer page load time, less pages crawled. Any decent documentation on this would be appreciated, I don't normally have any input on SEO so this is new to me.

Read the article
Webmaster Tools word count

- by Henrik Erlandsson

Is there a way to somehow verify that the googlebot finds the headings and the content, for example by word count? I'm asking this because I tried a program called Screaming Frog, which fails to even fetch the first h1 on a validated page - for about 1/3 of all the pages(!) - and got insecure. Even though the site looks hunky dory in Webmaster Tools, I'd like to know what a googlebot-like content crawler finds on my page and in what order. Any tips on such tools is appreciated. This is not about keyword count.

Read the article
What skills does a web developer need to have/learn?

- by Victor

I've been I've asked around, and here's what I gathered so far in no particular order: Knowledge Web server management (IIS, Apache, etc.) Shell scripting Security (E.g. ethical hacking knowledge?) Regular Expression HTML and CSS HTTP Web programming language (PHP, Ruby, etc.) SQL (command based, not GUI, since most server environment uses terminal only) Javascript and library (jQuery) Versioning (SVN, Git) Unit and functional test Tools Build tools (Ant, NAnt, Maven) Debugging tools (Firebug, Fiddler) Mastering the above makes you a good web developer. Any comments?

Read the article
Are these jobs for developer or designers or for client itself? for a web-site projects

- by jitendra

Spell checking grammar checking Descriptive alt text for big chart , graph images, technical images To write Table summary and caption Descriptive Link text Color Contrast checking Deciding in content what should be H2 ,H3, H4... and what should be <strong> or <span class="boldtext"> Meta Description and keywords for each pages Image compression To decide Filenames for images,PDf etc To decide Page's <title> for each page

Read the article
Will Google crawl session based website

- by DonShwep

I have a website, it is split into 3 categories but using PHP its an all-in one kind of style. When a user chooses a category on the home page a session is set, this is then used to set the style and contents of the website. Would Googlebot and other bots be able to still scan my website? If a page is accessed and no session is set then the user is sent back to the home page. I have created special links, that set a session but go straight to the contact page. Even this page doesn't seem to be showing up. Any ideas if a sitemap with specially crafted links (to set the session) will help Google?

Read the article
Web development for people who mainly do client side..

- by kamziro

Okay, I'm sure there are a lot of us that has plenty of experience developing c++/opengl/objective C on the iPhone, java development on android, python games, etc (any client side stuff) while having little to no experience on web-based development. So what skillset should one learn in order to be able to work on web projects, say, to make a facebook clone (I kid), or maybe a startup that specializes on connecting random fashionistas with pics etc. I actualy do have some experience with C#/VB.net back-end development a while back, but as part of a team, I had a lot of support from the senior devs. Is C# considered a decent web development language?

Read the article
Fake links cause crawl error in Google Webmaster Tools

- by Itai

Google reported Crawl Errors last week on my largest site though Webmaster Tools. Here is the message: Google detected a significant increase in the number of URLs that return a 404 (Page Not Found) error. Investigating these errors and fixing them where appropriate ensures that Google can successfully crawl your site's pages. The Crawl Errors list is now full of hundreds of fake links like these causing 16,519 errors so far: Note that my site does not even have a search.html and is not related to any of the terms shown in the above image. Inspecting sources for one of those links, I can see this is not simply an isolated source but a concerted effort: Each of the links has a few to a dozen sources all from different, seemingly unrelated sites. It is completely baffling as to why would someone to spending effort doing this. What are they hoping to achieve? Is this an attack? Most importantly: Does this have a negative effect on my side? Could it negatively impact my ranking? If so, what to do about it? The few linking pages I looked at are full of thousands of links to tons of sites and have no contact information and do not seem like the kind of people who would simply stop if asked nicely! According to Google Webmaster Tools, these errors have appeared in a span of 11 days. No crawl errors were being reported previously.

Read the article
Is there a weight for html link?

- by Questions

I would like to know if there is any weight associated with a html link (its backlink) when google does crawling/indexing. Will 1,2 & 3 be ever considered as a backlink by Google? 1. <a href="xyz.com">1</a> // one character link 2. <a href="xyz.com"> </a> //blank space link 3. <a href="xyz.com">!</a> //special character link 4. <a href="xyz.com">keyword</a> //meaningful word link I hope my question is understandable and guess this is right forum. I don't know to put it in other words. Thanks in advance.

Read the article

< Previous Page | 36 37 38 39 40 41 42 43 44 45 46 47 | Next Page >