scraper sites - Page 28

How do sites look up addresses from UK postcodes?

- by ctford

U.K. that require addresses often ask the user to provide a postcode. The site then offers the user a choice between the addresses that match that postcode. Where do these sites get the data to do this? Are there webservices that match postcodes to addresses? Do sites buy a database of addresses that they then query locally?

Read the article

what is the best method or tool to scrape web sites ?

- by user63898

Hello all i need to scrape (with approval) web sites before I start to write my own what is the best tool/way to scrape web sites, which is both fast (multithreaded) and easy to learn?

Read the article

I want to consolidate two sites into a third. Will my search engine rankings be penalized if I rewrite and redirect pages one by one?

- by Patrick Kenny

I have two Drupal sites with different content-- let's call them Apple and Orange. I recently developed a much more sophisticated third Drupal site-- let's call it Tree. For a large number of reasons, the content on Apple and Orange is useful for the users of Tree, so I want to move the content to Tree. However, much of the content is out of date. (This whole process took about five years.) To update the content, I will rewrite it one article at a time myself. Now here's my question: if I move the articles one by one (as I rewrite them) and then redirect the old articles (using a 301 redirect) on Apple/Orange to the new site on Tree, will this have a huge negative effect on my search engine rankings? Is there a good way to redirect among sites when they merge like this, or would I be better off keeping the old articles on Apple/Orange and simply linking them to the new, rewritten articles on Tree?

Read the article

What's with all the mailing archives?

- by Yuval

When I google for certain questions, or problems I run into I would sometimes run into 'archive' sites - these sites contain forum questions or information from other sites that are extremely poorly formatted, and are copies of the well formatted original posts from various forums. An example of one of those sites in mail-archive.com and various similar sites. Can anybody explain to me why those sites exist and how come they don't get banned from Google (since all they have is copied content that is really poorly formatted by bots)? Thanks!

Read the article

Why do I see "Internet Explorer 8 - faster, safer, easier" in IE8 on MS sites?

- by Jeroen Pluimers

Recently, on most Microsoft sites, I started to see a bar at the top of every page with a hyper-link indicating "Internet Explorer 8 - faster, safer, easier". I can understand that Microsoft shows this in browsers other than IE8. But: this also appears in IE8 itself. Why? --jeroen

Read the article

What are the most appropriate Q&A sites and forums for questions about Apple App Store terms and con

- by Howiecamp

I'm looking to ask some questions regarding to App Store functionality/terms and conditions related to developer access to customer information and the purchase ecosystem, eg app store vs external. What are the most appropriate Q&A sites and forums for questions like this?

Read the article

How can I use apache to reverse proxy several OWA sites for multiple domains, each with it's own ser

- by Helder

Hi all, I want to use a linux reverse proxy (CentOS) to serve multiple OWA sites, for multiple Exchange servers, on different domains, so I can use only one public IP address. Can I do it with apache? Or should I use Squid? Thanks! Cheers, Helder

Read the article

Is there a plugin/extension to disable CSS on certain sites and not others when browsing?

- by barfoon

Hey guys, Is there a plugin/extension to disable CSS on certain sites and not others when browsing? Sort of like the ability featured in web dev toolbar plugins, but selective and automatic. Any ideas? Thanks,

Read the article

Easy way to move 300+ IIS sites to new Application Pools?

- by Shawn

I have a ton (about 300) of IIS sites on a server that need to be moved to a new Application Pool. I could do it programmatically using the C# DirectoryEntry class, but I'm wondering if there's any easier way?

Read the article

Easy way to move 300+ IIS sites to new Application Pools?

- by Shawn

I have a ton (about 300) of IIS sites on a server that need to be moved to a new Application Pool. I could do it programmatically using the C# DirectoryEntry class, but I'm wondering if there's any easier way?

Read the article

What sites/publications are good for staying current on security and malware trends?

- by Holocryptic

In my ever expanding quest for knowledge, I'm at the point where I feel like I need to be more up to date with the current security trends, as well as malware and such that are in the wild. I'd like to be able to say, "I've heard of that and the fix is...." versus, "Oh, yeah, I had that eat up half my network before I contained it...." What sites and publications are good for keeping up with these things?

Read the article

Software for monitoring web sites accessed via shared wireless connection?

- by Anand

Hi, I have shared my Internet connection to my neighbour using wireless network. Now I want to know what web sites he/she is accessing through my Internet connection. Please provide a list of free software that we can use to do this.

Read the article

Is the guideline: don't open email attachments or execute downloads or run plug-ins (Flash, Java) from untrusted sites enough to avert infection?

- by therobyouknow

I'd like to know if the following is enough to avert malware as I feel that the press and other advisory resources aren't always precisely clear on all the methods as to how PCs get infected. To my mind, the key step to getting infected is a conscious choice by the user to run an executable attachment from an email or download, but also viewing content that requires a plug-in (Flash, Java or something else). This conscious step breaks down into the following possibilities: don't open email attachments: certainly agree with this. But lets try to be clear: email comes in 2 parts -the text and the attachment. Just reading the email should not be risky, right? But opening (i.e. running) email attachments IS risky (malware can be present in the attachment) don't execute downloads (e.g. from sites linked from in suspect emails or otherwise): again certainly agree with this (malware can be present in the executable). Usually the user has to voluntary click to download, or at least click to run the executable. Question: has there ever been a case where a user has visited a site and a download has completed on its own and run on its own? don't run content requiring plug-ins: certainly agree: malware can be present in the executable. I vaguely recall cases with Flash but know of the Java-based vulnerabilities much better. Now, is the above enough? Note that I'm much more cautious than this. What I'm concerned about is that the media is not always very clear about how the malware infection occurs. They talk of "booby-trapped sites", "browser attacks" - HOW exactly? I'd presume the other threat would be malevolent use of Javascript to make an executable run on the user's machine. Would I be right and are there details I can read up on about this. Generally I like Javascript as a developer, please note. An accepted answer would fill in any holes I've missed here so we have a complete general view of what the threats are (even though the actual specific details of new threats vary, but the general vectors are known).

Read the article

How to be aware of "Pay for surf" sites??

- by infant programmer

Is there any firewall or something similar to block my browser to get through "Pay for surf" sites .. if its a freeware or opensource software then it would help me much.

Read the article

can i cache two diffrent web sites on varnish cache server?

- by Kerberos

i have two web web sites and i want to cache them with varnish. could i cache them on same varnish with using same port? for example; www.domain1.com:80, www.domain2.com:80

Read the article

What is the BCSI-CS-**** cookie for?

- by Joanne Wellings

I'm undertaking an audit of the cookies we use on our external sites. There's one cookie that's used by all the sites, and by different domains within the sites. It starts BCSI-CS- and has random numbers and letters. It's the same cookie on different PCs on our network. Our own sites use it and Bing Maps, Google Analytics and Google Maps on our sites use it. This cookie does not seem to appear on PCs not on our network. We've figured that it's a cookie that our proxy server uses and therefore only an internal cookie, not one that our external site users will encounter. However, googling that cookie shows a lot of sites have listed a similar cookie in their "About our cookies" page with the same BCSI-CS prefix. Would we be right in thinking that these sites have got it wrong, that they don't have to list this cookie? After all, when I visit these sites, the cookie that they have listed does not appear on my PC. Can anyone confirm this, or explain what the BCSI-CS cookie actually is?

Read the article

How to extend an existing Ruby on Rails CMS to host multiple sites?

- by Andrew

I am trying to build a CMS I can use to host multiple sites. I know I'm going to end up reinventing the wheel a million times with this project, so I'm thinking about extending an existing open source Ruby on Rails CMS to meet my needs. One of those needs is to be able to run multiple sites, while using only one code-base. That way, when there's an update I want to make, I can update it in one place, and the change is reflected on all of the sites. I think that this will be able to scale by running multiple instances of the application. I think that I can use the domain/subdomain to determine which data to display. For example, someone goes to subdomain1.mysite.com and the application looks in the database for the content for subdomain1. The problem I see is with most pre-built CMS solutions, they are only designed to host one site, including the one I want to use. So the database is structured to work with one site. However, I had the idea that I could overcome this by "creating a new database" for each site, then specifying which database to connect to based on the domain/subdomain as I mentioned above. I'm thinking of hosting this on Heroku, so I'm wondering what my options for this might be. I'm not very familiar with Amazon S3, or Amazon SimpleDB, but I feel like there's some sort of "cloud database" that would make this solution a lot more realistic, than creating a new MySQL database for each site. What do you think? Am I thinking about this the wrong way? What advice do you have to offer in this area?

Read the article

How to search Multiple Sites using Lucene Search engine API?

- by Wael Salman

Hope that someone can help me as soon as possible :-) I would like to know how can we search Multiple Sites using Lucene??! (All sites are in one index). I have succeeded to search one website , and to index multiple sites, however I am not able to search all websites. Consider this method that I have: private void PerformSearch() { DateTime start = DateTime.Now; //Create the Searcher object string strIndexDir = Server.MapPath("index") + @"\" + mstrURL; IndexSearcher objSearcher = new IndexSearcher(strIndexDir); //Parse the query, "text" is the default field to search Query objQuery = QueryParser.Parse(mstrQuery, "text", new StandardAnalyzer()); //Create the result DataTable mobjDTResults.Columns.Add("title", typeof(string)); mobjDTResults.Columns.Add("path", typeof(string)); mobjDTResults.Columns.Add("score", typeof(string)); mobjDTResults.Columns.Add("sample", typeof(string)); mobjDTResults.Columns.Add("explain", typeof(string)); //Perform search and get hit count Hits objHits = objSearcher.Search(objQuery); mintTotal = objHits.Length(); //Create Highlighter QueryHighlightExtractor highlighter = new QueryHighlightExtractor(objQuery, new StandardAnalyzer(), "<B>", "</B>"); //Initialize "Start At" variable mintStartAt = GetStartAt(); //How many items we should show? int intResultsCt = GetSmallerOf(mintTotal, mintMaxResults + mintStartAt); //Loop through results and display for (int intCt = mintStartAt; intCt < intResultsCt; intCt++) { //Get the document from resuls index Document doc = objHits.Doc(intCt); //Get the document's ID and set the cache location string strID = doc.Get("id"); string strLocation = ""; if (mstrURL.Substring(0,3) == "www") strLocation = Server.MapPath("cache") + @"\" + mstrURL + @"\" + strID + ".htm"; else strLocation = doc.Get("path") + doc.Get("filename"); //Load the HTML page from cache string strPlainText; using (StreamReader sr = new StreamReader(strLocation, System.Text.Encoding.Default)) { strPlainText = ParseHTML(sr.ReadToEnd()); } //Add result to results datagrid DataRow row = mobjDTResults.NewRow(); if (mstrURL.Substring(0,3) == "www") row["title"] = doc.Get("title"); else row["title"] = doc.Get("filename"); row["path"] = doc.Get("path"); row["score"] = String.Format("{0:f}", (objHits.Score(intCt) * 100)) + "%"; row["sample"] = highlighter.GetBestFragments(strPlainText, 200, 2, "..."); Explanation objExplain = objSearcher.Explain(objQuery, intCt); row["explain"] = objExplain.ToHtml(); mobjDTResults.Rows.Add(row); } objSearcher.Close(); //Finalize results information mTsDuration = DateTime.Now - start; mintFromItem = mintStartAt + 1; mintToItem = GetSmallerOf(mintStartAt + mintMaxResults, mintTotal); } as you can see that I use the site URL 'mstrURL' when I create the search object string strIndexDir = Server.MapPath("index") + @"\" + mstrURL; How can I do the same when I want to search multiple sites?? Actually I am using the code from http://www.keylimetie.com/blog/2005/8/4/lucenenet/

Read the article

Web browser being selective about the sites that it will visit.

- by Andrew Doran

I've been trying to help my father-in-law with this problem but haven't been able to get anywhere. Since the weekend the web browsers on his computer (Chrome and Internet Explorer on Windows XP) will only let him get to certain sites - for example, he is able to conduct his online banking but he cannot visit www.bbc.co.uk, www.amazon.co.uk or www.ancestry.com. There is another computer in the house that goes via the same router and this can connect to both, which suggests it is his machine. I tried running a tracert to www.bbc.co.uk and managed to get through, but the web browser hangs with a message that it is waiting for a response. I tried using the WinSockFix tool in case it was anything to do with a recent registry change but that didn't work either. He can't think of anything that he recently did on his machine to cause the problem. Can anyone help?

Read the article

What are the best code-less website design sites available?

- by Ken Pespisa

I'm looking for an alternative to www.squarespace.com. Squarespace is great, but leans slightly toward the social networking industry. I'm looking for a simple way to build Web sites for small businesses. The criteria is the following: Requires no programming knowledge. No software to download - all design and maintenance can be done via browser Has a good selection of templates and layouts Essentially I'm looking for the features within Typepad.com or WordPress.com, but instead of focusing around a blog it would help build a more traditional Web site.

Read the article

Why doesn't SuperGenPass work on some sites when I use Chrome?

- by Lunatik

Bookmarklet SuperGenPass sometime fails to popup when I click the bookmark in Chrome. It does however work when on the same page works in Firefox; an example is http://www.engadget.com/login This behaviour also replicated on a new Chrome tab (understandably, there is no domain), but some sites just fail to launch it meaning you have to go to another site, open it up, enter in [something] to get the 'Regenerate password' link, enter the domain manually then finally enter your master password to get the generated password! Something about the makeup of the page seems to make SuperGenPass think that it isn't able/required to popup. The FAQ doesn't make any mention of this fact, neither does a quick Google turn up anything that looks relevant. Does anyone else have the same issue? How can it be fixed? I'm on Windows using the current release of Chrome (5.x at the moment, but probably 18.x by the time you read this next week based on Google's seemingly logarithmic release numbering).

Read the article

wildcard host name bindings for multiple subdomains in multiple sites on IIS7 with a single IP address

- by orca

Situation: I have a single windows 2008 server with a single public IP address. I have multiple domains with wildcard A records pointing to the single IP address. I need each domain to be hosted by a different web site. (i.e. www.domain1.com by site domain1site) I need domain1.com to act like www.domain1.com I need each site to be able to have multiple subdomains (i.e. www.domain1.com, abc.domain1.com, xyz.domain1.com) Not relevant yet here it goes, I plan to handle each subdomain by a different application hosted in the same site (i.e. application /xyz in domain1site) However I found out that IIS7 does not support creating web sites with wildcard host name binding and setting it without any subdomain (i.e. domain1.com) does not work, even for www.domain1.com. Is there a simple solution? Does any IIS Extension like Application Request Routing provide such capability?

Read the article

How can I make sharepoint use a small URL (e.g. http://internal.com instead of http://internal.com/sites/osfc/Pages/Default.aspx)

- by StevenB

Hi all, I'm new to sharepoint 2007, currently the home page is htp://internal.com/sites/osfc/Pages/Default.aspx but I would like to use htp://internal.com or have htp://internal.com redirect to the long URL. How can I do this? I thought of using a 301 redirect but the permissions on the site in IIS don't allow users to view files placed in the root and I don't want to mess with the permissions. Currently if I visit http://internal.com I see a sharepouint Access Denied page (htp://internal.com/_layouts/AccessDenied.aspx?Source=%2f). Note: I've used htp:// above as serverfault doesn't allow more than 1 https:// link. Many thanks Steven

Read the article

How to use mod_proxy to let my index of Apache go to Tomcat ROOT and be able to browse my other Apache sites

- by Dagvadorj

I am trying to use my Tomcat application (deployed at ROOT) to be viewed from Apache port 80. To do this, I used mod_proxy, since mod_jk made me try harder. I used sth like this in httpd.conf: <location http://www.example.com> Order deny,allow Allow from all PassProxy http://localhost:8080/ PassProxyReverse http://localhost:8080/ </location> <Proxy *> Order deny,allow Allow from all </Proxy> And now I can not retrieve my previous sites on Apache, which was running prior to my configuration. How can I have both running?

Read the article

Apache VirtualHost, multiple sites. 1 ssl with redirect and 1 regular http

- by pedalpete

I've got a server with one site which I am redirecting to https via <VirtualHost *:80> DocumentRoot /var/www/html/secure ServerName secure.com Redirect / https://secure.com </VirtualHost> That works no problem. Now I'm trying to add another non-secure site <VirtualHost *:80> DocumentRoot /var/www/html/notsecure ServerName notsecure.com </VirtualHost> of course, because the redirect is on '/', all sites are getting redicted. I've tried changing the Redirect to the full document root, but no luck.

Search Results

Search found 7625 results on 305 pages for 'scraper sites'.

Page 28/305 | < Previous Page | 24 25 26 27 28 29 30 31 32 33 34 35 | Next Page >

- by ctford

- by user63898

- by Patrick Kenny

- by Yuval

- by Jeroen Pluimers

- by Howiecamp

- by Helder

- by barfoon

- by Shawn

- by Shawn

- by Holocryptic

- by Anand

- by therobyouknow

- by infant programmer

- by Kerberos

- by Joanne Wellings

- by Andrew

- by Wael Salman

- by Andrew Doran

- by Ken Pespisa

- by Lunatik

- by orca

- by StevenB

- by Dagvadorj

- by pedalpete

< Previous Page | 24 25 26 27 28 29 30 31 32 33 34 35 | Next Page >