Search Results

Search found 7625 results on 305 pages for 'scraper sites'.

Page 75/305 | < Previous Page | 71 72 73 74 75 76 77 78 79 80 81 82  | Next Page >

  • One site being on a subdirectory of another. Does google count this againt you?

    - by Mick
    I have created two similar websites (relating to monetary systems). So far, one appears to be loved by Google and the other hated. I'm struggling to work out why. This is a mystery to me because both sites were created by me with the same design philosophy, both in pure html. Both are packed to the rafters with references to, and information about, their respective subjects. One issue I'm worried may be the cause is to do with the location of the sites. I got a web hosting package from hostmonster.com for the successful one, but less liked one is just an "add-on" which sits on a subdirectory of the successful one. I wonder if Google somehow detects this and treats it as a less significant website? EDIT: Just to clarify, even though one site is an add-on that sits on a subdirectory of the other, the URL is arranged to look like it is a root. I.e. the unpopular site can be accessed directly with a simple www.myunpopularsite.com name, without specifying any subdirectory.

    Read the article

  • How can I fix the #c3284d# malvertising hack on my website?

    - by crm
    For the past couple of weeks at semi regular intervals, this website has had the #c3284d# malware code inserted into some of its .php files. Also the .htaccess file had its equivelant code inserted. I have, on many occasions removed the malicious code, replaced files, changed the ftp password on my ftp client (which is CoreFTP), changed the connection method to FTPS for more secure storage of the password (instead of plain text). I have also scanned my computer several times using AVG and Windows Defender which have found no malware on my computer which might have been storing my ftp passwords. I used Sucuri SiteCheck to check my website which says my website is clean of malware which is bizarre because I just attempted to click one of the links on the site a minute ago and it linked me to another one of these random stats.php sites, even though it appears I have gotten rid of the #c3284d# code again (which will no doubt be re-inserted somehow in an hour or so).. Has anyone found an actual viable solution for this malware hack? I have done just about all of the things suggested here and here and the problem still persists. Currently when I click on a link within the sites navigation menu within Google Chrome I get googles Malware warning page: Warning: Something's Not Right Here! oxsanasiberians.com contains malware. Your computer might catch a virus if you visit this site. Google has found that malicious software may be installed onto your computer if you proceed. If you've visited this site in the past or you trust this site, it's possible that it has just recently been compromised by a hacker. You should not proceed. Why not try again tomorrow or go somewhere else? We have already notified oxsanasiberians.com that we found malware on the site. For more about the problems found on oxsanasiberians.com, visit the Google Safe Browsing diagnostic page. I'm wondering if it is possible that the Google Chrome browser I am using has itself been hacked? Does anyone else get re-directed when clicking links on the the website?

    Read the article

  • Installing a new ASP.NET 4.0 site on a Windows 2008 server.

    - by TATWORTH
    I have been specifically requested to blog about getting an ASP.NET 4.0 site working on a Windows 2008 server that has never run a 4.0 web site before. Make sure the 4.0 framework is installed on the server! Patch it will ALL the security patches have been applied. ((for a live server, make sure that you tested the patches on your development server first) You will find the HTTP Log status codes at http://support.microsoft.com/kb/943891 - they are very important in understandign the IIS logs) After installing, turn on 4.0, by doing the following: Start the Internet Information Services (IIS Manager) Select the server node in the connections pane. (this is the node above Application Pools, FTP Sites and Server Farms) Double click the ISAPI and CGI Restrictions item in the centre pane You should see 1 or 2 ASP.NET v4.0.30319 entries, select Enable in the Actions pane for all of them. ASP.NET 4.0 should now run! Remeber after creating your new 4.0 ASP.NET site. select the Sites node and find out the Id of it. By default, the IIS logs are at C:\inetpub\logs\LogFiles and if your site is say 21, then the logs will be created in the W3SVC21 sub-directory. The key point about using these logs is that in the event of an error when trying to start the site for the first time, the log will contain the status code and the sub-code. By having the full code and sub-code, set up issues can be resolved in minutes instead of hours.

    Read the article

  • MySQL Connector/Net 6.6 GA has been released

    - by fernando
    MySQL Connector/Net 6.6, a new version of the all-managed .NET driver for MySQL has been released.  This is the GA intended to introduce users to the new features in the release.  This release is feature complete. It is recommended for use in production environments. It is appropriate for use with MySQL server versions 5.0-5.6 It is now available in source and binary form from http://dev.mysql.com/downloads/connector/net/#downloads and mirror sites (note that not all mirror sites may be up to date at this point-if you can't find this version on some mirror, please try again later or choose another download site.) The 6.6 version of MySQL Connector/Net brings the following new features:   * Stored routine debugging   * Entity Framework 4.3 Code First support   * Pluggable authentication (now third parties can plug new authentications mechanisms into the driver).   * Full Visual Studio 2012 support: everything from Server Explorer to Intellisense&   the Stored Routine debugger. The release is available to download athttp://dev.mysql.com/downloads/connector/net/6.6.html Documentation ------------------------------------- You can view current Connector/Net documentation at http://dev.mysql.com/doc/refman/5.5/en/connector-net.html For specific topics: Stored Routine Debugger:http://dev.mysql.com/doc/refman/5.5/en/connector-net-visual-studio-debugger.html Authentication plugin:http://dev.mysql.com/doc/refman/5.5/en/connector-net-programming-authentication-user-plugin.html You can find our team blog at http://blogs.oracle.com/MySQLOnWindows. You can also post questions on our forums at http://forums.mysql.com/. Enjoy and thanks for the support! 

    Read the article

  • Help with URL Rewrite

    - by bodesam
    This is the first time i'm doing this and have been doing some research on it. I have a page that selects some info from a database and displays it with a link to a second page that uses the result to query the database, something like this: $sel=mysql_query("select id, title from thetable "); while($row=mysql_fetch_array($sel)) { $id=$row['id']; $title=$row['title']; echo "<a href='more.php?id=$id'>$title</a>"; } The issue is, in the more.php page, instead of more.php?id=5 to show in the address bar, I want something like more/title Secondly, as it obtains in most sites, I want the link on the referring page to show this friendly url on mouse hover not the more.php?id=5 And I notice in most sites some words like 'a', 'and', 'the' etc are usually removed from the url title(even if there originally), moreover how does one handle the situation where more than one record have the same title. How does one go about achieving this url rewrite with htaccess or whatever method is used. Thanks.

    Read the article

  • Hosting woes

    Unfortunately quite a few people have noticed our recent hosting problems, but if you are reading this they should all be over, so please accept our apologies. Our former web host decided migrate to a new platform, it had all sorts or great features, but on reflection hosting wasn’t one of them. We knew it was coming, and had even been proactive and requested several dates on their migration control panel so I could be around to check it afterwards. The dates came and went without anything happening, so we sat back and carried on on for a couple of months thinking they’d get back to us when they were ready. Then out of the blue I get an email saying it has happened! Now this is what I call timing, I had client work to complete, a 50 minute presentation to write and there was a little conference called SQLBits that I help organise at the end of the week, and then our hosting provider decides to migrate our sites. Unfortunately they only migrated parts of the sites, they forgot things like the database for SQLDTS. The database eventually appeared, but the data didn’t. Then the data pitched up but without the stored procedures. I was even asked if I could perform a backup and send it to them, as they were getting timeout errors. Never mind the issues of performing a native backup on a hosted server, whilst I could have done something, the question actually left me speechless. So you cannot access your own SQL server and you expect me to be able to help? This site was there, but hadn’t been set as an IIS application so all path references were wrong which meant no CSS and all the internal navigation and links were wrong. The new improved hosting platform Control Panel didn't appear to like setting applications. It said it would, you’d have to wait 2 hours of course, then just decided not to bother after all. So needless to say after a very successful SQLBits I focused my attention on finding a new web host, and here we are again. Sorry it took so long.

    Read the article

  • With Google DFP (Small Business) is it possible to disable AdSense in an Ad Slot on a per-request basis?

    - by Daniel Pehrson
    Setup: I run a network of websites that target different hobby niches and have a section dedicated to community classifieds. I serve advertising on these sites through Google DFP for Small Business with AdSense enabled on the slots. Problem: One of the next sites in my network will be targeting the firearms/shooting industry and as such the classifieds section will not comply with the prohibited content guidelines of AdSense regarding the sale (or coordination of sale) of weapons. I work very hard to comply with the guidelines of my partners even if I don't understand/agree with them and after talking with many people have decided that the best option is to disable AdSense serving on that section of that website, while leaving it on for the rest of the network. Solution: Right now my only idea for this is to duplicate all my site's ad slots and tack a "_sensitive" onto the end of each one (eg. header and header_sensitive) conditionally registering ad slots based on whether or not I am in the sensitive section of the sensitive site. My hope however is that there may be a way to accomplish this without duplicating all my ad slots possibly with some sort of options to the GA_googleFillSlot() call that allows me to say "load ads from this slot but do not serve AdSense no matter what."

    Read the article

  • WiFi problems on several Ubuntu installations

    - by Rickyfresh
    Okay this is the first time I have ever had to ask a question as usually the Ubuntu community have answered everything already but on this occasion there are many people asking for the answer but not one good solution has become available so far so someone please help or I will have to install Windows on my sons and my girlfriends PCs and that would be a disaster as I am trying to help convince people to move from Windows. I installed 12.04 on three computers on the same day. Dell Inspiron (Works Perfect) Toshiba Satellite Home built Desktop The Dell works perfect but the other two either keep losing connection to the wireless Internet and even when they are connected they stop connecting to web sites, for some reason it searches Google fine but will not connect to web sites when a link is clicked. So far people have recommended in other forums: Removing network manager and installing wicd (didn't solve it) Changing the MTU in the wireless settings (didn't solve it) All sorts of messing about with Firefox settings (this doesn't solve it and even if it did this would leave most average PC users scratching their heads and wishing they had stuck to windows) The problem exists on two very different machines and different wireless cards so I doubt its a driver or hardware issue, also many other Ubuntu users are having the same problem with a vast array of different machines and wireless cards. Can someone please give a good solution to this as its going to turn a lot of people away from Ubuntu if they cannot get this sorted. I would give some PC specs but the two machines are vastly different and the other people complaining of this problem also have very different systems all showing the same problem.

    Read the article

  • Cant make my site available to the internet

    - by user1683645
    Hi I'm using ubuntu as server OS for my webhosting but I'm having problem redirecting my domainname to my server Here are my /etc/hosts file and /etc/apache2/sites-available/mysite file. hosts file: 127.0.0.1 www.lowkey.se The following lines are desirable for IPv6 capable hosts ::1 ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters sites-available/file: ServerAdmin webmaster@localhost ServerName www.lowkey.se DocumentRoot /var/www/doost/ <Directory /> Options FollowSymLinks AllowOverride None </Directory> <Directory /var/www/doost/> Options Indexes FollowSymLinks MultiViews AllowOverride None Order allow,deny allow from all </Directory> ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/ <Directory "/usr/lib/cgi-bin"> AllowOverride None Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch Order allow,deny Allow from all </Directory> ErrorLog ${APACHE_LOG_DIR}/error.log # Possible values include: debug, info, notice, warn, error, crit, # alert, emerg. LogLevel warn CustomLog ${APACHE_LOG_DIR}/access.log combined And a screenshot from my domain name provider: http://imgur.com/VyqBR the site has been enabled in ubuntu, I've restarted apache2 and the folder /var/www/doost/ is there. What am i doing wrong?

    Read the article

  • SEO Blog Indexing : Dot Wordpress Versus a Registered Domain?

    - by rumspringa00
    I've used Wordpress for a few of my client's sites, mostly small businesses and ecommerce sites. I have found through Google Analytics as well as the All in One Webmaster plugin that when it comes to social media, using Wordpress is a surefire way of getting your site indexed by Google and occasionally Bing and Yahoo. Since I am a heavy WP user, I'd like to contribute by registering a dot Wordpress domain for my portfolio. When using a WP installation concurrently with a WP domain, e.g. myportfolio.wordpress.com, will the site be more or less likely to be indexed rather a generic myportfolio.com domain? I've seen mixed opinions where people seem to favor a WP domain for URL output where others say that it's a moot point, and that Google will not favor a WP domain over a dot com domain as long as your meta tags are updated and content is keyword optimized. I tend to disagree and believe a WP domian would more likely be indexed and output more URLs over an individual, laconic domain like myportfolio.com. Am I wrong? Thanks in advance!

    Read the article

  • SunSpace - a sentimental moment

    - by me
    I just came back from California where I had a little sentimental moment.With the great help from some former Sun colleagues we move the old SunSpace gear into a new data center in Santa Clara.We will re-purpose the hardware as a new development infrastructure to build integrated demos around Oracle WebCenter products, Business Applications and Social Services. now - I could not resist to restart the SunSpace applications and see if it still works. And hey - even though we had to re-IP the entire  stack (sun.com domain is gone) and with some little hacking (thanks to Apache reverse proxy) -  we got it back! Hey Max - now I just need to change your SSO hack to get login working again Hmm - I won't - but it is really nice to see it working again .. and it's time to switch it off and to work  on the next cool things .. Do you know Oracle WebCenter Sites (formely Fatwire)? Its Oracle's Web Experience Management Solution - a pretty cool technology and a very slick User Interface. I specially like the drag&drop functionality which allows non technical users to easily publish content.  Why do I mention it here ?  Because we will use the SunSpace gear to build cool  Oracle WebCenter Sites demos and proof of concepts integration  into Business Applications and Social Services  This is a sneak preview what we are working on. Stay tuned.....

    Read the article

  • How would I broadcast a subdomain/virtual name on a local server with people connected to the same network

    - by Sarmen B.
    I have a server connected to the router which has ubuntu 12.04. It is has apache/mysql/php all installed ready to go. the folder structure is like this: /var/www -- this isnt the root -/libs -/logs -/public - this is the root -/vhosts - all subdomains go here I have a folder in vhosts named mysite. I went into /etc/apache2/sites-available and created a file and here are the contents - (vhost file). and I also added an entry in /etc/hosts file containing: 127.0.1.1 mysite.dev and I also did sudo a2ensite mysite i tried accessing the site from a computer via mysite.dev and our public ip into the server but i was not able to view it. the public directory in the structure above does display on all computers when i try our public ip. but for anything added in vhosts the site wont show. there is no domain attached its just our ip. I tried changing the port from 80 to say 9999 in the mysite file in sites-available and tried myip:9999 but that didnt work either. what am I doing wrong? edit: i forgot to mention that the server is dmzed on the router.

    Read the article

  • Getting BeautifulSoup to find a specific <p>

    - by Ryan
    I'm trying to put together a basic HTML scraper for a variety of scientific journal websites, specifically trying to get the abstract or introductory paragraph. The current journal I'm working on is Nature, and the article I've been using as my sample can be seen at http://www.nature.com/nature/journal/v463/n7284/abs/nature08715.html. I can't get the abstract out of that page, however. I'm searching for everything between the <p class="lead">...</p> tags, but I can't seem to figure out how to isolate them. I thought it would be something simple like from BeautifulSoup import BeautifulSoup import re import urllib2 address="http://www.nature.com/nature/journal/v463/n7284/full/nature08715.html" html = urllib2.urlopen(address).read() soup = BeautifulSoup(html) abstract = soup.find('p', attrs={'class' : 'lead'}) print abstract Using Python 2.5, BeautifulSoup 3.0.8, running this returns 'None'. I have no option of using anything else that needs to be compiled/installed (like lxml). Is BeautifulSoup confused, or am I?

    Read the article

  • problem in loading class from 'me.prettyprint.hector.api.Serializer'

    - by dhananjay patil
    I have created executable jar but having some problem with Class not found Exception. When I type command: java -jar JarFileName.jar arguments.. I get error message, Exception in thread "main" java.lang.NoClassDefFoundError: me/prettyprint/hector/api/Serializer at com.ensarm.niidle.web.scraper.NiidleScrapeManager.main(NiidleScrapeManager.java:21) Caused by: java.lang.ClassNotFoundException: me.prettyprint.hector.api.Serializer at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:252) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320) ... 1 more please tell me solution for this,class is not getting loaded from the external jar

    Read the article

  • a question on webpage data scraping using Java

    - by Gemma
    Hi there. I am now trying to implement a simple HTML webpage scraper using Java.Now I have a small problem. Suppose I have the following HTML fragment. <div id="sr-h-left" class="sr-comp"> <a class="link-gray-underline" id="compare_header" rel="nofollow" href="javascript:i18nCompareProd('/serv/main/buyer/ProductCompare.jsp?nxtg=41980a1c051f-0942A6ADCF43B802'); " Compare Showing 1 - 30 of 1,439 matches, The data I am interested is the integer 1.439 shown at the bottom.I am just wondering how can I get that integer out of the HTML. I am now considering using a regular expression,and then use the java.util.Pattern to help get the data out,but still not very clear about the process. I would be grateful if you guys could give me some hint or idea on this data scraping. Thanks a lot.

    Read the article

  • Scraping paginated items from a website using scrapy

    - by Mridang Agarwalla
    I'm using scrapy to scrape items from a site. I'm not being able to implement this scraping pattern. The site I'm trying to scrape is a forum and I scrape the site once a day. Each page has a table containing posts. New posts are added to the top of the table and as more and more posts are posted to the site, the older posts go further into the pages due to pagination. This is a very simple scenario and we will assume that the order of the posts never change. I would like to scrape this site and scrape all the "new" records until the last scraped post from yesterday is encountered. I have configured my spider to paginate endlessly and when it encounters yesterday's last scraped post, it should stop. How can implement this? (My Scrapy installation works with my Django installation using django-dynamic-scraper )

    Read the article

  • What's the fastest way to scrape a lot of pages in php?

    - by Yegor
    I have a data aggregator that relies on scraping several sites, and indexing their information in a way that is searchable to the user. I need to be able to scrape a vast number of pages, daily, and I have ran into problems using simple curl requests, that are fairly slow when executed in rapid sequence for a long time (the scraper runs 24/7 basically). Running a multi curl request in a simple while loop is fairly slow. I speeded it up by doing individual curl requests in a background process, which works faster, but sooner or later the slower requests start piling up, which ends up crashing the server. Are there more efficient ways of scraping data? perhaps command line curl?

    Read the article

  • Easy to use/learn PHP framework?

    - by Meredith
    I need to build a php app, and I was thinking about using a framework (never used one before). I've been browsing around some but most of them seems kinda complicated, I really liked what I saw about Symfony, but it looks like I will have to spend like a month until I really understand how to use it, and in one month I could code the app I have in mind 5 times without a framework. But I want to use one to "standardize" my code and prevent bugs. So I was wondering if someone could share with me which php frameworks you think are easier to learn how to use. My application will use mysql, and it will have some sort of "search engine" to search data that will be populated on the database using a few "scraper scripts" (that I also wants to code using the framework).

    Read the article

  • Rails architecture questions

    - by justinbach
    I'm building a Rails site that, among other things, allows users to build their own recipe repository. Recipes are entered either manually or via a link to another site (think epicurious, cooks.com, etc). I'm writing scripts that will scrape a recipe from these sites given a link from a user, and so far (legal issues notwithstanding) that part isn't giving me any trouble. However, I'm not sure where to put the code that I'm writing for these scraper scripts. My first thought was to put it in the recipes model, but it seems a bit too involved to go there; would a library or a helper be more appropriate? Also, as I mentioned, I'm building several different scrapers for different food websites. It seems to me that the elegant way to do this would be to define an interface (or abstract base class) that determines a set of methods for constructing a recipe object given a link, but I'm not sure what the best approach would be here, either. How might I build out these OO relationships, and where should the code go?

    Read the article

  • How do I prevent an https response from throwing an AuthenticationException with Fiddler running?

    - by Ichabod Clay
    Relative newbie to C# here :) I'm currently creating a web link scraper and having issues with the responses I'm getting when trying to login to the website via my program. I'm trying to use Fiddler to see if my program is sending the proper data, but my program is throwing an AuthenticationException when trying to get a response from the site with Fiddler running. The requests are being sent over HTTPS and Fiddler's certificate is the cause of the excepting being thrown. My question is, what can I implement into my program to have it disregard the certificate authentication? As far as my program goes, the requests and responses are being handled by HttpWebRequest and HttpWebResponse classes.

    Read the article

  • Using Nokogiri to scrape groupon deal

    - by hyngyn
    I'm following the Nokogiri railscast to write a scraper for Groupon. I keep on getting the following error when I run my rb file. traveldeal_scrape.rb:10: warning: regular expression has ']' without escape: /\[0-9 \.]+/ Flamingo Conference Resort and Spa Deal of the Day | Groupon Napa / Sonoma traveldeal_scrape.rb:9:in `block in <main>': undefined local variable or method `item' for main:Object (NameError) Here is my scrape file. require 'rubygems' require 'nokogiri' require 'open-uri' url = "http://www.groupon.com/deals/ga-flamingo-conferences-resort-spa?c=all&p=0" doc = Nokogiri::HTML(open(url)) puts doc.at_css("title").text doc.css(".deal").each do |deal| title = deal.at_css("#content//a").text price = deal.at_css("#amount").text[/\[0-9\.]+/] puts "#{title} - #{price}" puts deal.at_css(".deal")[:href] end I used the exact same rubular expression as the tutorial. I am also unsure of whether or not my CSS tags are correct. Thanks!

    Read the article

  • How to work around a site forbidding me to scrape their images with PHP

    - by Petruza
    I'm scraping a site, searching for JPGs to download. Scraping the site's HTML pages works fine. But when I try getting the JPGs with CURL, copy(), fopen(), etc., I get a 403 forbiden status. I know that's because the site owners don't want their images scraped, so I understand a good answer would be just don't do it, because they don't want you to. Ok, but let's say it's ok and I try to work around this, how could this be achieved? If I get the same URL with a browser, I can open the image perfectly, it's not that my IP is banned or anything, and I'm testing the scraper one file at a time, so it's not blocking me because I make too many requests too often. From my understanding, it could be that either the site is checking for some cookies that confirm that I'm using a browser and browsing their site before I download a JPG. Or that maybe PHP is using some user agent for the requests that the server can detect and filter out. Anyway, have any idea?

    Read the article

  • wxPython formatting questions

    - by Kevin
    I have an app I was working on to learn more about wxPython( I have been primarily been a scripter ). I forgot about it now I am opening it back up. It's a screen scraper, and I have it working almost the way I want it, going to build a regex parser to strip out the links in every scrape that I don't need. The questions I have are this. In it current state, if I check more than one site, it goes out and scrapes, and returns it in separate windows, the for:each section in the Clicked function. I want to put them in a frame, in the window, altogether. I also want to know if I can take the list they are read into and send it to a checklist, so someone could check off separate items, I want to build a save function and keep certain ones. In regards to a save function, I want to keep saved checks, are there calls to the widgets to save their states? I know it's a lot, but thanks for the help.

    Read the article

  • Getting content of a Facebook page in Adobe Flex

    - by cuneyt
    Hi guys, I wrote a Flex application that sends a UrlRequest to Facebook and gets the content of page as a string. The application user clicks a button, and the application connects to Facebook. And no I do not mean using Facebook API. It is like a screen scraper. This application worked locally, but when deployed to server it gives a sandbox security error. I have my crossdomain.xml on the root, but I think the problem is not that. Not only Facebook, but I cannot get any web site when the application is deployed on server. What should I do to get the content of a remote web page?

    Read the article

  • Scaling a ruby script by launching multiple processes instead of using threads.

    - by Zombies
    I want to increase the throughput of a script which does net I/O (a scraper). Instead of making it multithreaded in ruby (I use the default 1.9.1 interpreter), I want to launch multiple processes. So, is there a system for doing this to where I can track when one finishes to re-launch it again so that I have X number running at any time. ALso some will run with different command args. I was thinking of writing a bash script but it sounds like a potentially bad idea if there already exists a method for doing something like this on linux.

    Read the article

< Previous Page | 71 72 73 74 75 76 77 78 79 80 81 82  | Next Page >