Search Results

Search found 446 results on 18 pages for 'crawl'.

Page 16/18 | < Previous Page | 12 13 14 15 16 17 18  | Next Page >

  • Need help diagnosing my machine

    - by Tom Collins
    I have something that just slows my computer to a crawl sometimes. Not running anything big. Yesterday all I had running (besides background apps) were Firefox & Windows Explorer and could barely even switch screens. Nothing showing up in the task manager as hogging CPUs. I have all non-essential services stopped (MySQl & MSSQL) unless I need them. I made some restore points not long ago, but they disappeared. This is a development mach with a LOT of apps installed, so I really, really do not want to re-install Windows. So, what I'm looking for are ideas or tools I can use to help diagnose this problem. The only clues I have is this started right after I installed Office 2013 (with Office 2010 still installed as well) installed Visual Studio 2012 (also keeping 2010 as a co-install) and installed MSSQL 2012 (upgrade from 2008, no co-install) Also, computer runs fine in Safe Mode. I've just ran out of ideas of what to check. Any help / suggestions would much appreciated. Thanks P.S. I'm running Win 7 Pro (x64). Office is also 64 bit. Visual Studio & MSSQL are 64 bit if that option was available (not sure).

    Read the article

  • ESX 4.0 space: DASD, NAS, or ?

    - by thormj
    I put together an ESX box for better management, but its performance is a WTF item; I'm a noob at dealing with ESX, so I'm looking for a laundry-list of reading material to help me straighten this out so I can go back to .NET programming. Current storage system: We're running Raid5+Hotspare (8x500 GB spindles) on a PERC6i on a Dell 2910. Due to ESX limitatios, the PERC is showing the storage as 1x2TB + 1x800GB "partitions." I'm not sure of the setup's configuration (stride / stripe / ???) at all. Our Applications We have a SBS server as well as a minor (2x50 GB, but growing at 10GB/month) database server... Our application that lives on the database VM is CPU and I/O insense; it's a database churning excercise mixed in with a lot of computation on the data (fixing that performance is what I'm supposed to be working on)... Perfomance Issue When I do a backup, restore, or worse (copy a backup from 1 vm to another to move it to the QA VM), the entire system slows to a crawl (even "unrelated" VMs). I originally thought a DASD situation would be quite good since you had PCI-x bandwidth, but the systemwide slowdown is killing productivity. Questions What should I do to make an intelligent decision about NAS vs RAID vs SAN vs DASD? Are there sweet spots/ugly spots in the storage setup? Can you use a SSD PCI-X card in ESX for the tempdb? Good/Bad idea? Is there any way to "share" some image in a copy-on-write fashion? Most of the "Backup-Copy-Restore" is to "put a clean image on the dev boxes"; if I could have them "share" the master image, the "big copy" (2x50 GB) would only need to be done once per week instead of once per dev per week...[runtime performance isn't a concern with the dev boxes, but the backup/copy/restore kills production, SBS, and everything else on the box]

    Read the article

  • Mac OS Leopard: SyncServer process constantly using 100% CPU

    - by macca1
    I am running Leopard that I upgraded from Tiger. I've been noticing that every once in a while the SyncServer process starts up and eats up all the CPU. The fans will start going at full blast and the laptop will slow down to a crawl. I need to force quit the process from Activity Monitor to get it under control. It disappears for a while, but eventually gets started again. I do have an iphone as well that I sync so I'm wondering if syncServer might be an apple process checking for my phone plugged in. Edit: Tried iSync and the manual resetsync as suggested, but got this output: Vince-2:~ vince$ /System/Library/Frameworks/SyncServices.framework/Versions/A/Resources/resetsync.pl full 2010-03-12 08:03:50.230 perl[176:10b] SyncServer is unavailable: exception when connecting: connection timeout: did not receive reply PerlObjCBridge: NSException raised while sending reallyResetSyncData to NSObject object name: "ISyncServerUnavailableException" reason: "Can't connect to the sync server: NSPortTimeoutException: connection timeout: did not receive reply ((null))" userInfo: "" location: "/System/Library/Frameworks/SyncServices.framework/Versions/A/Resources/resetsync.pl line 16" ** PerlObjCBridge: dying due to NSException Vince-2:~ vince$ And during that syncServer started spinning up 95-100% just like it always does.

    Read the article

  • Copy a website and preserve the file & folder structure

    - by DrStalker
    I have an old web site running on an ancient version of Oracle Portal that we need to convert to a flat-html structure. Due to damage to the server we are not able to access the administrative interface, and even if we could there is no export functionality that can work with modern software versions. It would be enough to crawl the website and have all the pages & images saved to a folder, but the file structure needs to be preserved; that is, if a page is located at http://www.oldserver.com/foo/bar/baz/mypage.html then it needs to be saved to /foo/bar/baz/mypage.html so that the various Javascript bits will continue to function. None of the web crawlers I've found have been able to do this; they all want to rename the pages (page01.html, page02.html etc) and break the folder structure. Is there any crawler out there that will recreate the site structure as it appears to a user accessing the site? It doesn't need to redo any of teh content of the pages; once rehosted the pages will all have the same names they did originally so links will continue to work.

    Read the article

  • Site hanging in iis7 - how do I troubleshoot?

    - by Chris Foot
    I am currently having a problem with a windows 2008 server running IIS 7. The server runs several sites but only seems to have the issue with one particular site. Every so often, the whole server slows to a crawl with nearly all requests timing out! Invariably, when we log in to take a look there is always an IIS process using up around 90% cpu. Looking into the worker processes in IIS there are usually one or two requests that have been running for a long time. They are always in the ExecuteRequestHandler state with ManagedPipeline as the module name and the current ones i'm looking at have been running for 7686248 (what units is this in, it doesn't say?). It is also not always the same page, in fact we have seen at least 3 different pages listed under url when this has happened. It seems that the only way to bring the server back to life is to kill the 90% process! The site is running under .Net 4.0 and the code on it is very similar to other sites on the server which do not have the problem! How do I start troubleshooting this?

    Read the article

  • wget crawling search results of news website

    - by kiltek
    I am trying to crawl the search results of a news website using wget. The name of the website is www.voanews.com. After typing in my search keyword and clicking search, it proceeds to the results. Then i can specify a "to" and a "from"-date and hit search again. After this the URL becomes: http://www.voanews.com/search/?st=article&k=mykeyword&df=10%2F01%2F2013&dt=09%2F20%2F2013&ob=dt#article and the actual content of the results is what i want to download. To achieve this I created the following wget-command: wget --reject=js,txt,gif,jpeg,jpg \ --accept=html \ --user-agent=My-Browser \ --recursive --level=2 \ www.voanews.com/search/?st=article&k=germany&df=08%2F21%2F2013&dt=09%2F20%2F2013&ob=dt#article Unfortunately, the crawler doesn't download the search results. It only gets into the upper link bar, which contains the "Home,USA,Africa,Asia,..." links and saves the articles they link to. It seems like he crawler doesn't check the search result links at all. What am I doing wrong and how can I modify the wget command to download the results search list links (and of course the sites they link to) only ?

    Read the article

  • IIS7 ASP.NET application - 2 identical apps in 2 identical app pools, 1 is responsive and 1 is not

    - by Ben
    I have an ASP.NET (v4.0) web app that is installed in a virtual directory (as an application) and is hosted in it's own app pool. This is repeated for each instance of the app (i.e. per customer). The app pools are integrated (not classic) mode and LoadUserProfile is set to true. Otherwise, default settings. Each instance currently has it's own copy of the code/config, and it's own data folder (basic file read/writes). 1 instance of this app runs well (operation used for comparison takes ~4 seconds). Every other instance runs slowly (from 10-25 seconds for the same operation). If I move the slower instance to the "fastest" app pool that instance springs to life. If I move the faster instance into the slower app pool that instance slows to a crawl. The app pools were created in the same way initially - manually. I later used the powershell copy routine to ensure an exact copy of the faster app pool and still the same behaviour. Comparing the apppool.config files shows they are identical barring the virtual directory assignments. There are no shared resources that are being blocked, so far as I can tell, and I tested that by shutting down the performant app pool and restarting... slow is still slow, and then when I restart that app pool (so it's loaded last) it's still faster...

    Read the article

  • Orphaned SQL Recordsets/Connections with IIS

    - by Damian
    I have an IIS 6 site running on Windows 2003 Server x86 with MS SQL2005 Enterprise edition running ASP Classic (no choice). The site runs very fast with about 8000 page views per hour. All of my SQL tables are indexed and I have used the profiler to check my queries, the slowest of which is only about 10-15ms. I have autoshrink disabled, autogrow is set to 250mb, database is 2gb with 800mb of free space. My problem is that every now and then the site will slow to a crawl for no reason. Pages that just have a simple 'connect to databse and increment a hit counter' work ok, but more SQL intensive pages that normally execute in about 60ms take 25,000ms to run. This happens for about 30 seconds and then goes away. I was having an issue with orphan recordsets and connections due to the way I was releasing them. I have fixed this up and the issue is much better, but I am still getting them. Is there a way with permon, etc. to track when SQL Server or Windows closes these Orphan connections? At least if I can monitor the issue I will know if I am making progress or if I am even looking at the right things. Is there anything else I might be missing? Thank you!

    Read the article

  • Routing / binding 128 IPs to one server

    - by Andrew
    I have a Ubuntu server with 128 ip's (static external ips 86.xx.xx.16), and I want to crawl pages thru different ip's. The gateway is xx.xxx.xxx.1, the main ip is xx.xxx.xxx.16, and the other 128 ip's are xx.xxx.xxx.129/255. I tried this configuration in /etc/network/interfaces but I doesn't work. It work if I remove the gateway for the aliases eth0:0 and eth0:1. I think this is routing problem. auto lo iface lo inet loopback auto eth0 auto eth0:0 auto eth0:1 iface eth0 inet static address xx.xxx.xxx.16 netmask 255.255.255.128 gateway xx.xxx.xxx.1 iface eth0:0 inet static address xx.xxx.xxx.129 netmask 255.255.255.128 gateway xx.xxx.xxx.1 iface eth0:1 inet static address xx.xxx.xxx.130 netmask 255.255.255.128 gateway xx.xxx.xxx.1 Also, please tell me how to "reset" every changes that I made in networking and routing. Update: I removed the gateway and now it works. I can reach the website thru all 128 ip's. But when I try to bind a socket connection in php to a specific ip I get no answer. socket_bind($sock, "xx.xxx.xx.xxx"); socket_connect($sock, 'google.com', 80); I tryed to use a sniffer to see the packets, and I see the packet sent from binded ip to google.com but the "connection" can't be established. I don't know anything about "route" command, but I have a feeling that this is the solution.

    Read the article

  • mod_fcgi in virtualmin: graceful kill fail, sending SIGKILL?

    - by mgjk
    Yesterday around 1am, our server ground to a crawl. This doesn't happen often, but I'm trying to get to the bottom of it. There is no unusual traffic volume, no unusual processes running, just all of the sudden the server started killing fcgid processes. [Thu Aug 02 01:17:32 2012] [warn] mod_fcgid: process 26460 graceful kill fail, sending SIGKILL ... for as many fcgid processes as we have... CPU idle fell to 0% and I/O seemed to take up most of the load. The issue lasted about 5 minutes. I suspect there was some swap activity, although I'm not sure if it was due to killed processes being swapped in to die, or if it was because some process ramped up memory usage faster than my process watching scripts can see them. The oom-killer wasn't triggered (at least it's not logged), so I think this was Apache for some reason restarting the processes. This is not regular, and nothing obvious appears in cron. Is there a normal Apache process which might cause this? We run dozens of different sites, and it was late at night, so volume was very, very low. (maybe 200 requests in a 10 minute period).

    Read the article

  • Puzzling TCP performance over 3G / UMTS

    - by lemonsqueeze
    I'm using 3G as my primary internet connection, and TCP over this thing is getting more puzzling every day. For example: Downloading from kernel.org is crazy fast: $wget http://www.kernel.org/pub/linux/kernel/v3.0/linux-3.6.8.tar.bz2 increases to ~500kB/s after a few secs ! Some servers are incredibly slow, for instance www.graphic-pc.com:Same thing, downloading a big file with wget it starts at ~30kB/s for a split second, then collapses to 5-10k or even worse. Web browsing is decent but somewhat unreliable. Randomly, a page will take really long to load or even fail to load, but a reload can succeed almost immediately. Now, by chance i started playing with OpenVPN over UDP on top of the 3G connection, and OMG suddenly everything's extremely fast !Same www.graphic-pc.com now shoots at 100-200kB/s ! What's going on here ??? How come it is so much better with the VPN than without ?? And why does graphic-pc.com crawl when kernel.org flies ?Something to do with my tcp stack (or the server), or some buggy router in between ?? Notes: Setup is laptop running Ubuntu Lucid and a Huawei 3G dongle (So direct pppd connection). I can reproduce this pretty much any time during the day and I'm not moving, so it's clearly not cell environment or internet congestion. (although kernel.org without VPN sometimes does worse in the evening, 60kB or so - but still 500kB with VPN !) For 2) wireshark shows retransmitted packets, dup ack's, even out of order sometimes. I've tried playing with different /proc/sys/net/ipv4 parameters (tcp_rmem, window_scaling, tcp_congestion...) doesn't seem to make a difference. Update: Tried under windows 7 (no VPN) with some interesting results: tcp settings : default tcp_optimizer kernel.org : 10 kB/s 20 kB/s graphic-pc.com: 8 kB/s 70 kB/s ! tcp_optimizer turned on ctcp among other things. Have to check what os graphic-pc.com is running, my bet is linux's tcp_westwood and ms ctcp don't mix well here...

    Read the article

  • Raspberry pi slows down my entire network

    - by gnusouth
    Whenever my Raspberry Pi is connected to the network (via ethernet) the entire network is slowed to a crawl. On my main computer, ping times for google.com go from ~10ms to ~200ms and it takes forever to load web pages. Connections are also slow on the Pi, with an apt-get update showing pathetic speeds in the order of 1KB/s. Turning off the Pi completely removes the drag from the network. I've tried static and dynamic IP addresses for the Pi, but both have the same problems. I'm currently using Raspbian (downloaded today), but also had this problem with Arch Linux. I've checked the connection's duplex with dmesg | grep -i duplex, which shows that the Pi's connection is running at 100Mbps, full-duplex, as expected. My modem/router is a Billion 7404VNPX (an Australian thing); relatively high-end, albeit a bit buggy at times (it will occassionally delete all its firewall settings). It assigns IPs in the range 192.168.1.1 to 192.168.1.20 and has 192.168.1.254 as its own IP. When I assign static IPs I tend to use the 192.168.1.200 area. Does anyone have any idea as to what could be causing this weird slowdown? Or any tests I could try? Thanks

    Read the article

  • Active Directory: Determining DN or OU from log in credentials [closed]

    - by Christopher Broome
    I'm updating a PHP login process to leverage active directory on a Windows server. The logging in process seems pretty straight forward via a "ldap_bind", but I also want to pull some profile information from the AD server (first name, last name, etc...) which seems to require a robust distinguished name (DN). When on the windows server I can grab this via 'dsquery user' on the command prompt, but is there a way to get the same value from just the user's login credentials in PHP? I want to avoid getting a list of hundreds of DNs when on-boarding clients and associating each with one of our users, so any means to programmatically determine this would be preferential. Otherwise, I'll know the domain and host for the request so I can at least set the DC portions of the DN, but the organizational units (OU) seem to be pretty important for querying data. If I can find some of the root level OU values associated with the user I can do a ldap_search and crawl. I browsed through the existing questions and found some similar but nothing that really addressed this, so my apologies if the obvious answer is out there. Thanks for the help.

    Read the article

  • nVidia performance with newer X and newer driver abysmal with Compiz

    - by Nakedible
    I recently upgraded Debian to Xorg 2.9.4 and installed nvidia-glx from experimental, version 260.19.21. This was somewhat of an uphill battle as the dependencies for the experimental nvidia-glx package are still somewhat broken. I got it to work without forcing the installation of any packages and without modifying the packages. However, after the upgrade compiz performance has been abysmal. I am using the desktop wall plugin and switching viewports is really slow - takes a few seconds for each switch. In addition to this, every effect that compiz does, such as zoom animations for icons when launching applications, takes seconds. The viewport switching speed changes relative to the amount of windows on that virtual screen - empty screens switch almost at normal speed, single browser windows work almost decently, but just 4 rxvt terminals slows the switches down to a crawl. My compiz configuration should be pretty basic. Xorg is likewise configured without anything special - the only "custom" configuration is forcing the driver name to be "nvidia". I've fiddled around with the nvidia-settings and compizconfig trying different VSync settings, but none of those helped. My graphics card is: NVIDIA GPU NVS 3100M (GT218) at PCI:1:0:0 (GPU-0). This is laptop GPU that is from the Geforce GTX 200 series. Graphics card performance should naturally be no problem. EDIT: In the end, nothing really worked, and I got really annoyed with the state of compiz and its support in Debian. Many nVidia driver revisions have passed and I am using Gnome 3 now, so I am accepting the best answers to this question even though the issue was not resolved.

    Read the article

  • Using JavaScript/jQuery to return a list of CSS selectors based on highlighted text

    - by Bungle
    I've been given some project requirements that involve (ideally) returning a list of CSS selectors based on highlighted text. In other words, a user could do something like this on a page: Click a button to indicate that their next text selection should be recorded. Highlight some text on the page. See a generated list of CSS selectors that correspond to all the elements that contain the highlighted text. Firstly, does this seem like a feasible goal? jQuery makes it easy to use a selector to access a particular element, but I'm not sure if the reverse holds true. If an element lacks an id attribute, I also don't know how you'd return an "optimized" selector - i.e., one that identifies an element uniquely. Maybe crawl up the DOM until you find an ID, then stem the selector from there? Secondly, from a high-level perspective, any ideas on how to go about this? Any tips or tricks that could speed development? I very much appreciate any help. Thanks!

    Read the article

  • Improving long-polling Ajax performance

    - by Bears will eat you
    I'm writing a webapp (Firefox-compatible only) which uses long polling (via jQuery's ajax abilities) to send more-or-less constant updates from the server to the client. I'm concerned about the effects of leaving this running for long periods of time, say, all day or overnight. The basic code skeleton is this: function processResults(xml) { // do stuff with the xml from the server } function fetch() { setTimeout(function () { $.ajax({ type: 'GET', url: 'foo/bar/baz', dataType: 'xml', success: function (xml) { processResults(xml); fetch(); }, error: function (xhr, type, exception) { if (xhr.status === 0) { console.log('XMLHttpRequest cancelled'); } else { console.debug(xhr); fetch(); } } }); }, 500); } (The half-second "sleep" is so that the client doesn't hammer the server if the updates are coming back to the client quickly - which they usually are.) After leaving this running overnight, it tends to make Firefox crawl. I'd been thinking that this could be partially caused by a large stack depth since I've basically written an infinitely recursive function. However, if I use Firebug and throw a breakpoint into fetch, it looks like this is not the case. The stack that Firebug shows me is only about 4 or 5 frames deep, even after an hour. One of the solutions I'm considering is changing my recursive function to an iterative one, but I can't figure out how I would insert the delay in between Ajax requests without spinning. I've looked at the JS 1.7 "yield" keyword but I can't quite wrap my head around it, to figure out if it's what I need here. Is the best solution just to do a hard refresh on the page periodically, say, once every hour? Is there a better/leaner long-polling design pattern that won't put a hurt on the browser even after running for 8 or 12 hours? Or should I just skip the long polling altogether and use a different "constant update" pattern since I usually know how frequently the server will have a response for me?

    Read the article

  • posting nutch data into a BASIC auth secured Solr instance

    - by mlathe
    Hi. I've secured a solr instance using BASIC auth, kind of how it is shown here: http://blog.comtaste.com/2009/02/securing_your_solr_server_on_t.html Now i'm trying to update my batch processes to push data into the authenticated instance. The ones using "curl" are easy, but i also have a Nutch crawl that uses the "solrindex" command to push data into Solr. When i do that i get this error: 2010-02-22 12:09:28,226 INFO auth.AuthChallengeProcessor - basic authentication scheme selected 2010-02-22 12:09:28,229 INFO httpclient.HttpMethodDirector - No credentials available for BASIC 'Tomcat Manager Application'@ninja:5500 2010-02-22 12:09:28,236 WARN mapred.LocalJobRunner - job_local_0001 org.apache.solr.common.SolrException: Unauthorized Unauthorized request: http://ninja:5500/solr/foo/update?wt=javabin&version=2.2 at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:343) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:183) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:217) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:69) at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:170) 2010-02-22 12:09:29,134 FATAL solr.SolrIndexer - SolrIndexer: java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232) at org.apache.nutch.indexer.solr.SolrIndexer.indexSolr(SolrIndexer.java:73) at org.apache.nutch.indexer.solr.SolrIndexer.run(SolrIndexer.java:95) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.nutch.indexer.solr.SolrIndexer.main(SolrIndexer.java:104) Apparently nutch uses SolrJ to push the content, and after going through the solrj code, it's clear that it uses commons-httpclient without providing a way to set the credentials. Here are my question(s) Is this possible to do? ie push from nutch into a BASIC auth secured Solr instance? Is it possible to tell commons-httpclient about a credential without explicitly doing an _httpclient.getState().setCredentials(...)? Anyother ideas? One idea i had was to use an IPfiltering Valve for just the "update" Solr webservices. That would mean you could only make an update call from certain nodes. Thanks

    Read the article

  • Manual drag-drop operations in Flex

    - by Yarin
    This is a two-part problem: A) I'm implementing several irregular drag-drop operations in Flex (e.g. DataGrid ItemRenderer into Tree). My preference was modifying DragManager operations to meet my needs, and in fact using DragManager allows me to do everything I need, but I'm having serious issues with performance. For example, dragging anything over a many-columned DataGrid, whether the drag was initiated with DragManager.doDrag, or just using native ListBase drag-drop functionality, slows the drag movement to a crawl. Even if the DataGrid is disabled/ not listenening for any move/drag events, this happens. On the other hand, if the drag is initiated by calling .startDrag() on the Sprite, the drag is smooth and performs great over DataGrids and everything else. So part A would be: Is there a reason why .startDrag() operations work so well, while drags initiated through DragManager.doDrag suffer so badly when over certain components? B) If indeed the solution is to handle drag-drops using .startDrag(), how would I go about determining what component the mouse is over when the drag is released? In my example, my dragged object is brought up to the top level of the display list, and so is being moved around in stage coordinates. mouseMove, mouseOver events don't fire on the components I'm dragging over because the mouse is constantly over the dragged component, so I would need some sort of stage.coordinate - visibleComponentAtThatCoordinate conversion. Any thoughts on this? Thanks alot!-- Yarin

    Read the article

  • Best full text search for mysql?

    - by ConroyP
    We're currently running MySQL on a LAMP stack and have been looking at implementing a more thorough, full-text search on our site. We've looked at MySQL's own freetext search, but it doesn't seem to cope well with large databases, which makes it far too slow for our needs. Our main requirements are: speed returning results simple updating of index In addition to the above, our "nice to have"s are: ideally not something that requires adding a module to MySQL plays nicely with PHP (majority of our dev work done using PHP) There seems to be quite a few healthy open-source projects to add fast, reliable full-text search to MySQL, so I'm basically looking for recommendations/suggestions on what you've found to be the most useful product out there, easiest to set up, etc. So far, the list of ones we've been starting to play around with are: Sphinx, C++ based, used by craigslist, thepiratebay Lucene, Java-based Apache project, powers zeoh.com and zoomf.com Solr, Java-based offshoot of Lucene, used to power searches on Digg, CNet & AOL Channels Are there any better ones out there that we haven't come across yet? Can you recommend / suggest against any of the options we've gathered so far? Thanks for your help! Update @Cletus suggested Google's Custom Search Engine. We recently trialled this on a couple of projects, and it's an almost-perfect fit for our needs. The problem is that entries on our site are updated quite regularly, and unfortunately the speed at which entries go in/get updated in Google's index was just too slow and erratic for us to rely on, even with the addition of sitemaps and requested crawl rate changes.

    Read the article

  • TypeError: coercing to Unicode: need string or buffer, User found

    - by Clemens
    hi, i have to crawl last.fm for users (university exercise). I'm new to python and get following error: Traceback (most recent call last): File "crawler.py", line 23, in <module> for f in user_.get_friends(limit='200'): File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/pylast.py", line 2717, in get_friends for node in _collect_nodes(limit, self, "user.getFriends", False): File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/pylast.py", line 3409, in _collect_nodes doc = sender._request(method_name, cacheable, params) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/pylast.py", line 969, in _request return _Request(self.network, method_name, params).execute(cacheable) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/pylast.py", line 721, in __init__ self.sign_it() File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/pylast.py", line 727, in sign_it self.params['api_sig'] = self._get_signature() File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/pylast.py", line 740, in _get_signature string += self.params[name] TypeError: coercing to Unicode: need string or buffer, User found i use the pylast lib for crawling. what i want to do: i want to get a users friends and the friends of the users friends. the error occurs, when i have a for loop in another for loop. here's the code: network = pylast.get_lastfm_network(api_key = API_KEY, api_secret = API_SECRET, username = username, password_hash = password_hash) user = network.get_user("vidarnelson") friends = user.get_friends(limit='200') i = 1 for friend in friends: user_ = network.get_user(friend) print '#%d %s' % (i, friend) i = i + 1 for f in user_.get_friends(limit='200'): print f any advice? thanks in advance. regards!

    Read the article

  • Why does Raphael's framerate slow down on this code?

    - by Bob
    So I'm just doing a basic orbit simulator using Raphael JS, where I draw one circle as the "star" and another circle as the "planet". It seems to be working just fine, with the one snag that as the simulation continues, its framerate progressively slows down until the orbital motion no longer appears fluid. Here's the code (note: uses jQuery only to initialize the page): $(function() { var paper = Raphael(document.getElementById('canvas'), 640, 480); var star = paper.circle(320, 240, 10); var planet = paper.circle(320, 150, 5); var starVelocity = [0,0]; var planetVelocity = [20.42,0]; var starMass = 3.08e22; var planetMass = 3.303e26; var gravConstant = 1.034e-18; function calculateOrbit() { var accx = 0; var accy = 0; accx = (gravConstant * starMass * ((star.attr('cx') - planet.attr('cx')))) / (Math.pow(circleDistance(), 3)); accy = (gravConstant * starMass * ((star.attr('cy') - planet.attr('cy')))) / (Math.pow(circleDistance(), 3)); planetVelocity[0] += accx; planetVelocity[1] += accy; planet.animate({cx: planet.attr('cx') + planetVelocity[0], cy: planet.attr('cy') + planetVelocity[1]}, 150, calculateOrbit); paper.circle(planet.attr('cx'), planet.attr('cy'), 1); // added to 'trace' orbit } function circleDistance() { return (Math.sqrt(Math.pow(star.attr('cx') - planet.attr('cx'), 2) + Math.pow(star.attr('cy') - planet.attr('cy'), 2))); } calculateOrbit(); }); It doesn't appear, to me anyway, that any part of that code would cause the animation to gradually slow down to a crawl, so any help solving the problem will be appreciated!

    Read the article

  • Silverlight performance with a large number of Objects Org Chart.

    - by KC
    Hello, I’m working on a org chart project(SL 3) and I’m seeing the UI thread hang when the chart is building around 2,000 nodes and when it renders it takes about a min and then FPS drop to a crawl. Here is the code flow. Page.xaml.cs calls a wcf service that returns a list of AD users. Then we use Linq to build a collection of nodes to bind to the Orgchart.cs OrgChart.cs is a canvas and displays a collection of nodes and connecting lines. Node.cs is a canvas and has user data can contain children nodes. NodeContent.xaml is a user control that has borders so I can set the background, textblocks to display user's data, Events that handle the selected and expaned nodes, and storyboards that resize the nodes when they are selected or expanded.I noticed during hours of debugging , here in the InitializeComponent(); section where it loads the xaml it seems to be where the preformance hit is happening. System.Windows.Application.LoadComponent(this, new System.Uri("/Silverlight.Custom;component/NodeContent.xaml", System.UriKind.Relative)); So I guess I have two questions. Can threading help in any way with the UI thread hanging while drawing the nodes? How can I avoid the hit when calling this user control? Any advice or direction anyone can lend would be greatly appreciated. Thanks, KC

    Read the article

  • Read a buffer of unknown size (Console input)

    - by Sanarothe
    Hi. I'm a little behind in my X86 Asm class, and the book is making me want to shoot myself in the face. The examples in the book are insufficient and, honestly, very frustrating because of their massive dependencies upon the author's link library, which I hate. I wanted to learn ASM, not how to call his freaking library, which calls more of his library. Anyway, I'm stuck on a lab that requires console input and output. So far, I've got this for my input: input PROC INVOKE ReadConsole, inputHandle, ADDR buffer, Buf - 2, ADDR bytesRead, 0 mov eax,OFFSET buffer Ret input EndP I need to use the input and output procedures multiple times, so I'm trying to make it abstract. I'm just not sure how to use the data that is set to eax here. My initial idea was to take that string array and manually crawl through it by adding 8 to the offset for each possible digit (Input is integer, and there's a little bit of processing) but this doesn't work out because I don't know how big the input actually is. So, how would you swap the string array into an integer that could be used? Full code: (Haven't done the integer logic or the instruction string output because I'm stuck here.) include c:/irvine/irvine32.inc .data inputHandle HANDLE ? outputHandle HANDLE ? buffer BYTE BufSize DUP(?),0,0 bytesRead DWORD ? str1 BYTE "Enter an integer:",0Dh, 0Ah str2 BYTE "Enter another integer:",0Dh, 0Ah str3 BYTE "The higher of the two integers is: " int1 WORD ? int2 WORD ? int3 WORD ? Buf = 80 .code main PROC call handle push str1 call output call input push str2 call output call input push str3 call output call input main EndP larger PROC Ret larger EndP output PROC INVOKE WriteConsole Ret output EndP handle PROC USES eax INVOKE GetStdHandle, STD_INPUT_HANDLE mov inputHandle,eax INVOKE GetStdHandle, STD_INPUT_HANDLE mov outputHandle,eax Ret handle EndP input PROC INVOKE ReadConsole, inputHandle, ADDR buffer, Buf - 2, ADDR bytesRead, 0 mov eax,OFFSET buffer Ret input EndP END main

    Read the article

  • Locating memory leak in Apache httpd process, PHP/Doctrine-based application

    - by Sam
    I have a PHP application using these components: Apache 2.2.3-31 on Centos 5.4 PHP 5.2.10 Xdebug 2.0.5 with Remote Debugging enabled APC 3.0.19 Doctrine ORM for PHP 1.2.1 using Query Caching and Results Caching via APC MySQL 5.0.77 using Query Caching I've noticed that when I start up Apache, I eventually end up 10 child processes. As time goes on, each process will grow in memory until each one approaches 10% of available memory, which begins to slow the server to a crawl since together they grow to take up 100% of memory. Here is a snapshot of my top output: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1471 apache 16 0 626m 201m 18m S 0.0 10.2 1:11.02 httpd 1470 apache 16 0 622m 198m 18m S 0.0 10.1 1:14.49 httpd 1469 apache 16 0 619m 197m 18m S 0.0 10.0 1:11.98 httpd 1462 apache 18 0 622m 197m 18m S 0.0 10.0 1:11.27 httpd 1460 apache 15 0 622m 195m 18m S 0.0 10.0 1:12.73 httpd 1459 apache 16 0 618m 191m 18m S 0.0 9.7 1:13.00 httpd 1461 apache 18 0 616m 190m 18m S 0.0 9.7 1:14.09 httpd 1468 apache 18 0 613m 190m 18m S 0.0 9.7 1:12.67 httpd 7919 apache 18 0 116m 75m 15m S 0.0 3.8 0:19.86 httpd 9486 apache 16 0 97.7m 56m 14m S 0.0 2.9 0:13.51 httpd I have no long-running scripts (they all terminate eventually, the longest being maybe 2 minutes long), and I am working under the assumption that once each script terminates, the memory it uses gets deallocated. (Maybe someone can correct me on that). My hunch is that it could be APC, since it stores data between requests, but at the same time, it seems weird that it would store data inside the httpd process. How can I track down which part of my app is causing the memory leak? What tools can I use to see how the memory usage is growing inside the httpd process and what is contributing to it?

    Read the article

  • Efficient alternative to merge() when building dataframe from json files with R?

    - by Bryan
    I have written the following code which works, but is painfully slow once I start executing it over thousands of records: require("RJSONIO") people_data <- data.frame(person_id=numeric(0)) json_data <- fromJSON(json_file) n_people <- length(json_data) for(lender in 1:n_people) { person_dataframe <- as.data.frame(t(unlist(json_data[[person]]))) people_data <- merge(people_data, person_dataframe, all=TRUE) } output_file <- paste("people_data",".csv") write.csv(people_data, file=output_file) I am attempting to build a unified data table from a series of json-formated files. The fromJSON() function reads in the data as lists of lists. Each element of the list is a person, which then contains a list of the attributes for that person. For example: [[1]] person_id name gender hair_color [[2]] person_id name location gender height [[...]] structure(list(person_id = "Amy123", name = "Amy", gender = "F", hair_color = "brown"), .Names = c("person_id", "name", "gender", "hair_color")) structure(list(person_id = "matt53", name = "Matt", location = structure(c(47231, "IN"), .Names = c("zip_code", "state")), gender = "M", height = 172), .Names = c("person_id", "name", "location", "gender", "height")) The end result of the code above is matrix where the columns are every person-attribute that appears in the structure above, and the rows are the relevant values for each person. As you can see though, some data is missing for some of the people, so I need to ensure those show up as NA and make sure things end up in the right columns. Further, location itself is a vector with two components: state and zip_code, meaning it needs to be flattened to location.state and location.zip_code before it can be merged with another person record; this is what I use unlist() for. I then keep the running master table in people_data. The above code works, but do you know of a more efficient way to accomplish what I'm trying to do? It appears the merge() is slowing this to a crawl... I have hundreds of files with hundreds of people in each file. Thanks! Bryan

    Read the article

< Previous Page | 12 13 14 15 16 17 18  | Next Page >