garbage collecting - Page 14

What are best practices for collecting, maintaining and ensuring accuracy of a huge data set?

- by Kyle West

I am posing this question looking for practical advice on how to design a system. Sites like amazon.com and pandora have and maintain huge data sets to run their core business. For example, amazon (and every other major e-commerce site) has millions of products for sale, images of those products, pricing, specifications, etc. etc. etc. Ignoring the data coming in from 3rd party sellers and the user generated content all that "stuff" had to come from somewhere and is maintained by someone. It's also incredibly detailed and accurate. How? How do they do it? Is there just an army of data-entry clerks or have they devised systems to handle the grunt work? My company is in a similar situation. We maintain a huge (10-of-millions of records) catalog of automotive parts and the cars they fit. We've been at it for a while now and have come up with a number of programs and processes to keep our catalog growing and accurate; however, it seems like to grow the catalog to x items we need to grow the team to y. I need to figure some ways to increase the efficiency of the data team and hopefully I can learn from the work of others. Any suggestions are appreciated, more though would be links to content I could spend some serious time reading. THANKS! Kyle

Read the article

Advice on a DB that can be uploaded to a website by a smart client for collecting survey feedback

- by absfabs

Hello, I'm hoping you can help. I'm looking for a zero config multi-user datbase that my winforms application can easily upload to a webserver folder (together with 1 or 2 classic asp pages) and am looking for some suggestions/recommendations. The idea is that the database will be used to collect feedback entered by people filling in the asp pages. The pages will write to the database using javascript. The database will subsequently be downloaded again for processing once the responses are in. In Summary: It will mostly run in MS Windows environments. I have a modest budget for this and do not mind paying for such a database. No runtime licensing costs. Should be xcopy - Once uploaded to a website folder it should be operational. It should not have a dotnet CLR dependency. It should support a resonable level of concurrent access. Average respondent count would be around 20-30 but one never knows. Should be a reasonable size so that uploads/downloads to and from the site will be reasonably fast. Would appreciate your suggestions/comments Many thanks Abz To clarify - this is a desktop commercial application for feedback management in a vertical market. It uses SQL Server as the backing store. The application currently provides feedback management from email and paper feedback. I now want to add web feedback capability. Getting users to to make their SQL servers accessible to a website is not at option at this time as I am want to make getting up and running as painless as possible. I intend to release a web based implementation of the software in the near future but for now am looking at the above as a pragmatic way to provide web based feedback collection.

Read the article

Are there libraries or techniques for collecting and weighing keywords from a block of text?

- by Soviut

I have a field in my database that can contain large blocks of text. I need to make this searchable but don't have the ability to use full text searching. Instead, on update, I want my business layer to process the block of text and extract keywords from it which I can save as searchable metadata. Ideally, these keywords could then be weighed based on the number of times they appear in the block of text. Naturally, words like "the", "and", "of", etc. should be discarded as they just add a lot of noise to the search. Are there tools or libraries in Python that can do this filtering or should I roll my own?

Read the article

Tabbed Win 7 Explorer interface that doesn't look like garbage?

- by Redandwhite

I'm looking for an Explorer plug-in (not a standalone app) that gives me tabs and integrates very nicely into the Windows 7 look and feel. What I'm looking for is something along the lines of Total Finder for Mac. Any suggestions?

Read the article

Ubuntu's garbage collection cron job for PHP sessions takes 25 minutes to run, why?

- by Lamah

Ubuntu has a cron job set up which looks for and deletes old PHP sessions: # Look for and purge old sessions every 30 minutes 09,39 * * * * root [ -x /usr/lib/php5/maxlifetime ] \ && [ -d /var/lib/php5 ] && find /var/lib/php5/ -depth -mindepth 1 \ -maxdepth 1 -type f -cmin +$(/usr/lib/php5/maxlifetime) ! -execdir \ fuser -s {} 2> /dev/null \; -delete My problem is that this process is taking a very long time to run, with lots of disk IO. Here's my CPU usage graph: The cleanup running is represented by the teal spikes. At the beginning of the period, PHP's cleanup jobs were scheduled at the default 09 and 39 minutes times. At 15:00 I removed the 39 minute time from cron, so a cleanup job twice the size runs half as often (you can see the peaks get twice as wide and half as frequent). Here are the corresponding graphs for IO time: And disk operations: At the peak where there were about 14,000 sessions active, the cleanup can be seen to run for a full 25 minutes, apparently using 100% of one core of the CPU and what seems to be 100% of the disk IO for the entire period. Why is it so resource intensive? An ls of the session directory /var/lib/php5 takes just a fraction of a second. So why does it take a full 25 minutes to trim old sessions? Is there anything I can do to speed this up? The filesystem for this device is currently ext4, running on Ubuntu Precise 12.04 64-bit. EDIT: I suspect that the load is due to the unusual process "fuser" (since I expect a simple rm to be a damn sight faster than the performance I'm seeing). I'm going to remove the use of fuser and see what happens.

Read the article

Garbage Collector not doing its job. Memory Consumption = 1.5GB & OutOFMemory Exception.

- by imageWorker

I'm working with images (each of size = 5MB). The following code extract some information from each image that is present in the given directory. I'm getting out of memory exception. The size of the process is around (1.5GB). I don't know why garbage collector is not freeing memory. I even tried adding GC.Collect() as last line of foreach loop. Still I'm getting 'OutOFMemory' using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Threading; using System.IO; using System.Drawing; using System.Drawing.Imaging; namespace TrainSVM { class Program { static void Main(string[] args) { FileStream fs = new FileStream("dg.train",FileMode.OpenOrCreate,FileAccess.Write); StreamWriter sw = new StreamWriter(fs); String[] filePathArr = Directory.GetFiles("E:\\images\\"); foreach (string filePath in filePathArr) { if (filePath.Contains("lmn")) { sw.Write("1 "); Console.Write("1 "); } else { sw.Write("1 "); Console.Write("1 "); } Bitmap originalBMP = new Bitmap(filePath); /***********************/ Bitmap imageBody; ImageBody.ImageBody im = new ImageBody.ImageBody(originalBMP); imageBody = im.GetImageBody(-1); /* white coat */ Bitmap whiteCoatBitmap = Rgb2Hsi.Rgb2Hsi.GetHuePlane(imageBody); float WhiteCoatPixelPercentage = Rgb2Hsi.Rgb2Hsi.GetWhiteCoatPixelPercentage(whiteCoatBitmap); //Console.Write("whiteDone\t"); sw.Write("1:" + WhiteCoatPixelPercentage + " "); Console.Write("1:" + WhiteCoatPixelPercentage + " "); /******************/ Quaternion.Quaternion qtr = new Quaternion.Quaternion(-15); Bitmap yellowCoatBMP = qtr.processImage(imageBody); //yellowCoatBMP.Save("yellowCoat.bmp"); float yellowCoatPixelPercentage = qtr.GetYellowCoatPixelPercentage(yellowCoatBMP); //Console.Write("yellowCoatDone\t"); sw.Write("2:" + yellowCoatPixelPercentage + " "); Console.Write("2:" + yellowCoatPixelPercentage + " "); /**********************/ Bitmap balckPatchBitmap = BlackPatchDetection.BlackPatchDetector.MarkBlackPatches(imageBody); float BlackPatchPixelPercentage = BlackPatchDetection.BlackPatchDetector.BlackPatchPercentage; //Console.Write("balckPatchDone\n"); sw.Write("3:" + BlackPatchPixelPercentage + "\n"); Console.Write("3:" + BlackPatchPixelPercentage + "\n"); balckPatchBitmap.Dispose(); yellowCoatBMP.Dispose(); whiteCoatBitmap.Dispose(); originalBMP.Dispose(); sw.Flush(); } sw.Dispose(); fs.Dispose(); } } }

Read the article

GetAcceptExSockaddrs returns garbage! Does anyone know why?

- by David

Hello, I'm trying to write a quick/dirty echoserver in Delphi, but I notice that GetAcceptExSockaddrs seems to be writing to only the first 4 bytes of the structure I pass it. USES SysUtils; TYPE BOOL = LongBool; DWORD = Cardinal; LPDWORD = ^DWORD; short = SmallInt; ushort = Word; uint16 = Word; uint = Cardinal; ulong = Cardinal; SOCKET = uint; PVOID = Pointer; _HANDLE = DWORD; _in_addr = packed record s_addr : ulong; end; _sockaddr_in = packed record sin_family : short; sin_port : uint16; sin_addr : _in_addr; sin_zero : array[0..7] of Char; end; P_sockaddr_in = ^_sockaddr_in; _Overlapped = packed record Internal : Int64; Offset : Int64; hEvent : _HANDLE; end; LP_Overlapped = ^_Overlapped; IMPORTS function _AcceptEx (sListenSocket, sAcceptSocket : SOCKET; lpOutputBuffer : PVOID; dwReceiveDataLength, dwLocalAddressLength, dwRemoteAddressLength : DWORD; lpdwBytesReceived : LPDWORD; lpOverlapped : LP_OVERLAPPED) : BOOL; stdcall; external MSWinsock name 'AcceptEx'; procedure _GetAcceptExSockaddrs (lpOutputBuffer : PVOID; dwReceiveDataLength, dwLocalAddressLength, dwRemoteAddressLength : DWORD; LocalSockaddr : P_Sockaddr_in; LocalSockaddrLength : LPINT; RemoteSockaddr : P_Sockaddr_in; RemoteSockaddrLength : LPINT); stdcall; external MSWinsock name 'GetAcceptExSockaddrs'; CONST BufDataSize = 8192; BufAddrSize = SizeOf (_sockaddr_in) + 16; VAR ListenSock, AcceptSock : SOCKET; Addr, LocalAddr, RemoteAddr : _sockaddr_in; LocalAddrSize, RemoteAddrSize : INT; Buf : array[1..BufDataSize + BufAddrSize * 2] of Byte; BytesReceived : DWORD; Ov : _Overlapped; BEGIN //WSAStartup, create listen socket, bind to port 1066 on any interface, listen //Create event for overlapped (autoreset, initally not signalled) //Create accept socket if _AcceptEx (ListenSock, AcceptSock, @Buf, BufDataSize, BufAddrSize, BufAddrSize, @BytesReceived, @Ov) then WinCheck ('SetEvent', _SetEvent (Ov.hEvent)) else if GetLastError <> ERROR_IO_PENDING then WinCheck ('AcceptEx', GetLastError); {do WaitForMultipleObjects} _GetAcceptExSockaddrs (@Buf, BufDataSize, BufAddrSize, BufAddrSize, @LocalAddr, @LocalAddrSize, @RemoteAddr, @RemoteAddrSize); So if I run this, connect to it with Telnet (on same computer, connecting to localhost) and then type a key, WaitForMultipleObjects will unblock and GetAcceptExSockaddrs will run. But the result is garbage! RemoteAddr.sin_family = -13894 RemoteAddr.sin_port = 64 and the rest is zeroes. What gives? Thanks in advance!

Read the article

How to register a domain for a beginner?

- by garbage collection

I've never registered a .com , .net like domain before, and I would like to do some research before doing so. I currently have a ruby on rails app running Heroku. Is there anything special I have to do prior to registering domain on my ruby on rails app at all? Or is it as easy as just inserting my current Heroku address to mask it with another .com or .net name? Is there some special features I should look for registering domain? Or is it typical for domain seller to just sell domain names only? Any recommendations on sellers? Thank you.

Read the article

How to register a domain for a beginner?

- by garbage collection

I've never registered a .com , .net like domain before, and I would like to do some research before doing so. I currently have a ruby on rails app running Heroku. Is there anything special I have to do prior to registering domain on my ruby on rails app at all? Or is it as easy as just inserting my current Heroku address to mask it with another .com or .net name? Is there some special features I should look for registering domain? Or is it typical for domain seller to just sell domain names only? Any recommendations on sellers? Thank you.

Read the article

Google Dart vs CoffeeScript? Which one should one learn?

- by garbage collection

I was thinking about learning CoffeeScript some time in the future. In the mean time, Google came out with Dart that seems to do what CoffeeScript does. Google says: Dart code can be executed in two different ways: either on a native virtual machine or on top of a JavaScript engine by using a compiler that translates Dart code to JavaScript. This means you can write a web application in Dart and have it compiled and run on any modern browser. Does anyone know advantages and disadvantages of learning Dart or CoffeeScript?

Read the article

Why does Scala require functions to have explicit return type?

- by garbage collection

I recently began learning to program in Scala, and it's been fun so far. I really like the ability to declare functions within another function which just seems to intuitive thing to do. One pet peeve I have about Scala is the fact that Scala requires explicit return type in its functions. And I feel like this hinders on expressiveness of the language. Also it's just difficult to program with that requirement. Maybe it's because I come from Javascript and Ruby comfort zone. But for a language like Scala which will have tons of connected functions in an application, I cannot conceive how I brainstorm in my head exactly what type the particular function I am writing should return with recursions after recursions. This requirement of explicit return type declaration on functions, do not bother me for languages like Java and C++. Recursions in Java and C++, when they did happen, often were dealt with 2 to 3 functions max. Never several functions chained up together like Scala. So I guess I'm wondering if there is a good reason why Scala should have the requirement of functions having explicit return type?

Read the article

How can I force Javascript garbage collection in IE? IE is acting very slow after AJAX calls & DOM

- by RenderIn

I have a page with chained drop-downs. Choosing an option from the first select populates the second, and choosing an option from the second select returns a table of matching results using the innerHtml function on an empty div on the page. The problem is, once I've made my selections and a considerable amount of data is brought onto the page, all subsequent Javascript on the page runs exceptionally slowly. It seems as if all the data I pulled back via AJAX to populate the div is still hogging a lot of memory. I tried setting the return object which contains the AJAX results to null after calling innerHtml but with no luck. Firefox, Safari, Chrome and Opera all show no performance degradation when I use Javascript to insert a lot of data into the DOM, but in IE it is very apparent. To test that it's a Javascript/DOM issue rather than a plain old IE issue, I created a version of the page that returns all the results on the initial load, rather than via AJAX/Javascript, and found IE had no performance problems. FYI, I'm using jQuery's jQuery.get method to execute the AJAX call. EDIT This is what I'm doing: <script type="text/javascript"> function onFinalSelection() { var searchParameter = jQuery("#second-select").val(); jQuery.get("pageReturningAjax.php", {SEARCH_PARAMETER: searchParameter}, function(data) { jQuery("#result-div").get(0).innerHtml = data; //jQuery("#result-div").html(data); //Tried this, same problem data = null; }, "html"); } </script>

Read the article

Why do I get garbage output when printing an int[]?

- by Kat

My program is suppose to count the occurrence of each character in a file ignoring upper and lower case. The method I wrote is: public int[] getCharTimes(File textFile) throws FileNotFoundException { Scanner inFile = new Scanner(textFile); int[] lower = new int[26]; char current; int other = 0; while(inFile.hasNext()){ String line = inFile.nextLine(); String line2 = line.toLowerCase(); for (int ch = 0; ch < line2.length(); ch++) { current = line2.charAt(ch); if(current >= 'a' && current <= 'z') lower[current-'a']++; else other++; } } return lower; } And is printed out using: for(int letter = 0; letter < 26; letter++) { System.out.print((char) (letter + 'a')); System.out.println(": " + ts.getCharTimes(file)); } Where ts is a TextStatistic object created earlier in my main method. However when I run my program, instead of printing out the number of how often the character occurs it prints: a: [I@f84386 b: [I@1194a4e c: [I@15d56d5 d: [I@efd552 e: [I@19dfbff f: [I@10b4b2f And I don't know what I'm doing wrong.

Read the article

Urllib's urlopen breaking on some sites (e.g. StackApps api): returns garbage results

- by Edan Maor

I'm using urllib2's urlopen function to try and get a JSON result from the StackOverflow api. The code I'm using: >>> import urllib2 >>> conn = urllib2.urlopen("http://api.stackoverflow.com/0.8/users/") >>> conn.readline() The result I'm getting: '\x1f\x8b\x08\x00\x00\x00\x00\x00\x04\x00\xed\xbd\x07`\x1cI\x96%&/m\xca{\x7fJ\... I'm fairly new to urllib, but this doesn't seem like the result I should be getting. I've tried it in other places and I get what I expect (the same as visiting the address with a browser gives me: a JSON object). Using urlopen on other sites (e.g. "http://google.com") works fine, and gives me actual html. I've also tried using urllib and it gives the same result. I'm pretty stuck, not even knowing where to look to solve this problem. Any ideas?

Read the article

Why did mislav-will_paginate start adding so much garbage to urls between rails 2.3.2 and 2.3.5?

- by user30997

I've used will_paginate in a number of projects now, but when I moved one of them to Rails 2.3.5, clicking on any of the pagination links (page number, next, prev, etc.,) went from getting nice URLs like this: http://foo.com/user/1/date/2005_01_31/phone/555-6161 to this: http://foo.com/?options[]=user&options[]=date&options[]=2005_01_31&options[]=phone&options[]=555-6161 I have a route that looks like this that is probably the source of the 'options' keyword: map.connect '/browse/*options', :controller=>'assets', :action=>'browse' It's enough of an annoyance that I'm willing to roll a paginator to get around this if there isn't a way to get back to where I was before. Is there a way to get will_paginate to turn array-style routes into sane urls again? Thanks.

Read the article

What is the purpose of the garbage (files) that Qt Creator auto-generates and how can I tame them?

- by Venemo

Hello Everyone, I'm fairly new to Qt, and I'm using the new Nokia Qt SDK beta and I'm working to develop a small application for my Nokia N900 in my free time. Fortunately, I was able to set up everything correctly, and also to run my app on the device. I've learned C++ in school, so I thought it won't be so difficult. I use Qt Creator as my IDE, because it doesn't work with Visual Studio. I also wish to port my app to Symbian, so I have run the emulator a few times, and I also compile for Windows to debug the most evil bugs. (The debugger doesn't work correctly on the device.) I come from a .NET background, so there are some things that I don't understand. When I hit the build button, Qt Creator generates a bunch of files to my project directory: moc_*.cpp files - I don't know their purpose. Could someone tell me? *.o files - I assume these are the object code *.rss files - I don't know their purpose, but they definitely don't have anything to do with RSS Makefile and Makefile.Debug - I have no idea AppName (without extension) - the executable for Maemo, and AppName.sis - the executable for Symbian, I guess? AppName.loc - I have no idea AppName_installer.pkg and AppName_template.pkg - I have no idea qrc_Resources.cpp - I guess this is for my Qt resources (where AppName is the name of the application in question) I noticed that these files can be safely deleted, Qt Creator simply regenerates them. The problem is that they pollute my source directory. Especially because I use version control, and if they can be regenerated, there is no point in uploading them to SVN. So, could someone please tell me what the exact purpose of these files is, and how can I ask Qt Creator to place them into another directory? Thank you in advance!

Read the article

Why does a change of Session State provider lead to an ASPx page yielding garbage?

- by Rory Becker

I have an aspnet webapp which has worked very well up until now. I was recently asked to explore ways of making it scale better. I found that seperation of database and Webapp would help. Further I was told that if I changed my session providing mechanism to SQLServer, I would be able to duplicate the Web Stack to several machines which could each call back to the state server allowing the load to be distirbuted better. This sounds logical. So I created an ASPState database using ASPNet_RegSQL.exe as detailed in many locations across the web and changed the web.config on my app from: <sessionState mode="InProc" cookieless="false" timeout="20" /> To: <sessionState mode="SQLServer" sqlConnectionString="Server=SomeSQLServer;user=SomeUser;password=SomePassword" cookieless="false" timeout="20" /> Then I addressed my app, which presented me with its logon screen and I duly logged in. Once in I was presented, not with the page I was expecting, but with: I can change the sessionstate back and forth. This problem goes away and then comes back based on which set of configuration I use. Why is this happening?

Read the article

To remove garbage characters from a string using regex...

- by Harjit Singh

Hi I want to remove characters from a string other then a-z, and A-Z. Created following function for the same and it works fine. public String stripGarbage(String s) { String good = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnopqrstuvwxyz"; String result = ""; for (int i = 0; i < s.length(); i++) { if (good.indexOf(s.charAt(i)) >= 0) { result += s.charAt(i); } } return result; } Can anyone tell me a better way to achieve the same. Probably regex may be better option. Regards Harry

Read the article

What is the purpose of Finalization in java?

- by Karthik

Different websites are giving different opinions. My understanding is this: To clean up or reclaim the memory that an object occupies, the Garbage collector comes into action. (automatically is invoked???) The garbage collector then dereferences the object. Sometimes, there is no way for the garbage collector to access the object. Then finalize is invoked to do a final clean up processing after which the garbage collector can be invoked. is this right?

Read the article

How is garbage collection done in Java and how does it compare to .net?

- by CodeToGlory

I am wondering what the differences are or if it is the same across both.

Read the article

Installed Ubuntu 14.04. How do I get it working with my new AMD R9 280X. When I run it I get garbage on the display

- by user289455

I installed Ubuntu 14.04 on my Gigabyte A88X FM2+ motherboard and A10 7700K processor. Everything worked perfectly. This system is dual booted with Windows 8.1 for gaming and I installed subsequently I installed an AMD r9 280X gpu. Windows was easy. It kept working with the new card installed and I just needed to download and instal the driver. Reboot and it's done. Ubuntu boots into a blank display. I enter the password and can log in. The menu on the left and bar at the top are visible but when I try to open an app, I get blocks and rubbish on the display. My preference is not to remove the card. So, my question is: What would be the procedure I could follow where I can log in using a text based terminal and download and install the AMD drivers.

Read the article

A friend told me Python is garbage, I'm taking web design classes in the Spring and I have a textbook on C++. What should I do? [on hold]

- by user107165

I dont know if I should start digging into Python beforehand just to get acquanited with programming and "whet my appetite" or if I should work on the C++ book... Python definitely has more resources around town and I like the beginner friendly approach that seems to go along with every site that appeals to it. Or should I just wait for my assignments that start in 4 months? Any tips for an aspiring programmer?

Read the article

Some new free tools enter the SQL marketplace

- by AaronBertrand

A while back, I started collecting links for free SQL Server resources available to everyone in the community. I created a blog post called " Useful, free resources for SQL Server " to serve as a launching point for the links I'd been collecting. I'm in the process of going back and updating that post, but in the meantime, I wanted to highlight a couple of big events that happened in the past week. Atlantis Interactive Last week Matt Whitfield ( blog | twitter ) announced that his company's commercial...(read more)

Read the article

Some new free tools enter the SQL marketplace

- by AaronBertrand

A while back, I started collecting links for free SQL Server resources available to everyone in the community. I created a blog post called " Useful, free resources for SQL Server " to serve as a launching point for the links I'd been collecting. I'm in the process of going back and updating that post, but in the meantime, I wanted to highlight a couple of big events that happened in the past week. Atlantis Interactive Last week Matt Whitfield ( blog | twitter ) announced that his company's commercial...(read more)

Read the article

Requiring multithreading/concurrency for implementation of scripting language

- by Ricky Stewart

Here's the deal: I'm looking at designing my own scripting/interpreted language for fun. I'm only in the planning stages right now; I want to make sure I have a very strong hold on exactly how I will implement everything before I start coding. What I'm currently struggling with is concurrency. It seems to me like an easy way to avoid the unpredictable performance that comes with garbage collection would be to put the garbage collector in its own thread, and have it run concurrently with the interpreter itself. (To be clear, I don't plan to allow the scripts to be multithreaded themselves; I would simply put a garbage collector to work in a different thread than the interpreter.) This doesn't seem to be a common strategy for many popular scripting languages, probably for portability reasons; I would probably write the interpreter in the UNIX/POSIX threading framework initially and then port it to other platforms (Windows, etc.) if need be. Does anyone have any thoughts in this issue? Would whatever gains I receive by exploiting concurrency be nullified by the portability issues that will inevitably arise? (On that note, am I really correct in my assumption that I would experience great performance gains with a concurrent garbage collector?) Should I move forward with this strategy or step away from it?

Search Results

Search found 1329 results on 54 pages for 'garbage collecting'.

Page 14/54 | < Previous Page | 10 11 12 13 14 15 16 17 18 19 20 21 | Next Page >

- by Kyle West

- by absfabs

- by Soviut

- by Redandwhite

- by Lamah

- by imageWorker

- by David

- by garbage collection

- by garbage collection

- by garbage collection

- by garbage collection

- by RenderIn

- by Kat

- by Edan Maor

- by user30997

- by Venemo

- by Rory Becker

- by Harjit Singh

- by Karthik

- by CodeToGlory

- by user289455

- by user107165

- by AaronBertrand

- by AaronBertrand

- by Ricky Stewart

< Previous Page | 10 11 12 13 14 15 16 17 18 19 20 21 | Next Page >