Search Results

Search found 45804 results on 1833 pages for 'large files'.

Page 74/1833 | < Previous Page | 70 71 72 73 74 75 76 77 78 79 80 81 | Next Page >

High Runtime for Dictionary.Add for a large amount of items

- by aaginor

Hi folks, I have a C#-Application that stores data from a TextFile in a Dictionary-Object. The amount of data to be stored can be rather large, so it takes a lot of time inserting the entries. With many items in the Dictionary it gets even worse, because of the resizing of internal array, that stores the data for the Dictionary. So I initialized the Dictionary with the amount of items that will be added, but this has no impact on speed. Here is my function: private Dictionary<IdPair, Edge> AddEdgesToExistingNodes(HashSet<NodeConnection> connections) { Dictionary<IdPair, Edge> resultSet = new Dictionary<IdPair, Edge>(connections.Count); foreach (NodeConnection con in connections) { ... resultSet.Add(nodeIdPair, newEdge); } return resultSet; } In my tests, I insert ~300k items. I checked the running time with ANTS Performance Profiler and found, that the Average time for resultSet.Add(...) doesn't change when I initialize the Dictionary with the needed size. It is the same as when I initialize the Dictionary with new Dictionary(); (about 0.256 ms on average for each Add). This is definitely caused by the amount of data in the Dictionary (ALTHOUGH I initialized it with the desired size). For the first 20k items, the average time for Add is 0.03 ms for each item. Any idea, how to make the add-operation faster? Thanks in advance, Frank

Read the article
Large flags enumerations in C#

- by LorenVS

Hey everyone, got a quick question that I can't seem to find anything about... I'm working on a project that requires flag enumerations with a large number of flags (up to 40-ish), and I don't really feel like typing in the exact mask for each enumeration value: public enum MyEnumeration : ulong { Flag1 = 1, Flag2 = 2, Flag3 = 4, Flag4 = 8, Flag5 = 16, // ... Flag16 = 65536, Flag17 = 65536 * 2, Flag18 = 65536 * 4, Flag19 = 65536 * 8, // ... Flag32 = 65536 * 65536, Flag33 = 65536 * 65536 * 2 // right about here I start to get really pissed off } Moreover, I'm also hoping that there is an easy(ier) way for me to control the actual arrangement of bits on different endian machines, since these values will eventually be serialized over a network: public enum MyEnumeration : uint { Flag1 = 1, // BIG: 0x00000001, LITTLE:0x01000000 Flag2 = 2, // BIG: 0x00000002, LITTLE:0x02000000 Flag3 = 4, // BIG: 0x00000004, LITTLE:0x03000000 // ... Flag9 = 256, // BIG: 0x00000010, LITTLE:0x10000000 Flag10 = 512, // BIG: 0x00000011, LITTLE:0x11000000 Flag11 = 1024 // BIG: 0x00000012, LITTLE:0x12000000 } So, I'm kind of wondering if there is some cool way I can set my enumerations up like: public enum MyEnumeration : uint { Flag1 = flag(1), // BOTH: 0x80000000 Flag2 = flag(2), // BOTH: 0x40000000 Flag3 = flag(3), // BOTH: 0x20000000 // ... Flag9 = flag(9), // BOTH: 0x00800000 } What I've Tried: // this won't work because Math.Pow returns double // and because C# requires constants for enum values public enum MyEnumeration : uint { Flag1 = Math.Pow(2, 0), Flag2 = Math.Pow(2, 1) } // this won't work because C# requires constants for enum values public enum MyEnumeration : uint { Flag1 = Masks.MyCustomerBitmaskGeneratingFunction(0) } // this is my best solution so far, but is definitely // quite clunkie public struct EnumWrapper<TEnum> where TEnum { private BitVector32 vector; public bool this[TEnum index] { // returns whether the index-th bit is set in vector } // all sorts of overriding using TEnum as args } Just wondering if anyone has any cool ideas, thanks!

Read the article
Need a tool to search large structure text documents for words, phrases and related phrases

- by pitosalas

I have to keep up with structured documents containing things such as requests for proposals, government program reports, threat models and all kinds of things like that. They are in techno-legalese as I would call them: highly structured, with section numbering and 3, 4 and 5 levels of nesting. All in English I need a more efficient way to locate those paragraphs of nuggets that matter to me. So what I’d like is kind of a local document index/repository, that would allow me to have some standing queries and easily locate sections in documents that talk about my queries. Here’s an example: I’d like to load in 10 large PDF files, each of say 100 pages. Each PDF contains English text, formatted very nicely into paragraphs and sections. I’d like to specify that I am interested in “blogging platforms”, “weaknesses in Ruby”, “localization and internationalization” Ideally then look at a list that showed the section of text, the name of the document, and other information that seemed to be related to and/or include the words and phrases I specified. I am sure something like this exists. I would call it something like document indexing, document comprehension or structured searching.

Read the article
Hibernate design to speed up querying of large dataset

- by paddydub

I currently have the below tables representing a bus network mapped in hibernate, accessed from a Spring MVC based bus route planner I'm trying to make my route planner application perform faster, I load all the above tables into Lists to perform the route planner logic. I would appreciate if anyone has any ideas of how to speed my performace Or any suggestions of another method to approach this problem of handling a large set of data Coordinate Connections Table (INT,INT,INT, DOUBLE)( Containing 50,000 Coordinate Connections) ID, FROMCOORDID, TOCOORDID, DISTANCE 1 1 2 0.383657 2 1 17 0.173201 3 1 63 0.258781 4 1 64 0.013726 5 1 65 0.459829 6 1 95 0.458769 Coordinate Table (INT,DECIMAL, DECIMAL) (Containing 4700 Coordinates) ID , LAT, LNG 0 59.352669 -7.264341 1 59.352669 -7.264341 2 59.350012 -7.260653 3 59.337585 -7.189798 4 59.339221 -7.193582 5 59.341408 -7.205888 Bus Stop Table (INT, INT, INT)(Containing 15000 Stops) StopID RouteID COORDINATEID 1000100001 100 17 1000100002 100 18 1000100003 100 19 1000100004 100 20 1000100005 100 21 1000100006 100 22 1000100007 100 23 This is how long it takes to load all the data from each table: stop.findAll = 148ms, stops.size: 15670 Hibernate: select coordinate0_.COORDINATEID as COORDINA1_2_, coordinate0_.LAT as LAT2_, coordinate0_.LNG as LNG2_ from COORDINATES coordinate0_ coord.findAll = 51ms , coordinates.size: 4704 Hibernate: select coordconne0_.COORDCONNECTIONID as COORDCON1_3_, coordconne0_.DISTANCE as DISTANCE3_, coordconne0_.FROMCOORDID as FROMCOOR3_3_, coordconne0_.TOCOORDID as TOCOORDID3_ from COORDCONNECTIONS coordconne0_ coordinateConnectionDao.findAll = 238ms ; coordConnectioninates.size:48132 Hibernate Annotations @Entity @Table(name = "STOPS") public class Stop implements Serializable { @Id @GeneratedValue(strategy = GenerationType.AUTO) @Column(name = "STOPID") private int stopID; @Column(name = "ROUTEID", nullable = false) private int routeID; @ManyToOne(fetch = FetchType.LAZY) @JoinColumn(name = "COORDINATEID", nullable = false) private Coordinate coordinate; } @Table(name = "COORDINATES") public class Coordinate { @Id @GeneratedValue @Column(name = "COORDINATEID") private int CoordinateID; @Column(name = "LAT") private double latitude; @Column(name = "LNG") private double longitude; } @Entity @Table(name = "COORDCONNECTIONS") public class CoordConnection { @Id @GeneratedValue @Column(name = "COORDCONNECTIONID") private int CoordinateID; @ManyToOne(fetch = FetchType.LAZY) @JoinColumn(name = "FROMCOORDID", nullable = false) private Coordinate fromCoordID; @ManyToOne(fetch = FetchType.LAZY) @JoinColumn(name = "TOCOORDID", nullable = false) private Coordinate toCoordID; @Column(name = "DISTANCE", nullable = false) private double distance; }

Read the article
Filter large amounts of data in a table w/ jQuery

- by Bry4n

I work for a transit agency and I have large amounts of data (mostly times), and I need a way to filter the data using two textboxes (To and From). I found jQuery quick search, but it seems to only work with one textbox. If anyone has any ideas via jQuery or some other client side library, that would be fantastic. Ideal example: To: [Textbox] From:[Textbox] <table> <tr> <td>69th street</td><td>5:00pm</td><td>5:06pm</td><td>5:10pm</td><td>5:20pm</td> </tr> <tr> <td>Millbourne</td><td>5:09pm</td><td>5:15pm</td><td>5:20pm</td><td>5:25pm</td> </tr> <tr> <td>Spring Garden</td><td>6:00pm</td><td>6:15pm</td><td>6:20pm</td><td>6:25pm</td> </tr> </table> So If I start typing in one of the stations in the To: textbox it either displays dynamically like the quick search or i have to press a button (either or) and then in the from: textbox. Lastly it shows me to: station and all its times on the left and the from: station and all its times on the right.

Read the article
Good PHP / MYSQL hashing solution for large number of text values

- by Dave

Short descriptio: Need hashing algorithm solution in php for large number of text values. Long description. PRODUCT_OWNER_TABLE serial_number (auto_inc), product_name, owner_id OWNER_TABLE owner_id (auto_inc), owener_name I need to maintain a database of 200000 unique products and their owners (AND all subsequent changes to ownership). Each product has one owner, but an owner may have MANY different products. Owner names are "Adam Smith", "John Reeves", etc, just text values (quite likely to be unicode as well). I want to optimize the database design, so what i was thinking was, every week when i run this script, it fetchs the owner of a proudct, then checks against a table i suppose similar to PRODUCT_OWNER_TABLE, fetching the owner_id. It then looks up owner_id in OWNER_TABLE. If it matches, then its the same, so it moves on. The problem is when its different... To optimize the database, i think i should be checking against the other "owner_name" entries in OWNER_TABLE to see if that value exists there. If it does, then i should use that owner_id. If it doesnt, then i should add another entry. Note that there is nothing special about the "name". as long as i maintain the correct linkagaes AND make the OWNER_TABLE "read-only, append-new" type table - I should be able create a historical archive of ownership. I need to do this check for 200000 entries, with i dont know how many unique owner names (~50000?). I think i need a hashing solution - the OWNER_TABLE wont be sorted, so search algos wont be optimal. programming language is PHP. database is MYSQL.

Read the article
table design for storing large number of rows

- by hyperboreean

I am trying to store in a postgresql database some unique identifiers along with the site they have been seen on. I can't really decide which of the following 3 option to choose in order to be faster and easy maintainable. The table would have to provide the following information: the unique identifier which unfortunately it's text the sites on which that unique identifier has been seen The amount of data that would have to hold is rather large: there are around 22 millions unique identifiers that I know of. So I thought about the following designs of the table: id - integer identifier - text seen_on_site - an integer, foreign key to a sites table This approach would require around 22 mil multiplied by the number of sites. id - integer identifier - text seen_on_site_1 - boolean seen_on_site_2 - boolean ............ seen_on_site_n - boolean Hopefully the number of sites won't go past 10. This would require only the number of unique identifiers that I know of, that is around 20 millions, but it would make it hard to work with it from an ORM perspective. one table that would store only unique identifiers, like in: id - integer unique_identifier - text, one table that would store only sites, like in: id - integer site - text and one many to many relation, like: id - integer, unique_id - integer (fk to the table storing identifiers) site_id - integer (fk to sites table) another approach would be to have a table that stores unique identifiers for each site So, which one seems like a better approach to take on the long run?

Read the article
Filter large amounts of data from a HTML table w/ jQuery

- by Bry4n

I work for a transit agency and I have large amounts of data (mostly times), and I need a way to filter the data using two textboxes (To and From). I found jQuery quick search, but it seems to only work with one textbox. If anyone has any ideas via jQuery or some other client side library, that would be fantastic. Ideal example: To: [Textbox] From:[Textbox] <table> <tr> <td>69th street</td><td>5:00pm</td><td>5:06pm</td><td>5:10pm</td><td>5:20pm</td> </tr> <tr> <td>Millbourne</td><td>5:09pm</td><td>5:15pm</td><td>5:20pm</td><td>5:25pm</td> </tr> <tr> <td>Spring Garden</td><td>6:00pm</td><td>6:15pm</td><td>6:20pm</td><td>6:25pm</td> </tr> </table> I have an HTML page with a giant table on it listing the station names and each stations times. I want to be able to put my starting location in one box and my ending location in another box and have all the items in the table disappear that don't relate to either of the two locations typed in, leaving only two rows that match what was typed in (even if they don't spell it right or type it all the way) Similar to the jQuery quick search plugin

Read the article
How do you make life easier for yourself when developing a really large database

- by Hannes de Jager

I am busy developing 2 web based systems with MySql databases and the amount of tables/views/stored routines is really becoming a lot and it is more and more challenging to handle the complexity. Now in programming languages we have namespacing e.g. Java packages, C++ namespaces to partition the software, grouping it together to make things more understandable. Databases on the other hand have more of a flat structure (MySql at least) e.g. tables and stored procedures are on the same level. So one have to be more creative, creating naming conventions, perhaps use more than one database or using tools to visualize things. What methods do you use to ease the pain? To be effective while developing your databases? To not get lost in a sea of tables and fields and stored procs? Feel free to mention tools you use also, but try to restrict it to open source and preferably Linux solutions if thats OK. b.t.w How many tables would a database have to be considered large in terms of design?

Read the article
Write contents of custom View to large Image file on SD card

- by JFortney

I have a class that extends View. I override the onDraw method and allow the user to draw on the screen. I am at the point where I want to save this view as an image. I Can use buildDrawingCache and getDrawingCache to create a bitmap that I can write to the SD card. However, the image is not good quality at a large size, it has jagged edges. Since I have a View and I use Paths I can transform all by drawing to a bigger size. I just don't know how to make the Canvas bigger so when I call getDrawingCache it doesn't crop all the paths I am just transformed. What is happening is I transform all my paths but when I write the Bitmap to file I am only getting the "viewport" of the actual screen size. I want something much bigger. Any help in the right direction would be greatly appreciated. I have been reading the docs and books and am at a loss. Thanks Jon

Read the article
Delphi fast large bitmap creation (without clearing)

- by Ritsaert Hornstra

When using the TBitmap wrapper for a GDI bitmap from the unit Graphics I noticed it will always clear out the bitmap (using a PatBlt call) when setting up a bitmap with SetSize( w, h ). When I copy in the bits later on (see routine below) it seems ScanLine is the fastest possibility and not SetDIBits. function ToBitmap: TBitmap; var i, N, x: Integer; S, D: PAnsiChar; begin Result := TBitmap.Create(); Result.PixelFormat := pf32bit; Result.SetSize( width, height ); S := Src; D := Result.ScanLine[ 0 ]; x := Integer( Result.ScanLine[ 1 ] ) - Integer( D ); N := width * sizeof( longword ); for i := 0 to height - 1 do begin Move( S^, D^, N ); Inc( S, N ); Inc( D, x ); end; end; The bitmaps I need to work with are quite large (150MB of RGB memory). With these iomages it takes 150ms to simply create an empty bitmap and a further 140ms to overwrite it's contents. Is there a way of initializing a TBitmap with the correct size WITHOUT initializing the pixels itself and leaving the memory of the pixels uninitialized (eg dirty)? Or is there another way to do such a thing. I know we could work on the pixels in place but this still leaves the 150ms of unnessesary initializtion of the pixels.

Read the article
Swt file dialog too much files selected?

- by InsertNickHere

Hi there, the swt file dialog will give me an empty result array if I select too much files (approx. 2500files). The listing shows you how I use this dialog. If i select too many sound files, the syso will show 0. Debugging tells me, that the files array is empty in this case. Is there any way to get this work? FileDialog fileDialog = new FileDialog(mainView.getShell(), SWT.MULTI); fileDialog.setText("Choose sound files"); fileDialog.setFilterExtensions(new String[] { new String("*.wav") }); Vector<String> result = new Vector<String>(); fileDialog.open(); String[] files = fileDialog.getFileNames(); for (int i = 0, n = files.length; i < n; i++) { if( !files[i].contains(".wav")) { System.out.println(files[i]); } StringBuffer stringBuffer = new StringBuffer(); stringBuffer.append(fileDialog.getFilterPath()); if (stringBuffer.charAt(stringBuffer.length() - 1) != File.separatorChar) { stringBuffer.append(File.separatorChar); } stringBuffer.append(files[i]); stringBuffer.append(""); String finalName = stringBuffer.toString(); if( !finalName.contains(".wav")) { System.out.println(finalName); } result.add(finalName); } System.out.println(result.size()) ;

Read the article
handle large Parcelable ArrayList in Android

- by Gal Ben-Haim

I'm developing an Android app that is a client to a JSON webservice API. I have classes of resource objects (some are nested) and I pass results from an IntentService that access the webserive using the Parcelable interface for all the resource classes. the webservice returns arrays or results that can be potentially large (because of the nesting, for example, a post object also contains comments array, each comment also contains a user object). currently I'm either inserting the results into a SQlite database or displaying them in a ListView. (my relevant methods are accepting ArrayList<resourceClass> as arguments). (some data need to be persistent stored and some should not). since I don't know what size of lists I can handle this way without reaching the memory limits, is this a good practice ? is it a better idea to save the parsed JSON to a local file immediately and pass the file path to the ResultReceiver, then either insert to database from that file or display the data ? is there a better way to handle this ? btw - I'm parsing the JSON as a stream with Gson's Reader so there shouldn't be memory issues at that stage.

Read the article
Way to store a large dictionary with low memory footprint + fast lookups (on Android)

- by BobbyJim

I'm developing an android word game app that needs a large (~250,000 word dictionary) available. I need: reasonably fast look ups e.g. constant time preferable, need to do maybe 200 lookups a second on occasion to solve a word puzzle and maybe 20 lookups within 0.2 second more often to check words the user just spelled. EDIT: Lookups are typically asking "Is in the dictionary?". I'd like to support up to two wildcards in the word as well, but this is easy enough by just generating all possible letters the wildcards could have been and checking the generated words (i.e. 26 * 26 lookups for a word with two wildcards). as it's a mobile app, using as little memory as possible and requiring only a small initial download for the dictionary data is top priority. My first naive attempts used Java's HashMap class, which caused an out of memory exception. I've looked into using the SQL lite databases available on android, but this seems like overkill. What's a good way to do what I need?

Read the article
LuaEdit can't find module when Lua files all in the same folder

- by joverboard

I downloaded LuaEdit to use as an IDE and debug tool however I'm having trouble using it for even the simplest things. I've created a solution with 2 files in it, all of which are stored in the same folder. My files are as follows: --startup.lua require("foo") test("Testing", "testing", "one, two, three") --foo.lua foo = {} print("In foo.lua") function test(a,b,c) print(a,b,c) end This works fine when in my C++ compiler when accessed through some embed code, however when I attempt to use the same code in LuaEdit, it crashes on line 3 require("foo") with an error stating: module 'foo' not found: no field package.preload['foo'] no file 'C:\Program Files (x86)\LuaEdit 2010\lua\foo.lua' no file 'C:\Program Files (x86)\LuaEdit 2010\lua\foo\init.lua' no file 'C:\Program Files (x86)\LuaEdit 2010\foo.lua' no file 'C:\Program Files (x86)\LuaEdit 2010\foo\init.lua' no file '.\foo.lua' no file 'C:\Program Files (x86)\LuaEdit 2010\foo.dll' no file 'C:\Program Files (x86)\LuaEdit 2010\loadall.dll' no file '.\battle.dll' I have also tried creating these files prior to adding them to a solution and still get the same error. Is there some setting I'm missing? It would be great to have an IDE/debugger but it's useless to me if it can't run linked functions.

Read the article
How To Deal With Exceptions In Large Code Bases

- by peter

Hi All, I have a large C# code base. It seems quite buggy at times, and I was wondering if there is a quick way to improve finding and diagnosing issues that are occuring on client PCs. The most pressing issue is that exceptions occur in the software, are caught, and even reported through to me. The problem is that by the time they are caught the original cause of the exception is lost. I.e. If an exception was caught in a specific method, but that method calls 20 other methods, and those methods each call 20 other methods. You get the picture, a null reference exception is impossible to figure out, especially if it occured on a client machine. I have currently found some places where I think errors are more likely to occur and wrapped these directly in their own try catch blocks. Is that the only solution? I could be here a long time. I don't care that the exception will bring down the current process (it is just a thread anyway - not the main application), but I care that the exceptions come back and don't help with troubleshooting. Any ideas? I am well aware that I am probably asking a question which sounds silly, and may not have a straightforward answer. All the same some discussion would be good.

Read the article
JavaScript - Efficiently find all elements containing one of a large set of strings

- by noah

I have a set of strings and I need to find all all of the occurrences in an HTML document. Where the string occurs is important because I need to handle each case differently: String is all or part of an attribute. e.g., the string is foo: <input value="foo"> - Add class ATTR to the element. String is the full text of an element. e.g., <button>foo</button> - Add class TEXT to the element. String is inline in the text of an element. e.g., <p>I love foo</p> - Wrap the text in a span tag with class TEXT. Also, I need to match the longest string first. e.g., if I have foo and foobar, then <p>I love foobar</p> should become <p>I love <span class="TEXT">foobar</span></p>, not <p>I love <span class="TEXT">foo</span>bar</p>. The inline text is easy enough: Sort the strings descending by length and find and replace each in document.body.innerHTML with <span class="TEXT">$1</span>, although I'm not sure if that is the most efficient way to go. For the attributes, I can do something like this: sortedStrings.each(function(it) { document.body.innerHTML.replace(new RegExp('(\S+?)="[^"]*'+escapeRegExChars(it)+'[^"]*"','g'),function(s,attr) { $('[+attr+'*='+it+']').addClass('ATTR'); }); }); Again, that seems inefficient. Lastly, for the full text elements, a depth first search of the document that compares the innerHTML to each string will work, but for a large number of strings, it seems very inefficient. Any answer that offers performance improvements gets an upvote :)

Read the article
How can i fetch the large image from url

- by Kutbi

i used below code to fetch the image from url.but its not working for large image.. i missing something to add for that type of image to fetch. imgView = (ImageView)findViewById(R.id.ImageView01); imgView.setImageBitmap(loadBitmap("http://www.360technosoft.com/mx4.jpg")); //imgView.setImageBitmap(loadBitmap("http://sugardaddydiaries.com/wp-content/uploads/2010/12/how_do_i_get_sugar_daddy.jpg")); //setImageDrawable("http://sugardaddydiaries.com/wp-content/uploads/2010/12/holding-money-copy.jpg"); //Drawable drawable = LoadImageFromWebOperations("http://www.androidpeople.com/wp-content/uploads/2010/03/android.png"); //imgView.setImageDrawable(drawable); /* try { ImageView i = (ImageView)findViewById(R.id.ImageView01); Bitmap bitmap = BitmapFactory.decodeStream((InputStream)new URL("http://sugardaddydiaries.com/wp-content/uploads/2010/12/holding-money-copy.jpg").getContent()); i.setImageBitmap(bitmap); } catch (MalformedURLException e) { System.out.println("hello"); } catch (IOException e) { System.out.println("hello"); }*/ } protected Drawable ImageOperations(Context context, String string, String string2) { // TODO Auto-generated method stub try { InputStream is = (InputStream) this.fetch(string); Drawable d = Drawable.createFromStream(is, "src"); return d; } catch (MalformedURLException e) { e.printStackTrace(); return null; } catch (IOException e) { e.printStackTrace(); return null; } }

Read the article
What database strategy to choose for a large web application

- by Snoopy

I have to rewrite a large database application, running on 32 servers. The hardware is up to date, each machine has two quad core Xeon and 32 GByte RAM. The database is multi-tenant, each customer has his own file, around 5 to 10 GByte each. I run around 50 databases on this hardware. The app is open to the web, so I have no control on the load. There are no really complex queries, so SQL is not required if there is a better solution. The databases get updated via FTP every day at midnight. The database is read-only. C# is my favourite language and I want to use ASP.NET MVC. I thought about the following options: Use two big SQL servers running SQL Server 2012 to serve the 32 servers with data. On the 32 servers running IIS hosting providing REST services. Denormalize the database and use Redis on each webserver. Use booksleeve as a Redis client. Use a combination of SQL Server and Redis Use SQL Server 2012 together with Hadoop Use Hadoop without SQL Server What is the best way for a read-only database, to get the best performance without loosing maintainability? Does Map-Reduce make sense at all in such a scenario? The reason for the rewrite is, the old app written in C++ with ISAM technology is too slow, the interfaces are old fashioned and not nice to use from an website, especially when using ajax. The app uses a relational datamodel with many tables, but it is possible to write one accerlerator table where all queries can be performed on, and all other information from the other tables are possible by a simple key lookup.

Read the article
Large ListView containing images in Android

- by Marco W.

For various Android applications, I need large ListViews, i.e. such views with 100-300 entries. All entries must be loaded in bulk when the application is started, as some sorting and processing is necessary and the application cannot know which items to display first, otherwise. So far, I've been loading the images for all items in bulk as well, which are then saved in an ArrayList<CustomType> together with the rest of the data for each entry. But of course, this is not a good practice, as you're very likely to have an OutOfMemoryException then: The references to all images in the ArrayList prevent the garbage collector from working. So the best solution is, obviously, to load only the text data in bulk whereas the images are then loaded as needed, right? The Google Play application does this, for example: You can see that images are loaded as you scroll to them, i.e. they are probably loaded in the adapter's getView() method. But with Google Play, this is a different problem, anyway, as the images must be loaded from the Internet, which is not the case for me. My problem is not that loading the images takes too long, but storing them requires too much memory. So what should I do with the images? Load in getView(), when they are really needed? Would make scrolling sluggish. So calling an AsyncTask then? Or just a normal Thread? Parametrize it? I could save the images that are already loaded into a HashMap<String,Bitmap>, so that they don't need to be loaded again in getView(). But if this is done, you have the memory problem again: The HashMap stores references to all images, so in the end, you could have the OutOfMemoryException again. I know that there are already lots of questions here that discuss "Lazy loading" of images. But they mainly cover the problem of slow loading, not too much memory consumption.

Read the article
Fastest way to perform subset test operation on a large collection of sets with same domain

- by niktech

Assume we have trillions of sets stored somewhere. The domain for each of these sets is the same. It is also finite and discrete. So each set may be stored as a bit field (eg: 0000100111...) of a relatively short length (eg: 1024). That is, bit X in the bitfield indicates whether item X (of 1024 possible items) is included in the given set or not. Now, I want to devise a storage structure and an algorithm to efficiently answer the query: what sets in the data store have set Y as a subset. Set Y itself is not present in the data store and is specified at run time. Now the simplest way to solve this would be to AND the bitfield for set Y with bit fields of every set in the data store one by one, picking the ones whose AND result matches Y's bitfield. How can I speed this up? Is there a tree structure (index) or some smart algorithm that would allow me to perform this query without having to AND every stored set's bitfield? Are there databases that already support such operations on large collections of sets?

Read the article
Python Speeding Up Retrieving data from extremely large string

- by Burninghelix123

I have a list I converted to a very very long string as I am trying to edit it, as you can gather it's called tempString. It works as of now it just takes way to long to operate, probably because it is several different regex subs. They are as follow: tempString = ','.join(str(n) for n in coords) tempString = re.sub(',{2,6}', '_', tempString) tempString = re.sub("[^0-9\-\.\_]", ",", tempString) tempString = re.sub(',+', ',', tempString) clean1 = re.findall(('[-+]?[0-9]*\.?[0-9]+,[-+]?[0-9]*\.?[0-9]+,' '[-+]?[0-9]*\.?[0-9]+'), tempString) tempString = '_'.join(str(n) for n in clean1) tempString = re.sub(',', ' ', tempString) Basically it's a long string containing commas and about 1-5 million sets of 4 floats/ints (mixture of both possible),: -5.65500020981,6.88999986649,-0.454999923706,1,,,-5.65500020981,6.95499992371,-0.454999923706,1,,, The 4th number in each set I don't need/want, i'm essentially just trying to split the string into a list with 3 floats in each separated by a space. The above code works flawlessly but as you can imagine is quite time consuming on large strings. I have done a lot of research on here for a solution but they all seem geared towards words, i.e. swapping out one word for another. EDIT: Ok so this is the solution i'm currently using: def getValues(s): output = [] while s: # get the three values you want, discard the 3 commas, and the # remainder of the string v1, v2, v3, _, _, _, s = s.split(',', 6) output.append("%s %s %s" % (v1.strip(), v2.strip(), v3.strip())) return output coords = getValues(tempString) Anyone have any advice to speed this up even farther? After running some tests It still takes much longer than i'm hoping for. I've been glancing at numPy, but I honestly have absolutely no idea how to the above with it, I understand that after the above has been done and the values are cleaned up i could use them more efficiently with numPy, but not sure how NumPy could apply to the above. The above to clean through 50k sets takes around 20 minutes, I cant imagine how long it would be on my full string of 1 million sets. I'ts just surprising that the program that originally exported the data took only around 30 secs for the 1 million sets

Read the article
Out-of-memory algorithms for addressing large arrays

- by reve_etrange

I am trying to deal with a very large dataset. I have k = ~4200 matrices (varying sizes) which must be compared combinatorially, skipping non-unique and self comparisons. Each of k(k-1)/2 comparisons produces a matrix, which must be indexed against its parents (i.e. can find out where it came from). The convenient way to do this is to (triangularly) fill a k-by-k cell array with the result of each comparison. These are ~100 X ~100 matrices, on average. Using single precision floats, it works out to 400 GB overall. I need to 1) generate the cell array or pieces of it without trying to place the whole thing in memory and 2) access its elements (and their elements) in like fashion. My attempts have been inefficient due to reliance on MATLAB's eval() as well as save and clear occurring in loops. for i=1:k [~,m] = size(data{i}); cur_var = ['H' int2str(i)]; %# if i == 1; save('FileName'); end; %# If using a single MAT file and need to create it. eval([cur_var ' = cell(1,k-i);']); for j=i+1:k [~,n] = size(data{j}); eval([cur_var '{i,j} = zeros(m,n,''single'');']); eval([cur_var '{i,j} = compare(data{i},data{j});']); end save(cur_var,cur_var); %# Add '-append' when using a single MAT file. clear(cur_var); end The other thing I have done is to perform the split when mod((i+j-1)/2,max(factor(k(k-1)/2))) == 0. This divides the result into the largest number of same-size pieces, which seems logical. The indexing is a little more complicated, but not too bad because a linear index could be used. Does anyone know/see a better way?

Read the article
Compact data structure for storing a large set of integral values

- by Odrade

I'm working on an application that needs to pass around large sets of Int32 values. The sets are expected to contain ~1,000,000-50,000,000 items, where each item is a database key in the range 0-50,000,000. I expect distribution of ids in any given set to be effectively random over this range. The operations I need on the set are dirt simple: Add a new value Iterate over all of the values. There is a serious concern about the memory usage of these sets, so I'm looking for a data structure that can store the ids more efficiently than a simple List<int>or HashSet<int>. I've looked at BitArray, but that can be wasteful depending on how sparse the ids are. I've also considered a bitwise trie, but I'm unsure how to calculate the space efficiency of that solution for the expected data. A Bloom Filter would be great, if only I could tolerate the false negatives. I would appreciate any suggestions of data structures suitable for this purpose. I'm interested in both out-of-the-box and custom solutions. EDIT: To answer your questions: No, the items don't need to be sorted By "pass around" I mean both pass between methods and serialize and send over the wire. I clearly should have mentioned this. There could be a decent number of these sets in memory at once (~100).

Read the article
Handling large (object) datasets with PHP

- by Aron Rotteveel

I am currently working on a project that extensively relies on the EAV model. Both entities as their attributes are individually represented by a model, sometimes extending other models (or at least, base models). This has worked quite well so far since most areas of the application only rely on filtered sets of entities, and not the entire dataset. Now, however, I need to parse the entire dataset (IE: all entities and all their attributes) in order to provide a sorting/filtering algorithm based on the attributes. The application currently consists of aproximately 2200 entities, each with aproximately 100 attributes. Every entity is represented by a single model (for example Client_Model_Entity) and has a protected property called $_attributes, which is an array of Attribute objects. Each entity object is about 500KB, which results in an incredible load on the server. With 2000 entities, this means a single task would take 1GB of RAM (and a lot of CPU time) in order to work, which is unacceptable. Are there any patterns or common approaches to iterating over such large datasets? Paging is not really an option, since everything has to be taken into account in order to provide the sorting algorithm.

Read the article

< Previous Page | 70 71 72 73 74 75 76 77 78 79 80 81 | Next Page >