Search Results

Search found 10417 results on 417 pages for 'large'.

Page 32/417 | < Previous Page | 28 29 30 31 32 33 34 35 36 37 38 39  | Next Page >

  • Need a tool to search large structure text documents for words, phrases and related phrases

    - by pitosalas
    I have to keep up with structured documents containing things such as requests for proposals, government program reports, threat models and all kinds of things like that. They are in techno-legalese as I would call them: highly structured, with section numbering and 3, 4 and 5 levels of nesting. All in English I need a more efficient way to locate those paragraphs of nuggets that matter to me. So what I’d like is kind of a local document index/repository, that would allow me to have some standing queries and easily locate sections in documents that talk about my queries. Here’s an example: I’d like to load in 10 large PDF files, each of say 100 pages. Each PDF contains English text, formatted very nicely into paragraphs and sections. I’d like to specify that I am interested in “blogging platforms”, “weaknesses in Ruby”, “localization and internationalization” Ideally then look at a list that showed the section of text, the name of the document, and other information that seemed to be related to and/or include the words and phrases I specified. I am sure something like this exists. I would call it something like document indexing, document comprehension or structured searching.

    Read the article

  • Hibernate design to speed up querying of large dataset

    - by paddydub
    I currently have the below tables representing a bus network mapped in hibernate, accessed from a Spring MVC based bus route planner I'm trying to make my route planner application perform faster, I load all the above tables into Lists to perform the route planner logic. I would appreciate if anyone has any ideas of how to speed my performace Or any suggestions of another method to approach this problem of handling a large set of data Coordinate Connections Table (INT,INT,INT, DOUBLE)( Containing 50,000 Coordinate Connections) ID, FROMCOORDID, TOCOORDID, DISTANCE 1 1 2 0.383657 2 1 17 0.173201 3 1 63 0.258781 4 1 64 0.013726 5 1 65 0.459829 6 1 95 0.458769 Coordinate Table (INT,DECIMAL, DECIMAL) (Containing 4700 Coordinates) ID , LAT, LNG 0 59.352669 -7.264341 1 59.352669 -7.264341 2 59.350012 -7.260653 3 59.337585 -7.189798 4 59.339221 -7.193582 5 59.341408 -7.205888 Bus Stop Table (INT, INT, INT)(Containing 15000 Stops) StopID RouteID COORDINATEID 1000100001 100 17 1000100002 100 18 1000100003 100 19 1000100004 100 20 1000100005 100 21 1000100006 100 22 1000100007 100 23 This is how long it takes to load all the data from each table: stop.findAll = 148ms, stops.size: 15670 Hibernate: select coordinate0_.COORDINATEID as COORDINA1_2_, coordinate0_.LAT as LAT2_, coordinate0_.LNG as LNG2_ from COORDINATES coordinate0_ coord.findAll = 51ms , coordinates.size: 4704 Hibernate: select coordconne0_.COORDCONNECTIONID as COORDCON1_3_, coordconne0_.DISTANCE as DISTANCE3_, coordconne0_.FROMCOORDID as FROMCOOR3_3_, coordconne0_.TOCOORDID as TOCOORDID3_ from COORDCONNECTIONS coordconne0_ coordinateConnectionDao.findAll = 238ms ; coordConnectioninates.size:48132 Hibernate Annotations @Entity @Table(name = "STOPS") public class Stop implements Serializable { @Id @GeneratedValue(strategy = GenerationType.AUTO) @Column(name = "STOPID") private int stopID; @Column(name = "ROUTEID", nullable = false) private int routeID; @ManyToOne(fetch = FetchType.LAZY) @JoinColumn(name = "COORDINATEID", nullable = false) private Coordinate coordinate; } @Table(name = "COORDINATES") public class Coordinate { @Id @GeneratedValue @Column(name = "COORDINATEID") private int CoordinateID; @Column(name = "LAT") private double latitude; @Column(name = "LNG") private double longitude; } @Entity @Table(name = "COORDCONNECTIONS") public class CoordConnection { @Id @GeneratedValue @Column(name = "COORDCONNECTIONID") private int CoordinateID; @ManyToOne(fetch = FetchType.LAZY) @JoinColumn(name = "FROMCOORDID", nullable = false) private Coordinate fromCoordID; @ManyToOne(fetch = FetchType.LAZY) @JoinColumn(name = "TOCOORDID", nullable = false) private Coordinate toCoordID; @Column(name = "DISTANCE", nullable = false) private double distance; }

    Read the article

  • Filter large amounts of data in a table w/ jQuery

    - by Bry4n
    I work for a transit agency and I have large amounts of data (mostly times), and I need a way to filter the data using two textboxes (To and From). I found jQuery quick search, but it seems to only work with one textbox. If anyone has any ideas via jQuery or some other client side library, that would be fantastic. Ideal example: To: [Textbox] From:[Textbox] <table> <tr> <td>69th street</td><td>5:00pm</td><td>5:06pm</td><td>5:10pm</td><td>5:20pm</td> </tr> <tr> <td>Millbourne</td><td>5:09pm</td><td>5:15pm</td><td>5:20pm</td><td>5:25pm</td> </tr> <tr> <td>Spring Garden</td><td>6:00pm</td><td>6:15pm</td><td>6:20pm</td><td>6:25pm</td> </tr> </table> So If I start typing in one of the stations in the To: textbox it either displays dynamically like the quick search or i have to press a button (either or) and then in the from: textbox. Lastly it shows me to: station and all its times on the left and the from: station and all its times on the right.

    Read the article

  • Good PHP / MYSQL hashing solution for large number of text values

    - by Dave
    Short descriptio: Need hashing algorithm solution in php for large number of text values. Long description. PRODUCT_OWNER_TABLE serial_number (auto_inc), product_name, owner_id OWNER_TABLE owner_id (auto_inc), owener_name I need to maintain a database of 200000 unique products and their owners (AND all subsequent changes to ownership). Each product has one owner, but an owner may have MANY different products. Owner names are "Adam Smith", "John Reeves", etc, just text values (quite likely to be unicode as well). I want to optimize the database design, so what i was thinking was, every week when i run this script, it fetchs the owner of a proudct, then checks against a table i suppose similar to PRODUCT_OWNER_TABLE, fetching the owner_id. It then looks up owner_id in OWNER_TABLE. If it matches, then its the same, so it moves on. The problem is when its different... To optimize the database, i think i should be checking against the other "owner_name" entries in OWNER_TABLE to see if that value exists there. If it does, then i should use that owner_id. If it doesnt, then i should add another entry. Note that there is nothing special about the "name". as long as i maintain the correct linkagaes AND make the OWNER_TABLE "read-only, append-new" type table - I should be able create a historical archive of ownership. I need to do this check for 200000 entries, with i dont know how many unique owner names (~50000?). I think i need a hashing solution - the OWNER_TABLE wont be sorted, so search algos wont be optimal. programming language is PHP. database is MYSQL.

    Read the article

  • table design for storing large number of rows

    - by hyperboreean
    I am trying to store in a postgresql database some unique identifiers along with the site they have been seen on. I can't really decide which of the following 3 option to choose in order to be faster and easy maintainable. The table would have to provide the following information: the unique identifier which unfortunately it's text the sites on which that unique identifier has been seen The amount of data that would have to hold is rather large: there are around 22 millions unique identifiers that I know of. So I thought about the following designs of the table: id - integer identifier - text seen_on_site - an integer, foreign key to a sites table This approach would require around 22 mil multiplied by the number of sites. id - integer identifier - text seen_on_site_1 - boolean seen_on_site_2 - boolean ............ seen_on_site_n - boolean Hopefully the number of sites won't go past 10. This would require only the number of unique identifiers that I know of, that is around 20 millions, but it would make it hard to work with it from an ORM perspective. one table that would store only unique identifiers, like in: id - integer unique_identifier - text, one table that would store only sites, like in: id - integer site - text and one many to many relation, like: id - integer, unique_id - integer (fk to the table storing identifiers) site_id - integer (fk to sites table) another approach would be to have a table that stores unique identifiers for each site So, which one seems like a better approach to take on the long run?

    Read the article

  • Filter large amounts of data from a HTML table w/ jQuery

    - by Bry4n
    I work for a transit agency and I have large amounts of data (mostly times), and I need a way to filter the data using two textboxes (To and From). I found jQuery quick search, but it seems to only work with one textbox. If anyone has any ideas via jQuery or some other client side library, that would be fantastic. Ideal example: To: [Textbox] From:[Textbox] <table> <tr> <td>69th street</td><td>5:00pm</td><td>5:06pm</td><td>5:10pm</td><td>5:20pm</td> </tr> <tr> <td>Millbourne</td><td>5:09pm</td><td>5:15pm</td><td>5:20pm</td><td>5:25pm</td> </tr> <tr> <td>Spring Garden</td><td>6:00pm</td><td>6:15pm</td><td>6:20pm</td><td>6:25pm</td> </tr> </table> I have an HTML page with a giant table on it listing the station names and each stations times. I want to be able to put my starting location in one box and my ending location in another box and have all the items in the table disappear that don't relate to either of the two locations typed in, leaving only two rows that match what was typed in (even if they don't spell it right or type it all the way) Similar to the jQuery quick search plugin

    Read the article

  • How do you make life easier for yourself when developing a really large database

    - by Hannes de Jager
    I am busy developing 2 web based systems with MySql databases and the amount of tables/views/stored routines is really becoming a lot and it is more and more challenging to handle the complexity. Now in programming languages we have namespacing e.g. Java packages, C++ namespaces to partition the software, grouping it together to make things more understandable. Databases on the other hand have more of a flat structure (MySql at least) e.g. tables and stored procedures are on the same level. So one have to be more creative, creating naming conventions, perhaps use more than one database or using tools to visualize things. What methods do you use to ease the pain? To be effective while developing your databases? To not get lost in a sea of tables and fields and stored procs? Feel free to mention tools you use also, but try to restrict it to open source and preferably Linux solutions if thats OK. b.t.w How many tables would a database have to be considered large in terms of design?

    Read the article

  • Write contents of custom View to large Image file on SD card

    - by JFortney
    I have a class that extends View. I override the onDraw method and allow the user to draw on the screen. I am at the point where I want to save this view as an image. I Can use buildDrawingCache and getDrawingCache to create a bitmap that I can write to the SD card. However, the image is not good quality at a large size, it has jagged edges. Since I have a View and I use Paths I can transform all by drawing to a bigger size. I just don't know how to make the Canvas bigger so when I call getDrawingCache it doesn't crop all the paths I am just transformed. What is happening is I transform all my paths but when I write the Bitmap to file I am only getting the "viewport" of the actual screen size. I want something much bigger. Any help in the right direction would be greatly appreciated. I have been reading the docs and books and am at a loss. Thanks Jon

    Read the article

  • Delphi fast large bitmap creation (without clearing)

    - by Ritsaert Hornstra
    When using the TBitmap wrapper for a GDI bitmap from the unit Graphics I noticed it will always clear out the bitmap (using a PatBlt call) when setting up a bitmap with SetSize( w, h ). When I copy in the bits later on (see routine below) it seems ScanLine is the fastest possibility and not SetDIBits. function ToBitmap: TBitmap; var i, N, x: Integer; S, D: PAnsiChar; begin Result := TBitmap.Create(); Result.PixelFormat := pf32bit; Result.SetSize( width, height ); S := Src; D := Result.ScanLine[ 0 ]; x := Integer( Result.ScanLine[ 1 ] ) - Integer( D ); N := width * sizeof( longword ); for i := 0 to height - 1 do begin Move( S^, D^, N ); Inc( S, N ); Inc( D, x ); end; end; The bitmaps I need to work with are quite large (150MB of RGB memory). With these iomages it takes 150ms to simply create an empty bitmap and a further 140ms to overwrite it's contents. Is there a way of initializing a TBitmap with the correct size WITHOUT initializing the pixels itself and leaving the memory of the pixels uninitialized (eg dirty)? Or is there another way to do such a thing. I know we could work on the pixels in place but this still leaves the 150ms of unnessesary initializtion of the pixels.

    Read the article

  • handle large Parcelable ArrayList in Android

    - by Gal Ben-Haim
    I'm developing an Android app that is a client to a JSON webservice API. I have classes of resource objects (some are nested) and I pass results from an IntentService that access the webserive using the Parcelable interface for all the resource classes. the webservice returns arrays or results that can be potentially large (because of the nesting, for example, a post object also contains comments array, each comment also contains a user object). currently I'm either inserting the results into a SQlite database or displaying them in a ListView. (my relevant methods are accepting ArrayList<resourceClass> as arguments). (some data need to be persistent stored and some should not). since I don't know what size of lists I can handle this way without reaching the memory limits, is this a good practice ? is it a better idea to save the parsed JSON to a local file immediately and pass the file path to the ResultReceiver, then either insert to database from that file or display the data ? is there a better way to handle this ? btw - I'm parsing the JSON as a stream with Gson's Reader so there shouldn't be memory issues at that stage.

    Read the article

  • Way to store a large dictionary with low memory footprint + fast lookups (on Android)

    - by BobbyJim
    I'm developing an android word game app that needs a large (~250,000 word dictionary) available. I need: reasonably fast look ups e.g. constant time preferable, need to do maybe 200 lookups a second on occasion to solve a word puzzle and maybe 20 lookups within 0.2 second more often to check words the user just spelled. EDIT: Lookups are typically asking "Is in the dictionary?". I'd like to support up to two wildcards in the word as well, but this is easy enough by just generating all possible letters the wildcards could have been and checking the generated words (i.e. 26 * 26 lookups for a word with two wildcards). as it's a mobile app, using as little memory as possible and requiring only a small initial download for the dictionary data is top priority. My first naive attempts used Java's HashMap class, which caused an out of memory exception. I've looked into using the SQL lite databases available on android, but this seems like overkill. What's a good way to do what I need?

    Read the article

  • JavaScript - Efficiently find all elements containing one of a large set of strings

    - by noah
    I have a set of strings and I need to find all all of the occurrences in an HTML document. Where the string occurs is important because I need to handle each case differently: String is all or part of an attribute. e.g., the string is foo: <input value="foo"> - Add class ATTR to the element. String is the full text of an element. e.g., <button>foo</button> - Add class TEXT to the element. String is inline in the text of an element. e.g., <p>I love foo</p> - Wrap the text in a span tag with class TEXT. Also, I need to match the longest string first. e.g., if I have foo and foobar, then <p>I love foobar</p> should become <p>I love <span class="TEXT">foobar</span></p>, not <p>I love <span class="TEXT">foo</span>bar</p>. The inline text is easy enough: Sort the strings descending by length and find and replace each in document.body.innerHTML with <span class="TEXT">$1</span>, although I'm not sure if that is the most efficient way to go. For the attributes, I can do something like this: sortedStrings.each(function(it) { document.body.innerHTML.replace(new RegExp('(\S+?)="[^"]*'+escapeRegExChars(it)+'[^"]*"','g'),function(s,attr) { $('[+attr+'*='+it+']').addClass('ATTR'); }); }); Again, that seems inefficient. Lastly, for the full text elements, a depth first search of the document that compares the innerHTML to each string will work, but for a large number of strings, it seems very inefficient. Any answer that offers performance improvements gets an upvote :)

    Read the article

  • How To Deal With Exceptions In Large Code Bases

    - by peter
    Hi All, I have a large C# code base. It seems quite buggy at times, and I was wondering if there is a quick way to improve finding and diagnosing issues that are occuring on client PCs. The most pressing issue is that exceptions occur in the software, are caught, and even reported through to me. The problem is that by the time they are caught the original cause of the exception is lost. I.e. If an exception was caught in a specific method, but that method calls 20 other methods, and those methods each call 20 other methods. You get the picture, a null reference exception is impossible to figure out, especially if it occured on a client machine. I have currently found some places where I think errors are more likely to occur and wrapped these directly in their own try catch blocks. Is that the only solution? I could be here a long time. I don't care that the exception will bring down the current process (it is just a thread anyway - not the main application), but I care that the exceptions come back and don't help with troubleshooting. Any ideas? I am well aware that I am probably asking a question which sounds silly, and may not have a straightforward answer. All the same some discussion would be good.

    Read the article

  • How can i fetch the large image from url

    - by Kutbi
    i used below code to fetch the image from url.but its not working for large image.. i missing something to add for that type of image to fetch. imgView = (ImageView)findViewById(R.id.ImageView01); imgView.setImageBitmap(loadBitmap("http://www.360technosoft.com/mx4.jpg")); //imgView.setImageBitmap(loadBitmap("http://sugardaddydiaries.com/wp-content/uploads/2010/12/how_do_i_get_sugar_daddy.jpg")); //setImageDrawable("http://sugardaddydiaries.com/wp-content/uploads/2010/12/holding-money-copy.jpg"); //Drawable drawable = LoadImageFromWebOperations("http://www.androidpeople.com/wp-content/uploads/2010/03/android.png"); //imgView.setImageDrawable(drawable); /* try { ImageView i = (ImageView)findViewById(R.id.ImageView01); Bitmap bitmap = BitmapFactory.decodeStream((InputStream)new URL("http://sugardaddydiaries.com/wp-content/uploads/2010/12/holding-money-copy.jpg").getContent()); i.setImageBitmap(bitmap); } catch (MalformedURLException e) { System.out.println("hello"); } catch (IOException e) { System.out.println("hello"); }*/ } protected Drawable ImageOperations(Context context, String string, String string2) { // TODO Auto-generated method stub try { InputStream is = (InputStream) this.fetch(string); Drawable d = Drawable.createFromStream(is, "src"); return d; } catch (MalformedURLException e) { e.printStackTrace(); return null; } catch (IOException e) { e.printStackTrace(); return null; } }

    Read the article

  • What database strategy to choose for a large web application

    - by Snoopy
    I have to rewrite a large database application, running on 32 servers. The hardware is up to date, each machine has two quad core Xeon and 32 GByte RAM. The database is multi-tenant, each customer has his own file, around 5 to 10 GByte each. I run around 50 databases on this hardware. The app is open to the web, so I have no control on the load. There are no really complex queries, so SQL is not required if there is a better solution. The databases get updated via FTP every day at midnight. The database is read-only. C# is my favourite language and I want to use ASP.NET MVC. I thought about the following options: Use two big SQL servers running SQL Server 2012 to serve the 32 servers with data. On the 32 servers running IIS hosting providing REST services. Denormalize the database and use Redis on each webserver. Use booksleeve as a Redis client. Use a combination of SQL Server and Redis Use SQL Server 2012 together with Hadoop Use Hadoop without SQL Server What is the best way for a read-only database, to get the best performance without loosing maintainability? Does Map-Reduce make sense at all in such a scenario? The reason for the rewrite is, the old app written in C++ with ISAM technology is too slow, the interfaces are old fashioned and not nice to use from an website, especially when using ajax. The app uses a relational datamodel with many tables, but it is possible to write one accerlerator table where all queries can be performed on, and all other information from the other tables are possible by a simple key lookup.

    Read the article

  • Fastest way to perform subset test operation on a large collection of sets with same domain

    - by niktech
    Assume we have trillions of sets stored somewhere. The domain for each of these sets is the same. It is also finite and discrete. So each set may be stored as a bit field (eg: 0000100111...) of a relatively short length (eg: 1024). That is, bit X in the bitfield indicates whether item X (of 1024 possible items) is included in the given set or not. Now, I want to devise a storage structure and an algorithm to efficiently answer the query: what sets in the data store have set Y as a subset. Set Y itself is not present in the data store and is specified at run time. Now the simplest way to solve this would be to AND the bitfield for set Y with bit fields of every set in the data store one by one, picking the ones whose AND result matches Y's bitfield. How can I speed this up? Is there a tree structure (index) or some smart algorithm that would allow me to perform this query without having to AND every stored set's bitfield? Are there databases that already support such operations on large collections of sets?

    Read the article

  • Large ListView containing images in Android

    - by Marco W.
    For various Android applications, I need large ListViews, i.e. such views with 100-300 entries. All entries must be loaded in bulk when the application is started, as some sorting and processing is necessary and the application cannot know which items to display first, otherwise. So far, I've been loading the images for all items in bulk as well, which are then saved in an ArrayList<CustomType> together with the rest of the data for each entry. But of course, this is not a good practice, as you're very likely to have an OutOfMemoryException then: The references to all images in the ArrayList prevent the garbage collector from working. So the best solution is, obviously, to load only the text data in bulk whereas the images are then loaded as needed, right? The Google Play application does this, for example: You can see that images are loaded as you scroll to them, i.e. they are probably loaded in the adapter's getView() method. But with Google Play, this is a different problem, anyway, as the images must be loaded from the Internet, which is not the case for me. My problem is not that loading the images takes too long, but storing them requires too much memory. So what should I do with the images? Load in getView(), when they are really needed? Would make scrolling sluggish. So calling an AsyncTask then? Or just a normal Thread? Parametrize it? I could save the images that are already loaded into a HashMap<String,Bitmap>, so that they don't need to be loaded again in getView(). But if this is done, you have the memory problem again: The HashMap stores references to all images, so in the end, you could have the OutOfMemoryException again. I know that there are already lots of questions here that discuss "Lazy loading" of images. But they mainly cover the problem of slow loading, not too much memory consumption.

    Read the article

  • Statistical analysis on large data set to be published on the web

    - by dassouki
    I have a non-computer related data logger, that collects data from the field. This data is stored as text files, and I manually lump the files together and organize them. The current format is through a csv file per year per logger. Each file is around 4,000,000 lines x 7 loggers x 5 years = a lot of data. some of the data is organized as bins item_type, item_class, item_dimension_class, and other data is more unique, such as item_weight, item_color, date_collected, and so on ... Currently, I do statistical analysis on the data using a python/numpy/matplotlib program I wrote. It works fine, but the problem is, I'm the only one who can use it, since it and the data live on my computer. I'd like to publish the data on the web using a postgres db; however, I need to find or implement a statistical tool that'll take a large postgres table, and return statistical results within an adequate time frame. I'm not familiar with python for the web; however, I'm proficient with PHP on the web side, and python on the offline side. users should be allowed to create their own histograms, data analysis. For example, a user can search for all items that are blue shipped between week x and week y, while another user can search for sort the weight distribution of all items by hour for all year long. I was thinking of creating and indexing my own statistical tools, or automate the process somehow to emulate most queries. This seemed inefficient. I'm looking forward to hearing your ideas Thanks

    Read the article

  • Python Speeding Up Retrieving data from extremely large string

    - by Burninghelix123
    I have a list I converted to a very very long string as I am trying to edit it, as you can gather it's called tempString. It works as of now it just takes way to long to operate, probably because it is several different regex subs. They are as follow: tempString = ','.join(str(n) for n in coords) tempString = re.sub(',{2,6}', '_', tempString) tempString = re.sub("[^0-9\-\.\_]", ",", tempString) tempString = re.sub(',+', ',', tempString) clean1 = re.findall(('[-+]?[0-9]*\.?[0-9]+,[-+]?[0-9]*\.?[0-9]+,' '[-+]?[0-9]*\.?[0-9]+'), tempString) tempString = '_'.join(str(n) for n in clean1) tempString = re.sub(',', ' ', tempString) Basically it's a long string containing commas and about 1-5 million sets of 4 floats/ints (mixture of both possible),: -5.65500020981,6.88999986649,-0.454999923706,1,,,-5.65500020981,6.95499992371,-0.454999923706,1,,, The 4th number in each set I don't need/want, i'm essentially just trying to split the string into a list with 3 floats in each separated by a space. The above code works flawlessly but as you can imagine is quite time consuming on large strings. I have done a lot of research on here for a solution but they all seem geared towards words, i.e. swapping out one word for another. EDIT: Ok so this is the solution i'm currently using: def getValues(s): output = [] while s: # get the three values you want, discard the 3 commas, and the # remainder of the string v1, v2, v3, _, _, _, s = s.split(',', 6) output.append("%s %s %s" % (v1.strip(), v2.strip(), v3.strip())) return output coords = getValues(tempString) Anyone have any advice to speed this up even farther? After running some tests It still takes much longer than i'm hoping for. I've been glancing at numPy, but I honestly have absolutely no idea how to the above with it, I understand that after the above has been done and the values are cleaned up i could use them more efficiently with numPy, but not sure how NumPy could apply to the above. The above to clean through 50k sets takes around 20 minutes, I cant imagine how long it would be on my full string of 1 million sets. I'ts just surprising that the program that originally exported the data took only around 30 secs for the 1 million sets

    Read the article

  • Out-of-memory algorithms for addressing large arrays

    - by reve_etrange
    I am trying to deal with a very large dataset. I have k = ~4200 matrices (varying sizes) which must be compared combinatorially, skipping non-unique and self comparisons. Each of k(k-1)/2 comparisons produces a matrix, which must be indexed against its parents (i.e. can find out where it came from). The convenient way to do this is to (triangularly) fill a k-by-k cell array with the result of each comparison. These are ~100 X ~100 matrices, on average. Using single precision floats, it works out to 400 GB overall. I need to 1) generate the cell array or pieces of it without trying to place the whole thing in memory and 2) access its elements (and their elements) in like fashion. My attempts have been inefficient due to reliance on MATLAB's eval() as well as save and clear occurring in loops. for i=1:k [~,m] = size(data{i}); cur_var = ['H' int2str(i)]; %# if i == 1; save('FileName'); end; %# If using a single MAT file and need to create it. eval([cur_var ' = cell(1,k-i);']); for j=i+1:k [~,n] = size(data{j}); eval([cur_var '{i,j} = zeros(m,n,''single'');']); eval([cur_var '{i,j} = compare(data{i},data{j});']); end save(cur_var,cur_var); %# Add '-append' when using a single MAT file. clear(cur_var); end The other thing I have done is to perform the split when mod((i+j-1)/2,max(factor(k(k-1)/2))) == 0. This divides the result into the largest number of same-size pieces, which seems logical. The indexing is a little more complicated, but not too bad because a linear index could be used. Does anyone know/see a better way?

    Read the article

  • Handling large (object) datasets with PHP

    - by Aron Rotteveel
    I am currently working on a project that extensively relies on the EAV model. Both entities as their attributes are individually represented by a model, sometimes extending other models (or at least, base models). This has worked quite well so far since most areas of the application only rely on filtered sets of entities, and not the entire dataset. Now, however, I need to parse the entire dataset (IE: all entities and all their attributes) in order to provide a sorting/filtering algorithm based on the attributes. The application currently consists of aproximately 2200 entities, each with aproximately 100 attributes. Every entity is represented by a single model (for example Client_Model_Entity) and has a protected property called $_attributes, which is an array of Attribute objects. Each entity object is about 500KB, which results in an incredible load on the server. With 2000 entities, this means a single task would take 1GB of RAM (and a lot of CPU time) in order to work, which is unacceptable. Are there any patterns or common approaches to iterating over such large datasets? Paging is not really an option, since everything has to be taken into account in order to provide the sorting algorithm.

    Read the article

  • Compact data structure for storing a large set of integral values

    - by Odrade
    I'm working on an application that needs to pass around large sets of Int32 values. The sets are expected to contain ~1,000,000-50,000,000 items, where each item is a database key in the range 0-50,000,000. I expect distribution of ids in any given set to be effectively random over this range. The operations I need on the set are dirt simple: Add a new value Iterate over all of the values. There is a serious concern about the memory usage of these sets, so I'm looking for a data structure that can store the ids more efficiently than a simple List<int>or HashSet<int>. I've looked at BitArray, but that can be wasteful depending on how sparse the ids are. I've also considered a bitwise trie, but I'm unsure how to calculate the space efficiency of that solution for the expected data. A Bloom Filter would be great, if only I could tolerate the false negatives. I would appreciate any suggestions of data structures suitable for this purpose. I'm interested in both out-of-the-box and custom solutions. EDIT: To answer your questions: No, the items don't need to be sorted By "pass around" I mean both pass between methods and serialize and send over the wire. I clearly should have mentioned this. There could be a decent number of these sets in memory at once (~100).

    Read the article

  • How to optimize paging for large in memory database

    - by snakefoot
    I have an application where the entire database is implemented in memory using a stl-map for each table in the database. Each item in the stl-map is a complex object with references to other items in the other stl-maps. The application works with a large amount of data, so it uses more than 500 MByte RAM. Clients are able to contact the application and get a filtered version of the entire database. This is done by running through the entire database, and finding items relevant for the client. When the application have been running for an hour or so, then Windows 2003 SP2 starts to page out parts of the RAM for the application (Eventhough there is 16 GByte RAM on the machine). After the application have been partly paged out then a client logon takes a long time (10 mins) because it now generates a page fault for each pointer lookup in the stl-map. I can see it is possible to tell Windows to lock memory in RAM, but this is generally only recommended for device drivers, and only for "small" amounts of memory. I guess a poor mans solution could be to loop through the entire memory database, and thus tell Windows we are still interested in keeping the datamodel in RAM. I guess another poor mans solution could be to disable the pagefile completely on Windows. I guess the expensive solution would be a SQL database, and then rewrite the entire application to use a database layer. Then hopefully the database system will have implemented means to for fast access. Are there other more elegant solutions ?

    Read the article

  • download large files using servlet

    - by niks
    I am using Apache Tomcat Server 6 and Java 1.6 and am trying to write large mp3 files to the ServletOutputStream for a user to download. Files are ranging from a 50-750MB at the moment. The smaller files aren't causing too much of a problem but with the larger files it and getting socket exception broken pipe. File fileMp3 = new File(objDownloadSong.getStrSongFolder() + "/" + strSongIdName); FileInputStream fis = new FileInputStream(fileMp3); response.setContentType("audio/mpeg"); response.setHeader("Content-Disposition", "attachment; filename=\"" + strSongName + ".mp3\";"); response.setContentLength((int) fileMp3.length()); OutputStream os = response.getOutputStream(); try { int byteRead = 0; while ((byteRead = fis.read()) != -1) { os.write(byteRead); } os.flush(); } catch (Exception excp) { downloadComplete = "-1"; excp.printStackTrace(); } finally { os.close(); fis.close(); }

    Read the article

  • Large Switch statements: Bad OOP?

    - by Mystere Man
    I've always been of the opinion that large switch statements are a symptom of bad OOP design. In the past, I've read articles that discuss this topic and they have provided altnerative OOP based approaches, typically based on polymorphism to instantiate the right object to handle the case. I'm now in a situation that has a monsterous switch statement based on a stream of data from a TCP socket in which the protocol consists of basically newline terminated command, followed by lines of data, followed by an end marker. The command can be one of 100 different commands, so I'd like to find a way to reduce this monster switch statement to something more manageable. I've done some googling to find the solutions I recall, but sadly, Google has become a wasteland of irrelevant results for many kinds of queries these days. Are there any patterns for this sort of problem? Any suggestions on possible implementations? One thought I had was to use a dictionary lookup, matching the command text to the object type to instantiate. This has the nice advantage of merely creating a new object and inserting a new command/type in the table for any new commands. However, this also has the problem of type explosion. I now need 100 new classes, plus I have to find a way to interface them cleanly to the data model. Is the "one true switch statement" really the way to go? I'd appreciate your thoughts, opinions, or comments.

    Read the article

< Previous Page | 28 29 30 31 32 33 34 35 36 37 38 39  | Next Page >