Search Results

Search found 14643 results on 586 pages for 'performance comparison'.

Page 150/586 | < Previous Page | 146 147 148 149 150 151 152 153 154 155 156 157  | Next Page >

  • Persistent (purely functional) Red-Black trees on disk performance

    - by Waneck
    I'm studying the best data structures to implement a simple open-source object temporal database, and currently I'm very fond of using Persistent Red-Black trees to do it. My main reasons for using persistent data structures is first of all to minimize the use of locks, so the database can be as parallel as possible. Also it will be easier to implement ACID transactions and even being able to abstract the database to work in parallel on a cluster of some kind. The great thing of this approach is that it makes possible implementing temporal databases almost for free. And this is something quite nice to have, specially for web and for data analysis (e.g. trends). All of this is very cool, but I'm a little suspicious about the overall performance of using a persistent data structure on disk. Even though there are some very fast disks available today, and all writes can be done asynchronously, so a response is always immediate, I don't want to build all application under a false premise, only to realize it isn't really a good way to do it. Here's my line of thought: - Since all writes are done asynchronously, and using a persistent data structure will enable not to invalidate the previous - and currently valid - structure, the write time isn't really a bottleneck. - There are some literature on structures like this that are exactly for disk usage. But it seems to me that these techniques will add more read overhead to achieve faster writes. But I think that exactly the opposite is preferable. Also many of these techniques really do end up with a multi-versioned trees, but they aren't strictly immutable, which is something very crucial to justify the persistent overhead. - I know there still will have to be some kind of locking when appending values to the database, and I also know there should be a good garbage collecting logic if not all versions are to be maintained (otherwise the file size will surely rise dramatically). Also a delta compression system could be thought about. - Of all search trees structures, I really think Red-Blacks are the most close to what I need, since they offer the least number of rotations. But there are some possible pitfalls along the way: - Asynchronous writes -could- affect applications that need the data in real time. But I don't think that is the case with web applications, most of the time. Also when real-time data is needed, another solutions could be devised, like a check-in/check-out system of specific data that will need to be worked on a more real-time manner. - Also they could lead to some commit conflicts, though I fail to think of a good example of when it could happen. Also commit conflicts can occur in normal RDBMS, if two threads are working with the same data, right? - The overhead of having an immutable interface like this will grow exponentially and everything is doomed to fail soon, so this all is a bad idea. Any thoughts? Thanks! edit: There seems to be a misunderstanding of what a persistent data structure is: http://en.wikipedia.org/wiki/Persistent_data_structure

    Read the article

  • Hadoop: Processing large serialized objects

    - by restrictedinfinity
    I am working on development of an application to process (and merge) several large java serialized objects (size of order GBs) using Hadoop framework. Hadoop stores distributes blocks of a file on different hosts. But as deserialization will require the all the blocks to be present on single host, its gonna hit the performance drastically. How can I deal this situation where different blocks have to cant be individually processed, unlike text files ?

    Read the article

  • calendar.getInstance() or calendar.clone()

    - by Pangea
    I need to make a copy of a given date 100s of times (I cannot pass-by-reference). I am wondering which of the below two are better options newTime=Calendar.getInstance().setTime(originalDate); OR newTime=originalDate.clone(); Performance is of main conern here. thx.

    Read the article

  • Is there a good .Net CSS aggregator that combines style sheets and minifies them?

    - by vfilby
    I am looking to see if there is an open source/free project that provides a CSS manager. I am looking for this mainly for performance tweaking and hoping there is a readymade project rather than building from scratch. Features I am looking for include: Combines multiple .css files into a single css file Optionally minifies the resulting .css file Works well with .Net (a user control, custom handler, etc) Is there a project out that that handles this?

    Read the article

  • Memcache vs MySQL in memory

    - by TimK
    I have a database that won't grow much in size. It's current size is about 1 GB. Achieving the fastest performance is desired. Question: When should I use Memcache vs simply using MySQL Innodb ability to store all my content within RAM (innodb_buffer_pool_size)?

    Read the article

  • If-else-if versus map

    - by perezvon
    Hi, Suppose I have such an if/else-if chain: if( x.GetId() == 1 ) { } else if( x.GetId() == 2 ) { } // ... 50 more else if statements What I wonder is, if I keep a map, will it be any better in terms of performance? (assuming keys are integers)

    Read the article

  • How fast should an interpreted language be today?

    - by Tarbal
    Is speed of the (main/only viable) implementation of an interpreted programming language a criteria today? What would be the optimal balance between speed and abstraction? Should scripting languages completely ignore all thoughts about performance and just follow the concepts of rapid development, readability, etc.? I'm asking this because I'm currently designing some experimental languages and interpreters

    Read the article

  • Is Core Animation causing my subviews to call -drawRect for every single frame?

    - by mystify
    I made a nice UIView subclass which paints all its stuff in -drawRect:, because people said that's good. That view is a subview of another. This another view is beeing animated with Core Animation: It's scaled down, rotated and moved. However, I encountered this: -drawRect seems to get called trillion of times during animation, and performance sucks. Is that normal or did I do something wrong, probably?

    Read the article

  • SQL Timstamp Function

    - by harrison
    Is there any difference between these two queries? select * from tbl where ts < '9999-12-31-24.00.00.000000'; and select * from tbl where ts < timestamp('9999-12-31-24.00.00.000000'); When is the timestamp function required? Is there a difference in performance?

    Read the article

  • Postgresql count+sort performance

    - by invictus
    I have built a small inventory system using postgresql and psycopg2. Everything works great, except, when I want to create aggregated summaries/reports of the content, I get really bad performance due to count()'ing and sorting. The DB schema is as follows: CREATE TABLE hosts ( id SERIAL PRIMARY KEY, name VARCHAR(255) ); CREATE TABLE items ( id SERIAL PRIMARY KEY, description TEXT ); CREATE TABLE host_item ( id SERIAL PRIMARY KEY, host INTEGER REFERENCES hosts(id) ON DELETE CASCADE ON UPDATE CASCADE, item INTEGER REFERENCES items(id) ON DELETE CASCADE ON UPDATE CASCADE ); There are some other fields as well, but those are not relevant. I want to extract 2 different reports: - List of all hosts with the number of items per, ordered from highest to lowest count - List of all items with the number of hosts per, ordered from highest to lowest count I have used 2 queries for the purpose: Items with host count: SELECT i.id, i.description, COUNT(hi.id) AS count FROM items AS i LEFT JOIN host_item AS hi ON (i.id=hi.item) GROUP BY i.id ORDER BY count DESC LIMIT 10; Hosts with item count: SELECT h.id, h.name, COUNT(hi.id) AS count FROM hosts AS h LEFT JOIN host_item AS hi ON (h.id=hi.host) GROUP BY h.id ORDER BY count DESC LIMIT 10; Problem is: the queries runs for 5-6 seconds before returning any data. As this is a web based application, 6 seconds are just not acceptable. The database is heavily populated with approximately 50k hosts, 1000 items and 400 000 host/items relations, and will likely increase significantly when (or perhaps if) the application will be used. After playing around, I found that by removing the "ORDER BY count DESC" part, both queries would execute instantly without any delay whatsoever (less than 20ms to finish the queries). Is there any way I can optimize these queries so that I can get the result sorted without the delay? I was trying different indexes, but seeing as the count is computed it is possible to utilize an index for this. I have read that count()'ing in postgresql is slow, but its the sorting that are causing me problems... My current workaround is to run the queries above as an hourly job, putting the result into a new table with an index on the count column for quick lookup. I use Postgresql 9.2.

    Read the article

  • javascript object's - private methods: which way is better.

    - by Praveen Prasad
    (function () { function User() { //some properties } //private fn 1 User.prototype._aPrivateFn = function () { //private function defined just like a public function, //for convetion underscore character is added } //private function type 2 //a closure function _anotherPrivateFunction() { // do something } //public function User.prototype.APublicFunction = function () { //call private fn1 this._aPrivateFn(); //call private fn2 _anotherPrivateFunction(); } window.UserX = User; })(); //which of the two ways of defining private methods of a javascript object is better way, specially in sense of memory management and performance.

    Read the article

  • What is the fastest findByName query with hibernate?

    - by Karussell
    I am sure I can improve the performance of the following findByName query of hibernate: public List<User> findByName(String name) { session.createCriteria(User.class).add(Restrictions.eq("name", name)).list(); } In which way should I improve it or even more important: in which ways should I improve it first? I will need the full object with all the collections (layz or not) and deps of this class.

    Read the article

  • Is using joins in select clause slow in Oracle?

    - by gniquil
    I would like to write a query like the following select username, (select state from addresses where addresses.username = users.username) email from users This works in Oracle (assuming the result from the inner query is unique). However, is there a performance penalty associated with this style of writing query?

    Read the article

  • Can MySQL reasonably perform queries on billions of rows?

    - by haxney
    I am planning on storing scans from a mass spectrometer in a MySQL database and would like to know whether storing and analyzing this amount of data is remotely feasible. I know performance varies wildly depending on the environment, but I'm looking for the rough order of magnitude: will queries take 5 days or 5 milliseconds? Input format Each input file contains a single run of the spectrometer; each run is comprised of a set of scans, and each scan has an ordered array of datapoints. There is a bit of metadata, but the majority of the file is comprised of arrays 32- or 64-bit ints or floats. Host system |----------------+-------------------------------| | OS | Windows 2008 64-bit | | MySQL version | 5.5.24 (x86_64) | | CPU | 2x Xeon E5420 (8 cores total) | | RAM | 8GB | | SSD filesystem | 500 GiB | | HDD RAID | 12 TiB | |----------------+-------------------------------| There are some other services running on the server using negligible processor time. File statistics |------------------+--------------| | number of files | ~16,000 | | total size | 1.3 TiB | | min size | 0 bytes | | max size | 12 GiB | | mean | 800 MiB | | median | 500 MiB | | total datapoints | ~200 billion | |------------------+--------------| The total number of datapoints is a very rough estimate. Proposed schema I'm planning on doing things "right" (i.e. normalizing the data like crazy) and so would have a runs table, a spectra table with a foreign key to runs, and a datapoints table with a foreign key to spectra. The 200 Billion datapoint question I am going to be analyzing across multiple spectra and possibly even multiple runs, resulting in queries which could touch millions of rows. Assuming I index everything properly (which is a topic for another question) and am not trying to shuffle hundreds of MiB across the network, is it remotely plausible for MySQL to handle this? UPDATE: additional info The scan data will be coming from files in the XML-based mzML format. The meat of this format is in the <binaryDataArrayList> elements where the data is stored. Each scan produces = 2 <binaryDataArray> elements which, taken together, form a 2-dimensional (or more) array of the form [[123.456, 234.567, ...], ...]. These data are write-once, so update performance and transaction safety are not concerns. My naïve plan for a database schema is: runs table | column name | type | |-------------+-------------| | id | PRIMARY KEY | | start_time | TIMESTAMP | | name | VARCHAR | |-------------+-------------| spectra table | column name | type | |----------------+-------------| | id | PRIMARY KEY | | name | VARCHAR | | index | INT | | spectrum_type | INT | | representation | INT | | run_id | FOREIGN KEY | |----------------+-------------| datapoints table | column name | type | |-------------+-------------| | id | PRIMARY KEY | | spectrum_id | FOREIGN KEY | | mz | DOUBLE | | num_counts | DOUBLE | | index | INT | |-------------+-------------| Is this reasonable?

    Read the article

  • How to handle very frequent updates to a Lucene index

    - by fsm
    I am trying to prototype an indexing/search application which uses very volatile indexing data sources (forums, social networks etc), here are some of the performance requirements, Very fast turn-around time (by this I mean that any new data (such as a new message on a forum) should be available in the search results very soon (less than a minute)) I need to discard old documents on a fairly regular basis to ensure that the search results are not dated. Last but not least, the search application needs to be responsive. (latency on the order of 100 milliseconds, and should support at least 10 qps) All of the requirements I have currently can be met w/o using Lucene (and that would let me satisfy all 1,2 and 3), but I am anticipating other requirements in the future (like search relevance etc) which Lucene makes easier to implement. However, since Lucene is designed for use cases far more complex than the one I'm currently working on, I'm having a hard time satisfying my performance requirements. Here are some questions, a. I read that the optimize() method in the IndexWriter class is expensive, and should not be used by applications that do frequent updates, what are the alternatives? b. In order to do incremental updates, I need to keep committing new data, and also keep refreshing the index reader to make sure it has the new data available. These are going to affect 1 and 3 above. Should I try duplicate indices? What are some common approaches to solving this problem? c. I know that Lucene provides a delete method, which lets you delete all documents that match a certain query, in my case, I need to delete all documents which are older than a certain age, now one option is to add a date field to every document and use that to delete documents later. Is it possible to do range queries on document ids (I can create my own id field since I think that the one created by lucene keeps changing) to delete documents? Is it any faster than comparing dates represented as strings? I know these are very open questions, so I am not looking for a detailed answer, I will try to treat all of your answers as suggestions and use them to inform my design. Thanks! Please let me know if you need any other information.

    Read the article

  • Is jdbc or ldap faster for basic read operations?

    - by Brandon
    I have a set of user data which I am try to access. Due to the way our company's employee data is set up, the information is available both through LDAP and through a table in our DB. I was curious, for standard read operations which would generally be a higher performance query?

    Read the article

  • Resources for Prformance testing

    - by munna
    Our small concern is entrusted with creating an Application on ASP.NET with client-server model. As we are almost done with the development we are creating a small team for Performance test. I have googled in the net about the topic but without much help. If anyone of you can share 'How, What and Why' about perf test, it would be great help.

    Read the article

< Previous Page | 146 147 148 149 150 151 152 153 154 155 156 157  | Next Page >