Search Results

Search found 23103 results on 925 pages for 'performance issues and ha'.

Page 180/925 | < Previous Page | 176 177 178 179 180 181 182 183 184 185 186 187 | Next Page >

Postgresql count+sort performance

- by invictus

I have built a small inventory system using postgresql and psycopg2. Everything works great, except, when I want to create aggregated summaries/reports of the content, I get really bad performance due to count()'ing and sorting. The DB schema is as follows: CREATE TABLE hosts ( id SERIAL PRIMARY KEY, name VARCHAR(255) ); CREATE TABLE items ( id SERIAL PRIMARY KEY, description TEXT ); CREATE TABLE host_item ( id SERIAL PRIMARY KEY, host INTEGER REFERENCES hosts(id) ON DELETE CASCADE ON UPDATE CASCADE, item INTEGER REFERENCES items(id) ON DELETE CASCADE ON UPDATE CASCADE ); There are some other fields as well, but those are not relevant. I want to extract 2 different reports: - List of all hosts with the number of items per, ordered from highest to lowest count - List of all items with the number of hosts per, ordered from highest to lowest count I have used 2 queries for the purpose: Items with host count: SELECT i.id, i.description, COUNT(hi.id) AS count FROM items AS i LEFT JOIN host_item AS hi ON (i.id=hi.item) GROUP BY i.id ORDER BY count DESC LIMIT 10; Hosts with item count: SELECT h.id, h.name, COUNT(hi.id) AS count FROM hosts AS h LEFT JOIN host_item AS hi ON (h.id=hi.host) GROUP BY h.id ORDER BY count DESC LIMIT 10; Problem is: the queries runs for 5-6 seconds before returning any data. As this is a web based application, 6 seconds are just not acceptable. The database is heavily populated with approximately 50k hosts, 1000 items and 400 000 host/items relations, and will likely increase significantly when (or perhaps if) the application will be used. After playing around, I found that by removing the "ORDER BY count DESC" part, both queries would execute instantly without any delay whatsoever (less than 20ms to finish the queries). Is there any way I can optimize these queries so that I can get the result sorted without the delay? I was trying different indexes, but seeing as the count is computed it is possible to utilize an index for this. I have read that count()'ing in postgresql is slow, but its the sorting that are causing me problems... My current workaround is to run the queries above as an hourly job, putting the result into a new table with an index on the count column for quick lookup. I use Postgresql 9.2.

Read the article
javascript object's - private methods: which way is better.

- by Praveen Prasad

(function () { function User() { //some properties } //private fn 1 User.prototype._aPrivateFn = function () { //private function defined just like a public function, //for convetion underscore character is added } //private function type 2 //a closure function _anotherPrivateFunction() { // do something } //public function User.prototype.APublicFunction = function () { //call private fn1 this._aPrivateFn(); //call private fn2 _anotherPrivateFunction(); } window.UserX = User; })(); //which of the two ways of defining private methods of a javascript object is better way, specially in sense of memory management and performance.

Read the article
What is the fastest findByName query with hibernate?

- by Karussell

I am sure I can improve the performance of the following findByName query of hibernate: public List<User> findByName(String name) { session.createCriteria(User.class).add(Restrictions.eq("name", name)).list(); } In which way should I improve it or even more important: in which ways should I improve it first? I will need the full object with all the collections (layz or not) and deps of this class.

Read the article
.NET: Which strategy is better for populating a gridview? using a Data Table and bind it or using a

- by odiseh

hi ASP.NET: Which strategy is better for populating a gridview? filling a Data Table and bind it or using a data reader ? Which one has a better performance?

Read the article
Is using joins in select clause slow in Oracle?

- by gniquil

I would like to write a query like the following select username, (select state from addresses where addresses.username = users.username) email from users This works in Oracle (assuming the result from the inner query is unique). However, is there a performance penalty associated with this style of writing query?

Read the article
Is jdbc or ldap faster for basic read operations?

- by Brandon

I have a set of user data which I am try to access. Due to the way our company's employee data is set up, the information is available both through LDAP and through a table in our DB. I was curious, for standard read operations which would generally be a higher performance query?

Read the article
Can MySQL reasonably perform queries on billions of rows?

- by haxney

I am planning on storing scans from a mass spectrometer in a MySQL database and would like to know whether storing and analyzing this amount of data is remotely feasible. I know performance varies wildly depending on the environment, but I'm looking for the rough order of magnitude: will queries take 5 days or 5 milliseconds? Input format Each input file contains a single run of the spectrometer; each run is comprised of a set of scans, and each scan has an ordered array of datapoints. There is a bit of metadata, but the majority of the file is comprised of arrays 32- or 64-bit ints or floats. Host system |----------------+-------------------------------| | OS | Windows 2008 64-bit | | MySQL version | 5.5.24 (x86_64) | | CPU | 2x Xeon E5420 (8 cores total) | | RAM | 8GB | | SSD filesystem | 500 GiB | | HDD RAID | 12 TiB | |----------------+-------------------------------| There are some other services running on the server using negligible processor time. File statistics |------------------+--------------| | number of files | ~16,000 | | total size | 1.3 TiB | | min size | 0 bytes | | max size | 12 GiB | | mean | 800 MiB | | median | 500 MiB | | total datapoints | ~200 billion | |------------------+--------------| The total number of datapoints is a very rough estimate. Proposed schema I'm planning on doing things "right" (i.e. normalizing the data like crazy) and so would have a runs table, a spectra table with a foreign key to runs, and a datapoints table with a foreign key to spectra. The 200 Billion datapoint question I am going to be analyzing across multiple spectra and possibly even multiple runs, resulting in queries which could touch millions of rows. Assuming I index everything properly (which is a topic for another question) and am not trying to shuffle hundreds of MiB across the network, is it remotely plausible for MySQL to handle this? UPDATE: additional info The scan data will be coming from files in the XML-based mzML format. The meat of this format is in the <binaryDataArrayList> elements where the data is stored. Each scan produces = 2 <binaryDataArray> elements which, taken together, form a 2-dimensional (or more) array of the form [[123.456, 234.567, ...], ...]. These data are write-once, so update performance and transaction safety are not concerns. My naïve plan for a database schema is: runs table | column name | type | |-------------+-------------| | id | PRIMARY KEY | | start_time | TIMESTAMP | | name | VARCHAR | |-------------+-------------| spectra table | column name | type | |----------------+-------------| | id | PRIMARY KEY | | name | VARCHAR | | index | INT | | spectrum_type | INT | | representation | INT | | run_id | FOREIGN KEY | |----------------+-------------| datapoints table | column name | type | |-------------+-------------| | id | PRIMARY KEY | | spectrum_id | FOREIGN KEY | | mz | DOUBLE | | num_counts | DOUBLE | | index | INT | |-------------+-------------| Is this reasonable?

Read the article
How to handle very frequent updates to a Lucene index

- by fsm

I am trying to prototype an indexing/search application which uses very volatile indexing data sources (forums, social networks etc), here are some of the performance requirements, Very fast turn-around time (by this I mean that any new data (such as a new message on a forum) should be available in the search results very soon (less than a minute)) I need to discard old documents on a fairly regular basis to ensure that the search results are not dated. Last but not least, the search application needs to be responsive. (latency on the order of 100 milliseconds, and should support at least 10 qps) All of the requirements I have currently can be met w/o using Lucene (and that would let me satisfy all 1,2 and 3), but I am anticipating other requirements in the future (like search relevance etc) which Lucene makes easier to implement. However, since Lucene is designed for use cases far more complex than the one I'm currently working on, I'm having a hard time satisfying my performance requirements. Here are some questions, a. I read that the optimize() method in the IndexWriter class is expensive, and should not be used by applications that do frequent updates, what are the alternatives? b. In order to do incremental updates, I need to keep committing new data, and also keep refreshing the index reader to make sure it has the new data available. These are going to affect 1 and 3 above. Should I try duplicate indices? What are some common approaches to solving this problem? c. I know that Lucene provides a delete method, which lets you delete all documents that match a certain query, in my case, I need to delete all documents which are older than a certain age, now one option is to add a date field to every document and use that to delete documents later. Is it possible to do range queries on document ids (I can create my own id field since I think that the one created by lucene keeps changing) to delete documents? Is it any faster than comparing dates represented as strings? I know these are very open questions, so I am not looking for a detailed answer, I will try to treat all of your answers as suggestions and use them to inform my design. Thanks! Please let me know if you need any other information.

Read the article
Resources for Prformance testing

- by munna

Our small concern is entrusted with creating an Application on ASP.NET with client-server model. As we are almost done with the development we are creating a small team for Performance test. I have googled in the net about the topic but without much help. If anyone of you can share 'How, What and Why' about perf test, it would be great help.

Read the article
Why should I use SQL Server's BETWEEN ... AND syntax?

- by Jeff Meatball Yang

These two statements are logically equivalent: select * from Table where someColumn between 1 and 100 select * from Table where someColumn >= 1 and someColumn <= 100 Is there a potential performance benefit to one versus the other?

Read the article
What native C++ profiling tool do you suggest?

- by glutz78

Can anyone suggest a performance analysis tool that runs on win32 on a native c++ app? How about one that runs on Windows Mobile? Thank you.

Read the article
microsoft access query speed...

- by V.S.

Hello everyone! I am now writing a report about MS Access and I can't find any information about its performance speed in comparison to other alternatives such as Micorsoft SQL Server, MySQL, Oracle, etc... It's obvious that MS Access is going to be the slowest among the rest, but there is no solid documents confirming this other than forums threads, and I don't have the time and resources to do the research myself :( Hoping for your help, V.S.

Read the article
C# - Fast and simple multi dimensional data structures?

- by Jeremy Rudd

I need to store multi-dimensional data consisting of numbers in a manner thats easy to work with. I'm capturing data in real time, and once processed I would destroy and GC older data. This data structure must be fast so it won't hit my overall app performance. The faster the better. What are my choices in terms of platform supported data structures? I'm using VS 2010. and .NET 4.

Read the article
Is mono fast enough for Mac OS X?

- by prosseek

I have to use .NET/C# for the next company project. As I've developed my project on Mac, I looked into the mono for development environment/tool. Is the mono for Mac OS X is fast enough? I mean, what about the performance in running the assembly compared to running the same code on .NET under windows machine? Do I have to buy PC laptop for developing C#/.NET in practical sense?

Read the article
What are the benefits of using properties internally?

- by cyclotis04

Encapsulation is obviously helpful and essential when accessing members from outside the class, but when referring to class variables internally, is it better to call their private members, or use their getters? If your getter simply returns the variable, is there any performance difference?

Read the article
(Why) does Tomcat/Java perform better on Linux than on Windows?

- by ripper234

I just read this (one) study in which Tomcat under Linux outperformed Windows. From your experience, is this generally true? Any deep reason that could explain the performance difference?

Read the article
web drop-down combo box with large list of records

- by AlejandroR

The amount of records to be displayed in drop-down combo boxes affect the performance of internet applications. What are the current best practices to solve this problem? Are paginated drop-downs the only solution? What is considered a large list? 100 or 1000?

Read the article
Mod_rewrite on all website images

- by Esteve Camps

I'm designing an image repository. I want to uncouple the filename from the image html link. For instance: image in filesystem is called images/items/12543.jpg HTML is <img src="images/car.jpg" /> Does anyone strongly discourages me to rewrite all image requests using PHP so when retrieving images/car.jpg, Apache really replies content from images/items/12543.jpg? I don't know if I may get performance problems.

Read the article
What's the fastest lookup algorithm for a key, pair data structure (i.e, a map)?

- by truncheon

In the following example a std::map structure is filled with 26 values from A - Z (for key) and 0 – 26 for value. The time taken (on my system) to lookup the last entry (10000000 times) is roughly 250 ms for the vector, and 125 ms for the map. (I compiled using release mode, with O3 option turned on for g++ 4.4) But if for some odd reason I wanted better performance than the std::map, what data structures and functions would I need to consider using? I apologize if the answer seems obvious to you, but I haven't had much experience in the performance critical aspects of C++ programming. #include <ctime> #include <map> #include <vector> #include <iostream> struct mystruct { char key; int value; mystruct(char k = 0, int v = 0) : key(k), value(v) { } }; int find(const std::vector<mystruct>& ref, char key) { for (std::vector<mystruct>::const_iterator i = ref.begin(); i != ref.end(); ++i) if (i->key == key) return i->value; return -1; } int main() { std::map<char, int> mymap; std::vector<mystruct> myvec; for (int i = 'a'; i < 'a' + 26; ++i) { mymap[i] = i - 'a'; myvec.push_back(mystruct(i, i - 'a')); } int pre = clock(); for (int i = 0; i < 10000000; ++i) { find(myvec, 'z'); } std::cout << "linear scan: milli " << clock() - pre << "\n"; pre = clock(); for (int i = 0; i < 10000000; ++i) { mymap['z']; } std::cout << "map scan: milli " << clock() - pre << "\n"; return 0; }

Read the article
Maximum capabilities of MySQL

- by cdated

How do I know when a project is just to big for MySQL and I should use something with a better reputation for scalability? Is there a max database size for MySQL before degradation of performance occurs? What factors contribute to MySQL not being a viable option compared to a commercial DBMS like Oracle or SQL Server?

Read the article
What would be better, (1 database + 4 tables) or (2 databases + 2 tables each) ?

- by griseldas

Hi there, I would like to be advised on what would be better (in regards to performance) A) 1 DATABASE with 4 tables or B) 2 DATABASES (same server), each with 2 tables. The tables size and usage are more or less similar, so the 2 tables on Database 1 would be similar usage/size to the 2 tables on database 2 The tables could have +500,000 records and the 2 tables on each database are not related (no join queries etc between them) Thanks in advance for your comments

Read the article
What does "performant" software actually mean?

- by Roddy

I see it used a lot, but haven't seen a definition that makes complete sense. Wiktionary says "characterized by an adequate or excellent level of performance or efficiency", which isn't much help. Initially I though performant just meant "fast", but others seem to think it's also about stability, code quality, memory use/footprint, or some combination of all those. I think this is a "real" question - but if enough people reckon this is a subjective question, that's an answer in itself.

Read the article
What is the Difference between onclick and href="javscript:function name ?

- by Shyju

Is there any difference between 1 : <a href="javascript:MyFunction()">Link1</a> and 2 : <a href="#" onclick="MyFunction()">Link2</a> ? Would one affect the page performance by any means ?

Read the article
Regex vs. string:find() for simple word boundary

- by user576267

Say I only need to find out whether a line read from a file contains a word from a finite set of words. One way of doing this is to use a regex like this: .*\y(good|better|best)\y.* Another way of accomplishing this is using a pseudo code like this: if ( (readLine.find("good") != string::npos) || (readLine.find("better") != string::npos) || (readLine.find("best") != string::npos) ) { // line contains a word from a finite set of words. } Which way will have better performance? (i.e. speed and CPU utilization)

Read the article
Improving performance for WRITE operation on Oracle DB in Java

- by Lucky

I've a typical scenario & need to understand best possible way to handle this, so here it goes - I'm developing a solution that will retrieve data from a remote SOAP based web service & will then push this data to an Oracle database on network. Also, this will be a scheduled task that will execute every 15 minutes. I've event queues on remote service that contains the INSERT/UPDATE/DELETE operations that have been done since last retrieval, & once I retrieve the events for last 15 minutes, it again add events for next retrieval. Now, its just pushing data to Oracle so all my interactions are INSERT & UPDATE statements. There are around 60 tables on Oracle with some of them having 100+ columns. Moreover, for every 15 minutes cycle there would be around 60-70 Inserts, 100+ Updates & 10-20 Deletes. This will be an executable jar file that will terminate after operation & will again start on next 15 minutes cycle. So, I need to understand how should I handle WRITE operations (best practices) to improve performance for this application as whole ? Current Test Code (on every cycle) - Connects to remote service to get events. Creates a connection with DB (single connection object). Identifies the type of operation (INSERT/UPDATE/DELETE) & table on which it is done. After above, calls the respective method based on type of operation & table. Uses Preparedstatement with positional parameters, & retrieves each column value from remote service & assigns that to statement parameters. Commits the statement & returns to get event class to process next event. Above is repeated till all the retrieved events are processed after which program closes & then starts on next cycle & everything repeats again. Thanks for help !

Read the article

< Previous Page | 176 177 178 179 180 181 182 183 184 185 186 187 | Next Page >