performance counters - Page 30

OpenGL performance on rendering "virtual gallery" (textures)

- by maticus

I have a considerable (120-240) amount of 640x480 images that will be displayed as textured flat surfaces (4 vertex polygons) in a 3D environment. About 30-50% of them will be visible in a given frame. It is possible for them to crossover. Nothing else will be present in the environment. The question is - will the modern and/or few-years-old (lets say Radeon 9550) GPU cope with that, and what frame rate can I expect? I aim for 20FPS, but 30-40 would be nice. Would changing the resolution to 320x240 make it more probable to happen? I do not have any previous experience with performance issues of 3D graphics on modern GPUs, and unfortunately I must make a design choice. I don't want to waste time on doing something that couldn't have worked :-)

Read the article

Columnstore Case Study #1: MSIT SONAR Aggregations

- by aspiringgeek

Preamble This is the first in a series of posts documenting big wins encountered using columnstore indexes in SQL Server 2012 & 2014. Many of these can be found in this deck along with details such as internals, best practices, caveats, etc. The purpose of sharing the case studies in this context is to provide an easy-to-consume quick-reference alternative. Why Columnstore? If we’re looking for a subset of columns from one or a few rows, given the right indexes, SQL Server can do a superlative job of providing an answer. If we’re asking a question which by design needs to hit lots of rows—DW, reporting, aggregations, grouping, scans, etc., SQL Server has never had a good mechanism—until columnstore. Columnstore indexes were introduced in SQL Server 2012. However, they're still largely unknown. Some adoption blockers existed; yet columnstore was nonetheless a game changer for many apps. In SQL Server 2014, potential blockers have been largely removed & they're going to profoundly change the way we interact with our data. The purpose of this series is to share the performance benefits of columnstore & documenting columnstore is a compelling reason to upgrade to SQL Server 2014. App: MSIT SONAR Aggregations At MSIT, performance & configuration data is captured by SCOM. We archive much of the data in a partitioned data warehouse table in SQL Server 2012 for reporting via an application called SONAR. By definition, this is a primary use case for columnstore—report queries requiring aggregation over large numbers of rows. New data is refreshed each night by an automated table partitioning mechanism—a best practices scenario for columnstore. The Win Compared to performance using classic indexing which resulted in the expected query plan selection including partition elimination vs. SQL Server 2012 nonclustered columnstore, query performance increased significantly. Logical reads were reduced by over a factor of 50; both CPU & duration improved by factors of 20 or more. Other than creating the columnstore index, no special modifications or tweaks to the app or databases schema were necessary to achieve the performance improvements. Existing nonclustered indexes were rendered superfluous & were deleted, thus mitigating maintenance challenges such as defragging as well as conserving disk capacity. Details The table provides the raw data & summarizes the performance deltas. Logical Reads (8K pages) CPU (ms) Durn (ms) Columnstore 160,323 20,360 9,786 Conventional Table & Indexes 9,053,423 549,608 193,903 ? x56 x27 x20 The charts provide additional perspective of this data. "Conventional vs. Columnstore Metrics" document the raw data. Note on this linear display the magnitude of the conventional index performance vs. columnstore. The “Metrics (?)” chart expresses these values as a ratio. Summary For DW, reports, & other BI workloads, columnstore often provides significant performance enhancements relative to conventional indexing. I have documented here, the first in a series of reports on columnstore implementations, results from an initial implementation at MSIT in which logical reads were reduced by over a factor of 50; both CPU & duration improved by factors of 20 or more. Subsequent features in this series document performance enhancements that are even more significant.

Read the article

Columnstore Case Study #1: MSIT SONAR Aggregations

- by aspiringgeek

Preamble This is the first in a series of posts documenting big wins encountered using columnstore indexes in SQL Server 2012 & 2014. Many of these can be found in this deck along with details such as internals, best practices, caveats, etc. The purpose of sharing the case studies in this context is to provide an easy-to-consume quick-reference alternative. Why Columnstore? If we’re looking for a subset of columns from one or a few rows, given the right indexes, SQL Server can do a superlative job of providing an answer. If we’re asking a question which by design needs to hit lots of rows—DW, reporting, aggregations, grouping, scans, etc., SQL Server has never had a good mechanism—until columnstore. Columnstore indexes were introduced in SQL Server 2012. However, they're still largely unknown. Some adoption blockers existed; yet columnstore was nonetheless a game changer for many apps. In SQL Server 2014, potential blockers have been largely removed & they're going to profoundly change the way we interact with our data. The purpose of this series is to share the performance benefits of columnstore & documenting columnstore is a compelling reason to upgrade to SQL Server 2014. App: MSIT SONAR Aggregations At MSIT, performance & configuration data is captured by SCOM. We archive much of the data in a partitioned data warehouse table in SQL Server 2012 for reporting via an application called SONAR. By definition, this is a primary use case for columnstore—report queries requiring aggregation over large numbers of rows. New data is refreshed each night by an automated table partitioning mechanism—a best practices scenario for columnstore. The Win Compared to performance using classic indexing which resulted in the expected query plan selection including partition elimination vs. SQL Server 2012 nonclustered columnstore, query performance increased significantly. Logical reads were reduced by over a factor of 50; both CPU & duration improved by factors of 20 or more. Other than creating the columnstore index, no special modifications or tweaks to the app or databases schema were necessary to achieve the performance improvements. Existing nonclustered indexes were rendered superfluous & were deleted, thus mitigating maintenance challenges such as defragging as well as conserving disk capacity. Details The table provides the raw data & summarizes the performance deltas. Logical Reads (8K pages) CPU (ms) Durn (ms) Columnstore 160,323 20,360 9,786 Conventional Table & Indexes 9,053,423 549,608 193,903 ? x56 x27 x20 The charts provide additional perspective of this data. "Conventional vs. Columnstore Metrics" document the raw data. Note on this linear display the magnitude of the conventional index performance vs. columnstore. The “Metrics (?)” chart expresses these values as a ratio. Summary For DW, reports, & other BI workloads, columnstore often provides significant performance enhancements relative to conventional indexing. I have documented here, the first in a series of reports on columnstore implementations, results from an initial implementation at MSIT in which logical reads were reduced by over a factor of 50; both CPU & duration improved by factors of 20 or more. Subsequent features in this series document performance enhancements that are even more significant.

Read the article

General monitoring for SQL Server Analysis Services using Performance Monitor

- by Testas

A recent customer engagement required a setup of a monitoring solution for SSAS, due to the time restrictions placed upon this, native Windows Performance Monitor (Perfmon) and SQL Server Profiler Monitoring Tools was used as using a third party tool would have meant the customer providing an additional monitoring server that was not available.I wanted to outline the performance monitoring counters that was used to monitor the system on which SSAS was running. Due to the slow query performance that was occurring during certain scenarios, perfmon was used to establish if any pressure was being placed on the Disk, CPU or Memory subsystem when concurrent connections access the same query, and Profiler to pinpoint how the query was being managed within SSAS, profiler I will leave for another blogThis guide is not designed to provide a definitive list of what should be used when monitoring SSAS, different situations may require the addition or removal of counters as presented by the situation. However I hope that it serves as a good basis for starting your monitoring of SSAS. I would also like to acknowledge Chris Webb’s awesome chapters from “Expert Cube Development” that also helped shape my monitoring strategy:http://cwebbbi.spaces.live.com/blog/cns!7B84B0F2C239489A!6657.entrySimulating ConnectionsTo simulate the additional connections to the SSAS server whilst monitoring, I used ascmd to simulate multiple connections to the typical and worse performing queries that were identified by the customer. A similar sript can be downloaded from codeplex at http://www.codeplex.com/SQLSrvAnalysisSrvcs. File name: ASCMD_StressTestingScripts.zip. Performance MonitorWithin performance monitor, a counter log was created that contained the list of counters below. The important point to note when running the counter log is that the RUN AS property within the counter log properties should be changed to an account that has rights to the SSAS instance when monitoring MSAS counters. Failure to do so means that the counter log runs under the system account, no errors or warning are given while running the counter log, and it is not until you need to view the MSAS counters that they will not be displayed if run under the default account that has no right to SSAS. If your connection simulation takes hours, this could prove quite frustrating if not done beforehand JThe counters used…… Object Counter Instance Justification System Processor Queue legnth N/A Indicates how many threads are waiting for execution against the processor. If this counter is consistently higher than around 5 when processor utilization approaches 100%, then this is a good indication that there is more work (active threads) available (ready for execution) than the machine's processors are able to handle. System Context Switches/sec N/A Measures how frequently the processor has to switch from user- to kernel-mode to handle a request from a thread running in user mode. The heavier the workload running on your machine, the higher this counter will generally be, but over long term the value of this counter should remain fairly constant. If this counter suddenly starts increasing however, it may be an indicating of a malfunctioning device, especially if the Processor\Interrupts/sec\(_Total) counter on your machine shows a similar unexplained increase Process % Processor Time sqlservr Definately should be used if Processor\% Processor Time\(_Total) is maxing at 100% to assess the effect of the SQL Server process on the processor Process % Processor Time msmdsrv Definately should be used if Processor\% Processor Time\(_Total) is maxing at 100% to assess the effect of the SQL Server process on the processor Process Working Set sqlservr If the Memory\Available bytes counter is decreaing this counter can be run to indicate if the process is consuming larger and larger amounts of RAM. Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault. Process Working Set msmdsrv If the Memory\Available bytes counter is decreaing this counter can be run to indicate if the process is consuming larger and larger amounts of RAM. Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault. Processor % Processor Time _Total and individual cores measures the total utilization of your processor by all running processes. If multi-proc then be mindful only an average is provided Processor % Privileged Time _Total To see how the OS is handling basic IO requests. If kernel mode utilization is high, your machine is likely underpowered as it's too busy handling basic OS housekeeping functions to be able to effectively run other applications. Processor % User Time _Total To see how the applications is interacting from a processor perspective, a high percentage utilisation determine that the server is dealing with too many apps and may require increasing thje hardware or scaling out Processor Interrupts/sec _Total The average rate, in incidents per second, at which the processor received and serviced hardware interrupts. Shoulr be consistant over time but a sudden unexplained increase could indicate a device malfunction which can be confirmed using the System\Context Switches/sec counter Memory Pages/sec N/A Indicates the rate at which pages are read from or written to disk to resolve hard page faults. This counter is a primary indicator of the kinds of faults that cause system-wide delays, this is the primary counter to watch for indication of possible insufficient RAM to meet your server's needs. A good idea here is to configure a perfmon alert that triggers when the number of pages per second exceeds 50 per paging disk on your system. May also want to see the configuration of the page file on the Server Memory Available Mbytes N/A is the amount of physical memory, in bytes, available to processes running on the computer. if this counter is greater than 10% of the actual RAM in your machine then you probably have more than enough RAM. monitor it regularly to see if any downward trend develops, and set an alert to trigger if it drops below 2% of the installed RAM. Physical Disk Disk Transfers/sec for each physical disk If it goes above 10 disk I/Os per second then you've got poor response time for your disk. Physical Disk Idle Time _total If Disk Transfers/sec is above 25 disk I/Os per second use this counter. which measures the percent time that your hard disk is idle during the measurement interval, and if you see this counter fall below 20% then you've likely got read/write requests queuing up for your disk which is unable to service these requests in a timely fashion. Physical Disk Disk queue legnth For the OLAP and SQL physical disk A value that is consistently less than 2 means that the disk system is handling the IO requests against the physical disk Network Interface Bytes Total/sec For the NIC Should be monitored over a period of time to see if there is anb increase/decrease in network utilisation Network Interface Current Bandwidth For the NIC is an estimate of the current bandwidth of the network interface in bits per second (BPS). MSAS 2005: Memory Memory Limit High KB N/A Shows (as a percentage) the high memory limit configured for SSAS in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini MSAS 2005: Memory Memory Limit Low KB N/A Shows (as a percentage) the low memory limit configured for SSAS in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini MSAS 2005: Memory Memory Usage KB N/A Displays the memory usage of the server process. MSAS 2005: Memory File Store KB N/A Displays the amount of memory that is reserved for the Cache. Note if total memory limit in the msmdsrv.ini is set to 0, no memory is reserved for the cache MSAS 2005: Storage Engine Query Queries from Cache Direct / sec N/A Displays the rate of queries answered from the cache directly MSAS 2005: Storage Engine Query Queries from Cache Filtered / Sec N/A Displays the Rate of queries answered by filtering existing cache entry. MSAS 2005: Storage Engine Query Queries from File / Sec N/A Displays the Rate of queries answered from files. MSAS 2005: Storage Engine Query Average time /query N/A Displays the average time of a query MSAS 2005: Connection Current connections N/A Displays the number of connections against the SSAS instance MSAS 2005: Connection Requests / sec N/A Displays the rate of query requests per second MSAS 2005: Locks Current Lock Waits N/A Displays thhe number of connections waiting on a lock MSAS 2005: Threads Query Pool job queue Length N/A The number of queries in the job queue MSAS 2005:Proc Aggregations Temp file bytes written/sec N/A Shows the number of bytes of data processed in a temporary file MSAS 2005:Proc Aggregations Temp file rows written/sec N/A Shows the number of bytes of data processed in a temporary file

Read the article

MySQL Cluster 7.2: Over 8x Higher Performance than Cluster 7.1

- by Mat Keep

0 0 1 893 5092 Homework 42 11 5974 14.0 Normal 0 false false false EN-US JA X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:Cambria; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin; mso-ansi-language:EN-US;} Summary The scalability enhancements delivered by extensions to multi-threaded data nodes enables MySQL Cluster 7.2 to deliver over 8x higher performance than the previous MySQL Cluster 7.1 release on a recent benchmark What’s New in MySQL Cluster 7.2 MySQL Cluster 7.2 was released as GA (Generally Available) in February 2012, delivering many enhancements to performance on complex queries, new NoSQL Key / Value API, cross-data center replication and ease-of-use. These enhancements are summarized in the Figure below, and detailed in the MySQL Cluster New Features whitepaper Figure 1: Next Generation Web Services, Cross Data Center Replication and Ease-of-Use Once of the key enhancements delivered in MySQL Cluster 7.2 is extensions made to the multi-threading processes of the data nodes. Multi-Threaded Data Node Extensions The MySQL Cluster 7.2 data node is now functionally divided into seven thread types: 1) Local Data Manager threads (ldm). Note – these are sometimes also called LQH threads. 2) Transaction Coordinator threads (tc) 3) Asynchronous Replication threads (rep) 4) Schema Management threads (main) 5) Network receiver threads (recv) 6) Network send threads (send) 7) IO threads Each of these thread types are discussed in more detail below. MySQL Cluster 7.2 increases the maximum number of LDM threads from 4 to 16. The LDM contains the actual data, which means that when using 16 threads the data is more heavily partitioned (this is automatic in MySQL Cluster). Each LDM thread maintains its own set of data partitions, index partitions and REDO log. The number of LDM partitions per data node is not dynamically configurable, but it is possible, however, to map more than one partition onto each LDM thread, providing flexibility in modifying the number of LDM threads. The TC domain stores the state of in-flight transactions. This means that every new transaction can easily be assigned to a new TC thread. Testing has shown that in most cases 1 TC thread per 2 LDM threads is sufficient, and in many cases even 1 TC thread per 4 LDM threads is also acceptable. Testing also demonstrated that in some instances where the workload needed to sustain very high update loads it is necessary to configure 3 to 4 TC threads per 4 LDM threads. In the previous MySQL Cluster 7.1 release, only one TC thread was available. This limit has been increased to 16 TC threads in MySQL Cluster 7.2. The TC domain also manages the Adaptive Query Localization functionality introduced in MySQL Cluster 7.2 that significantly enhanced complex query performance by pushing JOIN operations down to the data nodes. Asynchronous Replication was separated into its own thread with the release of MySQL Cluster 7.1, and has not been modified in the latest 7.2 release. To scale the number of TC threads, it was necessary to separate the Schema Management domain from the TC domain. The schema management thread has little load, so is implemented with a single thread. The Network receiver domain was bound to 1 thread in MySQL Cluster 7.1. With the increase of threads in MySQL Cluster 7.2 it is also necessary to increase the number of recv threads to 8. This enables each receive thread to service one or more sockets used to communicate with other nodes the Cluster. The Network send thread is a new thread type introduced in MySQL Cluster 7.2. Previously other threads handled the sending operations themselves, which can provide for lower latency. To achieve highest throughput however, it has been necessary to create dedicated send threads, of which 8 can be configured. It is still possible to configure MySQL Cluster 7.2 to a legacy mode that does not use any of the send threads – useful for those workloads that are most sensitive to latency. The IO Thread is the final thread type and there have been no changes to this domain in MySQL Cluster 7.2. Multiple IO threads were already available, which could be configured to either one thread per open file, or to a fixed number of IO threads that handle the IO traffic. Except when using compression on disk, the IO threads typically have a very light load. Benchmarking the Scalability Enhancements The scalability enhancements discussed above have made it possible to scale CPU usage of each data node to more than 5x of that possible in MySQL Cluster 7.1. In addition, a number of bottlenecks have been removed, making it possible to scale data node performance by even more than 5x. Figure 2: MySQL Cluster 7.2 Delivers 8.4x Higher Performance than 7.1 The flexAsynch benchmark was used to compare MySQL Cluster 7.2 performance to 7.1 across an 8-node Intel Xeon x5670-based cluster of dual socket commodity servers (6 cores each). As the results demonstrate, MySQL Cluster 7.2 delivers over 8x higher performance per data nodes than MySQL Cluster 7.1. More details of this and other benchmarks will be published in a new whitepaper – coming soon, so stay tuned! In a following blog post, I’ll provide recommendations on optimum thread configurations for different types of server processor. You can also learn more from the Best Practices Guide to Optimizing Performance of MySQL Cluster Conclusion MySQL Cluster has achieved a range of impressive benchmark results, and set in context with the previous 7.1 release, is able to deliver over 8x higher performance per node. As a result, the multi-threaded data node extensions not only serve to increase performance of MySQL Cluster, they also enable users to achieve significantly improved levels of utilization from current and future generations of massively multi-core, multi-thread processor designs.

Read the article

Getting baseline and performance stats - the easy way.

- by fatherjack

OK, pretty much any DBA worth their salt has read Brent Ozar's (Blog | Twitter) blog about getting a baseline of your server's performance counters and then getting the same counters at regular intervals afterwards so that you can track performance trends and evidence how you are making your servers faster or cope with extra load without costing your boss any money for hardware upgrades. No? well, go read it now. I can wait a while as there is a great video there too...http://www.brentozar.com/archive/2006/12/dba-101-using-perfmon-for-sql-performance-tuning/,...(read more)

Read the article

How to improve WinForms MSChart performance?

- by Marcel

Hi all, I have created some simple charts (of type FastLine) with MSChart and update them with live data, like below: . To do so, I bind an observable collection of a custom type to the chart like so: // set chart data source this._Chart.DataSource = value; //is of type ObservableCollection<SpectrumLevels> //define x and y value members for each series this._Chart.Series[0].XValueMember = "Index"; this._Chart.Series[1].XValueMember = "Index"; this._Chart.Series[0].YValueMembers = "Channel0Level"; this._Chart.Series[1].YValueMembers = "Channel1Level"; // bind data to chart this._Chart.DataBind(); //lasts 1.5 seconds for 8000 points per series At each refresh, the dataset completely changes, it is not a scrolling update! With a profiler I have found that the DataBind() call takes about 1.5 seconds. The other calls are negligible. How can I make this faster? Should I use another type than ObservableCollection? An array probably? Should I use another form of data binding? Is there some tweak for the MSChart that I may have missed? Should I use a sparsed set of date, having one value per pixel only? Have I simply reached the performance limit of MSCharts? From the type of the application to keep it "fluent", we should have multiple refreshes per second. Thanks for any hints!

Read the article

Books and resources for Java Performance tuning - when working with databases, huge lists

- by Arvind

Hi All, I am relatively new to working on huge applications in Java. I am working on a Java web service which is pretty heavily used by various clients. The service basically queries the database (hibernate) and then works with a lot of Lists (there are adapters to convert list returned from DB to the interface which the service publishes) and I am seeing lot of issues with the service like high CPU usage or high heap space. While I can troubleshoot the performance issues using a profiler, I want to actually learn about what all I need to take care when I actually write code. Like what kind of List to use or things like using StringBuilder instead of String, etc... Is there any book or blogs which I can refer which will help me while I write new services? Also my application is multithreaded - each service call from a client is a new thread, and I want to know some best practices around that area as well. I did search the web but I found many tips which are not relevant in the latest Java 6 releases, so wanted to know what kind of resources would help a developer starting out now on Java for heavily used applications. Arvind

Read the article

Silverlight combobox performance issue

- by Vinzz

Hi, I'm facing a performance issue with a crowded combobox (5000 items). Rendering of the drop down list is really slow (as if it was computing all items before showing any). Do you have any trick to make this dropdown display lazy? Xaml code: <Grid x:Name="LayoutRoot"> <StackPanel Orientation="Horizontal" Width="200" Height="20"> <TextBlock>Test Combo </TextBlock> <ComboBox x:Name="fooCombo" Margin="5,0,0,0"></ComboBox> </StackPanel> </Grid> code behind: public MainPage() { InitializeComponent(); List<string> li = new List<string>(); int Max = 5000; for (int i = 0; i < Max; ++i) li.Add("Item - " + i); fooCombo.ItemsSource = li; }

Read the article

Increase performance on iphone at pdf rendering

- by burki

Hi! I have a UITableView, and in every cell there's displayed a UIImage created from a pdf. But now the performance is very bad. Here's my code I use to generate the UIImage from the PDF. Creating CGPDFDocumentRef and UIImageView (in cellForRowAtIndexPath method): ... CFURLRef pdfURL = CFBundleCopyResourceURL(CFBundleGetMainBundle(), (CFStringRef)formula.icon, NULL, NULL); CGPDFDocumentRef documentRef = CGPDFDocumentCreateWithURL((CFURLRef)pdfURL); CFRelease(pdfURL); UIImageView *imageView = [[UIImageView alloc] initWithImage:[self imageFromPDFWithDocumentRef:documentRef]]; ... Generate UIImage: - (UIImage *)imageFromPDFWithDocumentRef:(CGPDFDocumentRef)documentRef { CGPDFPageRef pageRef = CGPDFDocumentGetPage(documentRef, 1); CGRect pageRect = CGPDFPageGetBoxRect(pageRef, kCGPDFCropBox); UIGraphicsBeginImageContext(pageRect.size); CGContextRef context = UIGraphicsGetCurrentContext(); CGContextTranslateCTM(context, CGRectGetMinX(pageRect),CGRectGetMaxY(pageRect)); CGContextScaleCTM(context, 1, -1); CGContextTranslateCTM(context, -(pageRect.origin.x), -(pageRect.origin.y)); CGContextDrawPDFPage(context, pageRef); UIImage *finalImage = UIGraphicsGetImageFromCurrentImageContext(); UIGraphicsEndImageContext(); return finalImage; } What can I do to increas the speed and keep the memory low?

Read the article

Iterator performance contract (and use on non-collections)

- by polygenelubricants

If all that you're doing is a simple one-pass iteration (i.e. only hasNext() and next(), no remove()), are you guaranteed linear time performance and/or amortized constant cost per operation? Is this specified in the Iterator contract anywhere? Are there data structures/Java Collection which cannot be iterated in linear time? java.util.Scanner implements Iterator<String>. A Scanner is hardly a data structure (e.g. remove() makes absolutely no sense). Is this considered a design blunder? Is something like PrimeGenerator implements Iterator<Integer> considered bad design, or is this exactly what Iterator is for? (hasNext() always returns true, next() computes the next number on demand, remove() makes no sense). Similarly, would it have made sense for java.util.Random implements Iterator<Double>? Should a type really implement Iterator if it's effectively only using one-third of its API? (i.e. no remove(), always hasNext())

Read the article

MySQL MyISAM table performance... painfully, painfully slow

- by Salman A

I've got a table structure that can be summarized as follows: pagegroup * pagegroupid * name has 3600 rows page * pageid * pagegroupid * data references pagegroup; has 10000 rows; can have anything between 1-700 rows per pagegroup; the data column is of type mediumtext and the column contains 100k - 200kbytes data per row userdata * userdataid * pageid * column1 * column2 * column9 references page; has about 300,000 rows; can have about 1-50 rows per page The above structure is pretty straight forwad, the problem is that that a join from userdata to page group is terribly, terribly slow even though I have indexed all columns that should be indexed. The time needed to run a query for such a join (userdata inner_join page inner_join pagegroup) exceeds 3 minutes. This is terribly slow considering the fact that I am not selecting the data column at all. Example of the query that takes too long: SELECT userdata.column1, pagegroup.name FROM userdata INNER JOIN page USING( pageid ) INNER JOIN pagegroup USING( pagegroupid ) Please help by explaining why does it take so long and what can i do to make it faster. Edit #1 Explain returns following gibberish: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE userdata ALL pageid 372420 1 SIMPLE page eq_ref PRIMARY,pagegroupid PRIMARY 4 topsecret.userdata.pageid 1 1 SIMPLE pagegroup eq_ref PRIMARY PRIMARY 4 topsecret.page.pagegroupid 1 Edit #2 SELECT u.field2, p.pageid FROM userdata u INNER JOIN page p ON u.pageid = p.pageid; /* 0.07 sec execution, 6.05 sec fecth */ id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE u ALL pageid 372420 1 SIMPLE p eq_ref PRIMARY PRIMARY 4 topsecret.u.pageid 1 Using index SELECT p.pageid, g.pagegroupid FROM page p INNER JOIN pagegroup g ON p.pagegroupid = g.pagegroupid; /* 9.37 sec execution, 60.0 sec fetch */ id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE g index PRIMARY PRIMARY 4 3646 Using index 1 SIMPLE p ref pagegroupid pagegroupid 5 topsecret.g.pagegroupid 3 Using where Moral of the story Keep medium/long text columns in a separate table if you run into performance problems such as this one.

Read the article

Varying performance of MSVC release exe

- by Andrew

Hello everyone, I am curious what could be the reason for highly varying performance of the same executable. Sometimes, I run it and it takes 20 seconds and sometimes it is 110. Source is compiled with MSVC in Release mode with standard options. The code is here: vector<double> Un; vector<double> Ucur; double *pUn, *pUcur; ... // time marching for (old_time=time-logfreq, time+=dt; time <= end_time; time+=dt) { for (i=1, j=Un.size()-1, pUn=&Un[1], pUcur=&Ucur[1]; i < j; ++i, ++pUn, ++pUcur) { *pUcur = (*pUn)*(1.0-0.5*alpha*( *(pUn+1) - *(pUn-1) )); } Ucur[0] = (Un[0])*(1.0-0.5*alpha*( Un[1] - Un[j] )); Ucur[j] = (Un[j])*(1.0-0.5*alpha*( Un[0] - Un[j-1] )); Un = Ucur; }

Read the article

Python performance improvement request for winkler

- by Martlark

I'm a python n00b and I'd like some suggestions on how to improve the algorithm to improve the performance of this method to compute the Jaro-Winkler distance of two names. def winklerCompareP(str1, str2): """Return approximate string comparator measure (between 0.0 and 1.0) USAGE: score = winkler(str1, str2) ARGUMENTS: str1 The first string str2 The second string DESCRIPTION: As described in 'An Application of the Fellegi-Sunter Model of Record Linkage to the 1990 U.S. Decennial Census' by William E. Winkler and Yves Thibaudeau. Based on the 'jaro' string comparator, but modifies it according to whether the first few characters are the same or not. """ # Quick check if the strings are the same - - - - - - - - - - - - - - - - - - # jaro_winkler_marker_char = chr(1) if (str1 == str2): return 1.0 len1 = len(str1) len2 = len(str2) halflen = max(len1,len2) / 2 - 1 ass1 = '' # Characters assigned in str1 ass2 = '' # Characters assigned in str2 #ass1 = '' #ass2 = '' workstr1 = str1 workstr2 = str2 common1 = 0 # Number of common characters common2 = 0 #print "'len1', str1[i], start, end, index, ass1, workstr2, common1" # Analyse the first string - - - - - - - - - - - - - - - - - - - - - - - - - # for i in range(len1): start = max(0,i-halflen) end = min(i+halflen+1,len2) index = workstr2.find(str1[i],start,end) #print 'len1', str1[i], start, end, index, ass1, workstr2, common1 if (index > -1): # Found common character common1 += 1 #ass1 += str1[i] ass1 = ass1 + str1[i] workstr2 = workstr2[:index]+jaro_winkler_marker_char+workstr2[index+1:] #print "str1 analyse result", ass1, common1 #print "str1 analyse result", ass1, common1 # Analyse the second string - - - - - - - - - - - - - - - - - - - - - - - - - # for i in range(len2): start = max(0,i-halflen) end = min(i+halflen+1,len1) index = workstr1.find(str2[i],start,end) #print 'len2', str2[i], start, end, index, ass1, workstr1, common2 if (index > -1): # Found common character common2 += 1 #ass2 += str2[i] ass2 = ass2 + str2[i] workstr1 = workstr1[:index]+jaro_winkler_marker_char+workstr1[index+1:] if (common1 != common2): print('Winkler: Wrong common values for strings "%s" and "%s"' % \ (str1, str2) + ', common1: %i, common2: %i' % (common1, common2) + \ ', common should be the same.') common1 = float(common1+common2) / 2.0 ##### This is just a fix ##### if (common1 == 0): return 0.0 # Compute number of transpositions - - - - - - - - - - - - - - - - - - - - - # transposition = 0 for i in range(len(ass1)): if (ass1[i] != ass2[i]): transposition += 1 transposition = transposition / 2.0 # Now compute how many characters are common at beginning - - - - - - - - - - # minlen = min(len1,len2) for same in range(minlen+1): if (str1[:same] != str2[:same]): break same -= 1 if (same > 4): same = 4 common1 = float(common1) w = 1./3.*(common1 / float(len1) + common1 / float(len2) + (common1-transposition) / common1) wn = w + same*0.1 * (1.0 - w) return wn

Read the article

Haskell math performance

- by Travis Brown

I'm in the middle of porting David Blei's original C implementation of Latent Dirichlet Allocation to Haskell, and I'm trying to decide whether to leave some of the low-level stuff in C. The following function is one example—it's an approximation of the second derivative of lgamma: double trigamma(double x) { double p; int i; x=x+6; p=1/(x*x); p=(((((0.075757575757576*p-0.033333333333333)*p+0.0238095238095238) *p-0.033333333333333)*p+0.166666666666667)*p+1)/x+0.5*p; for (i=0; i<6 ;i++) { x=x-1; p=1/(x*x)+p; } return(p); } I've translated this into more or less idiomatic Haskell as follows: trigamma :: Double -> Double trigamma x = snd $ last $ take 7 $ iterate next (x' - 1, p') where x' = x + 6 p = 1 / x' ^ 2 p' = p / 2 + c / x' c = foldr1 (\a b -> (a + b * p)) [1, 1/6, -1/30, 1/42, -1/30, 5/66] next (x, p) = (x - 1, 1 / x ^ 2 + p) The problem is that when I run both through Criterion, my Haskell version is six or seven times slower (I'm compiling with -O2 on GHC 6.12.1). Some similar functions are even worse. I know practically nothing about Haskell performance, and I'm not terribly interested in digging through Core or anything like that, since I can always just call the handful of math-intensive C functions through FFI. But I'm curious about whether there's low-hanging fruit that I'm missing—some kind of extension or library or annotation that I could use to speed up this numeric stuff without making it too ugly.

Read the article

Performance of Java matrix math libraries?

- by dfrankow

We are computing something whose runtime is bound by matrix operations. (Some details below if interested.) This experience prompted the following question: Do folk have experience with the performance of Java libraries for matrix math (e.g., multiply, inverse, etc.)? For example: JAMA: http://math.nist.gov/javanumerics/jama/ COLT: http://acs.lbl.gov/~hoschek/colt/ Apache commons math: http://commons.apache.org/math/ I searched and found nothing. Details of our speed comparison: We are using Intel FORTRAN (ifort (IFORT) 10.1 20070913). We have reimplemented it in Java (1.6) using Apache commons math 1.2 matrix ops, and it agrees to all of its digits of accuracy. (We have reasons for wanting it in Java.) (Java doubles, Fortran real*8). Fortran: 6 minutes, Java 33 minutes, same machine. jvisualm profiling shows much time spent in RealMatrixImpl.{getEntry,isValidCoordinate} (which appear to be gone in unreleased Apache commons math 2.0, but 2.0 is no faster). Fortran is using Atlas BLAS routines (dpotrf, etc.). Obviously this could depend on our code in each language, but we believe most of the time is in equivalent matrix operations. In several other computations that do not involve libraries, Java has not been much slower, and sometimes much faster.

Read the article

Mysql InnoDB performance optimization and indexing

- by Davide C

Hello everybody, I have 2 databases and I need to link information between two big tables (more than 3M entries each, continuously growing). The 1st database has a table 'pages' that stores various information about web pages, and includes the URL of each one. The column 'URL' is a varchar(512) and has no index. The 2nd database has a table 'urlHops' defined as: CREATE TABLE urlHops ( dest varchar(512) NOT NULL, src varchar(512) DEFAULT NULL, timestamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, KEY dest_key (dest), KEY src_key (src) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 Now, I need basically to issue (efficiently) queries like this: select p.id,p.URL from db1.pages p, db2.urlHops u where u.src=p.URL and u.dest=? At first, I thought to add an index on pages(URL). But it's a very long column, and I already issue a lot of INSERTs and UPDATEs on the same table (way more than the number of SELECTs I would do using this index). Other possible solutions I thought are: -adding a column to pages, storing the md5 hash of the URL and indexing it; this way I could do queries using the md5 of the URL, with the advantage of an index on a smaller column. -adding another table that contains only page id and page URL, indexing both columns. But this is maybe a waste of space, having only the advantage of not slowing down the inserts and updates I execute on 'pages'. I don't want to slow down the inserts and updates, but at the same time I would be able to do the queries on the URL efficiently. Any advice? My primary concern is performance; if needed, wasting some disk space is not a problem. Thank you, regards Davide

Read the article

Performance of map overlay in conjunction with ItemizedOverlay is very poor

- by oviroa

I am trying to display one png (drawable) on a map in about 300 points. I am retrieving the coordinates from a Sqlite table, dumping them in a cursor. When I try to display them by parsing through the cursor, it takes for ever for the images to be drawn, about .5 second per image. I find that to be suspiciously slow, so some insight on how I can increase performance would help. Here is the snippet of my code that does the rendering: while (!mFlavorsCursor.isAfterLast()) { Log.d("cursor",""+(i++)); point = new GeoPoint( (int)(mFlavorsCursor.getFloat(mFlavorsCursor.getColumnIndex(DataBaseHelper.KEY_LATITUDE))*1000000), (int)(mFlavorsCursor.getFloat(mFlavorsCursor.getColumnIndex(DataBaseHelper.KEY_LONGITUDE))*1000000)); overlayitem = new OverlayItem(point, "", ""); itemizedoverlay.addOverlay(overlayitem); itemizedoverlay.doPopulate(); mFlavorsCursor.moveToNext(); } mapOverlays.add(itemizedoverlay); I tried to isolate all the steps and it looks like the slow one is this: itemizedoverlay.doPopulate(); This is a public method in my class that extends ItemizedOverlay that runs the private populate() method.

Read the article

C# vs C - Big performance difference

- by John

I'm finding massive performance differences between similar code in C anc C#. The C code is: #include <stdio.h> #include <time.h> #include <math.h> main() { int i; double root; clock_t start = clock(); for (i = 0 ; i <= 100000000; i++){ root = sqrt(i); } printf("Time elapsed: %f\n", ((double)clock() - start) / CLOCKS_PER_SEC); } And the C# (console app) is: using System; using System.Collections.Generic; using System.Text; namespace ConsoleApplication2 { class Program { static void Main(string[] args) { DateTime startTime = DateTime.Now; double root; for (int i = 0; i <= 100000000; i++) { root = Math.Sqrt(i); } TimeSpan runTime = DateTime.Now - startTime; Console.WriteLine("Time elapsed: " + Convert.ToString(runTime.TotalMilliseconds/1000)); } } } With the above code, the C# completes in 0.328125 seconds (release version) and the C takes 11.14 seconds to run. The c is being compiled to a windows executable using mingw. I've always been under the assumption that C/C++ were faster or at least comparable to C#.net. What exactly is causing the C to run over 30 times slower? EDIT: It does appear that the C# optimizer was removing the root as it wasn't being used. I changed the root assignment to root += and printed out the total at the end. I've also compiled the C using cl.exe with the /O2 flag set for max speed. The results are now: 3.75 seconds for the C 2.61 seconds for the C# The C is still taking longer, but this is acceptable

Read the article

Does SetFileBandwidthReservation affect memory-mapped file performance?

- by Ghostrider

Does this function affect Memory-mapped file performance? Here's the problem I need to solve: I have two applications competing for disk access: "reader" and "updater". Whole system runs on Windows Server 2008 R2 x64 "Updater" constantly accesses disk in a linear manner, updating data. They system is set up in such a way that updater always has infinite data to update. Consider that it is constantly approximating a solution of a huge set of equations that takes up entire 2TB disk drive. Updater uses ReadFile and WriteFile to process data in a linear fashion. "Reader" is occasionally invoked by user to get some pieces of data. Usually user would read several 4kb blocks from the drive and stop. Occasionally user needs to read up to 100mb sequentially. In exceptional cases up to several gigabytes. Reader maps files to memory to get data it needs. What I would like to achieve is for "reader" to have absolute priority so that "updater" would completely stop if needed so that "reader" could get the data user needs ASAP. Is this problem solvable by using SetPriorityClass and SetFileBandwidthReservation calls? I would really hate to put synchronization login in "reader" and "updater" and rather have the OS take care of priorities.

Read the article

sql-server performance optimization by removing print statements

- by AG

We're going through a round of sql-server stored procedure optimizations. The one recommendation we've found that clearly applies for us is 'SET NOCOUNT ON' at the top of each procedure. (Yes, I've seen the posts that point out issues with this depending on what client objects you run the stored procedures from but these are not issues for us.) So now I'm just trying to add in a bit of common sense. If the benefit of SET NOCOUNT ON is simply to reduce network traffic by some small amount every time, wouldn't it also make sense to turn off all the PRINT statements we have in the stored procedures that we only use for debugging? I can't see how it can hurt performance. OTOH, it's a bit of a hassle to implement due to the fact that some of the print statements are the only thing within else clauses, so you can't just always comment out the one line and be done. The change carries some amount of risk so I don't want to do it if it isn't going to actually help. But I don't see eliminating print statements mentioned anywhere in articles on optimization. Is that because it is so obvious no one bothers to mention it?

Read the article

ColdFusion 8 Slow Performance

- by JoeBob

We have started a new CF8 app and it is running dog slow. A test where we go around ColdFusion (queries within a database utility) show normal speed (80ms). CF8 returns the same query in something like 60 to 80 seconds! I have been looking online and seeing lots of posts about CF8 and performance problems, but don't get any overall sense of a solution; just lots of people trying things and saying that they didn't have the problem with CF7. We are also seeing instability on the server, and some errors relating to garbage collection and the memory heap. We have a number of other applications running on CF8 and they perform adequately...our programmer is not an expert or a guru, he just plugs away. We have isolated this down to a single query that takes forever to return, so it is not a complicated test. Are there any known CF8 problems or obvious tweaks that we should consider trying? If we have to start over and learn a new environment, I will never make deadline. JoeBob

Read the article

WPF drawing performance with large numbers of geometries

- by MyFaJoArCo

Hello, I have problems with WPF drawing performance. There are a lot of small EllipseGeometry objects (1024 ellipses, for example), which are added to three separate GeometryGroups with different foreground brushes. After, I render it all on simple Image control. Code: DrawingGroup tmpDrawing = new DrawingGroup(); GeometryGroup onGroup = new GeometryGroup(); GeometryGroup offGroup = new GeometryGroup(); GeometryGroup disabledGroup = new GeometryGroup(); for (int x = 0; x < DisplayWidth; ++x) { for (int y = 0; y < DisplayHeight; ++y) { if (States[x, y] == true) onGroup.Children.Add(new EllipseGeometry(new Rect((double)x * EDGE, (double)y * EDGE, EDGE, EDGE))); else if (States[x, y] == false) offGroup.Children.Add(new EllipseGeometry(new Rect((double)x * EDGE, (double)y * EDGE, EDGE, EDGE))); else disabledGroup.Children.Add(new EllipseGeometry(new Rect((double)x * EDGE, (double)y * EDGE, EDGE, EDGE))); } } tmpDrawing.Children.Add(new GeometryDrawing(OnBrush, null, onGroup)); tmpDrawing.Children.Add(new GeometryDrawing(OffBrush, null, offGroup)); tmpDrawing.Children.Add(new GeometryDrawing(DisabledBrush, null, disabledGroup)); DisplayImage.Source = new DrawingImage(tmpDrawing); It works fine, but takes too much time - 0.5s on Core 2 Quad, 2s on Pentium 4. I need <0.1s everywhere. All Ellipses, how you can see, are equal. Background of control, where is my DisplayImage, is solid (black, for example), so we can use this fact. I tried to use 1024 Ellipse elements instead of Image with EllipseGeometries, and it was working much faster (~0.5s), but not enough. How to speed up it? Regards, Oleg Eremeev P.S. Sorry for my English.

Read the article

MySQL Normalization stored procedure performance

- by srkiNZ84

Hi, I've written a stored procedure in MySQL to take values currently in a table and to "Normalize" them. This means that for each value passed to the stored procedure, it checks whether the value is already in the table. If it is, then it stores the id of that row in a variable. If the value is not in the table, it stores the newly inserted value's id. The stored procedure then takes the id's and inserts them into a table which is equivalent to the original de-normailized table, but this table is fully normalized and consists of mainly foreign keys. My problem with this design is that the stored procedure takes approximately 10ms or so to return, which is too long when you're trying to work through some 10million records. My suspicion is that the performance is to do with the way in which I'm doing the inserts. i.e. INSERT INTO TableA (first_value) VALUES (argument_from_sp) ON DUPLICATE KEY UPDATE id=LAST_INSERT_ID(id); SET @TableAId = LAST_INSERT_ID(); The "ON DUPLICATE KEY UPDATE" is a bit of a hack, due to the fact that on a duplicate key I don't want to update anything but rather just return the id value of the row. If you miss this step though, the LAST_INSERT_ID() function returns the wrong value when you're trying to run the "SET ..." statement. Does anyone know of a better way to do this in MySQL? Thank you

Read the article

visualising piano performance evaluation

- by Dolphin

I need to develop a performance evaluator for piano playing. Based on a midi generated from sheet music, I need to evaluate the midi of the actual playing (midi keyboard). I'm planning to evaluate the playing based on note pitch, duration and loudness. The evaluation is I suppose a comparison of the notes of the sheet music and playing in midi. But I have no idea how I can visualise (i.e. show where the person have gone wrong) this evaluation process. i.e. maybe show both the notation and highlight which note has gone wrong. But how can I show any of this in some graphical form? Or more precisely on a stave (a music score) itself. I have note details (pitch, duration) and score details (key and time signature) stored in a table, and I'm using Java. But I have no clue as in how I can put all this into graphical form. Any insight is most gratefully appreciated. Advance thanks

Search Results

Search found 13151 results on 527 pages for 'performance counters'.

Page 30/527 | < Previous Page | 26 27 28 29 30 31 32 33 34 35 36 37 | Next Page >

- by maticus

- by aspiringgeek

- by aspiringgeek

- by Testas

- by Mat Keep

- by fatherjack

- by Marcel

- by Arvind

- by Vinzz

- by burki

- by polygenelubricants

- by Salman A

- by Andrew

- by Martlark

- by Travis Brown

- by dfrankow

- by Davide C

- by oviroa

- by John

- by Ghostrider

- by AG

- by JoeBob

- by MyFaJoArCo

- by srkiNZ84

- by Dolphin

< Previous Page | 26 27 28 29 30 31 32 33 34 35 36 37 | Next Page >