Search Results

Search found 3512 results on 141 pages for 'premature optimization'.

Page 34/141 | < Previous Page | 30 31 32 33 34 35 36 37 38 39 40 41 | Next Page >

Pure Java open-source libraries for portfolio selection (= constrained, non-linear optimization)?

- by __roland__

Does anyone know or has experience with a pure Java library to select portfolios or do some similar kinds of quadratic programming with constraints? There seems to be a lot of tools, as already discussed elsewhere - but what I would like to use is a pure Java implementation. Since I want to call the library from within another open-source software with a BSD-ish license I would prefer LGPL over GPL. Any help is appreciated. If you don't know such libraries, which is the most simple algorithm you would suggest to implement? It has to cope with an inequality constraint (all x_i = 0) and an equality constraint (sum of all x_i = 1).

Read the article
Mysql optimization question - How to apply AND logic in search and limit on results in one query?

- by sandeepan-nath

This is a little long but I have provided all the database structures and queries so that you can run it immediately and help me. Run the following queries:- CREATE TABLE IF NOT EXISTS `Tutor_Details` ( `id_tutor` int(10) NOT NULL auto_increment, `firstname` varchar(100) NOT NULL default '', `surname` varchar(155) NOT NULL default '', PRIMARY KEY (`id_tutor`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=41 ; INSERT INTO `Tutor_Details` (`id_tutor`,`firstname`, `surname`) VALUES (1, 'Sandeepan', 'Nath'), (2, 'Bob', 'Cratchit'); CREATE TABLE IF NOT EXISTS `Classes` ( `id_class` int(10) unsigned NOT NULL auto_increment, `id_tutor` int(10) unsigned NOT NULL default '0', `class_name` varchar(255) default NULL, PRIMARY KEY (`id_class`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=229 ; INSERT INTO `Classes` (`id_class`,`class_name`, `id_tutor`) VALUES (1, 'My Class', 1), (2, 'Sandeepan Class', 2); CREATE TABLE IF NOT EXISTS `Tags` ( `id_tag` int(10) unsigned NOT NULL auto_increment, `tag` varchar(255) default NULL, PRIMARY KEY (`id_tag`), UNIQUE KEY `tag` (`tag`), KEY `id_tag` (`id_tag`), KEY `tag_2` (`tag`), KEY `tag_3` (`tag`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=18 ; INSERT INTO `Tags` (`id_tag`, `tag`) VALUES (1, 'Bob'), (6, 'Class'), (2, 'Cratchit'), (4, 'Nath'), (3, 'Sandeepan'), (5, 'My'); CREATE TABLE IF NOT EXISTS `Tutors_Tag_Relations` ( `id_tag` int(10) unsigned NOT NULL default '0', `id_tutor` int(10) default NULL, KEY `Tutors_Tag_Relations` (`id_tag`), KEY `id_tutor` (`id_tutor`), KEY `id_tag` (`id_tag`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; INSERT INTO `Tutors_Tag_Relations` (`id_tag`, `id_tutor`) VALUES (3, 1), (4, 1), (1, 2), (2, 2); CREATE TABLE IF NOT EXISTS `Class_Tag_Relations` ( `id_tag` int(10) unsigned NOT NULL default '0', `id_class` int(10) default NULL, `id_tutor` int(10) NOT NULL, KEY `Class_Tag_Relations` (`id_tag`), KEY `id_class` (`id_class`), KEY `id_tag` (`id_tag`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; INSERT INTO `Class_Tag_Relations` (`id_tag`, `id_class`, `id_tutor`) VALUES (5, 1, 1), (6, 1, 1), (3, 2, 2), (6, 2, 2); Following is about the tables:- There are tutors who create classes. Tutor_Details - Stores tutors Classes - Stores classes created by tutors And for searching we are using a tags based approach. All the keywords are stored in tags table (while classes/tutors are created) and tag relations are entered in Tutor_Tag_Relations and Class_Tag_Relations tables (for tutors and classes respectively)like this:- Tags - id_tag tag (this is a a unique field) Tutors_Tag_Relations - Stores tag relations while the tutors are created. Class_Tag_Relations - Stores tag relations while any tutor creates a class In the present data in database, tutor "Sandeepan Nath" has has created class "My Class" and "Bob Cratchit" has created "Sandeepan Class". 3.Requirement The requirement is to return tutor records from Tutor_Details table such that all the search terms (AND logic) are present in the union of these two sets - 1. Tutor_Details table 2. classes created by a tutor in Classes table) Example search and expected results:- Search Term Result "Sandeepan Class" Tutor Sandeepan Nath's record from Tutor Details table "Class" Both the tutors from ... Most importantly, there should be only one mysql query and a LIMIT applicable on the number of results. Following is a working query which I have so far written (It just applies OR logic of search key words instead of the desired AND logic). SELECT td . * FROM Tutor_Details AS td LEFT JOIN Tutors_Tag_Relations AS ttagrels ON td.id_tutor = ttagrels.id_tutor LEFT JOIN Classes AS wc ON td.id_tutor = wc.id_tutor INNER JOIN Class_Tag_Relations AS wtagrels ON td.id_tutor = wtagrels.id_tutor LEFT JOIN Tags AS t ON t.id_tag = ttagrels.id_tag OR t.id_tag = wtagrels.id_tag WHERE t.tag LIKE '%Sandeepan%' OR t.tag LIKE '%Nath%' GROUP BY td.id_tutor LIMIT 20 Please help me with anything you can. Thanks

Read the article
Some optimization about the code (computing ranks of a vector)?

- by user1748356

The following code is a function (performance-critical) to compute tied ranks of a vector: mergeSort(x,inds,ci); //a sort function to sort vector x of length ci, also returns keys (inds) of x. int tj=0; double xi=x[0]; for (int j = 1; j < ci; ++j) { if (x[j] > xi) { double rankvalue = 0.5 * (j - 1 + tj); for (int k = tj; k < j; ++k) { ranks[inds[k]]=rankvalue; }; tj = j; xi = x[j]; }; }; double rankvalue = 0.5 * (ci - 1 + tj); for (int k = tj; k < ci; ++k) { ranks[inds[k]]=rankvalue; }; The problem is, the supposed performance bottleneck mergeSort(), which is O(NlogN) is several times faster than the other part of codes (which is O(N)), which suggests there is room for huge improvment with the other part of the codes, any advices?

Read the article
R: Are there any alternatives to loops for subsetting from an optimization standpoint?

- by Adam

A recurring analysis paradigm I encounter in my research is the need to subset based on all different group id values, performing statistical analysis on each group in turn, and putting the results in an output matrix for further processing/summarizing. How I typically do this in R is something like the following: data.mat <- read.csv("...") groupids <- unique(data.mat$ID) #Assume there are then 100 unique groups results <- matrix(rep("NA",300),ncol=3,nrow=100) for(i in 1:100) { tempmat <- subset(data.mat,ID==groupids[i]) #Run various stats on tempmat (correlations, regressions, etc), checking to #make sure this specific group doesn't have NAs in the variables I'm using #and assign results to x, y, and z, for example. results[i,1] <- x results[i,2] <- y results[i,3] <- z } This ends up working for me, but depending on the size of the data and the number of groups I'm working with, this can take up to three days. Besides branching out into parallel processing, is there any "trick" for making something like this run faster? For instance, converting the loops into something else (something like an apply with a function containing the stats I want to run inside the loop), or eliminating the need to actually assign the subset of data to a variable?

Read the article
R optimization: How can I avoid a for loop in this situation?

- by chrisamiller

I'm trying to do a simple genomic track intersection in R, and running into major performance problems, probably related to my use of for loops. In this situation, I have pre-defined windows at intervals of 100bp and I'm trying to calculate how much of each window is covered by the annotations in mylist. Graphically, it looks something like this: 0 100 200 300 400 500 600 windows: |-----|-----|-----|-----|-----|-----| mylist: |-| |-----------| So I wrote some code to do just that, but it's fairly slow and has become a bottleneck in my code: ##window for each 100-bp segment windows <- numeric(6) ##second track mylist = vector("list") mylist[[1]] = c(1,20) mylist[[2]] = c(120,320) ##do the intersection for(i in 1:length(mylist)){ st <- floor(mylist[[i]][1]/100)+1 sp <- floor(mylist[[i]][2]/100)+1 for(j in st:sp){ b <- max((j-1)*100, mylist[[i]][1]) e <- min(j*100, mylist[[i]][2]) windows[j] <- windows[j] + e - b + 1 } } print(windows) [1] 20 81 101 21 0 0 Naturally, this is being used on data sets that are much larger than the example I provide here. Through some profiling, I can see that the bottleneck is in the for loops, but my clumsy attempt to vectorize it using *apply functions resulted in code that runs an order of magnitude more slowly. I suppose I could write something in C, but I'd like to avoid that if possible. Can anyone suggest another approach that will speed this calculation up?

Read the article
Wpf. Chart optimization. More than million points

- by Evgeny

I have custom control - chart with size, for example, 300x300 pixels and more than one million points (maybe less) in it. And its clear that now he works very slowly. I am searching for algoritm which will show only few points with minimal visual difference. I have link to component which have functionallity exactly what i need (2 million points demo): http://www.mindscape.co.nz/demo/SilverlightElements/demopage.html#/ChartOverviewPage I will be grateful for any matherials, links or thoughts how to realize such functionallity.

Read the article
Python2.7: How can I speed up this bit of code (loop/lists/tuple optimization)?

- by user89

I repeat the following idiom again and again. I read from a large file (sometimes, up to 1.2 million records!) and store the output into an SQLite databse. Putting stuff into the SQLite DB seems to be fairly fast. def readerFunction(recordSize, recordFormat, connection, outputDirectory, outputFile, numObjects): insertString = "insert into NODE_DISP_INFO(node, analysis, timeStep, H1_translation, H2_translation, V_translation, H1_rotation, H2_rotation, V_rotation) values (?, ?, ?, ?, ?, ?, ?, ?, ?)" analysisNumber = int(outputPath[-3:]) outputFileObject = open(os.path.join(outputDirectory, outputFile), "rb") outputFileObject, numberOfRecordsInFileObject = determineNumberOfRecordsInFileObjectGivenRecordSize(recordSize, outputFileObject) numberOfRecordsPerObject = (numberOfRecordsInFileObject//numberOfObjects) loop1StartTime = time.time() for i in range(numberOfRecordsPerObject ): processedRecords = [] loop2StartTime = time.time() for j in range(numberOfObjects): fout = outputFileObject .read(recordSize) processedRecords.append(tuple([j+1, analysisNumber, i] + [x for x in list(struct.unpack(recordFormat, fout))])) loop2EndTime = time.time() print "Time taken to finish loop2: {}".format(loop2EndTime-loop2StartTime) dbInsertStartTime = time.time() connection.executemany(insertString, processedRecords) dbInsertEndTime = time.time() loop1EndTime = time.time() print "Time taken to finish loop1: {}".format(loop1EndTime-loop1StartTime) outputFileObject.close() print "Finished reading output file for analysis {}...".format(analysisNumber) When I run the code, it seems that "loop 2" and "inserting into the database" is where most execution time is spent. Average "loop 2" time is 0.003s, but it is run up to 50,000 times, in some analyses. The time spent putting stuff into the database is about the same: 0.004s. Currently, I am inserting into the database every time after loop2 finishes so that I don't have to deal with running out RAM. What could I do to speed up "loop 2"?

Read the article
SQL SERVER – Out of the Box – Activty and Performance Reports from SSSMS

- by pinaldave

SQL Server management Studio 2008 is wonderful tool and has many different features. Many times, an average user does not use them as they are not aware about these features. Today, we will learn one such feature. SSMS comes with many inbuilt performance and activity reports, but we do not use it to the full potential. Let us see how we can access these standard reports. Connect to SQL Server Node >> Right Click on it >> Go to Reports >> Click on Standard Reports >> Pick Any Report. Click to Enlarge You can see there are many reports, which an average users needs right away, are available there. Let me list all the reports available. Server Dashboard Configuration Changes History Schema Changes History Scheduler Health Memory Consumption Activity – All Blocking Transactions Activity – All Cursors Activity – All Sessions Activity – Top Sessions Activity – Dormant Sessions Activity - Top Connections Top Transactions by Age Top Transactions by Blocked Transactions Count Top Transactions by Locks Count Performance – Batch Execution Statistics Performance – Object Execution Statistics Performance – Top Queries by Average CPU Time Performance – Top Queries by Average IO Performance – Top Queries by Total CPU Time Performance – Top Queries by Total IO Service Broker Statistics Transactions Log Shipping Status In fact, when you look at the above list, it is fairly clear that they are very thought out and commonly needed reports that are available in SQL Server 2008. Let us run a couple of reports and observe their result. Performance – Top Queries by Total CPU Time Click to Enlarge Memory Consumption Click to Enlarge There are options for custom reports as well, which we can configure. We will learn about them in some other post. Additionally, you can right click on the reports and export in Excel or PDF. I think this tool can really help those who are just looking for some quick details. Does any of you use this feature, or this feature has some limitations and You would like to see more features? Reference : Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Server Management Studio, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Summary of Month – Wait Type – Day 28 of 28

- by pinaldave

I am glad to announce that the month of Wait Types and Queues very successful. I am glad that it was very well received and there was great amount of participation from community. I am fortunate to have some of the excellent comments throughout the series. I want to dedicate this series to all the guest blogger – Jonathan, Jacob, Glenn, and Feodor for their kindness to take a participation in this series. Here is the complete list of the blog posts in this series. I enjoyed writing the series and I plan to continue writing similar series. Please offer your opinion. SQL SERVER – Introduction to Wait Stats and Wait Types – Wait Type – Day 1 of 28 SQL SERVER – Signal Wait Time Introduction with Simple Example – Wait Type – Day 2 of 28 SQL SERVER – DMV – sys.dm_os_wait_stats Explanation – Wait Type – Day 3 of 28 SQL SERVER – DMV – sys.dm_os_waiting_tasks and sys.dm_exec_requests – Wait Type – Day 4 of 28 SQL SERVER – Capturing Wait Types and Wait Stats Information at Interval – Wait Type – Day 5 of 28 SQL SERVER – CXPACKET – Parallelism – Usual Solution – Wait Type – Day 6 of 28 SQL SERVER – CXPACKET – Parallelism – Advanced Solution – Wait Type – Day 7 of 28 SQL SERVER – SOS_SCHEDULER_YIELD – Wait Type – Day 8 of 28 SQL SERVER – PAGEIOLATCH_DT, PAGEIOLATCH_EX, PAGEIOLATCH_KP, PAGEIOLATCH_SH, PAGEIOLATCH_UP – Wait Type – Day 9 of 28 SQL SERVER – IO_COMPLETION – Wait Type – Day 10 of 28 SQL SERVER – ASYNC_IO_COMPLETION – Wait Type – Day 11 of 28 SQL SERVER – PAGELATCH_DT, PAGELATCH_EX, PAGELATCH_KP, PAGELATCH_SH, PAGELATCH_UP – Wait Type – Day 12 of 28 SQL SERVER – FT_IFTS_SCHEDULER_IDLE_WAIT – Full Text – Wait Type – Day 13 of 28 SQL SERVER – BACKUPIO, BACKUPBUFFER – Wait Type – Day 14 of 28 SQL SERVER – LCK_M_XXX – Wait Type – Day 15 of 28 SQL SERVER – Guest Post – Jonathan Kehayias – Wait Type – Day 16 of 28 SQL SERVER – WRITELOG – Wait Type – Day 17 of 28 SQL SERVER – LOGBUFFER – Wait Type – Day 18 of 28 SQL SERVER – PREEMPTIVE and Non-PREEMPTIVE – Wait Type – Day 19 of 28 SQL SERVER – MSQL_XP – Wait Type – Day 20 of 28 SQL SERVER – Guest Posts – Feodor Georgiev – The Context of Our Database Environment – Going Beyond the Internal SQL Server Waits – Wait Type – Day 21 of 28 SQL SERVER – Guest Post – Jacob Sebastian – Filestream – Wait Types – Wait Queues – Day 22 of 28 SQL SERVER – OLEDB – Link Server – Wait Type – Day 23 of 28 SQL SERVER – 2000 – DBCC SQLPERF(waitstats) – Wait Type – Day 24 of 28 SQL SERVER – 2011 – Wait Type – Day 25 of 28 SQL SERVER – Guest Post – Glenn Berry – Wait Type – Day 26 of 28 SQL SERVER – Best Reference – Wait Type – Day 27 of 28 SQL SERVER – Summary of Month – Wait Type – Day 28 of 28 Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Optimization, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, SQLServer, T SQL, Technology

Read the article
SQL SERVER – Difference Between DATETIME and DATETIME2

- by pinaldave

Yesterday I have written a very quick blog post on SQL SERVER – Difference Between GETDATE and SYSDATETIME and I got tremendous response for the same. I suggest you read that blog post before continuing this blog post today. I had asked people to honestly take part and share their view about above two system function. There are few emails as well few comments on the blog post asking question how did I come to know the difference between the same. The answer is real world issues. I was called in for performance tuning consultancy where I was asked very strange question by one developer. Here is the situation he was facing. System had a single table with two different column of datetime. One column was datelastmodified and second column was datefirstmodified. One of the column was DATETIME and another was DATETIME2. Developer was populating them with SYSDATETIME respectively. He was always thinking that the value inserted in the table will be the same. This table was only accessed by INSERT statement and there was no updates done over it in application.One fine day he ran distinct on both of this column and was in for surprise. He always thought that both of the table will have same data, but in fact they had very different data. He presented this scenario to me. I said this can not be possible but when looked at the resultset, I had to agree with him. Here is the simple script generated to demonstrate the problem he was facing. This is just a sample of original table. DECLARE @Intveral INT SET @Intveral = 10000 CREATE TABLE #TimeTable (FirstDate DATETIME, LastDate DATETIME2) WHILE (@Intveral > 0) BEGIN INSERT #TimeTable (FirstDate, LastDate) VALUES (SYSDATETIME(), SYSDATETIME()) SET @Intveral = @Intveral - 1 END GO SELECT COUNT(DISTINCT FirstDate) D_GETDATE, COUNT(DISTINCT LastDate) D_SYSGETDATE FROM #TimeTable GO SELECT DISTINCT a.FirstDate, b.LastDate FROM #TimeTable a INNER JOIN #TimeTable b ON a.FirstDate = b.LastDate GO SELECT * FROM #TimeTable GO DROP TABLE #TimeTable GO Let us see the resultset. You can clearly see from result that SYSDATETIME() does not populate the same value in the both of the field. In fact the value is either rounded down or rounded up in the field which is DATETIME. Event though we are populating the same value, the values are totally different in both the column resulting the SELF JOIN fail and display different DISTINCT values. The best policy is if you are using DATETIME use GETDATE() and if you are suing DATETIME2 use SYSDATETIME() to populate them with current date and time to accurately address the precision. As DATETIME2 is introduced in SQL Server 2008, above script will only work with SQL SErver 2008 and later versions. I hope I have answered few questions asked yesterday. Reference: Pinal Dave (http://www.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL DateTime, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Index Created on View not Used Often – Observation of the View – Part 2

- by pinaldave

Earlier, I have written an article about SQL SERVER – Index Created on View not Used Often – Observation of the View. I received an email from one of the readers, asking if there would no problems when we create the Index on the base table. Well, we need to discuss this situation in two different cases. Before proceeding to the discussion, I strongly suggest you read my earlier articles. To avoid the duplication, I am not going to repeat the code and explanation over here. In all the earlier cases, I have explained in detail how Index created on the View is not utilized. SQL SERVER – Index Created on View not Used Often – Limitation of the View 12 SQL SERVER – Index Created on View not Used Often – Observation of the View SQL SERVER – Indexed View always Use Index on Table As per earlier blog posts, so far we have done the following: Create a Table Create a View Create Index On View Write SELECT with ORDER BY on View However, the blog reader who emailed me suggests the extension of the said logic, which is as follows: Create a Table Create a View Create Index On View Write SELECT with ORDER BY on View Create Index on the Base Table Write SELECT with ORDER BY on View After doing the last two steps, the question is “Will the query on the View utilize the Index on the View, or will it still use the Index of the base table?“ Let us first run the Create example. USE tempdb GO IF EXISTS (SELECT * FROM sys.views WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[SampleView]')) DROP VIEW [dbo].[SampleView] GO IF EXISTS (SELECT * FROM sys.objects WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[mySampleTable]') AND TYPE IN (N'U')) DROP TABLE [dbo].[mySampleTable] GO -- Create SampleTable CREATE TABLE mySampleTable (ID1 INT, ID2 INT, SomeData VARCHAR(100)) INSERT INTO mySampleTable (ID1,ID2,SomeData) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY o1.name), ROW_NUMBER() OVER (ORDER BY o2.name), o2.name FROM sys.all_objects o1 CROSS JOIN sys.all_objects o2 GO -- Create View CREATE VIEW SampleView WITH SCHEMABINDING AS SELECT ID1,ID2,SomeData FROM dbo.mySampleTable GO -- Create Index on View CREATE UNIQUE CLUSTERED INDEX [IX_ViewSample] ON [dbo].[SampleView] ( ID2 ASC ) GO -- Select from view SELECT ID1,ID2,SomeData FROM SampleView ORDER BY ID2 GO -- Create Index on Original Table -- On Column ID1 CREATE UNIQUE CLUSTERED INDEX [IX_OriginalTable] ON mySampleTable ( ID1 ASC ) GO -- On Column ID2 CREATE UNIQUE NONCLUSTERED INDEX [IX_OriginalTable_ID2] ON mySampleTable ( ID2 ) GO -- Select from view SELECT ID1,ID2,SomeData FROM SampleView ORDER BY ID2 GO Now let us see the execution plans for both of the SELECT statement. Before Index on Base Table (with Index on View): After Index on Base Table (with Index on View): Looking at both executions, it is very clear that with or without, the View is using Indexes. Alright, I have written 11 disadvantages of the Views. Now I have written one case where the View is using Indexes. Anybody who says that I am being harsh on Views can say now that I found one place where Index on View can be helpful. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQL View, SQLServer, T SQL, Technology

Read the article
SQL SERVER – Index Created on View not Used Often – Observation of the View

- by pinaldave

I always enjoy writing about concepts on Views. Views are frequently used concepts, and so it’s not surprising that I have seen so many misconceptions about this subject. To clear such misconceptions, I have previously written the article SQL SERVER – The Limitations of the Views – Eleven and more…. I also wrote a follow up article wherein I demonstrated that without even creating index on the basic table, the query on the View will not use the View. You can read about this demonstration over here: SQL SERVER – Index Created on View not Used Often – Limitation of the View 12. I promised in that post that I would also write an article where I would demonstrate the condition where the Index will be used. I got many responses suggesting that I can do that with using NOEXPAND; I agree. I have already written about this in my original summary article. Here is a way for you to see how Index created on View can be utilized. We will do the following steps on this exercise: Create a Table Create a View Create Index On View Write SELECT with ORDER BY on View USE tempdb GO IF EXISTS (SELECT * FROM sys.views WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[SampleView]')) DROP VIEW [dbo].[SampleView] GO IF EXISTS (SELECT * FROM sys.objects WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[mySampleTable]') AND TYPE IN (N'U')) DROP TABLE [dbo].[mySampleTable] GO -- Create SampleTable CREATE TABLE mySampleTable (ID1 INT, ID2 INT, SomeData VARCHAR(100)) INSERT INTO mySampleTable (ID1,ID2,SomeData) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY o1.name), ROW_NUMBER() OVER (ORDER BY o2.name), o2.name FROM sys.all_objects o1 CROSS JOIN sys.all_objects o2 GO -- Create View CREATE VIEW SampleView WITH SCHEMABINDING AS SELECT ID1,ID2,SomeData FROM dbo.mySampleTable GO -- Create Index on View CREATE UNIQUE CLUSTERED INDEX [IX_ViewSample] ON [dbo].[SampleView] ( ID2 ASC ) GO -- Select from view SELECT ID1,ID2,SomeData FROM SampleView ORDER BY ID2 GO When we check the execution plan for this , we find it clearly that the Index created on the View is utilized. ORDER BY clause uses the Index created on the View. I hope this makes the puzzle simpler on how the Index is used on the View. Again, I strongly recommend reading my earlier series about the limitations of the Views found here: SQL SERVER – The Limitations of the Views – Eleven and more…. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQL View, T SQL, Technology

Read the article
SQL SERVER – DMV – sys.dm_exec_query_optimizer_info – Statistics of Optimizer

- by pinaldave

Incredibly, SQL Server has so much information to share with us. Every single day, I am amazed with this SQL Server technology. Sometimes I find several interesting information by just querying few of the DMV. And when I present this info in front of my client during performance tuning consultancy, they are surprised with my findings. Today, I am going to share one of the hidden gems of DMV with you, the one which I frequently use to understand what’s going on under the hood of SQL Server. SQL Server keeps the record of most of the operations of the Query Optimizer. We can learn many interesting details about the optimizer which can be utilized to improve the performance of server. SELECT * FROM sys.dm_exec_query_optimizer_info WHERE counter IN ('optimizations', 'elapsed time','final cost', 'insert stmt','delete stmt','update stmt', 'merge stmt','contains subquery','tables', 'hints','order hint','join hint', 'view reference','remote query','maximum DOP', 'maximum recursion level','indexed views loaded', 'indexed views matched','indexed views used', 'indexed views updated','dynamic cursor request', 'fast forward cursor request') All occurrence values are cumulative and are set to 0 at system restart. All values for value fields are set to NULL at system restart. I have removed a few of the internal counters from the script above, and kept only documented details. Let us check the result of the above query. As you can see, there is so much vital information that is revealed in above query. I can easily say so many things about how many times Optimizer was triggered and what the average time taken by it to optimize my queries was. Additionally, I can also determine how many times update, insert or delete statements were optimized. I was able to quickly figure out that my client was overusing the Query Hints using this dynamic management view. If you have been reading my blog, I am sure you are aware of my series related to SQL Server Views SQL SERVER – The Limitations of the Views – Eleven and more…. With this, I can take a quick look and figure out how many times Views were used in various solutions within the query. Moreover, you can easily know what fraction of the optimizations has been involved in tuning server. For example, the following query would tell me, in total optimizations, what the fraction of time View was “reference“. As this View also includes system Views and DMVs, the number is a bit higher on my machine. SELECT (SELECT CAST (occurrence AS FLOAT) FROM sys.dm_exec_query_optimizer_info WHERE counter = 'view reference') / (SELECT CAST (occurrence AS FLOAT) FROM sys.dm_exec_query_optimizer_info WHERE counter = 'optimizations') AS ViewReferencedFraction Reference : Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL DMV, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

Read the article
SQL SERVER – Public Training and Private Training – Differences and Similarities

- by pinaldave

Earlier this year, I was on Road SQL Server Seminars. I did many SQL Server Performance Trainings and SQL Server Performance Consultations throughout the year but I feel the most rewarding exercise is always the one when instructor learns something from students, too. I was just talking to my wife, Nupur – she manages my logistics and administration related activities – and she pointed out that this year I have done 62% consultations and 38% trainings. I was bit surprised as I thought the numbers would be reversed. Every time I review the year, I think of training done at organizations. Well, I cannot argue with reality, I have done more consultations (some would call them projects) than training. I told my wife that I enjoy consultations more than training. She promptly asked me a question which was not directly related but made me think for long time, and in the end resulted in this blog post. Nupur asked me: what do I enjoy the most, public training or private training? I had a long conversation with her on this subject. I am not going to write long blog post which can change your life here. This is rather a small post condensing my one hour discussion into 200 words. Public Training is fun because… There are lots of different kinds of attendees There are always vivid questions Lots of questions on questions Less interest in theory and more interest in demos Good opportunity of future business Private Training is fun because… There is a focused interest One question is discussed deeply because of existing company issues More interest in “how it happened” concepts – under the hood operations Good connection with attendees This is also a good opportunity of future business Here I will stop my monologue and I want to open up this question to all of you: Question to Attendees - Which one do you enjoy the most – Public Training or Private Training? Question to Trainers - What do you enjoy the most – Public Training or Private Training? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQL Training, SQLAuthority News, T SQL, Technology

Read the article
Correct way to handle path-finding collision matrix

- by Xander Lamkins

Here is an example of me utilizing path finding. The red grid represents the grid utilized by my A* library to locate a distance. This picture is only an example, currently it is all calculated on the 1x1 pixel level (pretty darn laggy). I want to make it so that the farther I click, the less accurate it will be (split the map into larger grid pieces). Edit: as mentioned by Eric, this is not a required game mechanic. I am perfectly fine with any method that allows me to make this accurate while still fast. This isn't the really the topic of this question though. The problem I have is, my current library uses a two dimensional grid of integers. The higher the number in a cell, the more resistance for that grid tile. Currently I'm setting all unwalkable spots to Integer Max. Here is an example of what I want: I'm just not sure how I should set up the arrays of integers of the grid. Every time an element is added/removed to/from the game, it's collision details are updated in the table. Here is a picture of what the map looks like on my collision layer: I probably shouldn't be creating new arrays every time I have to do a path find because my game needs to support tons of PF at the same time. Should I have multiple arrays that are all updated when the dynamic elements are updated (a building is built/a building is destroyed). The problem I see with this is that it will probably make the creation and destruction of buildings a little more laggy than I would want because it would be setting the collision grid for each built in accuracy level. I would also have to add more/remove some arrays if I ever in the future changed the map size. Should I generate the new array based on an accuracy value every time I need to PF? The problem I see with this is that it will probably make any form of PF just as laggy because it will have to search through a MapWidth x MapHeight number of cells to shrink it all down. Or is there a better way? I'm certainly not the best at optimizing really anything. I've just started dealing with XNA so I'm not used to having optimization code really doing much of an affect until now... :( If you need code examples, please ask. I'll add it as an edit. EDIT: While this doesn't directly relate to the question, I figure the more information I provide, the better. To keep your units from moving as accurately to the players desired position, I've decided that once the unit PFs over to the less accurate grid piece, it will then PF on a more accurate level to the exact position requested.

Read the article
Does OO, TDD, and Refactoring to Smaller Functions affect Speed of Code?

- by Dennis

In Computer Science field, I have noticed a notable shift in thinking when it comes to programming. The advice as it stands now is write smaller, more testable code refactor existing code into smaller and smaller chunks of code until most of your methods/functions are just a few lines long write functions that only do one thing (which makes them smaller again) This is a change compared to the "old" or "bad" code practices where you have methods spanning 2500 lines, and big classes doing everything. My question is this: when it call comes down to machine code, to 1s and 0s, to assembly instructions, should I be at all concerned that my class-separated code with variety of small-to-tiny functions generates too much extra overhead? While I am not exactly familiar with how OO code and function calls are handled in ASM in the end, I do have some idea. I assume that each extra function call, object call, or include call (in some languages), generate an extra set of instructions, thereby increasing code's volume and adding various overhead, without adding actual "useful" code. I also imagine that good optimizations can be done to ASM before it is actually ran on the hardware, but that optimization can only do so much too. Hence, my question -- how much overhead (in space and speed) does well-separated code (split up across hundreds of files, classes, and methods) actually introduce compared to having "one big method that contains everything", due to this overhead? UPDATE for clarity: I am assuming that adding more and more functions and more and more objects and classes in a code will result in more and more parameter passing between smaller code pieces. It was said somewhere (quote TBD) that up to 70% of all code is made up of ASM's MOV instruction - loading CPU registers with proper variables, not the actual computation being done. In my case, you load up CPU's time with PUSH/POP instructions to provide linkage and parameter passing between various pieces of code. The smaller you make your pieces of code, the more overhead "linkage" is required. I am concerned that this linkage adds to software bloat and slow-down and I am wondering if I should be concerned about this, and how much, if any at all, because current and future generations of programmers who are building software for the next century, will have to live with and consume software built using these practices. UPDATE: Multiple files I am writing new code now that is slowly replacing old code. In particular I've noted that one of the old classes was a ~3000 line file (as mentioned earlier). Now it is becoming a set of 15-20 files located across various directories, including test files and not including PHP framework I am using to bind some things together. More files are coming as well. When it comes to disk I/O, loading multiple files is slower than loading one large file. Of course not all files are loaded, they are loaded as needed, and disk caching and memory caching options exist, and yet still I believe that loading multiple files takes more processing than loading a single file into memory. I am adding that to my concern.

Read the article
SQL SERVER – Server Side Paging in SQL Server 2011 Performance Comparison

- by pinaldave

Earlier, I have written about SQL SERVER – Server Side Paging in SQL Server 2011 – A Better Alternative. I got many emails asking for performance analysis of paging. Here is the quick analysis of it. The real challenge of paging is all the unnecessary IO reads from the database. Network traffic was one of the reasons why paging has become a very expensive operation. I have seen many legacy applications where a complete resultset is brought back to the application and paging has been done. As what you have read earlier, SQL Server 2011 offers a better alternative to an age-old solution. This article has been divided into two parts: Test 1: Performance Comparison of the Two Different Pages on SQL Server 2011 Method In this test, we will analyze the performance of the two different pages where one is at the beginning of the table and the other one is at its end. Test 2: Performance Comparison of the Two Different Pages Using CTE (Earlier Solution from SQL Server 2005/2008) and the New Method of SQL Server 2011 We will explore this in the next article. This article will tackle test 1 first. Test 1: Retrieving Page from two different locations of the table. Run the following T-SQL Script and compare the performance. SET STATISTICS IO ON; USE AdventureWorks2008R2 GO DECLARE @RowsPerPage INT = 10, @PageNumber INT = 5 SELECT * FROM Sales.SalesOrderDetail ORDER BY SalesOrderDetailID OFFSET @PageNumber*@RowsPerPage ROWS FETCH NEXT 10 ROWS ONLY GO USE AdventureWorks2008R2 GO DECLARE @RowsPerPage INT = 10, @PageNumber INT = 12100 SELECT * FROM Sales.SalesOrderDetail ORDER BY SalesOrderDetailID OFFSET @PageNumber*@RowsPerPage ROWS FETCH NEXT 10 ROWS ONLY GO You will notice that when we are reading the page from the beginning of the table, the database pages read are much lower than when the page is read from the end of the table. This is very interesting as when the the OFFSET changes, PAGE IO is increased or decreased. In the normal case of the search engine, people usually read it from the first few pages, which means that IO will be increased as we go further in the higher parts of navigation. I am really impressed because using the new method of SQL Server 2011, PAGE IO will be much lower when the first few pages are searched in the navigation. Test 2: Retrieving Page from two different locations of the table and comparing to earlier versions. In this test, we will compare the queries of the Test 1 with the earlier solution via Common Table Expression (CTE) which we utilized in SQL Server 2005 and SQL Server 2008. Test 2 A : Page early in the table -- Test with pages early in table USE AdventureWorks2008R2 GO DECLARE @RowsPerPage INT = 10, @PageNumber INT = 5 ;WITH CTE_SalesOrderDetail AS ( SELECT *, ROW_NUMBER() OVER( ORDER BY SalesOrderDetailID) AS RowNumber FROM Sales.SalesOrderDetail PC) SELECT * FROM CTE_SalesOrderDetail WHERE RowNumber >= @PageNumber*@RowsPerPage+1 AND RowNumber <= (@PageNumber+1)*@RowsPerPage ORDER BY SalesOrderDetailID GO SET STATISTICS IO ON; USE AdventureWorks2008R2 GO DECLARE @RowsPerPage INT = 10, @PageNumber INT = 5 SELECT * FROM Sales.SalesOrderDetail ORDER BY SalesOrderDetailID OFFSET @PageNumber*@RowsPerPage ROWS FETCH NEXT 10 ROWS ONLY GO Test 2 B : Page later in the table -- Test with pages later in table USE AdventureWorks2008R2 GO DECLARE @RowsPerPage INT = 10, @PageNumber INT = 12100 ;WITH CTE_SalesOrderDetail AS ( SELECT *, ROW_NUMBER() OVER( ORDER BY SalesOrderDetailID) AS RowNumber FROM Sales.SalesOrderDetail PC) SELECT * FROM CTE_SalesOrderDetail WHERE RowNumber >= @PageNumber*@RowsPerPage+1 AND RowNumber <= (@PageNumber+1)*@RowsPerPage ORDER BY SalesOrderDetailID GO SET STATISTICS IO ON; USE AdventureWorks2008R2 GO DECLARE @RowsPerPage INT = 10, @PageNumber INT = 12100 SELECT * FROM Sales.SalesOrderDetail ORDER BY SalesOrderDetailID OFFSET @PageNumber*@RowsPerPage ROWS FETCH NEXT 10 ROWS ONLY GO From the resultset, it is very clear that in the earlier case, the pages read in the solution are always much higher than the new technique introduced in SQL Server 2011 even if we don’t retrieve all the data to the screen. If you carefully look at both the comparisons, the PAGE IO is much lesser in the case of the new technique introduced in SQL Server 2011 when we read the page from the beginning of the table and when we read it from the end. I consider this as a big improvement as paging is one of the most used features for the most part of the application. The solution introduced in SQL Server 2011 is very elegant because it also improves the performance of the query and, at large, the database. Reference : Pinal Dave (http://blog.SQLAuthority.com) Filed under: SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Update Statistics are Sampled By Default

- by pinaldave

After reading my earlier post SQL SERVER – Create Primary Key with Specific Name when Creating Table on Statistics, I have received another question by a blog reader. The question is as follows: Question: Are the statistics sampled by default? Answer: Yes. The sampling rate can be specified by the user and it can be anywhere between a very low value to 100%. Let us do a small experiment to verify if the auto update on statistics is left on. Also, let’s examine a very large table that is created and statistics by default- whether the statistics are sampled or not. USE [AdventureWorks] GO -- Create Table CREATE TABLE [dbo].[StatsTest]( [ID] [int] IDENTITY(1,1) NOT NULL, [FirstName] [varchar](100) NULL, [LastName] [varchar](100) NULL, [City] [varchar](100) NULL, CONSTRAINT [PK_StatsTest] PRIMARY KEY CLUSTERED ([ID] ASC) ) ON [PRIMARY] GO -- Insert 1 Million Rows INSERT INTO [dbo].[StatsTest] (FirstName,LastName,City) SELECT TOP 1000000 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Update the statistics UPDATE STATISTICS [dbo].[StatsTest] GO -- Shows the statistics DBCC SHOW_STATISTICS ("StatsTest"PK_StatsTest) GO -- Clean up DROP TABLE [dbo].[StatsTest] GO Now let us observe the result of the DBCC SHOW_STATISTICS. The result shows that Resultset is for sure sampling for a large dataset. The percentage of sampling is based on data distribution as well as the kind of data in the table. Before dropping the table, let us check first the size of the table. The size of the table is 35 MB. Now, let us run the above code with lesser number of the rows. USE [AdventureWorks] GO -- Create Table CREATE TABLE [dbo].[StatsTest]( [ID] [int] IDENTITY(1,1) NOT NULL, [FirstName] [varchar](100) NULL, [LastName] [varchar](100) NULL, [City] [varchar](100) NULL, CONSTRAINT [PK_StatsTest] PRIMARY KEY CLUSTERED ([ID] ASC) ) ON [PRIMARY] GO -- Insert 1 Hundred Thousand Rows INSERT INTO [dbo].[StatsTest] (FirstName,LastName,City) SELECT TOP 100000 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Update the statistics UPDATE STATISTICS [dbo].[StatsTest] GO -- Shows the statistics DBCC SHOW_STATISTICS ("StatsTest"PK_StatsTest) GO -- Clean up DROP TABLE [dbo].[StatsTest] GO You can see that Rows Sampled is just the same as Rows of the table. In this case, the sample rate is 100%. Before dropping the table, let us also check the size of the table. The size of the table is less than 4 MB. Let us compare the Result set just for a valid reference. Test 1: Total Rows: 1000000, Rows Sampled: 255420, Size of the Table: 35.516 MB Test 2: Total Rows: 100000, Rows Sampled: 100000, Size of the Table: 3.555 MB The reason behind the sample in the Test1 is that the data space is larger than 8 MB, and therefore it uses more than 1024 data pages. If the data space is smaller than 8 MB and uses less than 1024 data pages, then the sampling does not happen. Sampling aids in reducing excessive data scan; however, sometimes it reduces the accuracy of the data as well. Please note that this is just a sample test and there is no way it can be claimed as a benchmark test. The result can be dissimilar on different machines. There are lots of other information can be included when talking about this subject. I will write detail post covering all the subject very soon. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Index, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: SQL Statistics

Read the article
SQL SERVER – Introduction to Wait Stats and Wait Types – Wait Type – Day 1 of 28

- by pinaldave

I have been working a lot on Wait Stats and Wait Types recently. Last Year, I requested blog readers to send me their respective server’s wait stats. I appreciate their kind response as I have received Wait stats from my readers. I took each of the results and carefully analyzed them. I provided necessary feedback to the person who sent me his wait stats and wait types. Based on the feedbacks I got, many of the readers have tuned their server. After a while I got further feedbacks on my recommendations and again, I collected wait stats. I recorded the wait stats and my recommendations and did further research. At some point at time, there were more than 10 different round trips of the recommendations and suggestions. Finally, after six month of working my hands on performance tuning, I have collected some real world wisdom because of this. Now I plan to share my findings with all of you over here. Before anything else, please note that all of these are based on my personal observations and opinions. They may or may not match the theory available at other places. Some of the suggestions may not match your situation. Remember, every server is different and consequently, there is more than one solution to a particular problem. However, this series is written with kept wait stats in mind. While I was working on various performance tuning consultations, I did many more things than just tuning wait stats. Today we will discuss how to capture the wait stats. I use the script diagnostic script created by my friend and SQL Server Expert Glenn Berry to collect wait stats. Here is the script to collect the wait stats: -- Isolate top waits for server instance since last restart or statistics clear WITH Waits AS (SELECT wait_type, wait_time_ms / 1000. AS wait_time_s, 100. * wait_time_ms / SUM(wait_time_ms) OVER() AS pct, ROW_NUMBER() OVER(ORDER BY wait_time_ms DESC) AS rn FROM sys.dm_os_wait_stats WHERE wait_type NOT IN ('CLR_SEMAPHORE','LAZYWRITER_SLEEP','RESOURCE_QUEUE','SLEEP_TASK' ,'SLEEP_SYSTEMTASK','SQLTRACE_BUFFER_FLUSH','WAITFOR', 'LOGMGR_QUEUE','CHECKPOINT_QUEUE' ,'REQUEST_FOR_DEADLOCK_SEARCH','XE_TIMER_EVENT','BROKER_TO_FLUSH','BROKER_TASK_STOP','CLR_MANUAL_EVENT' ,'CLR_AUTO_EVENT','DISPATCHER_QUEUE_SEMAPHORE', 'FT_IFTS_SCHEDULER_IDLE_WAIT' ,'XE_DISPATCHER_WAIT', 'XE_DISPATCHER_JOIN', 'SQLTRACE_INCREMENTAL_FLUSH_SLEEP')) SELECT W1.wait_type, CAST(W1.wait_time_s AS DECIMAL(12, 2)) AS wait_time_s, CAST(W1.pct AS DECIMAL(12, 2)) AS pct, CAST(SUM(W2.pct) AS DECIMAL(12, 2)) AS running_pct FROM Waits AS W1 INNER JOIN Waits AS W2 ON W2.rn <= W1.rn GROUP BY W1.rn, W1.wait_type, W1.wait_time_s, W1.pct HAVING SUM(W2.pct) - W1.pct < 99 OPTION (RECOMPILE); -- percentage threshold GO This script uses Dynamic Management View sys.dm_os_wait_stats to collect the wait stats. It omits the system-related wait stats which are not useful to diagnose performance-related bottleneck. Additionally, not OPTION (RECOMPILE) at the end of the DMV will ensure that every time the query runs, it retrieves new data and not the cached data. This dynamic management view collects all the information since the time when the SQL Server services have been restarted. You can also manually clear the wait stats using the following command: DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR); Once the wait stats are collected, we can start analysis them and try to see what is causing any particular wait stats to achieve higher percentages than the others. Many waits stats are related to one another. When the CPU pressure is high, all the CPU-related wait stats show up on top. But when that is fixed, all the wait stats related to the CPU start showing reasonable percentages. It is difficult to have a sure solution, but there are good indications and good suggestions on how to solve this. I will keep this blog post updated as I will post more details about wait stats and how I reduce them. The reference to Book On Line is over here. Of course, I have selected February to run this Wait Stats series. I am already cheating by having the smallest month to run this series. :) Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: DMV, Pinal Dave, PostADay, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL, Technology

Read the article
SQL SERVER – Video – Beginning Performance Tuning with SQL Server Execution Plan

- by pinaldave

Traveling can be most interesting or most exhausting experience. However, traveling is always the most enlightening experience one can have. While going to long journey one has to prepare a lot of things. Pack necessary travel gears, clothes and medicines. However, the most essential part of travel is the journey to the destination. There are many variations one prefer but the ultimate goal is to have a delightful experience during the journey. Here is the video available which explains how to begin with SQL Server Execution plans. Performance Tuning is a Journey Performance tuning is just like a long journey. The goal of performance tuning is efficient and least resources consuming query execution with accurate results. Just as maps are the most essential aspect of performance tuning the same way, execution plans are essentially maps for SQL Server to reach to the resultset. The goal of the execution plan is to find the most efficient path which translates the least usage of the resources (CPU, memory, IO etc). Execution Plans are like Maps When online maps were invented (e.g. Bing, Google, Mapquests etc) initially it was not possible to customize them. They were given a single route to reach to the destination. As time evolved now it is possible to give various hints to the maps, for example ‘via public transport’, ‘walking’, ‘fastest route’, ‘shortest route’, ‘avoid highway’. There are places where we manually drag the route and make it appropriate to our needs. The same situation is with SQL Server Execution Plans, if we want to tune the queries, we need to understand the execution plans and execution plans internals. We need to understand the smallest details which relate to execution plan when we our destination is optimal queries. Understanding Execution Plans The biggest challenge with maps are figuring out the optimal path. The same way the most common challenge with execution plans is where to start from and which precise route to take. Here is a quick list of the frequently asked questions related to execution plans: Should I read the execution plans from bottoms up or top down? Is execution plans are left to right or right to left? What is the relational between actual execution plan and estimated execution plan? When I mouse over operator I see CPU and IO but not memory, why? Sometime I ran the query multiple times and I get different execution plan, why? How to cache the query execution plan and data? I created an optimal index but the query is not using it. What should I change – query, index or provide hints? What are the tools available which helps quickly to debug performance problems? Etc… Honestly the list is quite a big and humanly impossible to write everything in the words. SQL Server Performance: Introduction to Query Tuning My friend Vinod Kumar and I have created for the same a video learning course for beginning performance tuning. We have covered plethora of the subject in the course. Here is the quick list of the same: Execution Plan Basics Essential Indexing Techniques Query Design for Performance Performance Tuning Tools Tips and Tricks Checklist: Performance Tuning We believe we have covered a lot in this four hour course and we encourage you to go over the video course if you are interested in Beginning SQL Server Performance Tuning and Query Tuning. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology, Video Tagged: Execution Plan

Read the article
SQL SERVER – Reducing CXPACKET Wait Stats for High Transactional Database

- by pinaldave

While engaging in a performance tuning consultation for a client, a situation occurred where they were facing a lot of CXPACKET Waits Stats. The client asked me if I could help them reduce this huge number of wait stats. I usually receive this kind of request from other client as well, but the important thing to understand is whether this question has any merits or benefits, or not. Before we continue the resolution, let us understand what CXPACKET Wait Stats are. The official definition suggests that CXPACKET Wait Stats occurs when trying to synchronize the query processor exchange iterator. You may consider lowering the degree of parallelism if a conflict concerning this wait type develops into a problem. (from BOL) In simpler words, when a parallel operation is created for SQL Query, there are multiple threads for a single query. Each query deals with a different set of the data (or rows). Due to some reasons, one or more of the threads lag behind, creating the CXPACKET Wait Stat. Threads which came first have to wait for the slower thread to finish. The Wait by a specific completed thread is called CXPACKET Wait Stat. Note that CXPACKET Wait is done by completed thread and not the one which are unfinished. “Note that not all the CXPACKET wait types are bad. You might experience a case when it totally makes sense. There might also be cases when this is also unavoidable. If you remove this particular wait type for any query, then that query may run slower because the parallel operations are disabled for the query.” Now let us see what the best practices to reduce the CXPACKET Wait Stats are. The suggestions, with which you will find that if you search online through the browser, would play a major role as and might be asked about their jobs In addition, might tell you that you should set ‘maximum degree of parallelism’ to 1. I do agree with these suggestions, too; however, I think this is not the final resolutions. As soon as you set your entire query to run on single CPU, you will get a very bad performance from the queries which are actually performing okay when using parallelism. The best suggestion to this is that you set ‘the maximum degree of parallelism’ to a lower number or 1 (be very careful with this – it can create more problems) but tune the queries which can be benefited from multiple CPU’s. You can use query hint OPTION (MAXDOP 0) to run the server to use parallelism. Here is the two-quick script which helps to resolve these issues: Change MAXDOP at Server Level EXEC sys.sp_configure N'max degree of parallelism', N'1' GO RECONFIGURE WITH OVERRIDE GO Run Query with all the CPU (using parallelism) USE AdventureWorks GO SELECT * FROM Sales.SalesOrderDetail ORDER BY ProductID OPTION (MAXDOP 0) GO Below is the blog post which will help you to find all the parallel query in your server. SQL SERVER – Find Queries using Parallelism from Cached Plan Please note running Queries in single CPU may worsen your performance and it is not recommended at all. Infect this can be very bad advise. I strongly suggest that you identify the queries which are offending and tune them instead of following any other suggestions. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQL White Papers, SQLAuthority News, T SQL, Technology

Read the article
SQL SERVER – Data Pages in Buffer Pool – Data Stored in Memory Cache

- by pinaldave

This will drop all the clean buffers so we will be able to start again from there. Now, run the following script and check the execution plan of the query. Have you ever wondered what types of data are there in your cache? During SQL Server Trainings, I am usually asked if there is any way one can know how much data in a table is stored in the memory cache? The more detailed question I usually get is if there are multiple indexes on table (and used in a query), were the data of the single table stored multiple times in the memory cache or only for a single time? Here is a query you can run to figure out what kind of data is stored in the cache. USE AdventureWorks GO SELECT COUNT(*) AS cached_pages_count, name AS BaseTableName, IndexName, IndexTypeDesc FROM sys.dm_os_buffer_descriptors AS bd INNER JOIN ( SELECT s_obj.name, s_obj.index_id, s_obj.allocation_unit_id, s_obj.OBJECT_ID, i.name IndexName, i.type_desc IndexTypeDesc FROM ( SELECT OBJECT_NAME(OBJECT_ID) AS name, index_id ,allocation_unit_id, OBJECT_ID FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.hobt_id AND (au.type = 1 OR au.type = 3) UNION ALL SELECT OBJECT_NAME(OBJECT_ID) AS name, index_id, allocation_unit_id, OBJECT_ID FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.partition_id AND au.type = 2 ) AS s_obj LEFT JOIN sys.indexes i ON i.index_id = s_obj.index_id AND i.OBJECT_ID = s_obj.OBJECT_ID ) AS obj ON bd.allocation_unit_id = obj.allocation_unit_id WHERE database_id = DB_ID() GROUP BY name, index_id, IndexName, IndexTypeDesc ORDER BY cached_pages_count DESC; GO Now let us run the query above and observe the output of the same. We can see in the above query that there are four columns. Cached_Pages_Count lists the pages cached in the memory. BaseTableName lists the original base table from which data pages are cached. IndexName lists the name of the index from which pages are cached. IndexTypeDesc lists the type of index. Now, let us do one more experience here. Please note that you should not run this test on a production server as it can extremely reduce the performance of the database. DBCC DROPCLEANBUFFERS This will drop all the clean buffers and we will be able to start again from there. Now run following script and check the execution plan for the same. USE AdventureWorks GO SELECT UnitPrice, ModifiedDate FROM Sales.SalesOrderDetail WHERE SalesOrderDetailID BETWEEN 1 AND 100 GO The execution plans contain the usage of two different indexes. Now, let us run the script that checks the pages cached in SQL Server. It will give us the following output. It is clear from the Resultset that when more than one index is used, datapages related to both or all of the indexes are stored in Memory Cache separately. Let me know what you think of this article. I had a great pleasure while writing this article because I was able to write on this subject, which I like the most. In the next article, we will exactly see what data are cached and those that are not cached, using a few undocumented commands. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: DMV, Pinal Dave, SQL, SQL Authority, SQL Optimization, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: SQL DMV

Read the article
SQL SERVER – How to Ignore Columnstore Index Usage in Query

- by pinaldave

Earlier I wrote about SQL SERVER – Fundamentals of Columnstore Index and very first question I received in email was as following. “We are using SQL Server 2012 CTP3 and so far so good. In our data warehouse solution we have created 1 non-clustered columnstore index on our large fact table. We have very unique situation but your article did not cover it. We are running few queries on our fact table which is working very efficiently but there is one query which earlier was running very fine but after creating this non-clustered columnstore index this query is running very slow. We dropped the columnstore index and suddenly this one query is running fast but other queries which were benefited by this columnstore index it is running slow. Any workaround in this situation?” In summary the question in simple words “How can we ignore using columnstore index in selective queries?” Very interesting question – you can use I can understand there may be the cases when columnstore index is not ideal and needs to be ignored the same. You can use the query hint IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX to ignore the columnstore index. SQL Server Engine will use any other index which is best after ignoring the columnstore index. Here is the quick script to prove the same. We will first create sample database and then create columnstore index on the same. Once columnstore index is created we will write simple query. This query will use columnstore index. We will then show the usage of the query hint. USE AdventureWorks GO -- Create New Table CREATE TABLE [dbo].[MySalesOrderDetail]( [SalesOrderID] [int] NOT NULL, [SalesOrderDetailID] [int] NOT NULL, [CarrierTrackingNumber] [nvarchar](25) NULL, [OrderQty] [smallint] NOT NULL, [ProductID] [int] NOT NULL, [SpecialOfferID] [int] NOT NULL, [UnitPrice] [money] NOT NULL, [UnitPriceDiscount] [money] NOT NULL, [LineTotal] [numeric](38, 6) NOT NULL, [rowguid] [uniqueidentifier] NOT NULL, [ModifiedDate] [datetime] NOT NULL ) ON [PRIMARY] GO -- Create clustered index CREATE CLUSTERED INDEX [CL_MySalesOrderDetail] ON [dbo].[MySalesOrderDetail] ( [SalesOrderDetailID]) GO -- Create Sample Data Table -- WARNING: This Query may run upto 2-10 minutes based on your systems resources INSERT INTO [dbo].[MySalesOrderDetail] SELECT S1.* FROM Sales.SalesOrderDetail S1 GO 100 -- Create ColumnStore Index CREATE NONCLUSTERED COLUMNSTORE INDEX [IX_MySalesOrderDetail_ColumnStore] ON [MySalesOrderDetail] (UnitPrice, OrderQty, ProductID) GO Now we have created columnstore index so if we run following query it will use for sure the same index. -- Select Table with regular Index SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice, SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty FROM [dbo].[MySalesOrderDetail] GROUP BY ProductID ORDER BY ProductID GO We can specify Query Hint IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX as described in following query and it will not use columnstore index. -- Select Table with regular Index SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice, SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty FROM [dbo].[MySalesOrderDetail] GROUP BY ProductID ORDER BY ProductID OPTION (IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX) GO Let us clean up the database. -- Cleanup DROP INDEX [IX_MySalesOrderDetail_ColumnStore] ON [dbo].[MySalesOrderDetail] GO TRUNCATE TABLE dbo.MySalesOrderDetail GO DROP TABLE dbo.MySalesOrderDetail GO Again, make sure that you use hint sparingly and understanding the proper implication of the same. Make sure that you test it with and without hint and select the best option after review of your administrator. Here is the question for you – have you started to use SQL Server 2012 for your validation and development (not on production)? It will be interesting to know the answer. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Index, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Beginning SQL Server: One Step at a Time – SQL Server Magazine

- by pinaldave

I am glad to announce that along with SQLAuthority.com, I will be blogging on the prominent site of SQL Server Magazine. My very first blog post there is already live; read here: Beginning SQL Server: One Step at a Time. My association with SQL Server Magazine has been quite long, I have written nearly 7 to 8 SQL Server articles for the print magazine and it has been a great experience. I used to stay in the United States at that time. I moved back to India for good, and during this process, I had put everything on hold for a while. Just like many things, “temporary” things become “permanent” – coming back to SQLMag was on hold for long time. Well, this New Year, things have changed – once again, I am back with my online presence at SQLMag.com. Everybody is a beginner at every task or activity at some point of his/her life: spelling words for the first time; learning how to drive for the first time, etc. No one is perfect at the start of any task, but every human is different. As time passes, we all develop our interests and begin to study our subject of interest. Most of us dream to get a job in the area of our study – however things change as time passes. I recently read somewhere online (I could not find the link again while writing this one) that all the successful people in various areas have never studied in the area in which they are successful. After going through a formal learning process of what we love, we refuse to stop learning, and we finally stop changing career and focus areas. We move, we dare and we progress. IT field is similar to our life. New IT professionals come to this field every day. There are two types of beginners – a) those who are associated with IT field but not familiar with other technologies, and b) those who are absolutely new to the IT field. Learning a new technology is always exciting and overwhelming for enthusiasts. I am working with database (in particular) for SQL Server for more than 7 years but I am still overwhelmed with so many things to learn. I continue to learn and I do not think that I should ever stop doing so. Just like everybody, I want to be in the race and get ahead in learning the technology. For the same, I am always looking for good guidance. I always try to find a good article, blog or book chapter, which can teach me what I really want to learn at this stage in my career and can be immensely helpful. Quite often, I prefer to read the material where the author does not judge me or assume my understanding. I like to read new concepts like a child, who takes his/her first steps of learning without any prior knowledge. Keeping my personal philosophy and preference in mind, I will be blogging on SQL Server Magazine site. I will be blogging on the beginners stuff. I will be blogging for them, who really want to start and make a mark in this area. I will be blogging for all those who have an extreme passion for learning. I am happy that this is a good start for this year. One of my resolutions is to help every beginner. It is totally possible that in future they all will grow and find the same article quite ‘easy‘ – well when that happens, it indicates the success of the article and material! Well, I encourage everybody to read my SQL Server Magazine blog – I will be blogging there frequently on various topics. To begin, we will be talking about performance tuning, and I assure that I will not shy away from other multiple areas. Read my SQL Server Magazine Blog: Beginning SQL Server: One Step at a Time I think the title says it all. Do leave your comments and feedback to indicate your preference of subject and interest. I am going to continue writing on subject, and the aim is of course to help grow in this field. Reference : Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority News, T SQL, Technology

Read the article
SQL SERVER – Disabled Index and Update Statistics

- by pinaldave

When we try to update the statistics, it throws an error as if the clustered index is disabled. Now let us enable the clustered index only and attempt to update the statistics of the table right after that. Have you ever come across the situation where a conversation never gets over and it continues even though original point of discussion has passed. I am facing the same situation in the case of Disabled Index. Here is the link to original conversations. SQL SERVER – Disable Clustered Index and Data Insert – Reader had a issue here with Disabled Index SQL SERVER – Understanding ALTER INDEX ALL REBUILD with Disabled Clustered Index – Reader asked the effect of Rebuilding Indexes The same reader asked me today – “I understood what the disabled indexes do; what is their effect on statistics. Is it true that even though indexes are disabled, they continue updating the statistics?“ The answer is very interesting: If you have disabled clustered index, you will be not able to update the statistics at all for any index. If you have enabled clustered index and disabled non clustered index when you update the statistics of the table, it automatically updates the statistics of the ALL (disabled and enabled – both) the indexes on the table. If you are not satisfied with the answer, let us go over a simple example. I have written necessary comments in the code itself to have a clear idea. USE tempdb GO -- Drop Table if Exists IF EXISTS (SELECT * FROM sys.objects WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[TableName]') AND type IN (N'U')) DROP TABLE [dbo].[TableName] GO -- Create Table CREATE TABLE [dbo].[TableName]( [ID] [int] NOT NULL, [FirstCol] [varchar](50) NULL ) GO -- Insert Some data INSERT INTO TableName SELECT 1, 'First' UNION ALL SELECT 2, 'Second' UNION ALL SELECT 3, 'Third' UNION ALL SELECT 4, 'Fourth' UNION ALL SELECT 5, 'Five' GO -- Create Clustered Index ALTER TABLE [TableName] ADD CONSTRAINT [PK_TableName] PRIMARY KEY CLUSTERED ([ID] ASC) GO -- Create Nonclustered Index CREATE UNIQUE NONCLUSTERED INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] ([FirstCol] ASC) GO -- Check that all the indexes are enabled SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO Now let us update the statistics of the table and check the statistics update date. -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO Now let us disable the indexes and check if they are disabled using sys.indexes. -- Disable Indexes -- Disable Nonclustered Index ALTER INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] DISABLE GO -- Disable Clustered Index ALTER INDEX [PK_TableName] ON [dbo].[TableName] DISABLE GO -- Check that all the indexes are disabled SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO Let us try to update the statistics of the table. -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO /* -- Above operation should thrown following error Msg 1974, Level 16, State 1, Line 1 Cannot perform the specified operation on table 'TableName' because its clustered index 'PK_TableName' is disabled. */ When we try to update the statistics it throws an error as it clustered index is disabled. Now let us enable the clustered index only and attempt to update the statistics of the table right after that. -- Now let us rebuild clustered index only ALTER INDEX [PK_TableName] ON [dbo].[TableName] REBUILD GO -- Check that all the indexes status SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO We can clearly see that even though the nonclustered index is disabled it is also updated. If you do not need a nonclustered index, I suggest you to drop it as keeping them disabled is an overhead on your system. This is because every time the statistics are updated for system all the statistics for disabled indexesare also updated. -- Clean up DROP TABLE [TableName] GO The complete script is given below for easy reference. USE tempdb GO -- Drop Table if Exists IF EXISTS (SELECT * FROM sys.objects WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[TableName]') AND type IN (N'U')) DROP TABLE [dbo].[TableName] GO -- Create Table CREATE TABLE [dbo].[TableName]( [ID] [int] NOT NULL, [FirstCol] [varchar](50) NULL ) GO -- Insert Some data INSERT INTO TableName SELECT 1, 'First' UNION ALL SELECT 2, 'Second' UNION ALL SELECT 3, 'Third' UNION ALL SELECT 4, 'Fourth' UNION ALL SELECT 5, 'Five' GO -- Create Clustered Index ALTER TABLE [TableName] ADD CONSTRAINT [PK_TableName] PRIMARY KEY CLUSTERED ([ID] ASC) GO -- Create Nonclustered Index CREATE UNIQUE NONCLUSTERED INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] ([FirstCol] ASC) GO -- Check that all the indexes are enabled SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO -- Disable Indexes -- Disable Nonclustered Index ALTER INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] DISABLE GO -- Disable Clustered Index ALTER INDEX [PK_TableName] ON [dbo].[TableName] DISABLE GO -- Check that all the indexes are disabled SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO /* -- Above operation should thrown following error Msg 1974, Level 16, State 1, Line 1 Cannot perform the specified operation on table 'TableName' because its clustered index 'PK_TableName' is disabled. */ -- Now let us rebuild clustered index only ALTER INDEX [PK_TableName] ON [dbo].[TableName] REBUILD GO -- Check that all the indexes status SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO -- Clean up DROP TABLE [TableName] GO Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Index, SQL Optimization, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: SQL Statistics

Read the article

Search Results

Search found 3512 results on 141 pages for 'premature optimization'.

Page 34/141 | < Previous Page | 30 31 32 33 34 35 36 37 38 39 40 41 | Next Page >

- by __roland__

- by sandeepan-nath

- by user1748356

- by Adam

- by chrisamiller

- by Evgeny

- by user89

- by pinaldave

- by pinaldave

- by pinaldave

- by pinaldave

- by pinaldave

- by pinaldave

- by pinaldave

- by Xander Lamkins

- by Dennis

- by pinaldave

- by pinaldave

- by pinaldave

- by pinaldave

- by pinaldave

- by pinaldave

- by pinaldave

- by pinaldave

- by pinaldave

< Previous Page | 30 31 32 33 34 35 36 37 38 39 40 41 | Next Page >

- by roland