Search Results

Search found 8262 results on 331 pages for 'optimization algorithm'.

Page 71/331 | < Previous Page | 67 68 69 70 71 72 73 74 75 76 77 78 | Next Page >

SQL SERVER – Update Statistics are Sampled By Default

- by pinaldave

After reading my earlier post SQL SERVER – Create Primary Key with Specific Name when Creating Table on Statistics, I have received another question by a blog reader. The question is as follows: Question: Are the statistics sampled by default? Answer: Yes. The sampling rate can be specified by the user and it can be anywhere between a very low value to 100%. Let us do a small experiment to verify if the auto update on statistics is left on. Also, let’s examine a very large table that is created and statistics by default- whether the statistics are sampled or not. USE [AdventureWorks] GO -- Create Table CREATE TABLE [dbo].[StatsTest]( [ID] [int] IDENTITY(1,1) NOT NULL, [FirstName] [varchar](100) NULL, [LastName] [varchar](100) NULL, [City] [varchar](100) NULL, CONSTRAINT [PK_StatsTest] PRIMARY KEY CLUSTERED ([ID] ASC) ) ON [PRIMARY] GO -- Insert 1 Million Rows INSERT INTO [dbo].[StatsTest] (FirstName,LastName,City) SELECT TOP 1000000 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Update the statistics UPDATE STATISTICS [dbo].[StatsTest] GO -- Shows the statistics DBCC SHOW_STATISTICS ("StatsTest"PK_StatsTest) GO -- Clean up DROP TABLE [dbo].[StatsTest] GO Now let us observe the result of the DBCC SHOW_STATISTICS. The result shows that Resultset is for sure sampling for a large dataset. The percentage of sampling is based on data distribution as well as the kind of data in the table. Before dropping the table, let us check first the size of the table. The size of the table is 35 MB. Now, let us run the above code with lesser number of the rows. USE [AdventureWorks] GO -- Create Table CREATE TABLE [dbo].[StatsTest]( [ID] [int] IDENTITY(1,1) NOT NULL, [FirstName] [varchar](100) NULL, [LastName] [varchar](100) NULL, [City] [varchar](100) NULL, CONSTRAINT [PK_StatsTest] PRIMARY KEY CLUSTERED ([ID] ASC) ) ON [PRIMARY] GO -- Insert 1 Hundred Thousand Rows INSERT INTO [dbo].[StatsTest] (FirstName,LastName,City) SELECT TOP 100000 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Update the statistics UPDATE STATISTICS [dbo].[StatsTest] GO -- Shows the statistics DBCC SHOW_STATISTICS ("StatsTest"PK_StatsTest) GO -- Clean up DROP TABLE [dbo].[StatsTest] GO You can see that Rows Sampled is just the same as Rows of the table. In this case, the sample rate is 100%. Before dropping the table, let us also check the size of the table. The size of the table is less than 4 MB. Let us compare the Result set just for a valid reference. Test 1: Total Rows: 1000000, Rows Sampled: 255420, Size of the Table: 35.516 MB Test 2: Total Rows: 100000, Rows Sampled: 100000, Size of the Table: 3.555 MB The reason behind the sample in the Test1 is that the data space is larger than 8 MB, and therefore it uses more than 1024 data pages. If the data space is smaller than 8 MB and uses less than 1024 data pages, then the sampling does not happen. Sampling aids in reducing excessive data scan; however, sometimes it reduces the accuracy of the data as well. Please note that this is just a sample test and there is no way it can be claimed as a benchmark test. The result can be dissimilar on different machines. There are lots of other information can be included when talking about this subject. I will write detail post covering all the subject very soon. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Index, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: SQL Statistics

Read the article
SQL SERVER – Introduction to Wait Stats and Wait Types – Wait Type – Day 1 of 28

- by pinaldave

I have been working a lot on Wait Stats and Wait Types recently. Last Year, I requested blog readers to send me their respective server’s wait stats. I appreciate their kind response as I have received Wait stats from my readers. I took each of the results and carefully analyzed them. I provided necessary feedback to the person who sent me his wait stats and wait types. Based on the feedbacks I got, many of the readers have tuned their server. After a while I got further feedbacks on my recommendations and again, I collected wait stats. I recorded the wait stats and my recommendations and did further research. At some point at time, there were more than 10 different round trips of the recommendations and suggestions. Finally, after six month of working my hands on performance tuning, I have collected some real world wisdom because of this. Now I plan to share my findings with all of you over here. Before anything else, please note that all of these are based on my personal observations and opinions. They may or may not match the theory available at other places. Some of the suggestions may not match your situation. Remember, every server is different and consequently, there is more than one solution to a particular problem. However, this series is written with kept wait stats in mind. While I was working on various performance tuning consultations, I did many more things than just tuning wait stats. Today we will discuss how to capture the wait stats. I use the script diagnostic script created by my friend and SQL Server Expert Glenn Berry to collect wait stats. Here is the script to collect the wait stats: -- Isolate top waits for server instance since last restart or statistics clear WITH Waits AS (SELECT wait_type, wait_time_ms / 1000. AS wait_time_s, 100. * wait_time_ms / SUM(wait_time_ms) OVER() AS pct, ROW_NUMBER() OVER(ORDER BY wait_time_ms DESC) AS rn FROM sys.dm_os_wait_stats WHERE wait_type NOT IN ('CLR_SEMAPHORE','LAZYWRITER_SLEEP','RESOURCE_QUEUE','SLEEP_TASK' ,'SLEEP_SYSTEMTASK','SQLTRACE_BUFFER_FLUSH','WAITFOR', 'LOGMGR_QUEUE','CHECKPOINT_QUEUE' ,'REQUEST_FOR_DEADLOCK_SEARCH','XE_TIMER_EVENT','BROKER_TO_FLUSH','BROKER_TASK_STOP','CLR_MANUAL_EVENT' ,'CLR_AUTO_EVENT','DISPATCHER_QUEUE_SEMAPHORE', 'FT_IFTS_SCHEDULER_IDLE_WAIT' ,'XE_DISPATCHER_WAIT', 'XE_DISPATCHER_JOIN', 'SQLTRACE_INCREMENTAL_FLUSH_SLEEP')) SELECT W1.wait_type, CAST(W1.wait_time_s AS DECIMAL(12, 2)) AS wait_time_s, CAST(W1.pct AS DECIMAL(12, 2)) AS pct, CAST(SUM(W2.pct) AS DECIMAL(12, 2)) AS running_pct FROM Waits AS W1 INNER JOIN Waits AS W2 ON W2.rn <= W1.rn GROUP BY W1.rn, W1.wait_type, W1.wait_time_s, W1.pct HAVING SUM(W2.pct) - W1.pct < 99 OPTION (RECOMPILE); -- percentage threshold GO This script uses Dynamic Management View sys.dm_os_wait_stats to collect the wait stats. It omits the system-related wait stats which are not useful to diagnose performance-related bottleneck. Additionally, not OPTION (RECOMPILE) at the end of the DMV will ensure that every time the query runs, it retrieves new data and not the cached data. This dynamic management view collects all the information since the time when the SQL Server services have been restarted. You can also manually clear the wait stats using the following command: DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR); Once the wait stats are collected, we can start analysis them and try to see what is causing any particular wait stats to achieve higher percentages than the others. Many waits stats are related to one another. When the CPU pressure is high, all the CPU-related wait stats show up on top. But when that is fixed, all the wait stats related to the CPU start showing reasonable percentages. It is difficult to have a sure solution, but there are good indications and good suggestions on how to solve this. I will keep this blog post updated as I will post more details about wait stats and how I reduce them. The reference to Book On Line is over here. Of course, I have selected February to run this Wait Stats series. I am already cheating by having the smallest month to run this series. :) Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: DMV, Pinal Dave, PostADay, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL, Technology

Read the article
SQL SERVER – Video – Beginning Performance Tuning with SQL Server Execution Plan

- by pinaldave

Traveling can be most interesting or most exhausting experience. However, traveling is always the most enlightening experience one can have. While going to long journey one has to prepare a lot of things. Pack necessary travel gears, clothes and medicines. However, the most essential part of travel is the journey to the destination. There are many variations one prefer but the ultimate goal is to have a delightful experience during the journey. Here is the video available which explains how to begin with SQL Server Execution plans. Performance Tuning is a Journey Performance tuning is just like a long journey. The goal of performance tuning is efficient and least resources consuming query execution with accurate results. Just as maps are the most essential aspect of performance tuning the same way, execution plans are essentially maps for SQL Server to reach to the resultset. The goal of the execution plan is to find the most efficient path which translates the least usage of the resources (CPU, memory, IO etc). Execution Plans are like Maps When online maps were invented (e.g. Bing, Google, Mapquests etc) initially it was not possible to customize them. They were given a single route to reach to the destination. As time evolved now it is possible to give various hints to the maps, for example ‘via public transport’, ‘walking’, ‘fastest route’, ‘shortest route’, ‘avoid highway’. There are places where we manually drag the route and make it appropriate to our needs. The same situation is with SQL Server Execution Plans, if we want to tune the queries, we need to understand the execution plans and execution plans internals. We need to understand the smallest details which relate to execution plan when we our destination is optimal queries. Understanding Execution Plans The biggest challenge with maps are figuring out the optimal path. The same way the most common challenge with execution plans is where to start from and which precise route to take. Here is a quick list of the frequently asked questions related to execution plans: Should I read the execution plans from bottoms up or top down? Is execution plans are left to right or right to left? What is the relational between actual execution plan and estimated execution plan? When I mouse over operator I see CPU and IO but not memory, why? Sometime I ran the query multiple times and I get different execution plan, why? How to cache the query execution plan and data? I created an optimal index but the query is not using it. What should I change – query, index or provide hints? What are the tools available which helps quickly to debug performance problems? Etc… Honestly the list is quite a big and humanly impossible to write everything in the words. SQL Server Performance: Introduction to Query Tuning My friend Vinod Kumar and I have created for the same a video learning course for beginning performance tuning. We have covered plethora of the subject in the course. Here is the quick list of the same: Execution Plan Basics Essential Indexing Techniques Query Design for Performance Performance Tuning Tools Tips and Tricks Checklist: Performance Tuning We believe we have covered a lot in this four hour course and we encourage you to go over the video course if you are interested in Beginning SQL Server Performance Tuning and Query Tuning. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology, Video Tagged: Execution Plan

Read the article
SQL SERVER – Reducing CXPACKET Wait Stats for High Transactional Database

- by pinaldave

While engaging in a performance tuning consultation for a client, a situation occurred where they were facing a lot of CXPACKET Waits Stats. The client asked me if I could help them reduce this huge number of wait stats. I usually receive this kind of request from other client as well, but the important thing to understand is whether this question has any merits or benefits, or not. Before we continue the resolution, let us understand what CXPACKET Wait Stats are. The official definition suggests that CXPACKET Wait Stats occurs when trying to synchronize the query processor exchange iterator. You may consider lowering the degree of parallelism if a conflict concerning this wait type develops into a problem. (from BOL) In simpler words, when a parallel operation is created for SQL Query, there are multiple threads for a single query. Each query deals with a different set of the data (or rows). Due to some reasons, one or more of the threads lag behind, creating the CXPACKET Wait Stat. Threads which came first have to wait for the slower thread to finish. The Wait by a specific completed thread is called CXPACKET Wait Stat. Note that CXPACKET Wait is done by completed thread and not the one which are unfinished. “Note that not all the CXPACKET wait types are bad. You might experience a case when it totally makes sense. There might also be cases when this is also unavoidable. If you remove this particular wait type for any query, then that query may run slower because the parallel operations are disabled for the query.” Now let us see what the best practices to reduce the CXPACKET Wait Stats are. The suggestions, with which you will find that if you search online through the browser, would play a major role as and might be asked about their jobs In addition, might tell you that you should set ‘maximum degree of parallelism’ to 1. I do agree with these suggestions, too; however, I think this is not the final resolutions. As soon as you set your entire query to run on single CPU, you will get a very bad performance from the queries which are actually performing okay when using parallelism. The best suggestion to this is that you set ‘the maximum degree of parallelism’ to a lower number or 1 (be very careful with this – it can create more problems) but tune the queries which can be benefited from multiple CPU’s. You can use query hint OPTION (MAXDOP 0) to run the server to use parallelism. Here is the two-quick script which helps to resolve these issues: Change MAXDOP at Server Level EXEC sys.sp_configure N'max degree of parallelism', N'1' GO RECONFIGURE WITH OVERRIDE GO Run Query with all the CPU (using parallelism) USE AdventureWorks GO SELECT * FROM Sales.SalesOrderDetail ORDER BY ProductID OPTION (MAXDOP 0) GO Below is the blog post which will help you to find all the parallel query in your server. SQL SERVER – Find Queries using Parallelism from Cached Plan Please note running Queries in single CPU may worsen your performance and it is not recommended at all. Infect this can be very bad advise. I strongly suggest that you identify the queries which are offending and tune them instead of following any other suggestions. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQL White Papers, SQLAuthority News, T SQL, Technology

Read the article
SQL SERVER – Data Pages in Buffer Pool – Data Stored in Memory Cache

- by pinaldave

This will drop all the clean buffers so we will be able to start again from there. Now, run the following script and check the execution plan of the query. Have you ever wondered what types of data are there in your cache? During SQL Server Trainings, I am usually asked if there is any way one can know how much data in a table is stored in the memory cache? The more detailed question I usually get is if there are multiple indexes on table (and used in a query), were the data of the single table stored multiple times in the memory cache or only for a single time? Here is a query you can run to figure out what kind of data is stored in the cache. USE AdventureWorks GO SELECT COUNT(*) AS cached_pages_count, name AS BaseTableName, IndexName, IndexTypeDesc FROM sys.dm_os_buffer_descriptors AS bd INNER JOIN ( SELECT s_obj.name, s_obj.index_id, s_obj.allocation_unit_id, s_obj.OBJECT_ID, i.name IndexName, i.type_desc IndexTypeDesc FROM ( SELECT OBJECT_NAME(OBJECT_ID) AS name, index_id ,allocation_unit_id, OBJECT_ID FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.hobt_id AND (au.type = 1 OR au.type = 3) UNION ALL SELECT OBJECT_NAME(OBJECT_ID) AS name, index_id, allocation_unit_id, OBJECT_ID FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.partition_id AND au.type = 2 ) AS s_obj LEFT JOIN sys.indexes i ON i.index_id = s_obj.index_id AND i.OBJECT_ID = s_obj.OBJECT_ID ) AS obj ON bd.allocation_unit_id = obj.allocation_unit_id WHERE database_id = DB_ID() GROUP BY name, index_id, IndexName, IndexTypeDesc ORDER BY cached_pages_count DESC; GO Now let us run the query above and observe the output of the same. We can see in the above query that there are four columns. Cached_Pages_Count lists the pages cached in the memory. BaseTableName lists the original base table from which data pages are cached. IndexName lists the name of the index from which pages are cached. IndexTypeDesc lists the type of index. Now, let us do one more experience here. Please note that you should not run this test on a production server as it can extremely reduce the performance of the database. DBCC DROPCLEANBUFFERS This will drop all the clean buffers and we will be able to start again from there. Now run following script and check the execution plan for the same. USE AdventureWorks GO SELECT UnitPrice, ModifiedDate FROM Sales.SalesOrderDetail WHERE SalesOrderDetailID BETWEEN 1 AND 100 GO The execution plans contain the usage of two different indexes. Now, let us run the script that checks the pages cached in SQL Server. It will give us the following output. It is clear from the Resultset that when more than one index is used, datapages related to both or all of the indexes are stored in Memory Cache separately. Let me know what you think of this article. I had a great pleasure while writing this article because I was able to write on this subject, which I like the most. In the next article, we will exactly see what data are cached and those that are not cached, using a few undocumented commands. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: DMV, Pinal Dave, SQL, SQL Authority, SQL Optimization, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: SQL DMV

Read the article
SQL SERVER – How to Ignore Columnstore Index Usage in Query

- by pinaldave

Earlier I wrote about SQL SERVER – Fundamentals of Columnstore Index and very first question I received in email was as following. “We are using SQL Server 2012 CTP3 and so far so good. In our data warehouse solution we have created 1 non-clustered columnstore index on our large fact table. We have very unique situation but your article did not cover it. We are running few queries on our fact table which is working very efficiently but there is one query which earlier was running very fine but after creating this non-clustered columnstore index this query is running very slow. We dropped the columnstore index and suddenly this one query is running fast but other queries which were benefited by this columnstore index it is running slow. Any workaround in this situation?” In summary the question in simple words “How can we ignore using columnstore index in selective queries?” Very interesting question – you can use I can understand there may be the cases when columnstore index is not ideal and needs to be ignored the same. You can use the query hint IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX to ignore the columnstore index. SQL Server Engine will use any other index which is best after ignoring the columnstore index. Here is the quick script to prove the same. We will first create sample database and then create columnstore index on the same. Once columnstore index is created we will write simple query. This query will use columnstore index. We will then show the usage of the query hint. USE AdventureWorks GO -- Create New Table CREATE TABLE [dbo].[MySalesOrderDetail]( [SalesOrderID] [int] NOT NULL, [SalesOrderDetailID] [int] NOT NULL, [CarrierTrackingNumber] [nvarchar](25) NULL, [OrderQty] [smallint] NOT NULL, [ProductID] [int] NOT NULL, [SpecialOfferID] [int] NOT NULL, [UnitPrice] [money] NOT NULL, [UnitPriceDiscount] [money] NOT NULL, [LineTotal] [numeric](38, 6) NOT NULL, [rowguid] [uniqueidentifier] NOT NULL, [ModifiedDate] [datetime] NOT NULL ) ON [PRIMARY] GO -- Create clustered index CREATE CLUSTERED INDEX [CL_MySalesOrderDetail] ON [dbo].[MySalesOrderDetail] ( [SalesOrderDetailID]) GO -- Create Sample Data Table -- WARNING: This Query may run upto 2-10 minutes based on your systems resources INSERT INTO [dbo].[MySalesOrderDetail] SELECT S1.* FROM Sales.SalesOrderDetail S1 GO 100 -- Create ColumnStore Index CREATE NONCLUSTERED COLUMNSTORE INDEX [IX_MySalesOrderDetail_ColumnStore] ON [MySalesOrderDetail] (UnitPrice, OrderQty, ProductID) GO Now we have created columnstore index so if we run following query it will use for sure the same index. -- Select Table with regular Index SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice, SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty FROM [dbo].[MySalesOrderDetail] GROUP BY ProductID ORDER BY ProductID GO We can specify Query Hint IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX as described in following query and it will not use columnstore index. -- Select Table with regular Index SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice, SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty FROM [dbo].[MySalesOrderDetail] GROUP BY ProductID ORDER BY ProductID OPTION (IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX) GO Let us clean up the database. -- Cleanup DROP INDEX [IX_MySalesOrderDetail_ColumnStore] ON [dbo].[MySalesOrderDetail] GO TRUNCATE TABLE dbo.MySalesOrderDetail GO DROP TABLE dbo.MySalesOrderDetail GO Again, make sure that you use hint sparingly and understanding the proper implication of the same. Make sure that you test it with and without hint and select the best option after review of your administrator. Here is the question for you – have you started to use SQL Server 2012 for your validation and development (not on production)? It will be interesting to know the answer. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Index, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Beginning SQL Server: One Step at a Time – SQL Server Magazine

- by pinaldave

I am glad to announce that along with SQLAuthority.com, I will be blogging on the prominent site of SQL Server Magazine. My very first blog post there is already live; read here: Beginning SQL Server: One Step at a Time. My association with SQL Server Magazine has been quite long, I have written nearly 7 to 8 SQL Server articles for the print magazine and it has been a great experience. I used to stay in the United States at that time. I moved back to India for good, and during this process, I had put everything on hold for a while. Just like many things, “temporary” things become “permanent” – coming back to SQLMag was on hold for long time. Well, this New Year, things have changed – once again, I am back with my online presence at SQLMag.com. Everybody is a beginner at every task or activity at some point of his/her life: spelling words for the first time; learning how to drive for the first time, etc. No one is perfect at the start of any task, but every human is different. As time passes, we all develop our interests and begin to study our subject of interest. Most of us dream to get a job in the area of our study – however things change as time passes. I recently read somewhere online (I could not find the link again while writing this one) that all the successful people in various areas have never studied in the area in which they are successful. After going through a formal learning process of what we love, we refuse to stop learning, and we finally stop changing career and focus areas. We move, we dare and we progress. IT field is similar to our life. New IT professionals come to this field every day. There are two types of beginners – a) those who are associated with IT field but not familiar with other technologies, and b) those who are absolutely new to the IT field. Learning a new technology is always exciting and overwhelming for enthusiasts. I am working with database (in particular) for SQL Server for more than 7 years but I am still overwhelmed with so many things to learn. I continue to learn and I do not think that I should ever stop doing so. Just like everybody, I want to be in the race and get ahead in learning the technology. For the same, I am always looking for good guidance. I always try to find a good article, blog or book chapter, which can teach me what I really want to learn at this stage in my career and can be immensely helpful. Quite often, I prefer to read the material where the author does not judge me or assume my understanding. I like to read new concepts like a child, who takes his/her first steps of learning without any prior knowledge. Keeping my personal philosophy and preference in mind, I will be blogging on SQL Server Magazine site. I will be blogging on the beginners stuff. I will be blogging for them, who really want to start and make a mark in this area. I will be blogging for all those who have an extreme passion for learning. I am happy that this is a good start for this year. One of my resolutions is to help every beginner. It is totally possible that in future they all will grow and find the same article quite ‘easy‘ – well when that happens, it indicates the success of the article and material! Well, I encourage everybody to read my SQL Server Magazine blog – I will be blogging there frequently on various topics. To begin, we will be talking about performance tuning, and I assure that I will not shy away from other multiple areas. Read my SQL Server Magazine Blog: Beginning SQL Server: One Step at a Time I think the title says it all. Do leave your comments and feedback to indicate your preference of subject and interest. I am going to continue writing on subject, and the aim is of course to help grow in this field. Reference : Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority News, T SQL, Technology

Read the article
SQL SERVER – Disabled Index and Update Statistics

- by pinaldave

When we try to update the statistics, it throws an error as if the clustered index is disabled. Now let us enable the clustered index only and attempt to update the statistics of the table right after that. Have you ever come across the situation where a conversation never gets over and it continues even though original point of discussion has passed. I am facing the same situation in the case of Disabled Index. Here is the link to original conversations. SQL SERVER – Disable Clustered Index and Data Insert – Reader had a issue here with Disabled Index SQL SERVER – Understanding ALTER INDEX ALL REBUILD with Disabled Clustered Index – Reader asked the effect of Rebuilding Indexes The same reader asked me today – “I understood what the disabled indexes do; what is their effect on statistics. Is it true that even though indexes are disabled, they continue updating the statistics?“ The answer is very interesting: If you have disabled clustered index, you will be not able to update the statistics at all for any index. If you have enabled clustered index and disabled non clustered index when you update the statistics of the table, it automatically updates the statistics of the ALL (disabled and enabled – both) the indexes on the table. If you are not satisfied with the answer, let us go over a simple example. I have written necessary comments in the code itself to have a clear idea. USE tempdb GO -- Drop Table if Exists IF EXISTS (SELECT * FROM sys.objects WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[TableName]') AND type IN (N'U')) DROP TABLE [dbo].[TableName] GO -- Create Table CREATE TABLE [dbo].[TableName]( [ID] [int] NOT NULL, [FirstCol] [varchar](50) NULL ) GO -- Insert Some data INSERT INTO TableName SELECT 1, 'First' UNION ALL SELECT 2, 'Second' UNION ALL SELECT 3, 'Third' UNION ALL SELECT 4, 'Fourth' UNION ALL SELECT 5, 'Five' GO -- Create Clustered Index ALTER TABLE [TableName] ADD CONSTRAINT [PK_TableName] PRIMARY KEY CLUSTERED ([ID] ASC) GO -- Create Nonclustered Index CREATE UNIQUE NONCLUSTERED INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] ([FirstCol] ASC) GO -- Check that all the indexes are enabled SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO Now let us update the statistics of the table and check the statistics update date. -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO Now let us disable the indexes and check if they are disabled using sys.indexes. -- Disable Indexes -- Disable Nonclustered Index ALTER INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] DISABLE GO -- Disable Clustered Index ALTER INDEX [PK_TableName] ON [dbo].[TableName] DISABLE GO -- Check that all the indexes are disabled SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO Let us try to update the statistics of the table. -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO /* -- Above operation should thrown following error Msg 1974, Level 16, State 1, Line 1 Cannot perform the specified operation on table 'TableName' because its clustered index 'PK_TableName' is disabled. */ When we try to update the statistics it throws an error as it clustered index is disabled. Now let us enable the clustered index only and attempt to update the statistics of the table right after that. -- Now let us rebuild clustered index only ALTER INDEX [PK_TableName] ON [dbo].[TableName] REBUILD GO -- Check that all the indexes status SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO We can clearly see that even though the nonclustered index is disabled it is also updated. If you do not need a nonclustered index, I suggest you to drop it as keeping them disabled is an overhead on your system. This is because every time the statistics are updated for system all the statistics for disabled indexesare also updated. -- Clean up DROP TABLE [TableName] GO The complete script is given below for easy reference. USE tempdb GO -- Drop Table if Exists IF EXISTS (SELECT * FROM sys.objects WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[TableName]') AND type IN (N'U')) DROP TABLE [dbo].[TableName] GO -- Create Table CREATE TABLE [dbo].[TableName]( [ID] [int] NOT NULL, [FirstCol] [varchar](50) NULL ) GO -- Insert Some data INSERT INTO TableName SELECT 1, 'First' UNION ALL SELECT 2, 'Second' UNION ALL SELECT 3, 'Third' UNION ALL SELECT 4, 'Fourth' UNION ALL SELECT 5, 'Five' GO -- Create Clustered Index ALTER TABLE [TableName] ADD CONSTRAINT [PK_TableName] PRIMARY KEY CLUSTERED ([ID] ASC) GO -- Create Nonclustered Index CREATE UNIQUE NONCLUSTERED INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] ([FirstCol] ASC) GO -- Check that all the indexes are enabled SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO -- Disable Indexes -- Disable Nonclustered Index ALTER INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] DISABLE GO -- Disable Clustered Index ALTER INDEX [PK_TableName] ON [dbo].[TableName] DISABLE GO -- Check that all the indexes are disabled SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO /* -- Above operation should thrown following error Msg 1974, Level 16, State 1, Line 1 Cannot perform the specified operation on table 'TableName' because its clustered index 'PK_TableName' is disabled. */ -- Now let us rebuild clustered index only ALTER INDEX [PK_TableName] ON [dbo].[TableName] REBUILD GO -- Check that all the indexes status SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO -- Clean up DROP TABLE [TableName] GO Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Index, SQL Optimization, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: SQL Statistics

Read the article
SQL SERVER – Example of Performance Tuning for Advanced Users with DB Optimizer

- by Pinal Dave

Performance tuning is such a subject that everyone wants to master it. In beginning everybody is at a novice level and spend lots of time learning how to master the art of performance tuning. However, as we progress further the tuning of the system keeps on getting very difficult. I have understood in my early career there should be no need of ego in the technology field. There are always better solutions and better ideas out there and we should not resist them. Instead of resisting the change and new wave I personally adopt it. Here is a similar example, as I personally progress to the master level of performance tuning, I face that it is getting harder to come up with optimal solutions. In such scenarios I rely on various tools to teach me how I can do things better. Once I learn about tools, I am often able to come up with better solutions when I face the similar situation next time. A few days ago I had received a query where the user wanted to tune it further to get the maximum out of the performance. I have re-written the similar query with the help of AdventureWorks sample database. SELECT * FROM HumanResources.Employee e INNER JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID INNER JOIN HumanResources.Shift s ON edh.ShiftID = s.ShiftID; User had similar query to above query was used in very critical report and wanted to get best out of the query. When I looked at the query – here were my initial thoughts Use only column in the select statements as much as you want in the application Let us look at the query pattern and data workload and find out the optimal index for it Before I give further solutions I was told by the user that they need all the columns from all the tables and creating index was not allowed in their system. He can only re-write queries or use hints to further tune this query. Now I was in the constraint box – I believe * was not a great idea but if they wanted all the columns, I believe we can’t do much besides using *. Additionally, if I cannot create a further index, I must come up with some creative way to write this query. I personally do not like to use hints in my application but there are cases when hints work out magically and gives optimal solutions. Finally, I decided to use Embarcadero’s DB Optimizer. It is a fantastic tool and very helpful when it is about performance tuning. I have previously explained how it works over here. First open DBOptimizer and open Tuning Job from File >> New >> Tuning Job. Once you open DBOptimizer Tuning Job follow the various steps indicates in the following diagram. Essentially we will take our original script and will paste that into Step 1: New SQL Text and right after that we will enable Step 2 for Generating Various cases, Step 3 for Detailed Analysis and Step 4 for Executing each generated case. Finally we will click on Analysis in Step 5 which will generate the report detailed analysis in the result pan. The detailed pan looks like. It generates various cases of T-SQL based on the original query. It applies various hints and available hints to the query and generate various execution plans of the query and displays them in the resultant. You can clearly notice that original query had a cost of 0.0841 and logical reads about 607 pages. Whereas various options which are just following it has different execution cost as well logical read. There are few cases where we have higher logical read and there are few cases where as we have very low logical read. If we pay attention the very next row to original query have Merge_Join_Query in description and have lowest execution cost value of 0.044 and have lowest Logical Reads of 29. This row contains the query which is the most optimal re-write of the original query. Let us double click over it. Here is the query: SELECT * FROM HumanResources.Employee e INNER JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID INNER JOIN HumanResources.Shift s ON edh.ShiftID = s.ShiftID OPTION (MERGE JOIN) If you notice above query have additional hint of Merge Join. With the help of this Merge Join query hint this query is now performing much better than before. The entire process takes less than 60 seconds. Please note that it the join hint Merge Join was optimal for this query but it is not necessary that the same hint will be helpful in all the queries. Additionally, if the workload or data pattern changes the query hint of merge join may be no more optimal join. In that case, we will have to redo the entire exercise once again. This is the reason I do not like to use hints in my queries and I discourage all of my users to use the same. However, if you look at this example, this is a great case where hints are optimizing the performance of the query. It is humanly not possible to test out various query hints and index options with the query to figure out which is the most optimal solution. Sometimes, we need to depend on the efficiency tools like DB Optimizer to guide us the way and select the best option from the suggestion provided. Let me know what you think of this article as well your experience with DB Optimizer. Please leave a comment. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Joins, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Fundamentals of Columnstore Index

- by pinaldave

There are two kind of storage in database. Row Store and Column Store. Row store does exactly as the name suggests – stores rows of data on a page – and column store stores all the data in a column on the same page. These columns are much easier to search – instead of a query searching all the data in an entire row whether the data is relevant or not, column store queries need only to search much lesser number of the columns. This means major increases in search speed and hard drive use. Additionally, the column store indexes are heavily compressed, which translates to even greater memory and faster searches. I am sure this looks very exciting and it does not mean that you convert every single index from row store to column store index. One has to understand the proper places where to use row store or column store indexes. Let us understand in this article what is the difference in Columnstore type of index. Column store indexes are run by Microsoft’s VertiPaq technology. However, all you really need to know is that this method of storing data is columns on a single page is much faster and more efficient. Creating a column store index is very easy, and you don’t have to learn new syntax to create them. You just need to specify the keyword “COLUMNSTORE” and enter the data as you normally would. Keep in mind that once you add a column store to a table, though, you cannot delete, insert or update the data – it is READ ONLY. However, since column store will be mainly used for data warehousing, this should not be a big problem. You can always use partitioning to avoid rebuilding the index. A columnstore index stores each column in a separate set of disk pages, rather than storing multiple rows per page as data traditionally has been stored. The difference between column store and row store approaches is illustrated below: In case of the row store indexes multiple pages will contain multiple rows of the columns spanning across multiple pages. In case of column store indexes multiple pages will contain multiple single columns. This will lead only the columns needed to solve a query will be fetched from disk. Additionally there is good chance that there will be redundant data in a single column which will further help to compress the data, this will have positive effect on buffer hit rate as most of the data will be in memory and due to same it will not need to be retrieved. Let us see small example of how columnstore index improves the performance of the query on a large table. As a first step let us create databaseset which is large enough to show performance impact of columnstore index. The time taken to create sample database may vary on different computer based on the resources. USE AdventureWorks GO -- Create New Table CREATE TABLE [dbo].[MySalesOrderDetail]( [SalesOrderID] [int] NOT NULL, [SalesOrderDetailID] [int] NOT NULL, [CarrierTrackingNumber] [nvarchar](25) NULL, [OrderQty] [smallint] NOT NULL, [ProductID] [int] NOT NULL, [SpecialOfferID] [int] NOT NULL, [UnitPrice] [money] NOT NULL, [UnitPriceDiscount] [money] NOT NULL, [LineTotal] [numeric](38, 6) NOT NULL, [rowguid] [uniqueidentifier] NOT NULL, [ModifiedDate] [datetime] NOT NULL ) ON [PRIMARY] GO -- Create clustered index CREATE CLUSTERED INDEX [CL_MySalesOrderDetail] ON [dbo].[MySalesOrderDetail] ( [SalesOrderDetailID]) GO -- Create Sample Data Table -- WARNING: This Query may run upto 2-10 minutes based on your systems resources INSERT INTO [dbo].[MySalesOrderDetail] SELECT S1.* FROM Sales.SalesOrderDetail S1 GO 100 Now let us do quick performance test. I have kept STATISTICS IO ON for measuring how much IO following queries take. In my test first I will run query which will use regular index. We will note the IO usage of the query. After that we will create columnstore index and will measure the IO of the same. -- Performance Test -- Comparing Regular Index with ColumnStore Index USE AdventureWorks GO SET STATISTICS IO ON GO -- Select Table with regular Index SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice, SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty FROM [dbo].[MySalesOrderDetail] GROUP BY ProductID ORDER BY ProductID GO -- Table 'MySalesOrderDetail'. Scan count 1, logical reads 342261, physical reads 0, read-ahead reads 0. -- Create ColumnStore Index CREATE NONCLUSTERED COLUMNSTORE INDEX [IX_MySalesOrderDetail_ColumnStore] ON [MySalesOrderDetail] (UnitPrice, OrderQty, ProductID) GO -- Select Table with Columnstore Index SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice, SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty FROM [dbo].[MySalesOrderDetail] GROUP BY ProductID ORDER BY ProductID GO It is very clear from the results that query is performance extremely fast after creating ColumnStore Index. The amount of the pages it has to read to run query is drastically reduced as the column which are needed in the query are stored in the same page and query does not have to go through every single page to read those columns. If we enable execution plan and compare we can see that column store index performance way better than regular index in this case. Let us clean up the database. -- Cleanup DROP INDEX [IX_MySalesOrderDetail_ColumnStore] ON [dbo].[MySalesOrderDetail] GO TRUNCATE TABLE dbo.MySalesOrderDetail GO DROP TABLE dbo.MySalesOrderDetail GO In future posts we will see cases where Columnstore index is not appropriate solution as well few other tricks and tips of the columnstore index. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Index, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Optimizing collision engine bottleneck

- by Vittorio Romeo

Foreword: I'm aware that optimizing this bottleneck is not a necessity - the engine is already very fast. I, however, for fun and educational purposes, would love to find a way to make the engine even faster. I'm creating a general-purpose C++ 2D collision detection/response engine, with an emphasis on flexibility and speed. Here's a very basic diagram of its architecture: Basically, the main class is World, which owns (manages memory) of a ResolverBase*, a SpatialBase* and a vector<Body*>. SpatialBase is a pure virtual class which deals with broad-phase collision detection. ResolverBase is a pure virtual class which deals with collision resolution. The bodies communicate to the World::SpatialBase* with SpatialInfo objects, owned by the bodies themselves. There currenly is one spatial class: Grid : SpatialBase, which is a basic fixed 2D grid. It has it's own info class, GridInfo : SpatialInfo. Here's how its architecture looks: The Grid class owns a 2D array of Cell*. The Cell class contains two collection of (not owned) Body*: a vector<Body*> which contains all the bodies that are in the cell, and a map<int, vector<Body*>> which contains all the bodies that are in the cell, divided in groups. Bodies, in fact, have a groupId int that is used for collision groups. GridInfo objects also contain non-owning pointers to the cells the body is in. As I previously said, the engine is based on groups. Body::getGroups() returns a vector<int> of all the groups the body is part of. Body::getGroupsToCheck() returns a vector<int> of all the groups the body has to check collision against. Bodies can occupy more than a single cell. GridInfo always stores non-owning pointers to the occupied cells. After the bodies move, collision detection happens. We assume that all bodies are axis-aligned bounding boxes. How broad-phase collision detection works: Part 1: spatial info update For each Body body: Top-leftmost occupied cell and bottom-rightmost occupied cells are calculated. If they differ from the previous cells, body.gridInfo.cells is cleared, and filled with all the cells the body occupies (2D for loop from the top-leftmost cell to the bottom-rightmost cell). body is now guaranteed to know what cells it occupies. For a performance boost, it stores a pointer to every map<int, vector<Body*>> of every cell it occupies where the int is a group of body->getGroupsToCheck(). These pointers get stored in gridInfo->queries, which is simply a vector<map<int, vector<Body*>>*>. body is now guaranteed to have a pointer to every vector<Body*> of bodies of groups it needs to check collision against. These pointers are stored in gridInfo->queries. Part 2: actual collision checks For each Body body: body clears and fills a vector<Body*> bodiesToCheck, which contains all the bodies it needs to check against. Duplicates are avoided (bodies can belong to more than one group) by checking if bodiesToCheck already contains the body we're trying to add. const vector<Body*>& GridInfo::getBodiesToCheck() { bodiesToCheck.clear(); for(const auto& q : queries) for(const auto& b : *q) if(!contains(bodiesToCheck, b)) bodiesToCheck.push_back(b); return bodiesToCheck; } The GridInfo::getBodiesToCheck() method IS THE BOTTLENECK. The bodiesToCheck vector must be filled for every body update because bodies could have moved meanwhile. It also needs to prevent duplicate collision checks. The contains function simply checks if the vector already contains a body with std::find. Collision is checked and resolved for every body in bodiesToCheck. That's it. So, I've been trying to optimize this broad-phase collision detection for quite a while now. Every time I try something else than the current architecture/setup, something doesn't go as planned or I make assumption about the simulation that later are proven to be false. My question is: how can I optimize the broad-phase of my collision engine maintaining the grouped bodies approach? Is there some kind of magic C++ optimization that can be applied here? Can the architecture be redesigned in order to allow for more performance? Actual implementation: SSVSCollsion Body.h, Body.cpp World.h, World.cpp Grid.h, Grid.cpp Cell.h, Cell.cpp GridInfo.h, GridInfo.cpp

Read the article
Surprising results with .NET multi-theading algorithm

- by Myles J

Hi, I've recently wrote a C# console time tabling algorithm that is based on a combination of a genetic algorithm with a few brute force routines thrown in. The initial results were promising but I figured I could improve the performance by splitting the brute force routines up to run in parallel on multi processor architectures. To do this I used the well documented Producer/Consumer model (as documented in this fantastic article http://www.albahari.com/threading/part2.aspx#_ProducerConsumerQWaitHandle). I changed my code to create one thread per logical processor during the brute force routines. The performance gains on my work station were very pleasing. I am running Windows XP on the following hardware: Intel Core 2 Quad CPU 2.33 GHz 3.49 GB RAM Initial tests indicated average performance gains of approx 40% when using 4 threads. The next step was to deploy the new multi-threading version of the algorithm to our higher spec UAT server. Here is the spec of our UAT server: Windows 2003 Server R2 Enterprise x64 8 cpu (Quad-Core) AMD Opteron 2.70 GHz 255 GB RAM After running the first round of tests we were all extremely surprised to find that the algorithm actually runs slower on the high spec W2003 server than on my local XP work station! In fact the tests seem to indicate that it doesn't matter how many threads are generated (tests were ran with the app spawning between 2 to 32 threads). The algorithm always runs significantly slower on the UAT W2003 server? How could this be? Surely the app should run faster on a 8 cpu (Quad-Core) than my 2 Quad work station? Why are we seeing no performance gains with the multi-threading on the W2003 server whilst the XP workstation tests show gains of up to 40%? Any help or pointers would be appreciated. Regards Myles

Read the article
Most efficient method to query a Young Tableau

- by Matthieu M.

A Young Tableau is a 2D matrix A of dimensions M*N such that: i,j in [0,M)x[0,N): for each p in (i,M), A[i,j] <= A[p,j] for each q in (j,N), A[i,j] <= A[i,q] That is, it's sorted row-wise and column-wise. Since it may contain less than M*N numbers, the bottom-right values might be represented either as missing or using (in algorithm theory) infinity to denote their absence. Now the (elementary) question: how to check if a given number is contained in the Young Tableau ? Well, it's trivial to produce an algorithm in O(M*N) time of course, but what's interesting is that it is very easy to provide an algorithm in O(M+N) time: Bottom-Left search: Let x be the number we look for, initialize i,j as M-1, 0 (bottom left corner) If x == A[i,j], return true If x < A[i,j], then if i is 0, return false else decrement i and go to 2. Else, if j is N-1, return false else increment j This algorithm does not make more than M+N moves. The correctness is left as an exercise. It is possible though to obtain a better asymptotic runtime. Pivot Search: Let x be the number we look for, initialize i,j as floor(M/2), floor(N/2) If x == A[i,j], return true If x < A[i,j], search (recursively) in A[0:i-1, 0:j-1], A[i:M-1, 0:j-1] and A[0:i-1, j:N-1] Else search (recursively) in A[i+1:M-1, 0:j], A[i+1:M-1, j+1:N-1] and A[0:i, j+1:N-1] This algorithm proceed by discarding one of the 4 quadrants at each iteration and running recursively on the 3 left (divide and conquer), the master theorem yields a complexity of O((N+M)**(log 3 / log 4)) which is better asymptotically. However, this is only a big-O estimation... So, here are the questions: Do you know (or can think of) an algorithm with a better asymptotical runtime ? Like introsort prove, sometimes it's worth switching algorithms depending on the input size or input topology... do you think it would be possible here ? For 2., I am notably thinking that for small size inputs, the bottom-left search should be faster because of its O(1) space requirement / lower constant term.

Read the article
Minimum confidence and minimum support for Apriori

- by lmsasu

What are appropriate values for minimum confidence and minimum support values for the Apriori algorithm? How could you tweak them? Are they fixed values, or do they change during the running of the algorithm? If you have used this algorithm before, what values did you use?

Read the article
ASP.NET 2.0 RijndaelManaged encryption algorithm vs. FIPS

- by R Rush

I'm running into an issue with an ASP.NET 2.0 application. Our network folks just upped our security, and now I get the floowing error whenever I try to access the app: "This implementation is not part of the Windows Platform FIPS validated cryptographic algorithms." I've done a little research, and it sounds like ASP.NET uses the RijndaelManaged AES encryption algorithm to encrypt the ViewState of pages... and RijndaelManaged is on the list of algorithms that aren't FIPS compliant. We're certainly not explicitly calling any encryption algorithm... much less anything on the non-compliant list. This ViewState business makes sense to me, I guess. The thing I can't muddle out, though, is what to do about it. I've found a KB article that suggests using a web.config setting to specify a different algorithm... but either that didn't stick, or that algorithm isn't up to snuff, either. So: 1) Is the RijndaelManaged / ViewState thing actually the problem? Or am I barking up the wrong tree? 2) How to I specify what algorithm to use instead of RijndaelManaged? I've got a list of algorithms that are and aren't compliant; I'm just not sure where to plug that information in. Thanks! Richard

Read the article
Efficient way of calculating average difference of array elements from array average value

- by Saysmaster

Is there a way to calculate the average distance of array elements from array average value, by only "visiting" each array element once? (I search for an algorithm) Example: Array : [ 1 , 5 , 4 , 9 , 6 ] Average : ( 1 + 5 + 4 + 9 + 6 ) / 5 = 5 Distance Array : [|1-5|, |5-5|, |4-5|, |9-5|, |6-5|] = [4 , 0 , 1 , 4 , 1 ] Average Distance : ( 4 + 0 + 1 + 4 + 1 ) / 5 = 2 The simple algorithm needs 2 passes. 1st pass) Reads and accumulates values, then divides the result by array length to calculate average value of array elements. 2nd pass) Reads values, accumulates each one's distance from the previously calculated average value, and then divides the result by array length to find the average distance of the elements from the average value of the array. The two passes are identical. It is the classic algorithm of calculating the average of a set of values. The first one takes as input the elements of the array, the second one the distances of each element from the array's average value. Calculating the average can be modified to not accumulate the values, but caclulating the average "on the fly" as we sequentialy read the elements from the array. The formula is: Compute Running Average of Array's elements ------------------------------------------- RA[i] = E[i] {for i == 1} RA[i] = RA[i-1] - RA[i-1]/i + A[i]/i { for i > 1 } Where A[x] is the array's element at position x, RA[x] is the average of the array's elements between position 1 and x (running average). My question is: Is there a similar algorithm, to calculate "on the fly" (as we read the array's elements), the average distance of the elements from the array's mean value? The problem is that, as we read the array's elements, the final average value of the array is not known. Only the running average is known. So calculating differences from the running average will not yield the correct result. I suppose, if such algorithm exists, it probably should have the "ability" to compensate, in a way, on each new element read for the error calculated as far.

Read the article
scrabble algorithm....

- by teddy

I'm making algorithm like crossword, but I dont know how to design the algorithm. For example: there are words like 'car', 'apple' in the dictionary. the word 'app' is given on the board. there are letters like 'l' 'e' 'c' 'r'....for making words. So the algorithm's task is to make correct words which are stored in dictionary. app - lapp - leapp - lecapp - .... - lappe - eappc - ... - appl - apple (correct answer) What is the best solution for this algorithm?

Read the article
GO TO and DON'T GO TO ! proof of this:

- by Sorush Rabiee

is this proofed: every algorithm A designed using go to or something like, is equivalence to another algorithm B that does not use go to. in other words: every algorithm designed using go to, can be designed without using go to. how to proof?

Read the article
Are mathematical Algorithms protected by copyright?

- by analogy

I wish to implement an algorithm which i read in a journal paper in my software (commercial). I want to know if this is allowed or not. The algorithm in question is described in http://arxiv.org/abs/0709.2938 It is a very simple algorithm and a number of implementations exist in python (http://igraph.sourceforge.net/) and java. One of them is in gpl another which i got from a different researcher and had no license attached. There are significant differences in two implementations, e.g. second one uses threads and multiple cores. It is possible to rewrite/ (not translate) the algorithm. So can I use it in my software or on a server for commercial purpose. Thanks UPDATE: I am completely aware of copyright on the text of paper, it was published in phys rev E. I am concerned with use of the algorithm, in commercial software. Also the publication means that unless the patent has been already filed. The method has been disclosed publicly hence barring patent in future. Also the GPL implementation is not by authors themselves but comes from a third party. Finally i am not using the GPL implementation but creating my own using C++.

Read the article
Plotting an Arc in Discrete Steps

- by phobos51594

Good afternoon, Background My question relates to the plotting of an arbitrary arc in space using discrete steps. It is unique, however, in that I am not drawing to a canvas in the typical sense. The firmware I am designing is for a gcode interpreter for a CNC mill that will translate commands into stepper motor movements. Now, I have already found a similar question on this very site, but the methodology suggested (Bresenham's Algorithm) appears to be incompatable for moving an object in space, as it only relies on the calculation of one octant of a circle which is then mirrored about the remaining axes of symmetry. Furthermore, the prescribed method of calculation an arc between two arbitrary angles relies on trigonometry (I am implementing on a microcontroller and would like to avoid costly trig functions, if possible) and simply not taking the steps that are out of the range. Finally, the algorithm only is designed to work in one rotational direction (e.g. counterclockwise). Question So, on to the actual question: Does anyone know of a general-purpose algorithm that can be used to "draw" an arbitrary arc in discrete steps while still giving respect to angular direction (CW / CCW)? The final implementation will be done in C, but the language for the purpose of the question is irrelevant. Thank you in advance. References S.O post on drawing a simple circle using Bresenham's Algorithm: "Drawing" an arc in discrete x-y steps Wiki page describing Bresenham's Algorithm for a circle http://en.wikipedia.org/wiki/Midpoint_circle_algorithm Gcode instructions to be implemented (see. G2 and G3) http://linuxcnc.org/docs/html/gcode.html

Read the article
Packing differently sized chunks of data into multiple bins

- by knizz

EDIT: It seems like this problem is called "Cutting stock problem" I need an algorithm that gives me the (space-)optimal arrangement of chunks in bins. One way would be put the bigger chunks in first. But see how that algorithm fails in this example: Chunks Bins ----------------------------- AAA BBB CC DD ( ) ( ) Algorithm Result ----------------------------- biggest first (AAABBB ) (CC ) optimal (AAACCDD) (BBB) "Biggest first" can't fit in DD. Maybe it helps to build a table like this: Size 1: --- Size 2: CC, DD Size 3: AAA, BBB Size 4: CCDD Size 5: AAACC, AAADD, BBBCC, BBBDD Size 6: AAABBB Size 7: AAACCDD, BBBCCDD Size 8: AAABBBCC, AAABBBDD Size 10: AAABBBCCDD

Read the article
Openssl RAND_bytes algorithm

- by vasmt

What algorithm use RAND_bytes function in openssl?

Read the article
Are mathamatical Algorithms protected by copyright

- by analogy

I wish to implement an algorithm which i read in a journal paper in my software (commercial). I want to know if this is allowed or not. The algorithm in question is described in http://arxiv.org/abs/0709.2938 It is a very simple algorithm and a number of implementations exist in python (http://igraph.sourceforge.net/) and java. One of them is in gpl another which i got from a different researcher and had no license attached. There are significant differences in two implementations, e.g. second one uses threads and multiple cores. It is possible to rewrite/ (not translate) the algorithm. So can I use it in my software or on a server for commercial purpose. Thanks

Read the article
List modification in Python

- by user2945143

We are given an algorithm to modify a list of numbers from 1 to 28. There are 5 steps in the algorithm. We have written functions for each step (all correct). We need to write a function that combines all 5 steps. The algorithm modifies the list to get a value. Each time you get a new value, you use the list created by the algorithm from the previous step. This is what we have gotten so far for the code: get_card_at_top_index(insert_top_to_bottom(triple_cut((move_joker_2( move_joker_1(deck)))))) When we run the code to generate the get_card_at_top_index, the first answer is correct. However, the rest are not. Instead of using from the new list, python uses the value that it generated from the last step. What did we do wrong? UPDATE: The other 5 codes passed the tests, they are correct. Code 1 (List) = list1 Code 2 (list1) = list2 Code 3 (list2) = list3 Code 4 (list3) = list4 Code 5 (list4) = list5 we generate a number from 5. We need to run the algorithm again to generate 25 more numbers. We will use list 5 start from step 1

Read the article
Who is the best Hash algorithm?

- by harold-sota

Who is the best Hash algorithm?

Read the article

Search Results

Search found 8262 results on 331 pages for 'optimization algorithm'.

Page 71/331 | < Previous Page | 67 68 69 70 71 72 73 74 75 76 77 78 | Next Page >

- by pinaldave

- by pinaldave

- by pinaldave

- by pinaldave

- by pinaldave

- by pinaldave

- by pinaldave

- by pinaldave

- by Pinal Dave

- by pinaldave

- by Vittorio Romeo

- by Myles J

- by Matthieu M.

- by lmsasu

- by R Rush

- by Saysmaster

- by teddy

- by Sorush Rabiee

- by analogy

- by phobos51594

- by knizz

- by vasmt

- by analogy

- by user2945143

- by harold-sota

< Previous Page | 67 68 69 70 71 72 73 74 75 76 77 78 | Next Page >