Search Results

Search found 4193 results on 168 pages for 'dave aaron smith'.

Page 12/168 | < Previous Page | 8 9 10 11 12 13 14 15 16 17 18 19  | Next Page >

  • SQL SERVER – Solution – Puzzle – Statistics are not Updated but are Created Once

    - by pinaldave
    Earlier I asked puzzle why statistics are not updated. Read the complete details over here: Statistics are not Updated but are Created Once In the question I have demonstrated even though statistics should have been updated after lots of insert in the table are not updated.(Read the details SQL SERVER – When are Statistics Updated – What triggers Statistics to Update) In this example I have created following situation: Create Table Insert 1000 Records Check the Statistics Now insert 10 times more 10,000 indexes Check the Statistics – it will be NOT updated Auto Update Statistics and Auto Create Statistics for database is TRUE Now I have requested two things in the example 1) Why this is happening? 2) How to fix this issue? I have many answers – here is the how I fixed it which has resolved the issue for me. NOTE: There are multiple answers to this problem and I will do my best to list all. Solution: Create nonclustered Index on column City Here is the working example for the same. Let us understand this script and there is added explanation at the end. -- Execution Plans Difference -- Estimated Execution Plan Vs Actual Execution Plan -- Create Sample Database CREATE DATABASE SampleDB GO USE SampleDB GO -- Create Table CREATE TABLE ExecTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO CREATE NONCLUSTERED INDEX IX_ExecTable1 ON ExecTable (City); GO -- Insert One Thousand Records -- INSERT 1 INSERT INTO ExecTable (ID,FirstName,LastName,City) SELECT TOP 1000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 1 THEN 'New York' WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 5 THEN 'San Marino' WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 3 THEN 'Los Angeles' WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 7 THEN 'La Cinega' WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 13 THEN 'San Diego' WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 17 THEN 'Las Vegas' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Display statistics of the table sp_helpstats N'ExecTable', 'ALL' GO -- Select Statement SELECT FirstName, LastName, City FROM ExecTable WHERE City  = 'New York' GO -- Display statistics of the table sp_helpstats N'ExecTable', 'ALL' GO -- Replace your Statistics over here DBCC SHOW_STATISTICS('ExecTable', IX_ExecTable1); GO -------------------------------------------------------------- -- Round 2 -- Insert One Thousand Records -- INSERT 2 INSERT INTO ExecTable (ID,FirstName,LastName,City) SELECT TOP 1000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%20 = 1 THEN 'New York' WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 5 THEN 'San Marino' WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 3 THEN 'Los Angeles' WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 7 THEN 'La Cinega' WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 13 THEN 'San Diego' WHEN  ROW_NUMBER() OVER (ORDER BY a.name)%20 = 17 THEN 'Las Vegas' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Select Statement SELECT FirstName, LastName, City FROM ExecTable WHERE City  = 'New York' GO -- Display statistics of the table sp_helpstats N'ExecTable', 'ALL' GO -- Replace your Statistics over here DBCC SHOW_STATISTICS('ExecTable', IX_ExecTable1); GO -- Clean up Database DROP TABLE ExecTable GO When I created non clustered index on the column city, it also created statistics on the same column with same name as index. When we populate the data in the column the index is update – resulting execution plan to be invalided – this leads to the statistics to be updated in next execution of SELECT. This behavior does not happen on Heap or column where index is auto created. If you explicitly update the index, often you can see the statistics are updated as well. You can see this is for sure happening if you follow the tell of John Sansom. John Sansom‘s suggestion: That was fun! Although the column statistics are invalidated by the time the second select statement is executed, the query is not compiled/recompiled but instead the existing query plan is reused. It is the “next” compiled query against the column statistics that will see that they are out of date and will then in turn instantiate the action of updating statistics. You can see this in action by forcing the second statement to recompile. SELECT FirstName, LastName, City FROM ExecTable WHERE City = ‘New York’ option(RECOMPILE) GO Kevin Cross also have another suggestion: I agree with John. It is reusing the Execution Plan. Aside from OPTION(RECOMPILE), clearing the Execution Plan Cache before the subsequent tests will also work. i.e., run this before round 2: ————————————————————– – Clear execution plan cache before next test DBCC FREEPROCCACHE WITH NO_INFOMSGS; ————————————————————– Nice puzzle! Kevin As this was puzzle John and Kevin both got the correct answer, there was no condition for answer to be part of best practices. I know John and he is finest DBA around – his tremendous knowledge has always impressed me. John and Kevin both will agree that clearing cache either using DBCC FREEPROCCACHE and recompiling each query every time is for sure not good advice on production server. It is correct answer but not best practice. By the way, if you have better solution or have better suggestion please advise. I am open to change my answer and publish further improvement to this solution. On very separate note, I like to have clustered index on my Primary Key, which I have not mentioned here as it is out of the scope of this puzzle. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, Readers Contribution, Readers Question, SQL, SQL Authority, SQL Index, SQL Puzzle, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Statistics

    Read the article

  • Troubleshooting sudoers via ldap

    - by dafydd
    The good news is that I got sudoers via ldap working on Red Hat Directory Server. The package is sudo-1.7.2p1. I have some LDAP/Kerberos users in an LDAP group called wheel, and I have this entry in LDAP: # %wheel, SUDOers, example.com dn: cn=%wheel,ou=SUDOers,dc=example,dc=com cn: %wheel description: Members of group wheel have access to all privileges. objectClass: sudoRole objectClass: top sudoCommand: ALL sudoHost: ALL sudoUser: %wheel So, members of group wheel have administrative privileges via sudo. This has been tested and works fine. Now, I have this other sudo privilege set up to allow members of a group called Administrators to perform two commands as the non-root owner of those commands. # %Administrators, SUDOers, example.com dn: cn=%Administrators,ou=SUDOers,dc=example,dc=com sudoRunAsGroup: appGroup sudoRunAsUser: appOwner cn: %Administrators description: Allow members of the group Administrators to run various commands . objectClass: sudoRole objectClass: top sudoCommand: appStop sudoCommand: appStart sudoCommand: /path/to/appStop sudoCommand: /path/to/appStart sudoUser: %Administrators Unfortunately, members of Administrators are still refused permission to run appStart or appStop: -bash-3.2$ sudo /path/to/appStop [sudo] password for Aaron: Sorry, user Aaron is not allowed to execute '/path/to/appStop' as root on host.example.com. -bash-3.2$ sudo -u appOwner /path/to/appStop [sudo] password for Aaron: Sorry, user Aaron is not allowed to execute '/path/to/appStop' as appOwner on host.example.com. /var/log/secure shows me these two sets of messages for the two attempts: Oct 31 15:02:36 host sudo: pam_unix(sudo:auth): authentication failure; logname=Aaron uid=0 euid=0 tty=/dev/pts/3 ruser= rhost= user=Aaron Oct 31 15:02:37 host sudo: pam_krb5[1508]: TGT verified using key for 'host/[email protected]' Oct 31 15:02:37 host sudo: pam_krb5[1508]: authentication succeeds for 'Aaron' ([email protected]) Oct 31 15:02:37 host sudo: Aaron : command not allowed ; TTY=pts/3 ; PWD=/auto/home/Aaron ; USER=root ; COMMAND=/path/to/appStop Oct 31 15:02:52 host sudo: pam_unix(sudo:auth): authentication failure; logname=Aaron uid=0 euid=0 tty=/dev/pts/3 ruser= rhost= user=Aaron Oct 31 15:02:52 host sudo: pam_krb5[1547]: TGT verified using key for 'host/[email protected]' Oct 31 15:02:52 host sudo: pam_krb5[1547]: authentication succeeds for 'Aaron' ([email protected]) Oct 31 15:02:52 host sudo: Aaron : command not allowed ; TTY=pts/3 ; PWD=/auto/home/Aaron ; USER=appOwner; COMMAND=/path/to/appStop The questions: Does sudo have some sort of verbose or debug mode where I can actually watch it capture the sudoers privilege list and determine whether or not Aaron should have the privilege to run this command? (This question is probably independent of where the sudoers database is kept.) Does sudo work with some background mechanism that might have a log level I could turn up? Right now, I can't fix a problem I can't identify. Is this an LDAP search failure? Is this a group member matching failure? Identifying why the command fails will help me identify the fix... Next step: Recreate the privilege in /etc/sudoers, and see if it works locally... Cheers!

    Read the article

  • SQL SERVER – Retrieve and Explore Database Backup without Restoring Database – Idera virtual databas

    - by pinaldave
    I recently downloaded Idera’s SQL virtual database, and tested it. There are a few things about this tool which caught my attention. My Scenario It is quite common in real life that sometimes observing or retrieving older data is necessary; however, it had changed as time passed by. The full database backup was 40 GB in size, and, to restore it on our production server, it usually takes around 16 to 22 minutes, depending on the load server that is usually present. This range in time varies from one server to another as per the configuration of the computer. Some other issues we used to have are the following: When we try to restore a large 40-GB database, we needed at least that much space on our production server. Once in a while, we even had to make changes in the restored database, and use the said changed and restored database for our purpose, making it more time-consuming. My Solution I have heard a lot about the Idera’s SQL virtual database tool.. Well, right after we started to test this tool, we found out that it really delivers what it promises. Using this software was very easy and we were able to restore our database from backup in less than 2 minutes, sparing us from the usual longer time of 16–22 minutes. The needful was finished in a total of 10 minutes. Another interesting observation is that there is no need to have an additional space for restoring the database. For complete database restoration, the single additional MB on the drive is not required anymore. We can use the database in the same way as our regular database, and there is no need for any additional configuration and setup. Let us look at the most relevant points of this product based on my initial experience: Quick restoration of the database backup No additional space required for database restoration virtual database has no physical .MDF or .LDF The database which is restored is, in fact, the backup file converted in the virtual database. DDL and DML queries can be executed against this virtually restored database. Regular backup operation can be implemented against virtual database, creating a physical .bak file that can be used for future use. There was no observed degradation in performance on the original database as well the restored virtual database. Additional T-SQL queries can be let off on the virtual database. Well, this summarizes my quick review. And, as I was saying, I am very impressed with the product and I plan to explore it more. There are many features that I have noticed in this tool, which I think can be very useful if properly understood. I had taken a few screenshots using my demo database afterwards. Let us see what other things this tool can do besides the mentioned activities. I am surprised with its performance so I want to know how exactly this feature works, specifically in the matter of why it does not create any additional files and yet, it still allows update on the virtually restored database. I guess I will have to send an e-mail to the developers of Idera and try to figure this out from them. I think this tool is very useful, and it delivers a high level of performance way more than what I expected. Soon, I will write a review for additional uses of SQL virtual database.. If you are using SQL virtual database in your production environment, I am eager to learn more about it and your experience while using it. The ‘Virtual’ Part of virtual database When I set out to test this software, I thought virtual database had something to do with Hyper-V or visualization. In fact, the virtual database is a kind of database which shows up in your SQL Server Management Studio without actually restoring or even creating it. This tool creates a database in SSMS from the backup of the same database. The backup, however, works virtually the same way as original database. Potential Usage of virtual database: As soon as I described this tool to my teammate, I think his very first reaction was, “hey, if we have this then there is no need for log shipping.” I find his comment very interesting as log shipping is something where logs are moved to another server. In fact, there are no updates on the database from log; I would rather compare it with Snapshot Replication. In fact, whatever we use, snapshot replicated database can be similarly used and configured with virtual database. I totally believe that we can use it for reporting purpose. In fact, after this database was configured, I think the uses of this tool are unlimited. I will have to spend some more time studying it and will get back to you. Click on images to see larger images. virtual database Console Harddrive Space before virtual database Setup Attach Full Backup Screen Backup on Harddrive Attach Full Backup Screen with Settings virtual database Setup – less than 60 sec virtual database Setup – Online Harddrive Space after virtual database Setup Point in Time Recovery Option – Timeline View virtual database Summary No Performance Difference between Regular DB vs Virtual DB Please note that all SQL Server MVP gets free license of this software. Reference: Pinal Dave (http://blog.SQLAuthority.com), Idera (virtual database) Filed under: Database, Pinal Dave, SQL, SQL Add-On, SQL Authority, SQL Backup and Restore, SQL Data Storage, SQL Query, SQL Server, SQL Tips and Tricks, SQL Utility, SQLAuthority News, T SQL, Technology Tagged: Idera

    Read the article

  • SQL SERVER – Beginning SQL Server: One Step at a Time – SQL Server Magazine

    - by pinaldave
    I am glad to announce that along with SQLAuthority.com, I will be blogging on the prominent site of SQL Server Magazine. My very first blog post there is already live; read here: Beginning SQL Server: One Step at a Time. My association with SQL Server Magazine has been quite long, I have written nearly 7 to 8 SQL Server articles for the print magazine and it has been a great experience. I used to stay in the United States at that time. I moved back to India for good, and during this process, I had put everything on hold for a while. Just like many things, “temporary” things become “permanent” – coming back to SQLMag was on hold for long time. Well, this New Year, things have changed – once again, I am back with my online presence at SQLMag.com. Everybody is a beginner at every task or activity at some point of his/her life: spelling words for the first time; learning how to drive for the first time, etc. No one is perfect at the start of any task, but every human is different. As time passes, we all develop our interests and begin to study our subject of interest. Most of us dream to get a job in the area of our study – however things change as time passes. I recently read somewhere online (I could not find the link again while writing this one) that all the successful people in various areas have never studied in the area in which they are successful. After going through a formal learning process of what we love, we refuse to stop learning, and we finally stop changing career and focus areas. We move, we dare and we progress. IT field is similar to our life. New IT professionals come to this field every day. There are two types of beginners – a) those who are associated with IT field but not familiar with other technologies, and b) those who are absolutely new to the IT field. Learning a new technology is always exciting and overwhelming for enthusiasts. I am working with database (in particular) for SQL Server for more than 7 years but I am still overwhelmed with so many things to learn. I continue to learn and I do not think that I should ever stop doing so. Just like everybody, I want to be in the race and get ahead in learning the technology. For the same, I am always looking for good guidance. I always try to find a good article, blog or book chapter, which can teach me what I really want to learn at this stage in my career and can be immensely helpful. Quite often, I prefer to read the material where the author does not judge me or assume my understanding. I like to read new concepts like a child, who takes his/her first steps of learning without any prior knowledge. Keeping my personal philosophy and preference in mind, I will be blogging on SQL Server Magazine site. I will be blogging on the beginners stuff. I will be blogging for them, who really want to start and make a mark in this area. I will be blogging for all those who have an extreme passion for learning. I am happy that this is a good start for this year. One of my resolutions is to help every beginner. It is totally possible that in future they all will grow and find the same article quite ‘easy‘ – well when that happens, it indicates the success of the article and material! Well, I encourage everybody to read my SQL Server Magazine blog – I will be blogging there frequently on various topics. To begin, we will be talking about performance tuning, and I assure that I will not shy away from other multiple areas. Read my SQL Server Magazine Blog: Beginning SQL Server: One Step at a Time I think the title says it all. Do leave your comments and feedback to indicate your preference of subject and interest. I am going to continue writing on subject, and the aim is of course to help grow in this field. Reference : Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority News, T SQL, Technology

    Read the article

  • SQL SERVER – Signal Wait Time Introduction with Simple Example – Wait Type – Day 2 of 28

    - by pinaldave
    In this post, let’s delve a bit more in depth regarding wait stats. The very first question: when do the wait stats occur? Here is the simple answer. When SQL Server is executing any task, and if for any reason it has to wait for resources to execute the task, this wait is recorded by SQL Server with the reason for the delay. Later on we can analyze these wait stats to understand the reason the task was delayed and maybe we can eliminate the wait for SQL Server. It is not always possible to remove the wait type 100%, but there are few suggestions that can help. Before we continue learning about wait types and wait stats, we need to understand three important milestones of the query life-cycle. Running - a query which is being executed on a CPU is called a running query. This query is responsible for CPU time. Runnable – a query which is ready to execute and waiting for its turn to run is called a runnable query. This query is responsible for Signal Wait time. (In other words, the query is ready to run but CPU is servicing another query). Suspended – a query which is waiting due to any reason (to know the reason, we are learning wait stats) to be converted to runnable is suspended query. This query is responsible for wait time. (In other words, this is the time we are trying to reduce). In simple words, query execution time is a summation of the query Executing CPU Time (Running) + Query Wait Time (Suspended) + Query Signal Wait Time (Runnable). Again, it may be possible a query goes to all these stats multiple times. Let us try to understand the whole thing with a simple analogy of a taxi and a passenger. Two friends, Tom and Danny, go to the mall together. When they leave the mall, they decide to take a taxi. Tom and Danny both stand in the line waiting for their turn to get into the taxi. This is the Signal Wait Time as they are ready to get into the taxi but the taxis are currently serving other customer and they have to wait for their turn. In other word they are in a runnable state. Now when it is their turn to get into the taxi, the taxi driver informs them he does not take credit cards and only cash is accepted. Neither Tom nor Danny have enough cash, they both cannot get into the vehicle. Tom waits outside in the queue and Danny goes to ATM to fetch the cash. During this time the taxi cannot wait, they have to let other passengers get into the taxi. As Tom and Danny both are outside in the queue, this is the Query Wait Time and they are in the suspended state. They cannot do anything till they get the cash. Once Danny gets the cash, they are both standing in the line again, creating one more Signal Wait Time. This time when their turn comes they can pay the taxi driver in cash and reach their destination. The time taken for the taxi to get from the mall to the destination is running time (CPU time) and the taxi is running. I hope this analogy is bit clear with the wait stats. You can check the Signalwait stats using following query of Glenn Berry. -- Signal Waits for instance SELECT CAST(100.0 * SUM(signal_wait_time_ms) / SUM (wait_time_ms) AS NUMERIC(20,2)) AS [%signal (cpu) waits], CAST(100.0 * SUM(wait_time_ms - signal_wait_time_ms) / SUM (wait_time_ms) AS NUMERIC(20,2)) AS [%resource waits] FROM sys.dm_os_wait_stats OPTION (RECOMPILE); Higher the Signal wait stats are not good for the system. Very high value indicates CPU pressure. In my experience, when systems are running smooth and without any glitch the Signal wait stat is lower than 20%. Again, this number can be debated (and it is from my experience and is not documented anywhere). In other words, lower is better and higher is not good for the system. In future articles we will discuss in detail the various wait types and wait stats and their resolution. Read all the post in the Wait Types and Queue series. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL DMV, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL, Technology

    Read the article

  • SQL SERVER – Single Wait Time Introduction with Simple Example – Wait Type – Day 2 of 28

    - by pinaldave
    In this post, let’s delve a bit more in depth regarding wait stats. The very first question: when do the wait stats occur? Here is the simple answer. When SQL Server is executing any task, and if for any reason it has to wait for resources to execute the task, this wait is recorded by SQL Server with the reason for the delay. Later on we can analyze these wait stats to understand the reason the task was delayed and maybe we can eliminate the wait for SQL Server. It is not always possible to remove the wait type 100%, but there are few suggestions that can help. Before we continue learning about wait types and wait stats, we need to understand three important milestones of the query life-cycle. Running - a query which is being executed on a CPU is called a running query. This query is responsible for CPU time. Runnable – a query which is ready to execute and waiting for its turn to run is called a runnable query. This query is responsible for Single Wait time. (In other words, the query is ready to run but CPU is servicing another query). Suspended – a query which is waiting due to any reason (to know the reason, we are learning wait stats) to be converted to runnable is suspended query. This query is responsible for wait time. (In other words, this is the time we are trying to reduce). In simple words, query execution time is a summation of the query Executing CPU Time (Running) + Query Wait Time (Suspended) + Query Single Wait Time (Runnable). Again, it may be possible a query goes to all these stats multiple times. Let us try to understand the whole thing with a simple analogy of a taxi and a passenger. Two friends, Tom and Danny, go to the mall together. When they leave the mall, they decide to take a taxi. Tom and Danny both stand in the line waiting for their turn to get into the taxi. This is the Signal Wait Time as they are ready to get into the taxi but the taxis are currently serving other customer and they have to wait for their turn. In other word they are in a runnable state. Now when it is their turn to get into the taxi, the taxi driver informs them he does not take credit cards and only cash is accepted. Neither Tom nor Danny have enough cash, they both cannot get into the vehicle. Tom waits outside in the queue and Danny goes to ATM to fetch the cash. During this time the taxi cannot wait, they have to let other passengers get into the taxi. As Tom and Danny both are outside in the queue, this is the Query Wait Time and they are in the suspended state. They cannot do anything till they get the cash. Once Danny gets the cash, they are both standing in the line again, creating one more Single Wait Time. This time when their turn comes they can pay the taxi driver in cash and reach their destination. The time taken for the taxi to get from the mall to the destination is running time (CPU time) and the taxi is running. I hope this analogy is bit clear with the wait stats. You can check the single wait stats using following query of Glenn Berry. -- Signal Waits for instance SELECT CAST(100.0 * SUM(signal_wait_time_ms) / SUM (wait_time_ms) AS NUMERIC(20,2)) AS [%signal (cpu) waits], CAST(100.0 * SUM(wait_time_ms - signal_wait_time_ms) / SUM (wait_time_ms) AS NUMERIC(20,2)) AS [%resource waits] FROM sys.dm_os_wait_stats OPTION (RECOMPILE); Higher the single wait stats are not good for the system. Very high value indicates CPU pressure. In my experience, when systems are running smooth and without any glitch the single wait stat is lower than 20%. Again, this number can be debated (and it is from my experience and is not documented anywhere). In other words, lower is better and higher is not good for the system. In future articles we will discuss in detail the various wait types and wait stats and their resolution. Read all the post in the Wait Types and Queue series. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL DMV, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL, Technology

    Read the article

  • SQL SERVER – Database Dynamic Caching by Automatic SQL Server Performance Acceleration

    - by pinaldave
    My second look at SafePeak’s new version (2.1) revealed to me few additional interesting features. For those of you who hadn’t read my previous reviews SafePeak and not familiar with it, here is a quick brief: SafePeak is in business of accelerating performance of SQL Server applications, as well as their scalability, without making code changes to the applications or to the databases. SafePeak performs database dynamic caching, by caching in memory result sets of queries and stored procedures while keeping all those cache correct and up to date. Cached queries are retrieved from the SafePeak RAM in microsecond speed and not send to the SQL Server. The application gets much faster results (100-500 micro seconds), the load on the SQL Server is reduced (less CPU and IO) and the application or the infrastructure gets better scalability. SafePeak solution is hosted either within your cloud servers, hosted servers or your enterprise servers, as part of the application architecture. Connection of the application is done via change of connection strings or adding reroute line in the c:\windows\system32\drivers\etc\hosts file on all application servers. For those who would like to learn more on SafePeak architecture and how it works, I suggest to read this vendor’s webpage: SafePeak Architecture. More interesting new features in SafePeak 2.1 In my previous review of SafePeak new I covered the first 4 things I noticed in the new SafePeak (check out my article “SQLAuthority News – SafePeak Releases a Major Update: SafePeak version 2.1 for SQL Server Performance Acceleration”): Cache setup and fine-tuning – a critical part for getting good caching results Database templates Choosing which database to cache Monitoring and analysis options by SafePeak Since then I had a chance to play with SafePeak some more and here is what I found. 5. Analysis of SQL Performance (present and history): In SafePeak v.2.1 the tools for understanding of performance became more comprehensive. Every 15 minutes SafePeak creates and updates various performance statistics. Each query (or a procedure execute) that arrives to SafePeak gets a SQL pattern, and after it is used again there are statistics for such pattern. An important part of this product is that it understands the dependencies of every pattern (list of tables, views, user defined functions and procs). From this understanding SafePeak creates important analysis information on performance of every object: response time from the database, response time from SafePeak cache, average response time, percent of traffic and break down of behavior. One of the interesting things this behavior column shows is how often the object is actually pdated. The break down analysis allows knowing the above information for: queries and procedures, tables, views, databases and even instances level. The data is show now on all arriving queries, both read queries (that can be cached), but also any types of updates like DMLs, DDLs, DCLs, and even session settings queries. The stats are being updated every 15 minutes and SafePeak dashboard allows going back in time and investigating what happened within any time frame. 6. Logon trigger, for making sure nothing corrupts SafePeak cache data If you have an application with many parts, many servers many possible locations that can actually update the database, or the SQL Server is accessible to many DBAs or software engineers, each can access some database directly and do some changes without going thru SafePeak – this can create a potential corruption of the data stored in SafePeak cache. To make sure SafePeak cache is correct it needs to get all updates to arrive to SafePeak, and if a DBA will access the database directly and do some changes, for example, then SafePeak will simply not know about it and will not clean SafePeak cache. In the new version, SafePeak brought a new feature called “Logon Trigger” to solve the above challenge. By special click of a button SafePeak can deploy a special server logon trigger (with a CLR object) on your SQL Server that actually monitors all connections and informs SafePeak on any connection that is coming not from SafePeak. In SafePeak dashboard there is an interface that allows to control which logins can be ignored based on login names and IPs, while the rest will invoke cache cleanup of SafePeak and actually locks SafePeak cache until this connection will not be closed. Important to note, that this does not interrupt any logins, only informs SafePeak on such connection. On the Dashboard screen in SafePeak you will be able to see those connections and then decide what to do with them. Configuration of this feature in SafePeak dashboard can be done here: Settings -> SQL instances management -> click on instance -> Logon Trigger tab. Other features: 7. User management ability to grant permissions to someone without changing its configuration and only use SafePeak as performance analysis tool. 8. Better reports for analysis of performance using 15 minute resolution charts. 9. Caching of client cursors 10. Support for IPv6 Summary SafePeak is a great SQL Server performance acceleration solution for users who want immediate results for sites with performance, scalability and peak spikes challenges. Especially if your apps are packaged or 3rd party, since no code changes are done. SafePeak can significantly increase response times, by reducing network roundtrip to the database, decreasing CPU resource usage, eliminating I/O and storage access. SafePeak team provides a free fully functional trial www.safepeak.com/download and actually provides a one-on-one assistance during such trial. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: About Me, Pinal Dave, PostADay, SQL, SQL Authority, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQL Utility, T SQL, Technology

    Read the article

  • SQL SERVER – Simple Example of Snapshot Isolation – Reduce the Blocking Transactions

    - by pinaldave
    To learn any technology and move to a more advanced level, it is very important to understand the fundamentals of the subject first. Today, we will be talking about something which has been quite introduced a long time ago but not properly explored when it comes to the isolation level. Snapshot Isolation was introduced in SQL Server in 2005. However, the reality is that there are still many software shops which are using the SQL Server 2000, and therefore cannot be able to maintain the Snapshot Isolation. Many software shops have upgraded to the later version of the SQL Server, but their respective developers have not spend enough time to upgrade themselves with the latest technology. “It works!” is a very common answer of many when they are asked about utilizing the new technology, instead of backward compatibility commands. In one of the recent consultation project, I had same experience when developers have “heard about it” but have no idea about snapshot isolation. They were thinking it is the same as Snapshot Replication – which is plain wrong. This is the same demo I am including here which I have created for them. In Snapshot Isolation, the updated row versions for each transaction are maintained in TempDB. Once a transaction has begun, it ignores all the newer rows inserted or updated in the table. Let us examine this example which shows the simple demonstration. This transaction works on optimistic concurrency model. Since reading a certain transaction does not block writing transaction, it also does not block the reading transaction, which reduced the blocking. First, enable database to work with Snapshot Isolation. Additionally, check the existing values in the table from HumanResources.Shift. ALTER DATABASE AdventureWorks SET ALLOW_SNAPSHOT_ISOLATION ON GO SELECT ModifiedDate FROM HumanResources.Shift GO Now, we will need two different sessions to prove this example. First Session: Set Transaction level isolation to snapshot and begin the transaction. Update the column “ModifiedDate” to today’s date. -- Session 1 SET TRANSACTION ISOLATION LEVEL SNAPSHOT BEGIN TRAN UPDATE HumanResources.Shift SET ModifiedDate = GETDATE() GO Please note that we have not yet been committed to the transaction. Now, open the second session and run the following “SELECT” statement. Then, check the values of the table. Please pay attention on setting the Isolation level for the second one as “Snapshot” at the same time when we already start the transaction using BEGIN TRAN. -- Session 2 SET TRANSACTION ISOLATION LEVEL SNAPSHOT BEGIN TRAN SELECT ModifiedDate FROM HumanResources.Shift GO You will notice that the values in the table are still original values. They have not been modified yet. Once again, go back to session 1 and begin the transaction. -- Session 1 COMMIT After that, go back to Session 2 and see the values of the table. -- Session 2 SELECT ModifiedDate FROM HumanResources.Shift GO You will notice that the values are yet not changed and they are still the same old values which were there right in the beginning of the session. Now, let us commit the transaction in the session 2. Once committed, run the same SELECT statement once more and see what the result is. -- Session 2 COMMIT SELECT ModifiedDate FROM HumanResources.Shift GO You will notice that it now reflects the new updated value. I hope that this example is clear enough as it would give you good idea how the Snapshot Isolation level works. There is much more to write about an extra level, READ_COMMITTED_SNAPSHOT, which we will be discussing in another post soon. If you wish to use this transaction’s Isolation level in your production database, I would appreciate your comments about their performance on your servers. I have included here the complete script used in this example for your quick reference. ALTER DATABASE AdventureWorks SET ALLOW_SNAPSHOT_ISOLATION ON GO SELECT ModifiedDate FROM HumanResources.Shift GO -- Session 1 SET TRANSACTION ISOLATION LEVEL SNAPSHOT BEGIN TRAN UPDATE HumanResources.Shift SET ModifiedDate = GETDATE() GO -- Session 2 SET TRANSACTION ISOLATION LEVEL SNAPSHOT BEGIN TRAN SELECT ModifiedDate FROM HumanResources.Shift GO -- Session 1 COMMIT -- Session 2 SELECT ModifiedDate FROM HumanResources.Shift GO -- Session 2 COMMIT SELECT ModifiedDate FROM HumanResources.Shift GO Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Transaction Isolation

    Read the article

  • SQL SERVER – Spatial Database Queries – What About BLOB – T-SQL Tuesday #006

    - by pinaldave
    Michael Coles is one of the most interesting book authors I have ever met. He has a flair of writing complex stuff in a simple language. There are a very few people like that.  I really enjoyed reading his recent book, Expert SQL Server 2008 Encryption. I strongly suggest taking a look at it. This blog is written in response to T-SQL Tuesday #006: “What About BLOB? by Michael Coles. Spatial Database is my favorite subject. Since I did my TechEd India 2010 presentation, I have enjoyed this subject a lot. Before I continue this blog post, there are a few other blog posts, so I suggest you read them.  To help build the environment run the queries, I am going to present them in this single blog post. SQL SERVER – What is Spatial Database? – Developing with SQL Server Spatial and Deep Dive into Spatial Indexing This blog post explains the basics of Spatial Database and also provides a good introduction to Indexing concept. SQL SERVER – World Shapefile Download and Upload to Database – Spatial Database This blog post will enable you with how to load the shape file into database. SQL SERVER – Spatial Database Definition and Research Documents This blog post links to the white paper about Spatial Database written by Microsoft experts. SQL SERVER – Introduction to Spatial Coordinate Systems: Flat Maps for a Round Planet This blog post links to the white paper explaining coordinate system, as written by Microsoft experts. After reading the above listed blog posts, I am very confident that you are ready to run the following script. Once you create a database using the World Shapefile, as mentioned in the second link above,you can display the image of India just like the following. Please note that this is not an accurate political map. The boundary of this map has many errors and it is just a representation. You can run the following query to generate the map of India from the database spatial which you have created after following the instructions here. USE Spatial GO -- India Map SELECT [CountryName] ,[BorderAsGeometry] ,[Border] FROM [Spatial].[dbo].[Countries] WHERE Countryname = 'India' GO Now, let us find the longitude and latitude of the two major IT cities of India, Hyderabad and Bangalore. I find their values as the following: the values of longitude-latitude for Bangalore is 77.5833300000 13.0000000000; for Hyderabad, longitude-latitude is 78.4675900000 17.4531200000. Now, let us try to put these values on the India Map and see their location. -- Bangalore DECLARE @GeoLocation GEOGRAPHY SET @GeoLocation = GEOGRAPHY::STPointFromText('POINT(77.5833300000 13.0000000000)',4326).STBuffer(20000); -- Hyderabad DECLARE @GeoLocation1 GEOGRAPHY SET @GeoLocation1 = GEOGRAPHY::STPointFromText('POINT(78.4675900000 17.4531200000)',4326).STBuffer(20000); -- Bangalore and Hyderabad on Map of India SELECT name, [GeoLocation] FROM [IndiaGeoNames] I WHERE I.[GeoLocation].STDistance(@GeoLocation) <= 0 UNION ALL SELECT name, [GeoLocation] FROM [IndiaGeoNames] I WHERE I.[GeoLocation].STDistance(@GeoLocation1) <= 0 UNION ALL SELECT '',[Border] FROM [Spatial].[dbo].[Countries] WHERE Countryname = 'India' GO Now let us quickly draw a straight line between them. DECLARE @GeoLocation GEOGRAPHY SET @GeoLocation = GEOGRAPHY::STPointFromText('POINT(78.4675900000 17.4531200000)',4326).STBuffer(10000); DECLARE @GeoLocation1 GEOGRAPHY SET @GeoLocation1 = GEOGRAPHY::STPointFromText('POINT(77.5833300000 13.0000000000)',4326).STBuffer(10000); DECLARE @GeoLocation2 GEOGRAPHY SET @GeoLocation2 = GEOGRAPHY::STGeomFromText('LINESTRING(78.4675900000 17.4531200000, 77.5833300000 13.0000000000)',4326) SELECT name, [GeoLocation] FROM [IndiaGeoNames] I WHERE I.[GeoLocation].STDistance(@GeoLocation) <= 0 UNION ALL SELECT name, [GeoLocation] FROM [IndiaGeoNames] I1 WHERE I1.[GeoLocation].STDistance(@GeoLocation1) <= 0 UNION ALL SELECT '' name, @GeoLocation2 UNION ALL SELECT '',[Border] FROM [Spatial].[dbo].[Countries] WHERE Countryname = 'India' GO Let us use the distance function of the spatial database and find the straight line distance between this two cities. -- Distance Between Hyderabad and Bangalore DECLARE @GeoLocation GEOGRAPHY SET @GeoLocation = GEOGRAPHY::STPointFromText('POINT(78.4675900000 17.4531200000)',4326) DECLARE @GeoLocation1 GEOGRAPHY SET @GeoLocation1 = GEOGRAPHY::STPointFromText('POINT(77.5833300000 13.0000000000)',4326) SELECT @GeoLocation.STDistance(@GeoLocation1)/1000 'KM'; GO The result of above query is as displayed in following image. As per SQL Server, the distance between these two cities is 501 KM, but according to what I know, the distance between those two cities is around 562 KM by road. However, please note that roads are not straight and they have lots of turns, whereas this is a straight-line distance. What would be more accurate is the distance between these two cities by air travel. When we look at the air travel distance between Bangalore and Hyderabad, the total distance covered is 495 KM, which is very close to what SQL Server has estimated, which is 501 KM. Bravo! SQL Server has accurately provided the distance between two of the cities. SQL Server Spatial Database can be very useful simply because it is very easy to use, as demonstrated above. I appreciate your comments, so let me know what your thoughts and opinions about this are. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Spatial Database

    Read the article

  • SQL SERVER – Disabled Index and Update Statistics

    - by pinaldave
    When we try to update the statistics, it throws an error as if the clustered index is disabled. Now let us enable the clustered index only and attempt to update the statistics of the table right after that. Have you ever come across the situation where a conversation never gets over and it continues even though original point of discussion has passed. I am facing the same situation in the case of Disabled Index. Here is the link to original conversations. SQL SERVER – Disable Clustered Index and Data Insert – Reader had a issue here with Disabled Index SQL SERVER – Understanding ALTER INDEX ALL REBUILD with Disabled Clustered Index – Reader asked the effect of Rebuilding Indexes The same reader asked me today – “I understood what the disabled indexes do; what is their effect on statistics. Is it true that even though indexes are disabled, they continue updating the statistics?“ The answer is very interesting: If you have disabled clustered index, you will be not able to update the statistics at all for any index. If you have enabled clustered index and disabled non clustered index when you update the statistics of the table, it automatically updates the statistics of the ALL (disabled and enabled – both) the indexes on the table. If you are not satisfied with the answer, let us go over a simple example. I have written necessary comments in the code itself to have a clear idea. USE tempdb GO -- Drop Table if Exists IF EXISTS (SELECT * FROM sys.objects WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[TableName]') AND type IN (N'U')) DROP TABLE [dbo].[TableName] GO -- Create Table CREATE TABLE [dbo].[TableName]( [ID] [int] NOT NULL, [FirstCol] [varchar](50) NULL ) GO -- Insert Some data INSERT INTO TableName SELECT 1, 'First' UNION ALL SELECT 2, 'Second' UNION ALL SELECT 3, 'Third' UNION ALL SELECT 4, 'Fourth' UNION ALL SELECT 5, 'Five' GO -- Create Clustered Index ALTER TABLE [TableName] ADD CONSTRAINT [PK_TableName] PRIMARY KEY CLUSTERED ([ID] ASC) GO -- Create Nonclustered Index CREATE UNIQUE NONCLUSTERED INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] ([FirstCol] ASC) GO -- Check that all the indexes are enabled SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO Now let us update the statistics of the table and check the statistics update date. -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO Now let us disable the indexes and check if they are disabled using sys.indexes. -- Disable Indexes -- Disable Nonclustered Index ALTER INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] DISABLE GO -- Disable Clustered Index ALTER INDEX [PK_TableName] ON [dbo].[TableName] DISABLE GO -- Check that all the indexes are disabled SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO Let us try to update the statistics of the table. -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO /* -- Above operation should thrown following error Msg 1974, Level 16, State 1, Line 1 Cannot perform the specified operation on table 'TableName' because its clustered index 'PK_TableName' is disabled. */ When we try to update the statistics it throws an error as it clustered index is disabled. Now let us enable the clustered index only and attempt to update the statistics of the table right after that. -- Now let us rebuild clustered index only ALTER INDEX [PK_TableName] ON [dbo].[TableName] REBUILD GO -- Check that all the indexes status SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO We can clearly see that even though the nonclustered index is disabled it is also updated. If you do not need a nonclustered index, I suggest you to drop it as keeping them disabled is an overhead on your system. This is because every time the statistics are updated for system all the statistics for disabled indexesare also updated. -- Clean up DROP TABLE [TableName] GO The complete script is given below for easy reference. USE tempdb GO -- Drop Table if Exists IF EXISTS (SELECT * FROM sys.objects WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[TableName]') AND type IN (N'U')) DROP TABLE [dbo].[TableName] GO -- Create Table CREATE TABLE [dbo].[TableName]( [ID] [int] NOT NULL, [FirstCol] [varchar](50) NULL ) GO -- Insert Some data INSERT INTO TableName SELECT 1, 'First' UNION ALL SELECT 2, 'Second' UNION ALL SELECT 3, 'Third' UNION ALL SELECT 4, 'Fourth' UNION ALL SELECT 5, 'Five' GO -- Create Clustered Index ALTER TABLE [TableName] ADD CONSTRAINT [PK_TableName] PRIMARY KEY CLUSTERED ([ID] ASC) GO -- Create Nonclustered Index CREATE UNIQUE NONCLUSTERED INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] ([FirstCol] ASC) GO -- Check that all the indexes are enabled SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO -- Disable Indexes -- Disable Nonclustered Index ALTER INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] DISABLE GO -- Disable Clustered Index ALTER INDEX [PK_TableName] ON [dbo].[TableName] DISABLE GO -- Check that all the indexes are disabled SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO /* -- Above operation should thrown following error Msg 1974, Level 16, State 1, Line 1 Cannot perform the specified operation on table 'TableName' because its clustered index 'PK_TableName' is disabled. */ -- Now let us rebuild clustered index only ALTER INDEX [PK_TableName] ON [dbo].[TableName] REBUILD GO -- Check that all the indexes status SELECT OBJECT_NAME(OBJECT_ID), Name, type_desc, is_disabled FROM sys.indexes WHERE OBJECT_NAME(OBJECT_ID) = 'TableName' GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO -- Update the stats of table UPDATE STATISTICS TableName WITH FULLSCAN GO -- Check Statistics Last Updated Datetime SELECT name AS index_name, STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('TableName') GO -- Clean up DROP TABLE [TableName] GO Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Index, SQL Optimization, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: SQL Statistics

    Read the article

  • SQL SERVER – Solution to Puzzle – Simulate LEAD() and LAG() without Using SQL Server 2012 Analytic Function

    - by pinaldave
    Earlier I wrote a series on SQL Server Analytic Functions of SQL Server 2012. During the series to keep the learning maximum and having fun, we had few puzzles. One of the puzzle was simulating LEAD() and LAG() without using SQL Server 2012 Analytic Function. Please read the puzzle here first before reading the solution : Write T-SQL Self Join Without Using LEAD and LAG. When I was originally wrote the puzzle I had done small blunder and the question was a bit confusing which I corrected later on but wrote a follow up blog post on over here where I describe the give-away. Quick Recap: Generate following results without using SQL Server 2012 analytic functions. I had received so many valid answers. Some answers were similar to other and some were very innovative. Some answers were very adaptive and some did not work when I changed where condition. After selecting all the valid answer, I put them in table and ran RANDOM function on the same and selected winners. Here are the valid answers. No Joins and No Analytic Functions Excellent Solution by Geri Reshef – Winner of SQL Server Interview Questions and Answers (India | USA) WITH T1 AS (SELECT Row_Number() OVER(ORDER BY SalesOrderDetailID) N, s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty FROM Sales.SalesOrderDetail s WHERE SalesOrderID IN (43670, 43669, 43667, 43663)) SELECT SalesOrderID,SalesOrderDetailID,OrderQty, CASE WHEN N%2=1 THEN MAX(CASE WHEN N%2=0 THEN SalesOrderDetailID END) OVER (Partition BY (N+1)/2) ELSE MAX(CASE WHEN N%2=1 THEN SalesOrderDetailID END) OVER (Partition BY N/2) END LeadVal, CASE WHEN N%2=1 THEN MAX(CASE WHEN N%2=0 THEN SalesOrderDetailID END) OVER (Partition BY N/2) ELSE MAX(CASE WHEN N%2=1 THEN SalesOrderDetailID END) OVER (Partition BY (N+1)/2) END LagVal FROM T1 ORDER BY SalesOrderID, SalesOrderDetailID, OrderQty; GO No Analytic Function and Early Bird Excellent Solution by DHall – Winner of Pluralsight 30 days Subscription -- a query to emulate LEAD() and LAG() ;WITH s AS ( SELECT 1 AS ldOffset, -- equiv to 2nd param of LEAD 1 AS lgOffset, -- equiv to 2nd param of LAG NULL AS ldDefVal, -- equiv to 3rd param of LEAD NULL AS lgDefVal, -- equiv to 3rd param of LAG ROW_NUMBER() OVER (ORDER BY SalesOrderDetailID) AS row, SalesOrderID, SalesOrderDetailID, OrderQty FROM Sales.SalesOrderDetail WHERE SalesOrderID IN (43670, 43669, 43667, 43663) ) SELECT s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty, ISNULL( sLd.SalesOrderDetailID, s.ldDefVal) AS LeadValue, ISNULL( sLg.SalesOrderDetailID, s.lgDefVal) AS LagValue FROM s LEFT OUTER JOIN s AS sLd ON s.row = sLd.row - s.ldOffset LEFT OUTER JOIN s AS sLg ON s.row = sLg.row + s.lgOffset ORDER BY s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty No Analytic Function and Partition By Excellent Solution by DHall – Winner of Pluralsight 30 days Subscription /* a query to emulate LEAD() and LAG() */ ;WITH s AS ( SELECT 1 AS LeadOffset, /* equiv to 2nd param of LEAD */ 1 AS LagOffset, /* equiv to 2nd param of LAG */ NULL AS LeadDefVal, /* equiv to 3rd param of LEAD */ NULL AS LagDefVal, /* equiv to 3rd param of LAG */ /* Try changing the values of the 4 integer values above to see their effect on the results */ /* The values given above of 0, 0, null and null behave the same as the default 2nd and 3rd parameters to LEAD() and LAG() */ ROW_NUMBER() OVER (ORDER BY SalesOrderDetailID) AS row, SalesOrderID, SalesOrderDetailID, OrderQty FROM Sales.SalesOrderDetail WHERE SalesOrderID IN (43670, 43669, 43667, 43663) ) SELECT s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty, ISNULL( sLead.SalesOrderDetailID, s.LeadDefVal) AS LeadValue, ISNULL( sLag.SalesOrderDetailID, s.LagDefVal) AS LagValue FROM s LEFT OUTER JOIN s AS sLead ON s.row = sLead.row - s.LeadOffset /* Try commenting out this next line when LeadOffset != 0 */ AND s.SalesOrderID = sLead.SalesOrderID /* The additional join criteria on SalesOrderID above is equivalent to PARTITION BY SalesOrderID in the OVER clause of the LEAD() function */ LEFT OUTER JOIN s AS sLag ON s.row = sLag.row + s.LagOffset /* Try commenting out this next line when LagOffset != 0 */ AND s.SalesOrderID = sLag.SalesOrderID /* The additional join criteria on SalesOrderID above is equivalent to PARTITION BY SalesOrderID in the OVER clause of the LAG() function */ ORDER BY s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty No Analytic Function and CTE Usage Excellent Solution by Pravin Patel - Winner of SQL Server Interview Questions and Answers (India | USA) --CTE based solution ; WITH cteMain AS ( SELECT SalesOrderID, SalesOrderDetailID, OrderQty, ROW_NUMBER() OVER (ORDER BY SalesOrderDetailID) AS sn FROM Sales.SalesOrderDetail WHERE SalesOrderID IN (43670, 43669, 43667, 43663) ) SELECT m.SalesOrderID, m.SalesOrderDetailID, m.OrderQty, sLead.SalesOrderDetailID AS leadvalue, sLeg.SalesOrderDetailID AS leagvalue FROM cteMain AS m LEFT OUTER JOIN cteMain AS sLead ON sLead.sn = m.sn+1 LEFT OUTER JOIN cteMain AS sLeg ON sLeg.sn = m.sn-1 ORDER BY m.SalesOrderID, m.SalesOrderDetailID, m.OrderQty No Analytic Function and Co-Related Subquery Usage Excellent Solution by Pravin Patel – Winner of SQL Server Interview Questions and Answers (India | USA) -- Co-Related subquery SELECT m.SalesOrderID, m.SalesOrderDetailID, m.OrderQty, ( SELECT MIN(SalesOrderDetailID) FROM Sales.SalesOrderDetail AS l WHERE l.SalesOrderID IN (43670, 43669, 43667, 43663) AND l.SalesOrderID >= m.SalesOrderID AND l.SalesOrderDetailID > m.SalesOrderDetailID ) AS lead, ( SELECT MAX(SalesOrderDetailID) FROM Sales.SalesOrderDetail AS l WHERE l.SalesOrderID IN (43670, 43669, 43667, 43663) AND l.SalesOrderID <= m.SalesOrderID AND l.SalesOrderDetailID < m.SalesOrderDetailID ) AS leag FROM Sales.SalesOrderDetail AS m WHERE m.SalesOrderID IN (43670, 43669, 43667, 43663) ORDER BY m.SalesOrderID, m.SalesOrderDetailID, m.OrderQty This was one of the most interesting Puzzle on this blog. Giveaway Winners will get following giveaways. Geri Reshef and Pravin Patel SQL Server Interview Questions and Answers (India | USA) DHall Pluralsight 30 days Subscription Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, Readers Contribution, Readers Question, SQL, SQL Authority, SQL Function, SQL Puzzle, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – PAGEIOLATCH_DT, PAGEIOLATCH_EX, PAGEIOLATCH_KP, PAGEIOLATCH_SH, PAGEIOLATCH_UP – Wait Type – Day 9 of 28

    - by pinaldave
    It is very easy to say that you replace your hardware as that is not up to the mark. In reality, it is very difficult to implement. It is really hard to convince an infrastructure team to change any hardware because they are not performing at their best. I had a nightmare related to this issue in a deal with an infrastructure team as I suggested that they replace their faulty hardware. This is because they were initially not accepting the fact that it is the fault of their hardware. But it is really easy to say “Trust me, I am correct”, while it is equally important that you put some logical reasoning along with this statement. PAGEIOLATCH_XX is such a kind of those wait stats that we would directly like to blame on the underlying subsystem. Of course, most of the time, it is correct – the underlying subsystem is usually the problem. From Book On-Line: PAGEIOLATCH_DT Occurs when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Destroy mode. Long waits may indicate problems with the disk subsystem. PAGEIOLATCH_EX Occurs when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Exclusive mode. Long waits may indicate problems with the disk subsystem. PAGEIOLATCH_KP Occurs when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Keep mode. Long waits may indicate problems with the disk subsystem. PAGEIOLATCH_SH Occurs when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Shared mode. Long waits may indicate problems with the disk subsystem. PAGEIOLATCH_UP Occurs when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Update mode. Long waits may indicate problems with the disk subsystem. PAGEIOLATCH_XX Explanation: Simply put, this particular wait type occurs when any of the tasks is waiting for data from the disk to move to the buffer cache. ReducingPAGEIOLATCH_XX wait: Just like any other wait type, this is again a very challenging and interesting subject to resolve. Here are a few things you can experiment on: Improve your IO subsystem speed (read the first paragraph of this article, if you have not read it, I repeat that it is easy to say a step like this than to actually implement or do it). This type of wait stats can also happen due to memory pressure or any other memory issues. Putting aside the issue of a faulty IO subsystem, this wait type warrants proper analysis of the memory counters. If due to any reasons, the memory is not optimal and unable to receive the IO data. This situation can create this kind of wait type. Proper placing of files is very important. We should check file system for the proper placement of files – LDF and MDF on separate drive, TempDB on separate drive, hot spot tables on separate filegroup (and on separate disk), etc. Check the File Statistics and see if there is higher IO Read and IO Write Stall SQL SERVER – Get File Statistics Using fn_virtualfilestats. It is very possible that there are no proper indexes on the system and there are lots of table scans and heap scans. Creating proper index can reduce the IO bandwidth considerably. If SQL Server can use appropriate cover index instead of clustered index, it can significantly reduce lots of CPU, Memory and IO (considering cover index has much lesser columns than cluster table and all other it depends conditions). You can refer to the two articles’ links below previously written by me that talk about how to optimize indexes. Create Missing Indexes Drop Unused Indexes Updating statistics can help the Query Optimizer to render optimal plan, which can only be either directly or indirectly. I have seen that updating statistics with full scan (again, if your database is huge and you cannot do this – never mind!) can provide optimal information to SQL Server optimizer leading to efficient plan. Checking Memory Related Perfmon Counters SQLServer: Memory Manager\Memory Grants Pending (Consistent higher value than 0-2) SQLServer: Memory Manager\Memory Grants Outstanding (Consistent higher value, Benchmark) SQLServer: Buffer Manager\Buffer Hit Cache Ratio (Higher is better, greater than 90% for usually smooth running system) SQLServer: Buffer Manager\Page Life Expectancy (Consistent lower value than 300 seconds) Memory: Available Mbytes (Information only) Memory: Page Faults/sec (Benchmark only) Memory: Pages/sec (Benchmark only) Checking Disk Related Perfmon Counters Average Disk sec/Read (Consistent higher value than 4-8 millisecond is not good) Average Disk sec/Write (Consistent higher value than 4-8 millisecond is not good) Average Disk Read/Write Queue Length (Consistent higher value than benchmark is not good) Note: The information presented here is from my experience and there is no way that I claim it to be accurate. I suggest reading Book OnLine for further clarification. All of the discussions of Wait Stats in this blog is generic and varies from system to system. It is recommended that you test this on a development server before implementing it to a production server. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL, Technology

    Read the article

  • SQL SERVER – Simple Example of Snapshot Isolation – Reduce the Blocking Transactions

    - by pinaldave
    To learn any technology and move to a more advanced level, it is very important to understand the fundamentals of the subject first. Today, we will be talking about something which has been quite introduced a long time ago but not properly explored when it comes to the isolation level. Snapshot Isolation was introduced in SQL Server in 2005. However, the reality is that there are still many software shops which are using the SQL Server 2000, and therefore cannot be able to maintain the Snapshot Isolation. Many software shops have upgraded to the later version of the SQL Server, but their respective developers have not spend enough time to upgrade themselves with the latest technology. “It works!” is a very common answer of many when they are asked about utilizing the new technology, instead of backward compatibility commands. In one of the recent consultation project, I had same experience when developers have “heard about it” but have no idea about snapshot isolation. They were thinking it is the same as Snapshot Replication – which is plain wrong. This is the same demo I am including here which I have created for them. In Snapshot Isolation, the updated row versions for each transaction are maintained in TempDB. Once a transaction has begun, it ignores all the newer rows inserted or updated in the table. Let us examine this example which shows the simple demonstration. This transaction works on optimistic concurrency model. Since reading a certain transaction does not block writing transaction, it also does not block the reading transaction, which reduced the blocking. First, enable database to work with Snapshot Isolation. Additionally, check the existing values in the table from HumanResources.Shift. ALTER DATABASE AdventureWorks SET ALLOW_SNAPSHOT_ISOLATION ON GO SELECT ModifiedDate FROM HumanResources.Shift GO Now, we will need two different sessions to prove this example. First Session: Set Transaction level isolation to snapshot and begin the transaction. Update the column “ModifiedDate” to today’s date. -- Session 1 SET TRANSACTION ISOLATION LEVEL SNAPSHOT BEGIN TRAN UPDATE HumanResources.Shift SET ModifiedDate = GETDATE() GO Please note that we have not yet been committed to the transaction. Now, open the second session and run the following “SELECT” statement. Then, check the values of the table. Please pay attention on setting the Isolation level for the second one as “Snapshot” at the same time when we already start the transaction using BEGIN TRAN. -- Session 2 SET TRANSACTION ISOLATION LEVEL SNAPSHOT BEGIN TRAN SELECT ModifiedDate FROM HumanResources.Shift GO You will notice that the values in the table are still original values. They have not been modified yet. Once again, go back to session 1 and begin the transaction. -- Session 1 COMMIT After that, go back to Session 2 and see the values of the table. -- Session 2 SELECT ModifiedDate FROM HumanResources.Shift GO You will notice that the values are yet not changed and they are still the same old values which were there right in the beginning of the session. Now, let us commit the transaction in the session 2. Once committed, run the same SELECT statement once more and see what the result is. -- Session 2 COMMIT SELECT ModifiedDate FROM HumanResources.Shift GO You will notice that it now reflects the new updated value. I hope that this example is clear enough as it would give you good idea how the Snapshot Isolation level works. There is much more to write about an extra level, READ_COMMITTED_SNAPSHOT, which we will be discussing in another post soon. If you wish to use this transaction’s Isolation level in your production database, I would appreciate your comments about their performance on your servers. I have included here the complete script used in this example for your quick reference. ALTER DATABASE AdventureWorks SET ALLOW_SNAPSHOT_ISOLATION ON GO SELECT ModifiedDate FROM HumanResources.Shift GO -- Session 1 SET TRANSACTION ISOLATION LEVEL SNAPSHOT BEGIN TRAN UPDATE HumanResources.Shift SET ModifiedDate = GETDATE() GO -- Session 2 SET TRANSACTION ISOLATION LEVEL SNAPSHOT BEGIN TRAN SELECT ModifiedDate FROM HumanResources.Shift GO -- Session 1 COMMIT -- Session 2 SELECT ModifiedDate FROM HumanResources.Shift GO -- Session 2 COMMIT SELECT ModifiedDate FROM HumanResources.Shift GO Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Transaction Isolation

    Read the article

  • SQL SERVER – LCK_M_XXX – Wait Type – Day 15 of 28

    - by pinaldave
    Locking is a mechanism used by the SQL Server Database Engine to synchronize access by multiple users to the same piece of data, at the same time. In simpler words, it maintains the integrity of data by protecting (or preventing) access to the database object. From Book On-Line: LCK_M_BU Occurs when a task is waiting to acquire a Bulk Update (BU) lock. LCK_M_IS Occurs when a task is waiting to acquire an Intent Shared (IS) lock. LCK_M_IU Occurs when a task is waiting to acquire an Intent Update (IU) lock. LCK_M_IX Occurs when a task is waiting to acquire an Intent Exclusive (IX) lock. LCK_M_S Occurs when a task is waiting to acquire a Shared lock. LCK_M_SCH_M Occurs when a task is waiting to acquire a Schema Modify lock. LCK_M_SCH_S Occurs when a task is waiting to acquire a Schema Share lock. LCK_M_SIU Occurs when a task is waiting to acquire a Shared With Intent Update lock. LCK_M_SIX Occurs when a task is waiting to acquire a Shared With Intent Exclusive lock. LCK_M_U Occurs when a task is waiting to acquire an Update lock. LCK_M_UIX Occurs when a task is waiting to acquire an Update With Intent Exclusive lock. LCK_M_X Occurs when a task is waiting to acquire an Exclusive lock. LCK_M_XXX Explanation: I think the explanation of this wait type is the simplest. When any task is waiting to acquire lock on any resource, this particular wait type occurs. The common reason for the task to be waiting to put lock on the resource is that the resource is already locked and some other operations may be going on within it. This wait also indicates that resources are not available or are occupied at the moment due to some reasons. There is a good chance that the waiting queries start to time out if this wait type is very high. Client application may degrade the performance as well. You can use various methods to find blocking queries: EXEC sp_who2 SQL SERVER – Quickest Way to Identify Blocking Query and Resolution – Dirty Solution DMV – sys.dm_tran_locks DMV – sys.dm_os_waiting_tasks Reducing LCK_M_XXX wait: Check the Explicit Transactions. If transactions are very long, this wait type can start building up because of other waiting transactions. Keep the transactions small. Serialization Isolation can build up this wait type. If that is an acceptable isolation for your business, this wait type may be natural. The default isolation of SQL Server is ‘Read Committed’. One of my clients has changed their isolation to “Read Uncommitted”. I strongly discourage the use of this because this will probably lead to having lots of dirty data in the database. Identify blocking queries mentioned using various methods described above, and then optimize them. Partition can be one of the options to consider because this will allow transactions to execute concurrently on different partitions. If there are runaway queries, use timeout. (Please discuss this solution with your database architect first as timeout can work against you). Check if there is no memory and IO-related issue using the following counters: Checking Memory Related Perfmon Counters SQLServer: Memory Manager\Memory Grants Pending (Consistent higher value than 0-2) SQLServer: Memory Manager\Memory Grants Outstanding (Consistent higher value, Benchmark) SQLServer: Buffer Manager\Buffer Hit Cache Ratio (Higher is better, greater than 90% for usually smooth running system) SQLServer: Buffer Manager\Page Life Expectancy (Consistent lower value than 300 seconds) Memory: Available Mbytes (Information only) Memory: Page Faults/sec (Benchmark only) Memory: Pages/sec (Benchmark only) Checking Disk Related Perfmon Counters Average Disk sec/Read (Consistent higher value than 4-8 millisecond is not good) Average Disk sec/Write (Consistent higher value than 4-8 millisecond is not good) Average Disk Read/Write Queue Length (Consistent higher value than benchmark is not good) Read all the post in the Wait Types and Queue series. Note: The information presented here is from my experience and there is no way that I claim it to be accurate. I suggest reading Book OnLine for further clarification. All the discussion of Wait Stats in this blog is generic and varies from system to system. It is recommended that you test this on a development server before implementing it to a production server. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL, Technology

    Read the article

  • SQL SERVER – PAGELATCH_DT, PAGELATCH_EX, PAGELATCH_KP, PAGELATCH_SH, PAGELATCH_UP – Wait Type – Day 12 of 28

    - by pinaldave
    This is another common wait type. However, I still frequently see people getting confused with PAGEIOLATCH_X and PAGELATCH_X wait types. Actually, there is a big difference between the two. PAGEIOLATCH is related to IO issues, while PAGELATCH is not related to IO issues but is oftentimes linked to a buffer issue. Before we delve deeper in this interesting topic, first let us understand what Latch is. Latches are internal SQL Server locks which can be described as very lightweight and short-term synchronization objects. Latches are not primarily to protect pages being read from disk into memory. It’s a synchronization object for any in-memory access to any portion of a log or data file.[Updated based on comment of Paul Randal] The difference between locks and latches is that locks seal all the involved resources throughout the duration of the transactions (and other processes will have no access to the object), whereas latches locks the resources during the time when the data is changed. This way, a latch is able to maintain the integrity of the data between storage engine and data cache. A latch is a short-living lock that is put on resources on buffer cache and in the physical disk when data is moved in either directions. As soon as the data is moved, the latch is released. Now, let us understand the wait stat type  related to latches. From Book On-Line: PAGELATCH_DT Occurs when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Destroy mode. PAGELATCH_EX Occurs when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Exclusive mode. PAGELATCH_KP Occurs when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Keep mode. PAGELATCH_SH Occurs when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Shared mode. PAGELATCH_UP Occurs when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Update mode. PAGELATCH_X Explanation: When there is a contention of access of the in-memory pages, this wait type shows up. It is quite possible that some of the pages in the memory are of very high demand. For the SQL Server to access them and put a latch on the pages, it will have to wait. This wait type is usually created at the same time. Additionally, it is commonly visible when the TempDB has higher contention as well. If there are indexes that are heavily used, contention can be created as well, leading to this wait type. Reducing PAGELATCH_X wait: The following counters are useful to understand the status of the PAGELATCH: Average Latch Wait Time (ms): The wait time for latch requests that have to wait. Latch Waits/sec: This is the number of latch requests that could not be granted immediately. Total Latch Wait Time (ms): This is the total latch wait time for latch requests in the last second. If there is TempDB contention, I suggest that you read the blog post of Robert Davis right away. He has written an excellent blog post regarding how to find out TempDB contention. The same blog post explains the terms in the allocation of GAM, SGAM and PFS. If there was a TempDB contention, Paul Randal explains the optimal settings for the TempDB in his misconceptions series. Trace Flag 1118 can be useful but use it very carefully. I totally understand that this blog post is not as clear as my other blog posts. I suggest if this wait stats is on one of your higher wait type. Do leave a comment or send me an email and I will get back to you with my solution for your situation. May the looking at all other wait stats and types together become effective as this wait type can help suggest proper bottleneck in your system. Read all the post in the Wait Types and Queue series. Note: The information presented here is from my experience and there is no way that I claim it to be accurate. I suggest reading Book OnLine for further clarification. All the discussions of Wait Stats in this blog are generic and vary from system to system. It is recommended that you test this on a development server before implementing it to a production server. Reference: Pinal Dave (http://blog.SQLAuthority.com)   Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL, Technology

    Read the article

  • SQLAuthority News – Training and Consultancy and Travel – Story of 30 Last 30 Days

    - by pinaldave
    Today’s blog post is not technical as usual. Here, I present a real story, and I also invite you all to share your thoughts or opinions on this post. I am a professional SQL Server Trainer; I also do consultation in the area of the Performance Tuning and Query Optimizations. In any month, I like the mix of both in my schedule. I prefer to do training for one week, and then commit the next week for some consultation work. Due to the advancement in technology, for most of the consultation works, there is no client location visit or first time visit for understanding the project. Usually, I conduct high-end training sessions or 400 level training, and these training sessions are very intensive most of the time. Always after completing the training for 5 days straight away at 400 level, I make sure to take out some time to cool down and relax. During this time, I prefer to work on optimization projects. Consultancy is great as it keeps me updated regarding what is going on in the real world. As we all know, all those trainers who have real world experience are always considered to be the best trainers. My learning is immense during my consultations with the real client and while resolving real problems. I share the same with my students the very next week when I go for training sessions. For the same reason, every class is different from the previous ones. An experience trainer would tell you that the class is best if it is driven by Students the way instructor wants! The best scenario is as described above; but you won’t get the best scenario all the time. I was on road for nearly 25 days out of the last 30 days and involved in doing various SQL Server-related trainings. Here is what I have done in the last 30 days. I have gathered the following details from my expenditure reports, which are maintained by my wife. There are few points related to my personal expenses and few other related to business. I maintain a separate list for each of these expenses, but here I have aggregated them. Last 30 days - Training 23 days - 4 – two days training classes – 8 days of training 3 – five days training classes – 15 days of training 1 – one day training classes – 2 days of training Flights 18 flights - 8 – Kingfisher 6 – Spicejet 2 – Jet light 2 – Jet connect Stay in different cities Hyderabad – 16 days Chennai – 6 days Bangalore – 2 days Ahmedabad – 6 days (Hometown) Meals – 54 (Averaging less than 2 per day) Room Services – 16 times Training Campuses – 20 times Restaurants – 6 times Home – 12 times Taxi/Cabs – 64 times (Averaging more than 2 per day) Hotel Cab – 34 times Meru Cab – 8 times Easy Cabs – 10 times Auto Rickshaw – 2 times Looking at the above statistics, I can see that I have eaten less than what I should have, which is not good, and traveled in taxi more than what I should have. Also the temperatures in different cities were very different, not to mention the humidity as well. I missed my family, especially my little girl (9 months). When I was at home, I used to have a proper healthy meal every single day; however, when I was traveling, the food was something I had to compromise on. I have previously written about my travel experience with different airlines, my opinion is still same about them. Well, I have question to all of you road warriors, how do you manage your health and enthusiasm during situations I am going through. I have couple of time stomach upset as well sour throat. I drink lots of water and do my best to keep up. Any idea? Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: About Me, Pinal Dave, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQLAuthority News – A Real Story of Book Getting ‘Out of Stock’ to A 25% Discount Story Available

    - by pinaldave
    As many of my readers may know, I have recently written a few books.  Right now I’d like to talk about SQL Server Interview Questions and Answers (http://bit.ly/sqlinterviewbook ), my newest release. What inspired me to write this book was similar to my motivations for my previous titles – I wanted to help people understand SQL Server concepts and ace interview questions so that they could get a great job they love, as much as I love my own job. If you are new to SQL Server, don’t think I left you out of my book writing efforts. If you are new to the subject or have not had to deal with SQL Server in a long time, this book is perfect for someone who wants or needs a last minute refresher. If you are facing an upcoming interview and want to impress your future bosses, this book is perfect for getting you up to speed in a short time. However, if you are already an expert, you will still find a lot to learn and many pointers and suggestions that go deep into the subject. As I said before, I wrote this book in order to help my community, and I certainly hoped that this book would become popular. However, we decided to print a very limited number of copies to begin with. We did not think that it would sell out since much of the information is available for free online. We could not have been more wrong! We incorrectly estimated what people wanted. We did not realize that there is still a need and an interest for structured learning. So, with great reservations, we printed quite a large number of copies – and it still ran out in 36 hours! We got call from the online store with a request for more copies within 12 hours. But we had printed only as many as we had sent them. There were no extra copies. We finally talked to the printer to get more copies. However, due to festivals and holidays the copies could not be shipped to the online retailer for two days. We knew for sure that they were going to be out of the book for 48 hours. 48 hours – this was very difficult as the book was very highly anticipated. Many people wanted to buy this book quickly, and receive it soon in order to meet a deadline or to study for an upcoming test of their knowledge. But now this book was out of stock on the retail store. The way the online store works is that if the Indian-priced book is not there they list the US version of the book so that buyers will not be disappointed. The problem was that the US price of the book is three times more than the Indian price – which means one has to pay three times as much to buy this book instead of the previous very low price. We received a lot of communication on this subject, here are some examples: We are now businessmen and only focusing on money Why has the price tripled in 36 hours Why we are not honest with the price If the prices will ever come down And some of the letters we cannot post here! Well, finally after 48 hours the Indian stock was finally available online. Thanks to our printer who worked day and night to get all the copies printed. He divided the complete stock in two parts. The first part they sent immediately to online retailer  and the second part they kept with them to sell. Finally, the online retailer got them online promptly as well, and the price returned to normal. Our book once again got in business and became the eighth most popular new release in 36 hours. We appreciate your love and support. Without all of your interest and love we would have never come this far and the book would not be so successful. After thinking about all your support and how patient you were with our online troubles, the online retailer has decided to give an extra 25% discount for a limited time only. I think the 48 hours when the book was out of stock were very horrible and stressful and I’d like to apologize to my loyal readers for the mishap. I hope that the 25% off is enough to sooth any remaining hurt feelings, and that everyone will continue to learn and discover things in the book. Once again thank you so much and I truly hope that you all enjoy reading the book as much as I enjoyed writing it. My book SQL Server Interview Questions and Answers is available now. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: About Me, Pinal Dave, PostADay, SQL, SQL Authority, SQL Interview Questions and Answers, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority Book Review, SQLAuthority News, T SQL, Technology

    Read the article

  • SQL SERVER – Solution – Puzzle – SELECT * vs SELECT COUNT(*)

    - by pinaldave
    Earlier I have published Puzzle Why SELECT * throws an error but SELECT COUNT(*) does not. This question have received many interesting comments. Let us go over few of the answers, which are valid. Before I start the same, let me acknowledge Rob Farley who has not only answered correctly very first but also started interesting conversation in the same thread. The usual question will be what is the right answer. I would like to point to official Microsoft Connect Items which discusses the same. RGarvao https://connect.microsoft.com/SQLServer/feedback/details/671475/select-test-where-exists-select tiberiu utan http://connect.microsoft.com/SQLServer/feedback/details/338532/count-returns-a-value-1 Rob Farley count(*) is about counting rows, not a particular column. It doesn’t even look to see what columns are available, it’ll just count the rows, which in the case of a missing FROM clause, is 1. “select *” is designed to return columns, and therefore barfs if there are none available. Even more odd is this one: select ‘blah’ where exists (select *) You might be surprised at the results… Koushik The engine performs a “Constant scan” for Count(*) where as in the case of “SELECT *” the engine is trying to perform either Index/Cluster/Table scans. amikolaj When you query ‘select * from sometable’, SQL replaces * with the current schema of that table. With out a source for the schema, SQL throws an error. so when you query ‘select count(*)’, you are counting the one row. * is just a constant to SQL here. Check out the execution plan. Like the description states – ‘Scan an internal table of constants.’ You could do ‘select COUNT(‘my name is adam and this is my answer’)’ and get the same answer. Netra Acharya SELECT * Here, * represents all columns from a table. So it always looks for a table (As we know, there should be FROM clause before specifying table name). So, it throws an error whenever this condition is not satisfied. SELECT COUNT(*) Here, COUNT is a Function. So it is not mandetory to provide a table. Check it out this: DECLARE @cnt INT SET @cnt = COUNT(*) SELECT @cnt SET @cnt = COUNT(‘x’) SELECT @cnt Naveen Select 1 / Select ‘*’ will return 1/* as expected. Select Count(1)/Count(*) will return the count of result set of select statement. Count(1)/Count(*) will have one 1/* for each row in the result set of select statement. Select 1 or Select ‘*’ result set will contain only 1 result. so count is 1. Where as “Select *” is a sysntax which expects the table or equauivalent to table (table functions, etc..). It is like compilation error for that query. Ramesh Hi Friends, Count is an aggregate function and it expects the rows (list of records) for a specified single column or whole rows for *. So, when we use ‘select *’ it definitely give and error because ‘*’ is meant to have all the fields but there is not any table and without table it can only raise an error. So, in the case of ‘Select Count(*)’, there will be an error as a record in the count function so you will get the result as ’1'. Try using : Select COUNT(‘RAMESH’) and think there is an error ‘Must specify table to select from.’ in place of ‘RAMESH’ Pinal : If i am wrong then please clarify this. Sachin Nandanwar Any aggregate function expects a constant or a column name as an expression. DO NOT be confused with * in an aggregate function.The aggregate function does not treat it as a column name or a set of column names but a constant value, as * is a key word in SQL. You can replace any value instead of * for the COUNT function.Ex Select COUNT(5) will result as 1. The error resulting from select * is obvious it expects an object where it can extract the result set. I sincerely thank you all for wonderful conversation, I personally enjoyed it and I am sure all of you have the same feeling. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: CodeProject, Pinal Dave, PostADay, Readers Contribution, Readers Question, SQL, SQL Authority, SQL Puzzle, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

    Read the article

  • SQL SERVER – Fundamentals of Columnstore Index

    - by pinaldave
    There are two kind of storage in database. Row Store and Column Store. Row store does exactly as the name suggests – stores rows of data on a page – and column store stores all the data in a column on the same page. These columns are much easier to search – instead of a query searching all the data in an entire row whether the data is relevant or not, column store queries need only to search much lesser number of the columns. This means major increases in search speed and hard drive use. Additionally, the column store indexes are heavily compressed, which translates to even greater memory and faster searches. I am sure this looks very exciting and it does not mean that you convert every single index from row store to column store index. One has to understand the proper places where to use row store or column store indexes. Let us understand in this article what is the difference in Columnstore type of index. Column store indexes are run by Microsoft’s VertiPaq technology. However, all you really need to know is that this method of storing data is columns on a single page is much faster and more efficient. Creating a column store index is very easy, and you don’t have to learn new syntax to create them. You just need to specify the keyword “COLUMNSTORE” and enter the data as you normally would. Keep in mind that once you add a column store to a table, though, you cannot delete, insert or update the data – it is READ ONLY. However, since column store will be mainly used for data warehousing, this should not be a big problem. You can always use partitioning to avoid rebuilding the index. A columnstore index stores each column in a separate set of disk pages, rather than storing multiple rows per page as data traditionally has been stored. The difference between column store and row store approaches is illustrated below: In case of the row store indexes multiple pages will contain multiple rows of the columns spanning across multiple pages. In case of column store indexes multiple pages will contain multiple single columns. This will lead only the columns needed to solve a query will be fetched from disk. Additionally there is good chance that there will be redundant data in a single column which will further help to compress the data, this will have positive effect on buffer hit rate as most of the data will be in memory and due to same it will not need to be retrieved. Let us see small example of how columnstore index improves the performance of the query on a large table. As a first step let us create databaseset which is large enough to show performance impact of columnstore index. The time taken to create sample database may vary on different computer based on the resources. USE AdventureWorks GO -- Create New Table CREATE TABLE [dbo].[MySalesOrderDetail]( [SalesOrderID] [int] NOT NULL, [SalesOrderDetailID] [int] NOT NULL, [CarrierTrackingNumber] [nvarchar](25) NULL, [OrderQty] [smallint] NOT NULL, [ProductID] [int] NOT NULL, [SpecialOfferID] [int] NOT NULL, [UnitPrice] [money] NOT NULL, [UnitPriceDiscount] [money] NOT NULL, [LineTotal] [numeric](38, 6) NOT NULL, [rowguid] [uniqueidentifier] NOT NULL, [ModifiedDate] [datetime] NOT NULL ) ON [PRIMARY] GO -- Create clustered index CREATE CLUSTERED INDEX [CL_MySalesOrderDetail] ON [dbo].[MySalesOrderDetail] ( [SalesOrderDetailID]) GO -- Create Sample Data Table -- WARNING: This Query may run upto 2-10 minutes based on your systems resources INSERT INTO [dbo].[MySalesOrderDetail] SELECT S1.* FROM Sales.SalesOrderDetail S1 GO 100 Now let us do quick performance test. I have kept STATISTICS IO ON for measuring how much IO following queries take. In my test first I will run query which will use regular index. We will note the IO usage of the query. After that we will create columnstore index and will measure the IO of the same. -- Performance Test -- Comparing Regular Index with ColumnStore Index USE AdventureWorks GO SET STATISTICS IO ON GO -- Select Table with regular Index SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice, SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty FROM [dbo].[MySalesOrderDetail] GROUP BY ProductID ORDER BY ProductID GO -- Table 'MySalesOrderDetail'. Scan count 1, logical reads 342261, physical reads 0, read-ahead reads 0. -- Create ColumnStore Index CREATE NONCLUSTERED COLUMNSTORE INDEX [IX_MySalesOrderDetail_ColumnStore] ON [MySalesOrderDetail] (UnitPrice, OrderQty, ProductID) GO -- Select Table with Columnstore Index SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice, SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty FROM [dbo].[MySalesOrderDetail] GROUP BY ProductID ORDER BY ProductID GO It is very clear from the results that query is performance extremely fast after creating ColumnStore Index. The amount of the pages it has to read to run query is drastically reduced as the column which are needed in the query are stored in the same page and query does not have to go through every single page to read those columns. If we enable execution plan and compare we can see that column store index performance way better than regular index in this case. Let us clean up the database. -- Cleanup DROP INDEX [IX_MySalesOrderDetail_ColumnStore] ON [dbo].[MySalesOrderDetail] GO TRUNCATE TABLE dbo.MySalesOrderDetail GO DROP TABLE dbo.MySalesOrderDetail GO In future posts we will see cases where Columnstore index is not appropriate solution as well few other tricks and tips of the columnstore index. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Index, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – CXPACKET – Parallelism – Usual Solution – Wait Type – Day 6 of 28

    - by pinaldave
    CXPACKET has to be most popular one of all wait stats. I have commonly seen this wait stat as one of the top 5 wait stats in most of the systems with more than one CPU. Books On-Line: Occurs when trying to synchronize the query processor exchange iterator. You may consider lowering the degree of parallelism if contention on this wait type becomes a problem. CXPACKET Explanation: When a parallel operation is created for SQL Query, there are multiple threads for a single query. Each query deals with a different set of the data (or rows). Due to some reasons, one or more of the threads lag behind, creating the CXPACKET Wait Stat. There is an organizer/coordinator thread (thread 0), which takes waits for all the threads to complete and gathers result together to present on the client’s side. The organizer thread has to wait for the all the threads to finish before it can move ahead. The Wait by this organizer thread for slow threads to complete is called CXPACKET wait. Note that not all the CXPACKET wait types are bad. You might experience a case when it totally makes sense. There might also be cases when this is unavoidable. If you remove this particular wait type for any query, then that query may run slower because the parallel operations are disabled for the query. Reducing CXPACKET wait: We cannot discuss about reducing the CXPACKET wait without talking about the server workload type. OLTP: On Pure OLTP system, where the transactions are smaller and queries are not long but very quick usually, set the “Maximum Degree of Parallelism” to 1 (one). This way it makes sure that the query never goes for parallelism and does not incur more engine overhead. EXEC sys.sp_configure N'cost threshold for parallelism', N'1' GO RECONFIGURE WITH OVERRIDE GO Data-warehousing / Reporting server: As queries will be running for long time, it is advised to set the “Maximum Degree of Parallelism” to 0 (zero). This way most of the queries will utilize the parallel processor, and long running queries get a boost in their performance due to multiple processors. EXEC sys.sp_configure N'cost threshold for parallelism', N'0' GO RECONFIGURE WITH OVERRIDE GO Mixed System (OLTP & OLAP): Here is the challenge. The right balance has to be found. I have taken a very simple approach. I set the “Maximum Degree of Parallelism” to 2, which means the query still uses parallelism but only on 2 CPUs. However, I keep the “Cost Threshold for Parallelism” very high. This way, not all the queries will qualify for parallelism but only the query with higher cost will go for parallelism. I have found this to work best for a system that has OLTP queries and also where the reporting server is set up. Here, I am setting ‘Cost Threshold for Parallelism’ to 25 values (which is just for illustration); you can choose any value, and you can find it out by experimenting with the system only. In the following script, I am setting the ‘Max Degree of Parallelism’ to 2, which indicates that the query that will have a higher cost (here, more than 25) will qualify for parallel query to run on 2 CPUs. This implies that regardless of the number of CPUs, the query will select any two CPUs to execute itself. EXEC sys.sp_configure N'cost threshold for parallelism', N'25' GO EXEC sys.sp_configure N'max degree of parallelism', N'2' GO RECONFIGURE WITH OVERRIDE GO Read all the post in the Wait Types and Queue series. Additionally a must read comment of Jonathan Kehayias. Note: The information presented here is from my experience and I no way claim it to be accurate. I suggest you all to read the online book for further clarification. All the discussion of Wait Stats over here is generic and it varies from system to system. It is recommended that you test this on the development server before implementing on the production server. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: DMV, Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL, Technology

    Read the article

  • SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Introduction – Day 1 of 31

    - by pinaldave
    List of all the Interview Questions and Answers Series blogs Posts covering interview questions and answers always make for interesting reading.  Some people like the subject for their helpful hints and thought provoking subject, and others dislike these posts because they feel it is nothing more than cheating.  I’d like to discuss the pros and cons of a Question and Answer format here. Interview Questions and Answers are Helpful Just like blog posts, books, and articles, interview Question and Answer discussions are learning material.  The popular Dummy’s books or Idiots Guides are not only for “dummies,” but can help everyone relearn the fundamentals.  Question and Answer discussions can serve the same purpose.  You could call this SQL Server Fundamentals or SQL Server 101. I have administrated hundreds of interviews during my career and I have noticed that sometimes an interviewee with several years of experience lacks an understanding of the fundamentals.  These individuals have been in the industry for so long, usually working on a very specific project, that the ABCs of the business have slipped their mind. Or, when a college graduate is looking to get into the industry, he is not expected to have experience since he is just graduated. However, the new grad is expected to have an understanding of fundamentals and theory.  Sometimes after the stress of final exams and graduation, it can be difficult to remember the correct answers to interview questions, though. An interview Question and Answer discussion can be very helpful to both these individuals.  It is simply a way to go back over the building blocks of a topic.  Many times a simple review like this will help “jog” your memory, and all those previously-memorized facts will come flooding back to you.  It is not a way to re-learn a topic, but a way to remind yourself of what you already know. A Question and Answer discussion can also be a way to go over old topics in a more interesting manner.  Especially if you have been working in the industry, or taking lots of classes on the topic, everything you read can sound like a repeat of what you already know.  Going over a topic in a new format can make the material seem fresh and interesting.  And an interested mind will be more engaged and remember more in the end. Interview Questions and Answers are Harmful A common argument against a Question and Answer discussion is that it will give someone a “cheat sheet.” A new guy with relatively little experience can read the interview questions and answers, and then memorize them. When an interviewer asks him the same questions, he will repeat the answers and get the job. Honestly, is he good hire because he memorized the interview questions? Wouldn’t it be better for the interviewer to hire someone with actual experience?  The answer is not as easy as it seems – there are many different factors to be considered. If the interviewer is asking fundamentals-related questions only, he gets the answers he wants to hear, and then hires this first candidate – there is a good chance that he is hiring based on personality rather than experience.  If the interviewer is smart he will ask deeper questions, have more than one person on the interview team, and interview a variety of candidates.  If one interviewee happens to memorize some answers, it usually doesn’t mean he will automatically get the job at the expense of more qualified candidates. Another argument against interview Question and Answers is that it will give candidates a false sense of confidence, and that they will appear more qualified than they are. Well, if that is true, it will not last after the first interview when the candidate is asked difficult questions and he cannot find the answers in the list of interview Questions and Answers.  Besides, confidence is one of the best things to walk into an interview with! In today’s competitive job market, there are often hundreds of candidates applying for the same position.  With so many applicants to choose from, interviewers must make decisions about who to call back and who to hire based on their gut feeling.  One drawback to reading an interview Question and Answer article is that you might sound very boring in your interview – saying the same thing as every single candidate, and parroting answers that sound like someone else wrote them for you – because they did.  However, it is definitely better to go to an interview prepared, just make sure that you give a lot of thought to your answers to make them sound like your own voice.  Remember that you will be hired based on your skills as well as your personality, so don’t think that having all the right answers will make get you hired.  A good interviewee will be prepared, confident, and know how to stand out. My Opinion A list of interview Questions and Answers is really helpful as a refresher or for beginners. To really ace an interview, one needs to have real-world, hands-on experience with SQL Server as well. Interview questions just serve as a starter or easy read for experienced professionals. When I have to learn new technology, I often search online for interview questions and get an idea about the breadth and depth of the technology. Next Action I am going to write about interview Questions and Answers for next 30 days. I have previously written a series of interview questions and answers; now I have re-written them keeping the latest version of SQL Server and current industry progress in mind. If you have faced interesting interview questions or situations, please write to me and I will publish them as a guest post. If you want me to add few more details, leave a comment and I will make sure that I do my best to accommodate. Tomorrow we will start the interview Questions and Answers series, with a few interesting stories, best practices and guest posts. We will have a prize give-away and other awards when the series ends. List of all the Interview Questions and Answers Series blogs Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Interview Questions and Answers, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – 5 Tips for Improving Your Data with expressor Studio

    - by pinaldave
    It’s no secret that bad data leads to bad decisions and poor results.  However, how do you prevent dirty data from taking up residency in your data store?  Some might argue that it’s the responsibility of the person sending you the data.  While that may be true, in practice that will rarely hold up.  It doesn’t matter how many times you ask, you will get the data however they decide to provide it. So now you have bad data.  What constitutes bad data?  There are quite a few valid answers, for example: Invalid date values Inappropriate characters Wrong data Values that exceed a pre-set threshold While it is certainly possible to write your own scripts and custom SQL to identify and deal with these data anomalies, that effort often takes too long and becomes difficult to maintain.  Instead, leveraging an ETL tool like expressor Studio makes the data cleansing process much easier and faster.  Below are some tips for leveraging expressor to get your data into tip-top shape. Tip 1:     Build reusable data objects with embedded cleansing rules One of the new features in expressor Studio 3.2 is the ability to define constraints at the metadata level.  Using expressor’s concept of Semantic Types, you can define reusable data objects that have embedded logic such as constraints for dealing with dirty data.  Once defined, they can be saved as a shared atomic type and then re-applied to other data attributes in other schemas. As you can see in the figure above, I’ve defined a constraint on zip code.  I can then save the constraint rules I defined for zip code as a shared atomic type called zip_type for example.   The next time I get a different data source with a schema that also contains a zip code field, I can simply apply the shared atomic type (shown below) and the previously defined constraints will be automatically applied. Tip 2:     Unlock the power of regular expressions in Semantic Types Another powerful feature introduced in expressor Studio 3.2 is the option to use regular expressions as a constraint.   A regular expression is used to identify patterns within data.   The patterns could be something as simple as a date format or something much more complex such as a street address.  For example, I could define that a valid IP address should be made up of 4 numbers, each 0 to 255, and separated by a period.  So 192.168.23.123 might be a valid IP address whereas 888.777.0.123 would not be.   How can I account for this using regular expressions? A very simple regular expression that would look for any 4 sets of 3 digits separated by a period would be:  ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ Alternatively, the following would be the exact check for truly valid IP addresses as we had defined above:  ^(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])$ .  In expressor, we would enter this regular expression as a constraint like this: Here we select the corrective action to be ‘Escalate’, meaning that the expressor Dataflow operator will decide what to do.  Some of the options include rejecting the offending record, skipping it, or aborting the dataflow. Tip 3:     Email pattern expressions that might come in handy In the example schema that I am using, there’s a field for email.  Email addresses are often entered incorrectly because people are trying to avoid spam.  While there are a lot of different ways to define what constitutes a valid email address, a quick search online yields a couple of really useful regular expressions for validating email addresses: This one is short and sweet:  \b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b (Source: http://www.regular-expressions.info/) This one is more specific about which characters are allowed:  ^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$ (Source: http://regexlib.com/REDetails.aspx?regexp_id=26 ) Tip 4:     Reject “dirty data” for analysis or further processing Yet another feature introduced in expressor Studio 3.2 is the ability to reject records based on constraint violations.  To capture reject records on input, simply specify Reject Record in the Error Handling setting for the Read File operator.  Then attach a Write File operator to the reject port of the Read File operator as such: Next, in the Write File operator, you can configure the expressor operator in a similar way to the Read File.  The key difference would be that the schema needs to be derived from the upstream operator as shown below: Once configured, expressor will output rejected records to the file you specified.  In addition to the rejected records, expressor also captures some diagnostic information that will be helpful towards identifying why the record was rejected.  This makes diagnosing errors much easier! Tip 5:    Use a Filter or Transform after the initial cleansing to finish the job Sometimes you may want to predicate the data cleansing on a more complex set of conditions.  For example, I may only be interested in processing data containing males over the age of 25 in certain zip codes.  Using an expressor Filter operator, you can define the conditional logic which isolates the records of importance away from the others. Alternatively, the expressor Transform operator can be used to alter the input value via a user defined algorithm or transformation.  It also supports the use of conditional logic and data can be rejected based on constraint violations. However, the best tip I can leave you with is to not constrain your solution design approach – expressor operators can be combined in many different ways to achieve the desired results.  For example, in the expressor Dataflow below, I can post-process the reject data from the Filter which did not meet my pre-defined criteria and, if successful, Funnel it back into the flow so that it gets written to the target table. I continue to be impressed that expressor offers all this functionality as part of their FREE expressor Studio desktop ETL tool, which you can download from here.  Their Studio ETL tool is absolutely free and they are very open about saying that if you want to deploy their software on a dedicated Windows Server, you need to purchase their server software, whose pricing is posted on their website. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – Extending SQL Azure with Azure worker role – Guest Post by Paras Doshi

    - by pinaldave
    This is guest post by Paras Doshi. Paras Doshi is a research Intern at SolidQ.com and a Microsoft student partner. He is currently working in the domain of SQL Azure. SQL Azure is nothing but a SQL server in the cloud. SQL Azure provides benefits such as on demand rapid provisioning, cost-effective scalability, high availability and reduced management overhead. To see an introduction on SQL Azure, check out the post by Pinal here In this article, we are going to discuss how to extend SQL Azure with the Azure worker role. In other words, we will attempt to write a custom code and host it in the Azure worker role; the aim is to add some features that are not available with SQL Azure currently or features that need to be customized for flexibility. This way we extend the SQL Azure capability by building some solutions that run on Azure as worker roles. To understand Azure worker role, think of it as a windows service in cloud. Azure worker role can perform background processes, and to handle processes such as synchronization and backup, it becomes our ideal tool. First, we will focus on writing a worker role code that synchronizes SQL Azure databases. Before we do so, let’s see some scenarios in which synchronization between SQL Azure databases is beneficial: scaling out access over multiple databases enables us to handle workload efficiently As of now, SQL Azure database can be hosted in one of any six datacenters. By synchronizing databases located in different data centers, one can extend the data by enabling access to geographically distributed data Let us see some scenarios in which SQL server to SQL Azure database synchronization is beneficial To backup SQL Azure database on local infrastructure Rather than investing in local infrastructure for increased workloads, such workloads could be handled by cloud Ability to extend data to different datacenters located across the world to enable efficient data access from remote locations Now, let us develop cloud-based app that synchronizes SQL Azure databases. For an Introduction to developing cloud based apps, click here Now, in this article, I aim to provide a bird’s eye view of how a code that synchronizes SQL Azure databases look like and then list resources that can help you develop the solution from scratch. Now, if you newly add a worker role to the cloud-based project, this is how the code will look like. (Note: I have added comments to the skeleton code to point out the modifications that will be required in the code to carry out the SQL Azure synchronization. Note the placement of Setup() and Sync() function.) Click here (http://parasdoshi1989.files.wordpress.com/2011/06/code-snippet-1-for-extending-sql-azure-with-azure-worker-role1.pdf ) Enabling SQL Azure databases synchronization through sync framework is a two-step process. In the first step, the database is provisioned and sync framework creates tracking tables, stored procedures, triggers, and tables to store metadata to enable synchronization. This is one time step. The code for the same is put in the setup() function which is called once when the worker role starts. Now, the second step is continuous (or on demand) synchronization of SQL Azure databases by propagating changes between databases. This is done on a continuous basis by calling the sync() function in the while loop. The code logic to synchronize changes between SQL Azure databases should be put in the sync() function. Discussing the coding part step by step is out of the scope of this article. Therefore, let me suggest you a resource, which is given here. Also, note that before you start developing the code, you will need to install SYNC framework 2.1 SDK (download here). Further, you will reference some libraries before you start coding. Details regarding the same are available in the article that I just pointed to. You will be charged for data transfers if the databases are not in the same datacenter. For pricing information, go here Currently, a tool named DATA SYNC, which is built on top of sync framework, is available in CTP that allows SQL Azure <-> SQL server and SQL Azure <-> SQL Azure synchronization (without writing single line of code); however, in some cases, the custom code shown in this blogpost provides flexibility that is not available with Data SYNC. For instance, filtering is not supported in the SQL Azure DATA SYNC CTP2; if you wish to have such a functionality now, then you have the option of developing a custom code using SYNC Framework. Now, this code can be easily extended to synchronize at some schedule. Let us say we want the databases to get synchronized every day at 10:00 pm. This is what the code will look like now: (http://parasdoshi1989.files.wordpress.com/2011/06/code-snippet-2-for-extending-sql-azure-with-azure-worker-role.pdf) Don’t you think that by writing such a code, we are imitating the functionality provided by the SQL server agent for a SQL server? Think about it. We are scheduling our administrative task by writing custom code – in other words, we have developed a “Light weight SQL server agent for SQL Azure!” Since the SQL server agent is not currently available in cloud, we have developed a solution that enables us to schedule tasks, and thus we have extended SQL Azure with the Azure worker role! Now if you wish to track jobs, you can do so by storing this data in SQL Azure (or Azure tables). The reason is that Windows Azure is a stateless platform, and we will need to store the state of the job ourselves and the choice that you have is SQL Azure or Azure tables. Note that this solution requires custom code and also it is not UI driven; however, for now, it can act as a temporary solution until SQL server agent is made available in the cloud. Moreover, this solution does not encompass functionalities that a SQL server agent provides, but it does open up an interesting avenue to schedule some of the tasks such as backup and synchronization of SQL Azure databases by writing some custom code in the Azure worker role. Now, let us see one more possibility – i.e., running BCP through a worker role in Azure-hosted services and then uploading the backup files either locally or on blobs. If you upload it locally, then consider the data transfer cost. If you upload it to blobs residing in the same datacenter, then no transfer cost applies but the cost on blob size applies. So, before choosing the option, you need to evaluate your preferences keeping the cost associated with each option in mind. In this article, I have shown that Azure worker role solution could be developed to synchronize SQL Azure databases. Moreover, a light-weight SQL server agent for SQL Azure can be developed. Also we discussed the possibility of running BCP through a worker role in Azure-hosted services for backing up our precious SQL Azure data. Thus, we can extend SQL Azure with the Azure worker role. But remember: you will be charged for running Azure worker roles. So at the end of the day, you need to ask – am I willing to build a custom code and pay money to achieve this functionality? I hope you found this blog post interesting. If you have any questions/feedback, you can comment below or you can mail me at Paras[at]student-partners[dot]com Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Azure, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – Guest Post – Architecting Data Warehouse – Niraj Bhatt

    - by pinaldave
    Niraj Bhatt works as an Enterprise Architect for a Fortune 500 company and has an innate passion for building / studying software systems. He is a top rated speaker at various technical forums including Tech·Ed, MCT Summit, Developer Summit, and Virtual Tech Days, among others. Having run a successful startup for four years Niraj enjoys working on – IT innovations that can impact an enterprise bottom line, streamlining IT budgets through IT consolidation, architecture and integration of systems, performance tuning, and review of enterprise applications. He has received Microsoft MVP award for ASP.NET, Connected Systems and most recently on Windows Azure. When he is away from his laptop, you will find him taking deep dives in automobiles, pottery, rafting, photography, cooking and financial statements though not necessarily in that order. He is also a manager/speaker at BDOTNET, Asia’s largest .NET user group. Here is the guest post by Niraj Bhatt. As data in your applications grows it’s the database that usually becomes a bottleneck. It’s hard to scale a relational DB and the preferred approach for large scale applications is to create separate databases for writes and reads. These databases are referred as transactional database and reporting database. Though there are tools / techniques which can allow you to create snapshot of your transactional database for reporting purpose, sometimes they don’t quite fit the reporting requirements of an enterprise. These requirements typically are data analytics, effective schema (for an Information worker to self-service herself), historical data, better performance (flat data, no joins) etc. This is where a need for data warehouse or an OLAP system arises. A Key point to remember is a data warehouse is mostly a relational database. It’s built on top of same concepts like Tables, Rows, Columns, Primary keys, Foreign Keys, etc. Before we talk about how data warehouses are typically structured let’s understand key components that can create a data flow between OLTP systems and OLAP systems. There are 3 major areas to it: a) OLTP system should be capable of tracking its changes as all these changes should go back to data warehouse for historical recording. For e.g. if an OLTP transaction moves a customer from silver to gold category, OLTP system needs to ensure that this change is tracked and send to data warehouse for reporting purpose. A report in context could be how many customers divided by geographies moved from sliver to gold category. In data warehouse terminology this process is called Change Data Capture. There are quite a few systems that leverage database triggers to move these changes to corresponding tracking tables. There are also out of box features provided by some databases e.g. SQL Server 2008 offers Change Data Capture and Change Tracking for addressing such requirements. b) After we make the OLTP system capable of tracking its changes we need to provision a batch process that can run periodically and takes these changes from OLTP system and dump them into data warehouse. There are many tools out there that can help you fill this gap – SQL Server Integration Services happens to be one of them. c) So we have an OLTP system that knows how to track its changes, we have jobs that run periodically to move these changes to warehouse. The question though remains is how warehouse will record these changes? This structural change in data warehouse arena is often covered under something called Slowly Changing Dimension (SCD). While we will talk about dimensions in a while, SCD can be applied to pure relational tables too. SCD enables a database structure to capture historical data. This would create multiple records for a given entity in relational database and data warehouses prefer having their own primary key, often known as surrogate key. As I mentioned a data warehouse is just a relational database but industry often attributes a specific schema style to data warehouses. These styles are Star Schema or Snowflake Schema. The motivation behind these styles is to create a flat database structure (as opposed to normalized one), which is easy to understand / use, easy to query and easy to slice / dice. Star schema is a database structure made up of dimensions and facts. Facts are generally the numbers (sales, quantity, etc.) that you want to slice and dice. Fact tables have these numbers and have references (foreign keys) to set of tables that provide context around those facts. E.g. if you have recorded 10,000 USD as sales that number would go in a sales fact table and could have foreign keys attached to it that refers to the sales agent responsible for sale and to time table which contains the dates between which that sale was made. These agent and time tables are called dimensions which provide context to the numbers stored in fact tables. This schema structure of fact being at center surrounded by dimensions is called Star schema. A similar structure with difference of dimension tables being normalized is called a Snowflake schema. This relational structure of facts and dimensions serves as an input for another analysis structure called Cube. Though physically Cube is a special structure supported by commercial databases like SQL Server Analysis Services, logically it’s a multidimensional structure where dimensions define the sides of cube and facts define the content. Facts are often called as Measures inside a cube. Dimensions often tend to form a hierarchy. E.g. Product may be broken into categories and categories in turn to individual items. Category and Items are often referred as Levels and their constituents as Members with their overall structure called as Hierarchy. Measures are rolled up as per dimensional hierarchy. These rolled up measures are called Aggregates. Now this may seem like an overwhelming vocabulary to deal with but don’t worry it will sink in as you start working with Cubes and others. Let’s see few other terms that we would run into while talking about data warehouses. ODS or an Operational Data Store is a frequently misused term. There would be few users in your organization that want to report on most current data and can’t afford to miss a single transaction for their report. Then there is another set of users that typically don’t care how current the data is. Mostly senior level executives who are interesting in trending, mining, forecasting, strategizing, etc. don’t care for that one specific transaction. This is where an ODS can come in handy. ODS can use the same star schema and the OLAP cubes we saw earlier. The only difference is that the data inside an ODS would be short lived, i.e. for few months and ODS would sync with OLTP system every few minutes. Data warehouse can periodically sync with ODS either daily or weekly depending on business drivers. Data marts are another frequently talked about topic in data warehousing. They are subject-specific data warehouse. Data warehouses that try to span over an enterprise are normally too big to scope, build, manage, track, etc. Hence they are often scaled down to something called Data mart that supports a specific segment of business like sales, marketing, or support. Data marts too, are often designed using star schema model discussed earlier. Industry is divided when it comes to use of data marts. Some experts prefer having data marts along with a central data warehouse. Data warehouse here acts as information staging and distribution hub with spokes being data marts connected via data feeds serving summarized data. Others eliminate the need for a centralized data warehouse citing that most users want to report on detailed data. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Best Practices, Business Intelligence, Data Warehousing, Database, Pinal Dave, PostADay, Readers Contribution, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – A Quick Look at Logging and Ideas around Logging

    - by pinaldave
    This blog post is written in response to the T-SQL Tuesday post on Logging. When someone talks about logging, personally I get lots of ideas about it. I have seen logging as a very generic term. Let me ask you this question first before I continue writing about logging. What is the first thing comes to your mind when you hear word “Logging”? Now ask the same question to the guy standing next to you. I am pretty confident that you will get  a different answer from different people. I decided to do this activity and asked 5 SQL Server person the same question. Question: What is the first thing comes to your mind when you hear the word “Logging”? Strange enough I got a different answer every single time. Let me just list what answer I got from my friends. Let us go over them one by one. Output Clause The very first person replied output clause. Pretty interesting answer to start with. I see what exactly he was thinking. SQL Server 2005 has introduced a new OUTPUT clause. OUTPUT clause has access to inserted and deleted tables (virtual tables) just like triggers. OUTPUT clause can be used to return values to client clause. OUTPUT clause can be used with INSERT, UPDATE, or DELETE to identify the actual rows affected by these statements. Here are some references for Output Clause: OUTPUT Clause Example and Explanation with INSERT, UPDATE, DELETE Reasons for Using Output Clause – Quiz Tips from the SQL Joes 2 Pros Development Series – Output Clause in Simple Examples Error Logs I was expecting someone to mention Error logs when it is about logging. The error log is the most looked place when there is any error either with the application or there is an error with the operating system. I have kept the policy to check my server’s error log every day. The reason is simple – enough time in my career I have figured out that when I am looking at error logs I find something which I was not expecting. There are cases, when I noticed errors in the error log and I fixed them before end user notices it. Other common practices I always tell my DBA friends to do is that when any error happens they should find relevant entries in the error logs and document the same. It is quite possible that they will see the same error in the error log  and able to fix the error based on the knowledge base which they have created. There can be many different kinds of error log files exists in SQL Server as well – 1) SQL Server Error Logs 2) Windows Event Log 3) SQL Server Agent Log 4) SQL Server Profile Log 5) SQL Server Setup Log etc. Here are some references for Error Logs: Recycle Error Log – Create New Log file without Server Restart SQL Error Messages Change Data Capture I got surprised with this answer. I think more than the answer I was surprised by the person who had answered me this one. I always thought he was expert in HTML, JavaScript but I guess, one should never assume about others. Indeed one of the cool logging feature is Change Data Capture. Change Data Capture records INSERTs, UPDATEs, and DELETEs applied to SQL Server tables, and makes a record available of what changed, where, and when, in simple relational ‘change tables’ rather than in an esoteric chopped salad of XML. These change tables contain columns that reflect the column structure of the source table you have chosen to track, along with the metadata needed to understand the changes that have been made. Here are some references for Change Data Capture: Introduction to Change Data Capture (CDC) in SQL Server 2008 Tuning the Performance of Change Data Capture in SQL Server 2008 Download Script of Change Data Capture (CDC) CDC and TRUNCATE – Cannot truncate table because it is published for replication or enabled for Change Data Capture Dynamic Management View (DMV) I like this answer. If asked I would have not come up with DMV right away but in the spirit of the original question, I think DMV does log the data. DMV logs or stores or records the various data and activity on the SQL Server. Dynamic management views return server state information that can be used to monitor the health of a server instance, diagnose problems, and tune performance. One can get plethero of information from DMVs – High Availability Status, Query Executions Details, SQL Server Resources Status etc. Here are some references for Dynamic Management View (DMV): SQL SERVER – Denali – DMV Enhancement – sys.dm_exec_query_stats – New Columns DMV – sys.dm_os_windows_info – Information about Operating System DMV – sys.dm_os_wait_stats Explanation – Wait Type – Day 3 of 28 DMV sys.dm_exec_describe_first_result_set_for_object – Describes the First Result Metadata for the Module Transaction Log Impact Detection Using DMV – dm_tran_database_transactions Log Files I almost flipped with this final answer from my friend. This should be probably the first answer. Yes, indeed log file logs the SQL Server activities. One can write infinite things about log file. SQL Server uses log file with the extension .ldf to manage transactions and maintain database integrity. Log file ensures that valid data is written out to database and system is in a consistent state. Log files are extremely useful in case of the database failures as with the help of full backup file database can be brought in the desired state (point in time recovery is also possible). SQL Server database has three recovery models – 1) Simple, 2) Full and 3) Bulk Logged. Each of the model uses the .ldf file for performing various activities. It is very important to take the backup of the log files (along with full backup) as one never knows when backup of the log file come into the action and save the day! How to Stop Growing Log File Too Big Reduce the Virtual Log Files (VLFs) from LDF file Log File Growing for Model Database – model Database Log File Grew Too Big master Database Log File Grew Too Big SHRINKFILE and TRUNCATE Log File in SQL Server 2008 Can I just say I loved this month’s T-SQL Tuesday Question. It really provoked very interesting conversation around me. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Optimization, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

< Previous Page | 8 9 10 11 12 13 14 15 16 17 18 19  | Next Page >