Search Results

Search found 848 results on 34 pages for 'sqlauthority'.

Page 29/34 | < Previous Page | 25 26 27 28 29 30 31 32 33 34 | Next Page >

SQL SERVER – A Puzzle – Swap Value of Column Without Case Statement

- by pinaldave

For the last few weeks, I have been doing Friday Puzzles and I am really loving it. Yesterday I received a very interesting question by Navneet Chaurasia on Facebook Page. He was asked this question in one of the interview questions for job. Please read the original thread for a complete idea of the conversation. I am presenting the same question here. Puzzle Let us assume there is a single column in the table called Gender. The challenge is to write a single update statement which will flip or swap the value in the column. For example if the value in the gender column is ‘male’ swap it with ‘female’ and if the value is ‘female’ swap it with ‘male’. Here is the quick setup script for the puzzle. USE tempdb GO CREATE TABLE SimpleTable (ID INT, Gender VARCHAR(10)) GO INSERT INTO SimpleTable (ID, Gender) SELECT 1, 'female' UNION ALL SELECT 2, 'male' UNION ALL SELECT 3, 'male' GO SELECT * FROM SimpleTable GO The above query will return following result set. The puzzle was to write a single update column which will generate following result set. There are multiple answers to this simple puzzle. Let me show you three different ways. I am assuming that the column will have either value ‘male’ or ‘female’ only. Method 1: Using CASE Statement I believe this is going to be the most popular solution as we are all familiar with CASE Statement. UPDATE SimpleTable SET Gender = CASE Gender WHEN 'male' THEN 'female' ELSE 'male' END GO SELECT * FROM SimpleTable GO Method 2: Using REPLACE Function I totally understand it is the not cleanest solution but it will for sure work in giving situation. UPDATE SimpleTable SET Gender = REPLACE(('fe'+Gender),'fefe','') GO SELECT * FROM SimpleTable GO Method 3: Using IIF in SQL Server 2012 If you are using SQL Server 2012 you can use IIF and get the same effect as CASE statement. UPDATE SimpleTable SET Gender = IIF(Gender = 'male', 'female', 'male') GO SELECT * FROM SimpleTable GO You can read my article series on SQL Server 2012 various functions over here. SQL SERVER – Denali – Logical Function – IIF() – A Quick Introduction SQL SERVER – Detecting Leap Year in T-SQL using SQL Server 2012 – IIF, EOMONTH and CONCAT Function Let us clean up. DROP TABLE SimpleTable GO Question to you: I came up with three simple tricks where there is a single UPDATE statement which swaps the values in the column. Do you know any other simple trick? If yes, please post here in the comments. I will pick two random winners from all the valid answers. Winners will get 1) Print Copy of SQL Server Interview Questions and Answers 2) Free Learning Code for Online Video Courses I will announce the winners on coming Monday. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: CodeProject, PostADay, SQL, SQL Authority, SQL Interview Questions and Answers, SQL Puzzle, SQL Query, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

Read the article
SQL SERVER – Another lesser known feature of SQL Server Management Studio 2012 – Guest Post by Balmukund Lakhani

- by Pinal Dave

This is a fantastic blog post from my dear friend Balmukund ( blog | twitter | facebook ). He had presented a fantastic session in our last UG and there were lots of requests from attendees that he blogs about it. Well, here is the blog post about the same very popular UG session. Let us read the entire blog post in the voice of the Balmukund himself. In one of my previous guest blog on SQL Authority, I wrote about “Additional Connection Parameter” tab of login screen in SQL Server Management Studio (a.k.a. SSMS). On the similar lines, this blog is going to show little less known new feature of login main screen (“Connect to Server”) of SSMS 2012. You might have seen below screen countless times and you might wonder what is there is blog about in this simple screen. Well, continue reading and you would get the answer. Many times, DBA have to login to production server from non-regular machine, may be a developer’s workstation. Once you login to SQL, do your work and close the management studio. Do you know that your server name is saved in management studio? Of course, very useful feature because you may not like to type server name/IP address every time. Whatever servers you have connected, it would be stored by management studio. But sometime, it’s annoying! What you would do if you want SQL Server Management Studio to forget “all” the servers listed in drop down of Server name? To do that, you need to know how and where it’s stored. You can use one of my favorite tool from sysinternals called Process Monitor (also known as ProcMon) and easily figure out that this is stored in a file under your windows user profile. Below is the file in SQL 2008 R2 Management Studio. %appdata%\Microsoft\Microsoft SQL Server\100\Tools\Shell\SqlStudio.bin For SQL Server 2012, here is what we can see in ProcMon So, the path is %appdata%\Microsoft\Microsoft SQL Server\110\Tools\Shell\SqlStudio.bin So far, you might wonder, where is the new feature? I have been asked by many users to delete entries from SSMS “Connect to Server” server name list. Well, unofficially, you can delete the file directly which we found via ProcMon. Note that delete file to get rid of server list is not officially supported by Microsoft. Better way to achieve this is provided in SSMS 2012. To delete the servers from the list, highlight the name we want to delete (via keyboard or mouse) and then press delete key via keyboard. We can’t be multi-select and has to be done one by one. We can delete as many entries we want. I have delete few from first screenshot taken and here is the modified version. This is not available in SQL 2008 R2 and its previous version. This came from feedback given to SQL Server Product group. Hope you have learned something new today! Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Server Management Studio, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Identity Fields – Contest Win Joes 2 Pros Combo (USD 198) – Day 2 of 5

- by pinaldave

August 2011 we ran a contest where every day we give away one book for an entire month. The contest had extreme success. Lots of people participated and lots of give away. I have received lots of questions if we are doing something similar this month. Absolutely, instead of running a contest a month long we are doing something more interesting. We are giving away USD 198 worth gift every day for this week. We are giving away Joes 2 Pros 5 Volumes (BOOK) SQL 2008 Development Certification Training Kit every day. One copy in India and One in USA. Total 2 of the giveaway (worth USD 198). All the gifts are sponsored from the Koenig Training Solution and Joes 2 Pros. The books are available here Amazon | Flipkart | Indiaplaza How to Win: Read the Question Read the Hints Answer the Quiz in Contact Form in following format Question Answer Name of the country (The contest is open for USA and India residents only) 2 Winners will be randomly selected announced on August 20th. Question of the Day: Which of the following statement is incorrect? a) Identity value can be negative. b) Identity value can have negative interval. c) Identity value can be of datatype VARCHAR d) Identity value can have increment interval larger than 1 Query Hints: BIG HINT POST A simple way to determine if a table contains an identity field is to use the SSMS Object Explorer Design Interface. Navigate to the table, then right-click it and choose Design from the pop-up window. When your design tab opens, select the first field in the table to view its list of properties in the lower pane of the tab (In this case the field is ProductID). Look to see if the Identity Specification property in the lower pane is set to either yes or no. SQL Server will allow you to utilize IDENTITY_INSERT with just one table at a time. After you’ve completed the needed work, it’s very important to reset the IDENTITY_INSERT back to OFF. Additional Hints: I have previously discussed various concepts from SQL Server Joes 2 Pros Volume 2. SQL Joes 2 Pros Development Series – Output Clause in Simple Examples SQL Joes 2 Pros Development Series – Ranking Functions – Advanced NTILE in Detail SQL Joes 2 Pros Development Series – Ranking Functions – RANK( ), DENSE_RANK( ), and ROW_NUMBER( ) SQL Joes 2 Pros Development Series – Advanced Aggregates with the Over Clause SQL Joes 2 Pros Development Series – Aggregates with the Over Clause SQL Joes 2 Pros Development Series – Overriding Identity Fields – Tricks and Tips of Identity Fields SQL Joes 2 Pros Development Series – Many to Many Relationships Next Step: Answer the Quiz in Contact Form in following format Question Answer Name of the country (The contest is open for USA and India) Bonus Winner Leave a comment with your favorite article from the “additional hints” section and you may be eligible for surprise gift. There is no country restriction for this Bonus Contest. Do mention why you liked it any particular blog post and I will announce the winner of the same along with the main contest. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Joes 2 Pros, PostADay, SQL, SQL Authority, SQL Puzzle, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
MySQL – Video Course – MySQL Backup and Recovery Fundamentals

- by Pinal Dave

Data is the one of the most crucial things for any organization and keeping data safe is the biggest challenge for any DBA. This is true for any organizations. Think about the scenario that you have a database which is extremely important and suddenly you accidently delete the most important table from that database. I am sure this is a very difficult time. In times like this people often get stressed or just make even second mistake. In my career of 10 years I have done often this mistake and often got stressed out due to un-availability of the database backup. In the SQL Server field, we have plenty of the help on this subject, but in MySQL domain there is not enough help. For the same reason I have build this MySQL course on Backup and Recovery. Course Outline Data is very important to any application and business. It is very important that every business plan for data safety. Database backup strategies are often discussed after the disaster has already happened. In this introductory course we will explore a few of the basic backup strategies every business should implement for data safely. We will explore how we can recover our server quickly after any unfriendly incident to our MySQL database. Click to View Course Here are various important aspects which we have discussed in this course. How to take backup of single database? How to take backup of multiple database? How to backup various database objects? How to restore a single database? How to restore multiple databases? How to use MySQL Workbench for Backup and Restore? How to restore Point in Time for any database? What is the best time to backup? How to copy database from one server to another server? All of the above concepts and many more subjects are covered in the MySQL Backup and Recovery Fundamentals course. It is available on Pluralsight. Scenarios As learning about Backup and Recovery can be very much boring, I decided to create two fictitious characters and demonstrate the entire course based on their conversation. The story is about Mike and Rahul. Mike is Sr. Database administrator in USA and Rahul is an intern in India. Rahul aspires to become a senior database administrator and this is a story about his challenges and how he overcomes those challenges. I had a great time to build this course and I have got very good feedback on this course. I encourage all of you to attempt to learn MySQL Backup and Recovery Fundamental course with this innovative effort. It will be very valuable to know your feedback. You will need a valid Pluralsight subscription to watch this course. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: MySQL, PostADay, SQL, SQL Authority, SQL Backup and Restore, SQL Query, SQL Tips and Tricks, T SQL

Read the article
SQL SERVER – Expanding Views – Contest Win Joes 2 Pros Combo (USD 198) – Day 4 of 5

- by pinaldave

August 2011 we ran a contest where every day we give away one book for an entire month. The contest had extreme success. Lots of people participated and lots of give away. I have received lots of questions if we are doing something similar this month. Absolutely, instead of running a contest a month long we are doing something more interesting. We are giving away USD 198 worth gift every day for this week. We are giving away Joes 2 Pros 5 Volumes (BOOK) SQL 2008 Development Certification Training Kit every day. One copy in India and One in USA. Total 2 of the giveaway (worth USD 198). All the gifts are sponsored from the Koenig Training Solution and Joes 2 Pros. The books are available here Amazon | Flipkart | Indiaplaza How to Win: Read the Question Read the Hints Answer the Quiz in Contact Form in following format Question Answer Name of the country (The contest is open for USA and India residents only) 2 Winners will be randomly selected announced on August 20th. Question of the Day: Which of the following key word will force the query to use indexes created on views? a) ENCRYPTION b) SCHEMABINDING c) NOEXPAND d) CHECK OPTION Query Hints: BIG HINT POST Usually, the assumption is that Index on the table will use Index on the table and Index on view will be used by view. However, that is the misconception. It does not happen this way. In fact, if you notice the image, you will find the both of them (table and view) use both the index created on the table. The index created on the view is not used. The reason for the same as listed in BOL. The cost of using the indexed view may exceed the cost of getting the data from the base tables, or the query is so simple that a query against the base tables is fast and easy to find. This often happens when the indexed view is defined on small tables. You can use the NOEXPAND hint if you want to force the query processor to use the indexed view. This may require you to rewrite your query if you don’t initially reference the view explicitly. You can get the actual cost of the query with NOEXPAND and compare it to the actual cost of the query plan that doesn’t reference the view. If they are close, this may give you the confidence that the decision of whether or not to use the indexed view doesn’t matter. Additional Hints: I have previously discussed various concepts from SQL Server Joes 2 Pros Volume 4. SQL Joes 2 Pros Development Series – Structured Error Handling SQL Joes 2 Pros Development Series – SQL Server Error Messages SQL Joes 2 Pros Development Series – Table-Valued Functions SQL Joes 2 Pros Development Series – Table-Valued Store Procedure Parameters SQL Joes 2 Pros Development Series – Easy Introduction to CHECK Options SQL Joes 2 Pros Development Series – Introduction to Views SQL Joes 2 Pros Development Series – All about SQL Constraints Next Step: Answer the Quiz in Contact Form in following format Question Answer Name of the country (The contest is open for USA and India) Bonus Winner Leave a comment with your favorite article from the “additional hints” section and you may be eligible for surprise gift. There is no country restriction for this Bonus Contest. Do mention why you liked it any particular blog post and I will announce the winner of the same along with the main contest. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Joes 2 Pros, PostADay, SQL, SQL Authority, SQL Puzzle, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – IO_COMPLETION – Wait Type – Day 10 of 28

- by pinaldave

For any good system three things are vital: CPU, Memory and IO (disk). Among these three, IO is the most crucial factor of SQL Server. Looking at real-world cases, I do not see IT people upgrading CPU and Memory frequently. However, the disk is often upgraded for either improving the space, speed or throughput. Today we will look at an IO-related wait types. From Book On-Line: Occurs while waiting for I/O operations to complete. This wait type generally represents non-data page I/Os. Data page I/O completion waits appear as PAGEIOLATCH_* waits. IO_COMPLETION Explanation: Any tasks are waiting for I/O to finish. This is a good indication that IO needs to be looked over here. Reducing IO_COMPLETION wait: When it is an issue concerning the IO, one should look at the following things related to IO subsystem: Proper placing of the files is very important. We should check the file system for proper placement of files – LDF and MDF on a separate drive, TempDB on another separate drive, hot spot tables on separate filegroup (and on separate disk),etc. Check the File Statistics and see if there is higher IO Read and IO Write Stall SQL SERVER – Get File Statistics Using fn_virtualfilestats. Check event log and error log for any errors or warnings related to IO. If you are using SAN (Storage Area Network), check the throughput of the SAN system as well as the configuration of the HBA Queue Depth. In one of my recent projects, the SAN was performing really badly so the SAN administrator did not accept it. After some investigations, he agreed to change the HBA Queue Depth on development (test environment) set up and as soon as we changed the HBA Queue Depth to quite a higher value, there was a sudden big improvement in the performance. It is very possible that there are no proper indexes in the system and there are lots of table scans and heap scans. Creating proper index can reduce the IO bandwidth considerably. If SQL Server can use appropriate cover index instead of clustered index, it can effectively reduce lots of CPU, Memory and IO (considering cover index has lesser columns than cluster table and all other; it depends upon the situation). You can refer to the two articles that I wrote; they are about how to optimize indexes: Create Missing Indexes Drop Unused Indexes Checking Memory Related Perfmon Counters SQLServer: Memory Manager\Memory Grants Pending (Consistent higher value than 0-2) SQLServer: Memory Manager\Memory Grants Outstanding (Consistent higher value, Benchmark) SQLServer: Buffer Manager\Buffer Hit Cache Ratio (Higher is better, greater than 90% for usually smooth running system) SQLServer: Buffer Manager\Page Life Expectancy (Consistent lower value than 300 seconds) Memory: Available Mbytes (Information only) Memory: Page Faults/sec (Benchmark only) Memory: Pages/sec (Benchmark only) Checking Disk Related Perfmon Counters Average Disk sec/Read (Consistent higher value than 4-8 millisecond is not good) Average Disk sec/Write (Consistent higher value than 4-8 millisecond is not good) Average Disk Read/Write Queue Length (Consistent higher value than benchmark is not good) Note: The information presented here is from my experience and there is no way that I claim it to be accurate. I suggest reading Book OnLine for further clarification. All the discussions of Wait Stats in this blog are generic and vary from system to system. It is recommended that you test this on a development server before implementing it to a production server. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQL Wait Types, SQL White Papers, T SQL, Technology

Read the article
SQL SERVER – Quiz and Video – Introduction to Hierarchical Query using a Recursive CTE

- by pinaldave

This blog post is inspired from SQL Queries Joes 2 Pros: SQL Query Techniques For Microsoft SQL Server 2008 – SQL Exam Prep Series 70-433 – Volume 2.[Amazon] | [Flipkart] | [Kindle] | [IndiaPlaza] This is follow up blog post of my earlier blog post on the same subject - SQL SERVER – Introduction to Hierarchical Query using a Recursive CTE – A Primer. In the article we discussed various basics terminology of the CTE. The article further covers following important concepts of common table expression. What is a Common Table Expression (CTE) Building a Recursive CTE Identify the Anchor and Recursive Query Add the Anchor and Recursive query to a CTE Add an expression to track hierarchical level Add a self-referencing INNER JOIN statement Above six are the most important concepts related to CTE and SQL Server. There are many more things one has to learn but without beginners fundamentals one can’t learn the advanced concepts. Let us have small quiz and check how many of you get the fundamentals right. Quiz 1) You have an employee table with the following data. EmpID FirstName LastName MgrID 1 David Kennson 11 2 Eric Bender 11 3 Lisa Kendall 4 4 David Lonning 11 5 John Marshbank 4 6 James Newton 3 7 Sally Smith NULL You need to write a recursive CTE that shows the EmpID, FirstName, LastName, MgrID, and employee level. The CEO should be listed at Level 1. All people who work for the CEO will be listed at Level 2. All of the people who work for those people will be listed at Level 3. Which CTE code will achieve this result? WITH EmpList AS (SELECT Boss.EmpID, Boss.FName, Boss.LName, Boss.MgrID, 1 AS Lvl FROM Employee AS Boss WHERE Boss.MgrID IS NULL UNION ALL SELECT E.EmpID, E.FirstName, E.LastName, E.MgrID, EmpList.Lvl + 1 FROM Employee AS E INNER JOIN EmpList ON E.MgrID = EmpList.EmpID) SELECT * FROM EmpList WITH EmpListAS (SELECT EmpID, FirstName, LastName, MgrID, 1 as Lvl FROM Employee WHERE MgrID IS NULL UNION ALL SELECT EmpID, FirstName, LastName, MgrID, 2 as Lvl ) SELECT * FROM BossList WITH EmpList AS (SELECT EmpID, FirstName, LastName, MgrID, 1 as Lvl FROM Employee WHERE MgrID is NOT NULL UNION SELECT EmpID, FirstName, LastName, MgrID, BossList.Lvl + 1 FROM Employee INNER JOIN EmpList BossList ON Employee.MgrID = BossList.EmpID) SELECT * FROM EmpList 2) You have a table named Employee. The EmployeeID of each employee’s manager is in the ManagerID column. You need to write a recursive query that produces a list of employees and their manager. The query must also include the employee’s level in the hierarchy. You write the following code segment: WITH EmployeeList (EmployeeID, FullName, ManagerName, Level) AS ( –PICK ANSWER CODE HERE ) SELECT EmployeeID, FullName, ” AS [ManagerID], 1 AS [Level] FROM Employee WHERE ManagerID IS NULL UNION ALL SELECT emp.EmployeeID, emp.FullName mgr.FullName, 1 + 1 AS [Level] FROM Employee emp JOIN Employee mgr ON emp.ManagerID = mgr.EmployeeId SELECT EmployeeID, FullName, ” AS [ManagerID], 1 AS [Level] FROM Employee WHERE ManagerID IS NULL UNION ALL SELECT emp.EmployeeID, emp.FullName, mgr.FullName, mgr.Level + 1 FROM EmployeeList mgr JOIN Employee emp ON emp.ManagerID = mgr.EmployeeId Now make sure that you write down all the answers on the piece of paper. Watch following video and read earlier article over here. If you want to change the answer you still have chance. Solution 1) 1 2) 2 Now compare let us check the answers and compare your answers to following answers. I am very confident you will get them correct. Available at USA: Amazon India: Flipkart | IndiaPlaza Volume: 1, 2, 3, 4, 5 Please leave your feedback in the comment area for the quiz and video. Did you know all the answers of the quiz? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Joes 2 Pros, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Speed Up! – Parallel Processes and Unparalleled Performance – TechEd 2012 India

- by pinaldave

TechEd India 2012 is just around the corner and I will be presenting there on two different session. SQL Server Performance Tuning is a very challenging subject that requires expertise in Database Administration and Database Development. I always have enjoyed talking about SQL Server Performance tuning subject. Just like doctors I like to call my every attempt to improve the performance of SQL Server queries and database server as a practice too. I have been working with SQL Server for more than 8 years and I believe that many of the performance tuning concept I have mastered. However, performance tuning is not a simple subject. However there are occasions when I feel stumped, there are occasional when I am not sure what should be the next step. When I face situation where I cannot figure things out easily, it makes me most happy because I clearly see this as a learning opportunity. I have been presenting in TechEd India for last three years. This is my fourth time opportunity to present a technical session on SQL Server. Just like every other year, I decided to present something different, something which I have spend years of learning. This time, I am going to present about parallel processes. It is widely believed that more the CPU will improve performance of the server. It is true in many cases. However, there are cases when limiting the CPU usages have improved overall health of the server. I will be presenting on the subject of Parallel Processes and its effects. I have spent more than a year working on this subject only. After working on various queries on multi-CPU systems I have personally learned few things. In coming TechEd session, I am going to share my experience with parallel processes and performance tuning. Session Details Title: Speed Up! – Parallel Processes and Unparalleled Performance (Add to Calendar) Abstract: “More CPU More Performance” – A very common understanding is that usage of multiple CPUs can improve the performance of the query. To get maximum performance out of any query – one has to master various aspects of the parallel processes. In this deep dive session, we will explore this complex subject with a very simple interactive demo. An attendee will walk away with proper understanding of CX_PACKET wait types, MAXDOP, parallelism threshold and various other concepts. Date and Time: March 23, 2012, 12:15 to 13:15 Location: Hotel Lalit Ashok - Kumara Krupa High Grounds, Bengaluru – 560001, Karnataka, India. Add to Calendar Please submit your questions in the comments area and I will be for sure discussing them during my session. If I pick your question to discuss during my session, here is your gift I commit right now – SQL Server Interview Questions and Answers Book. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL, Technology Tagged: TechEd, TechEdIn

Read the article
SQL SERVER – Why Do We Need Data Quality Services – Importance and Significance of Data Quality Services (DQS)

- by pinaldave

Databases are awesome. I’m sure my readers know my opinion about this – I have made SQL Server my life’s work after all! I love technology and all things computer-related. Of course, even with my love for technology, I have to admit that it has its limits. For example, it takes a human brain to notice that data has been input incorrectly. Computer “brains” might be faster than humans, but human brains are still better at pattern recognition. For example, a human brain will notice that “300” is a ridiculous age for a human to be, but to a computer it is just a number. A human will also notice similarities between “P. Dave” and “Pinal Dave,” but this would stump most computers. In a database, these sorts of anomalies are incredibly important. Databases are often used by multiple people who rely on this data to be true and accurate, so data quality is key. That is why the improved SQL Server features Master Data Management talks about Data Quality Services. This service has the ability to recognize and flag anomalies like out of range numbers and similarities between data. This allows a human brain with its pattern recognition abilities to double-check and ensure that P. Dave is the same as Pinal Dave. A nice feature of Data Quality Services is that once you set the rules for the program to follow, it will not only keep your data organized in the future, but go to the past and “fix up” any data that has already been entered. It also allows you do combine data from multiple places and it will apply these rules across the board, so that you don’t have any weird issues that crop up when trying to fit a round peg into a square hole. There are two parts of Data Quality Services that help you accomplish all these neat things. The first part is DQL Server, which you can think of as the hardware component of the system. It is installed on the side of (it needs to install separately after SQL Server is installed) SQL Server and runs quietly in the background, performing all its cleanup services. DQS Client is the user interface that you can interact with to set the rules and check over your data. There are three main aspects of Client: knowledge base management, data quality projects and administration. Knowledge base management is the part of the system that allows you to set the rules, or program the “knowledge base,” so that your database is clean and consistent. Data Quality projects are what run in the background and clean up the data that is already present. The administration allows you to check out what DQS Client is doing, change rules, and generally oversee the entire process. The whole process is user-friendly and a pleasure to use. I highly recommend implementing Data Quality Services in your database. Here are few of my blog posts which are related to Data Quality Services and I encourage you to try this out. SQL SERVER – Installing Data Quality Services (DQS) on SQL Server 2012 SQL SERVER – Step by Step Guide to Beginning Data Quality Services in SQL Server 2012 – Introduction to DQS SQL SERVER – DQS Error – Cannot connect to server – A .NET Framework error occurred during execution of user-defined routine or aggregate “SetDataQualitySessions” – SetDataQualitySessionPhaseTwo SQL SERVER – Configuring Interactive Cleansing Suggestion Min Score for Suggestions in Data Quality Services (DQS) – Sensitivity of Suggestion SQL SERVER – Unable to DELETE Project in Data Quality Projects (DQS) Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Data Quality Services, DQS

Read the article
SQL SERVER – Simple Demo of New Cardinality Estimation Features of SQL Server 2014

- by Pinal Dave

SQL Server 2014 has new cardinality estimation logic/algorithm. The cardinality estimation logic is responsible for quality of query plans and majorly responsible for improving performance for any query. This logic was not updated for quite a while, but in the latest version of SQL Server 2104 this logic is re-designed. The new logic now incorporates various assumptions and algorithms of OLTP and warehousing workload. Cardinality estimates are a prediction of the number of rows in the query result. The query optimizer uses these estimates to choose a plan for executing the query. The quality of the query plan has a direct impact on improving query performance. ~ Souce MSDN Let us see a quick example of how cardinality improves performance for a query. I will be using the AdventureWorks database for my example. Before we start with this demonstration, remember that even though you have SQL Server 2014 to see the effect of new cardinality estimates, you will need your database compatibility mode set to 120 which is for SQL Server 2014. If your server instance of SQL Server 2014 but you have set up your database compatibility mode to 110 or any other earlier version, you will get performance from your query like older version of SQL Server. Now we will execute following query in two different compatibility mode and see its performance. (Note that my SQL Server instance is of version 2014). USE AdventureWorks2014 GO -- ------------------------------- -- NEW Cardinality Estimation ALTER DATABASE AdventureWorks2014 SET COMPATIBILITY_LEVEL = 120 GO EXEC [dbo].[uspGetManagerEmployees] 44 GO -- ------------------------------- -- Old Cardinality Estimation ALTER DATABASE AdventureWorks2014 SET COMPATIBILITY_LEVEL = 110 GO EXEC [dbo].[uspGetManagerEmployees] 44 GO Result of Statistics IO Compatibility level 120 Table ‘Person’. Scan count 0, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Employee’. Scan count 2, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Worktable’. Scan count 2, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Compatibility level 110 Table ‘Worktable’. Scan count 2, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Person’. Scan count 0, logical reads 137, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Employee’. Scan count 2, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. You will notice in the case of compatibility level 110 there 137 logical read from table person where as in the case of compatibility level 120 there are only 6 physical reads from table person. This drastically improves the performance of the query. If we enable execution plan, we can see the same as well. I hope you will find this quick example helpful. You can read more about this in my latest Pluralsight Course. Reference: Pinal Dave (http://blog.SQLAuthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL SERVER – Query Hint – Contest Win Joes 2 Pros Combo (USD 198) – Day 1 of 5

- by pinaldave

August 2011 we ran a contest where every day we give away one book for an entire month. The contest had extreme success. Lots of people participated and lots of give away. I have received lots of questions if we are doing something similar this month. Absolutely, instead of running a contest a month long we are doing something more interesting. We are giving away USD 198 worth gift every day for this week. We are giving away Joes 2 Pros 5 Volumes (BOOK) SQL 2008 Development Certification Training Kit every day. One copy in India and One in USA. Total 2 of the giveaway (worth USD 198). All the gifts are sponsored from the Koenig Training Solution and Joes 2 Pros. The books are available here Amazon | Flipkart | Indiaplaza How to Win: Read the Question Read the Hints Answer the Quiz in Contact Form in following format Question Answer Name of the country (The contest is open for USA and India residents only) 2 Winners will be randomly selected announced on August 20th. Question of the Day: Which of the following queries will return dirty data? a) SELECT * FROM Table1 (READUNCOMMITED) b) SELECT * FROM Table1 (NOLOCK) c) SELECT * FROM Table1 (DIRTYREAD) d) SELECT * FROM Table1 (MYLOCK) Query Hints: BIG HINT POST Most SQL people know what a “Dirty Record” is. You might also call that an “Intermediate record”. In case this is new to you here is a very quick explanation. The simplest way to describe the steps of a transaction is to use an example of updating an existing record into a table. When the insert runs, SQL Server gets the data from storage, such as a hard drive, and loads it into memory and your CPU. The data in memory is changed and then saved to the storage device. Finally, a message is sent confirming the rows that were affected. For a very short period of time the update takes the data and puts it into memory (an intermediate state), not a permanent state. For every data change to a table there is a brief moment where the change is made in the intermediate state, but is not committed. During this time, any other DML statement needing that data waits until the lock is released. This is a safety feature so that SQL Server evaluates only official data. For every data change to a table there is a brief moment where the change is made in this intermediate state, but is not committed. During this time, any other DML statement (SELECT, INSERT, DELETE, UPDATE) needing that data must wait until the lock is released. This is a safety feature put in place so that SQL Server evaluates only official data. Additional Hints: I have previously discussed various concepts from SQL Server Joes 2 Pros Volume 1. SQL Joes 2 Pros Development Series – Dirty Records and Table Hints SQL Joes 2 Pros Development Series – Row Constructors SQL Joes 2 Pros Development Series – Finding un-matching Records SQL Joes 2 Pros Development Series – Efficient Query Writing Strategy SQL Joes 2 Pros Development Series – Finding Apostrophes in String and Text SQL Joes 2 Pros Development Series – Wildcard – Querying Special Characters SQL Joes 2 Pros Development Series – Wildcard Basics Recap Next Step: Answer the Quiz in Contact Form in following format Question Answer Name of the country (The contest is open for USA and India) Bonus Winner Leave a comment with your favorite article from the “additional hints” section and you may be eligible for surprise gift. There is no country restriction for this Bonus Contest. Do mention why you liked it any particular blog post and I will announce the winner of the same along with the main contest. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Joes 2 Pros, PostADay, SQL, SQL Authority, SQL Puzzle, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Introduction to LEAD and LAG – Analytic Functions Introduced in SQL Server 2012

- by pinaldave

SQL Server 2012 introduces new analytical function LEAD() and LAG(). This functions accesses data from a subsequent row (for lead) and previous row (for lag) in the same result set without the use of a self-join . It will be very difficult to explain this in words so I will attempt small example to explain you this function. Instead of creating new table, I will be using AdventureWorks sample database as most of the developer uses that for experiment. Let us fun following query. USE AdventureWorks GO SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty, LEAD(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID ) LeadValue, LAG(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID ) LagValue FROM Sales.SalesOrderDetail s WHERE SalesOrderID IN (43670, 43669, 43667, 43663) ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty GO Above query will give us following result. When we look at above resultset it is very clear that LEAD function gives us value which is going to come in next line and LAG function gives us value which was encountered in previous line. If we have to generate the same result without using this function we will have to use self join. In future blog post we will see the same. Let us explore this function a bit more. This function not only provide previous or next line but it can also access any line before or after using offset. Let us fun following query, where LEAD and LAG function accesses the row with offset of 2. USE AdventureWorks GO SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty, LEAD(SalesOrderDetailID,2) OVER (ORDER BY SalesOrderDetailID ) LeadValue, LAG(SalesOrderDetailID,2) OVER (ORDER BY SalesOrderDetailID ) LagValue FROM Sales.SalesOrderDetail s WHERE SalesOrderID IN (43670, 43669, 43667, 43663) ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty GO Above query will give us following result. You can see the LEAD and LAG functions now have interval of rows when they are returning results. As there is interval of two rows the first two rows in LEAD function and last two rows in LAG function will return NULL value. You can easily replace this NULL Value with any other default value by passing third parameter in LEAD and LAG function. Let us fun following query. USE AdventureWorks GO SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty, LEAD(SalesOrderDetailID,2,0) OVER (ORDER BY SalesOrderDetailID ) LeadValue, LAG(SalesOrderDetailID,2,0) OVER (ORDER BY SalesOrderDetailID ) LagValue FROM Sales.SalesOrderDetail s WHERE SalesOrderID IN (43670, 43669, 43667, 43663) ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty GO Above query will give us following result, where NULL are now replaced with value 0. Just like any other analytic function we can easily partition this function as well. Let us see the use of PARTITION BY in this clause. USE AdventureWorks GO SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty, LEAD(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID ORDER BY SalesOrderDetailID ) LeadValue, LAG(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID ORDER BY SalesOrderDetailID ) LagValue FROM Sales.SalesOrderDetail s WHERE SalesOrderID IN (43670, 43669, 43667, 43663) ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty GO Above query will give us following result, where now the data is partitioned by SalesOrderID and LEAD and LAG functions are returning the appropriate result in that window. As now there are smaller partition in my query, you will see higher presence of NULL. In future blog post we will see how this functions are compared to SELF JOIN. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Function, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – OVER clause with FIRST _VALUE and LAST_VALUE – Analytic Functions Introduced in SQL Server 2012 – ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING

- by pinaldave

Yesterday I had discussed two analytical functions FIRST_VALUE and LAST_VALUE. After reading the blog post I received very interesting question. “Don’t you think there is bug in your first example where FIRST_VALUE is remain same but the LAST_VALUE is changing every line. I think the LAST_VALUE should be the highest value in the windows or set of result.” I find this question very interesting because this is very commonly made mistake. No there is no bug in the code. I think what we need is a bit more explanation. Let me attempt that first. Before you do that I suggest you read yesterday’s blog post as this question is related to that blog post. Now let’s have fun following query: USE AdventureWorks GO SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty, FIRST_VALUE(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID) FstValue, LAST_VALUE(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID) LstValue FROM Sales.SalesOrderDetail s WHERE SalesOrderID IN (43670, 43669, 43667, 43663) ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty GO The above query will give us the following result: As per the reader’s question the value of the LAST_VALUE function should be always 114 and not increasing as the rows are increased. Let me re-write the above code once again with bit extra T-SQL Syntax. Please pay special attention to the ROW clause which I have added in the above syntax. USE AdventureWorks GO SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty, FIRST_VALUE(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FstValue, LAST_VALUE(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) LstValue FROM Sales.SalesOrderDetail s WHERE SalesOrderID IN (43670, 43669, 43667, 43663) ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty GO Now once again check the result of the above query. The result of both the query is same because in OVER clause the default ROWS selection is always UNBOUNDED PRECEDING AND CURRENT ROW. If you want the maximum value of the windows with OVER clause you need to change the syntax to UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING for ROW clause. Now run following query and pay special attention to ROW clause again. USE AdventureWorks GO SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty, FIRST_VALUE(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID ORDER BY SalesOrderDetailID ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) FstValue, LAST_VALUE(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID ORDER BY SalesOrderDetailID ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) LstValue FROM Sales.SalesOrderDetail s WHERE SalesOrderID IN (43670, 43669, 43667, 43663) ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty GO Here is the resultset of the above query which is what questioner was asking. So in simple word, there is no bug but there is additional syntax needed to add to get your desired answer. The same logic also applies to PARTITION BY clause when used. Here is quick example of how we can further partition the query by SalesOrderDetailID with this new functions. USE AdventureWorks GO SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty, FIRST_VALUE(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID ORDER BY SalesOrderDetailID ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) FstValue, LAST_VALUE(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID ORDER BY SalesOrderDetailID ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) LstValue FROM Sales.SalesOrderDetail s WHERE SalesOrderID IN (43670, 43669, 43667, 43663) ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty GO Above query will give us windowed resultset on SalesOrderDetailsID as well give us FIRST and LAST value for the windowed resultset. There are lots to discuss for this two functions and we have just explored tip of the iceberg. In future post I will discover it further deep. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Function, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Database – Beginning with Cloud Database As A Service

- by Pinal Dave

I love my weekend projects. Everybody does different activities in their weekend – like traveling, reading or just nothing. Every weekend I try to do something creative and different in the database world. The goal is I learn something new and if I enjoy my learning experience I share with the world. This weekend, I decided to explore Cloud Database As A Service – Morpheus. In my career I have managed many databases in the cloud and I have good experience in managing them. I should highlight that today’s applications use multiple databases from SQL for transactions and analytics, NoSQL for documents, In-Memory for caching to Indexing for search. Provisioning and deploying these databases often require extensive expertise and time. Often these databases are also not deployed on the same infrastructure and can create unnecessary latency between the application layer and the databases. Not to mention the different quality of service based on the infrastructure and the service provider where they are deployed. Moreover, there are additional problems that I have experienced with traditional database setup when hosted in the cloud: Database provisioning & orchestration Slow speed due to hardware issues Poor Monitoring Tools High network latency Now if you have a great software and expert network engineer, you can continuously work on above problems and overcome them. However, not every organization have the luxury to have top notch experts in the field. Now above issues are related to infrastructure, but there are a few more problems which are related to software/application as well. Here are the top three things which can be problems if you do not have application expert: Replication and Clustering Simple provisioning of the hard drive space Automatic Sharding Well, Morpheus looks like a product build by experts who have faced similar situation in the past. The product pretty much addresses all the pain points of developers and database administrators. What is different about Morpheus is that it offers a variety of databases from MySQL, MongoDB, ElasticSearch to Reddis as a service. Thus users can pick and chose any combination of these databases. All of them can be provisioned in a matter of minutes with a simple and intuitive point and click user interface. The Morpheus cloud is built on Solid State Drives (SSD) and is designed for high-speed database transactions. In addition it offers a direct link to Amazon Web Services to minimize latency between the application layer and the databases. Here are the few steps on how one can get started with Morpheus. Follow along with me. First go to http://www.gomorpheus.com and register for a new and free account. Step 1: Signup It is very simple to signup for Morpheus. Step 2: Select your database I use MySQL for my daily routine, so I have selected MySQL. Upon clicking on the big red button to add Instance, it prompted a dialogue of creating a new instance. Step 3: Create User Now we just have to create a user in our portal which we will use to connect to a database hosted at Morpheus. Click on your database instance and it will bring you to User Screen. Over here you will notice once again a big red button to create a new user. I created a user with my first name. Step 4: Configure your MySQL client I used MySQL workbench and connected to MySQL instance, which I had created with an IP address and user. That’s it! You are connecting to MySQL instance. Now you can create your objects just like you would create on your local box. You will have all the features of the Morpheus when you are working with your database. Dashboard While working with Morpheus, I was most impressed with its dashboard. In future blog posts, I will write more about this feature. Also with Morpheus you use the same process for provisioning and connecting with other databases: MongoDB, ElasticSearch and Reddis. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: MySQL, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL SERVER – UNION ALL and ORDER BY – How to Order Table Separately While Using UNION ALL

- by pinaldave

I often see developers trying following syntax while using ORDER BY. SELECT Columns FROM TABLE1 ORDER BY Columns UNION ALL SELECT Columns FROM TABLE2 ORDER BY Columns However the above query will return following error. Msg 156, Level 15, State 1, Line 5 Incorrect syntax near the keyword ‘ORDER’. It is not possible to use two different ORDER BY in the UNION statement. UNION returns single resultsetand as per the Logical Query Processing Phases. However, if your requirement is such that you want your top and bottom query of the UNION resultset independently sorted but in the same resultset you can add an additional static column and order by that column. Let us re-create the same scenario. First create two tables and populated with sample data. USE tempdb GO -- Create table CREATE TABLE t1 (ID INT, Col1 VARCHAR(100)); CREATE TABLE t2 (ID INT, Col1 VARCHAR(100)); GO -- Sample Data Build INSERT INTO t1 (ID, Col1) SELECT 1, 'Col1-t1' UNION ALL SELECT 2, 'Col2-t1' UNION ALL SELECT 3, 'Col3-t1'; INSERT INTO t2 (ID, Col1) SELECT 3, 'Col1-t2' UNION ALL SELECT 2, 'Col2-t2' UNION ALL SELECT 1, 'Col3-t2'; GO If we SELECT the data from both the table using UNION ALL . -- SELECT without ORDER BY SELECT ID, Col1 FROM t1 UNION ALL SELECT ID, Col1 FROM t2 GO We will get the data in following order. However, our requirement is to get data in following order. If we need data ordered by Column1 we can ORDER the resultset ordered by Column1. -- SELECT with ORDER BY SELECT ID, Col1 FROM t1 UNION ALL SELECT ID, Col1 FROM t2 ORDER BY ID GO Now to get the data in independently sorted in UNION ALL let us add additional column OrderKey and use ORDER BY on that column. I think the description does not do proper justice let us see the example here. -- SELECT with ORDER BY - with ORDER KEY SELECT ID, Col1, 'id1' OrderKey FROM t1 UNION ALL SELECT ID, Col1, 'id2' OrderKey FROM t2 ORDER BY OrderKey, ID GO The above query will give the desired result. Now do not forget to clean up the database by running the following script. -- Clean up DROP TABLE t1; DROP TABLE t2; GO Here is the complete script used in this example. USE tempdb GO -- Create table CREATE TABLE t1 (ID INT, Col1 VARCHAR(100)); CREATE TABLE t2 (ID INT, Col1 VARCHAR(100)); GO -- Sample Data Build INSERT INTO t1 (ID, Col1) SELECT 1, 'Col1-t1' UNION ALL SELECT 2, 'Col2-t1' UNION ALL SELECT 3, 'Col3-t1'; INSERT INTO t2 (ID, Col1) SELECT 3, 'Col1-t2' UNION ALL SELECT 2, 'Col2-t2' UNION ALL SELECT 1, 'Col3-t2'; GO -- SELECT without ORDER BY SELECT ID, Col1 FROM t1 UNION ALL SELECT ID, Col1 FROM t2 GO -- SELECT with ORDER BY SELECT ID, Col1 FROM t1 UNION ALL SELECT ID, Col1 FROM t2 ORDER BY ID GO -- SELECT with ORDER BY - with ORDER KEY SELECT ID, Col1, 'id1' OrderKey FROM t1 UNION ALL SELECT ID, Col1, 'id2' OrderKey FROM t2 ORDER BY OrderKey, ID GO -- Clean up DROP TABLE t1; DROP TABLE t2; GO I am sure there are many more ways to achieve this, what method would you use if you have to face the similar situation? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Best Practices, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Updating Data in A Columnstore Index

- by pinaldave

So far I have written two articles on Columnstore Indexes, and both of them got very interesting readership. In fact, just recently I got a query on my previous article on Columnstore Index. Read the following two articles to get familiar with the Columnstore Index. They will give you a reference to the question which was asked by a certain reader: SQL SERVER – Fundamentals of Columnstore Index SQL SERVER – How to Ignore Columnstore Index Usage in Query Here is the reader’s question: ” When I tried to update my table after creating the Columnstore index, it gives me an error. What should I do?” When the Columnstore index is created on the table, the table becomes Read-Only table and it does not let any insert/update/delete on the table. The basic understanding is that Columnstore Index will be created on the table that is very huge and holds lots of data. If a table is small enough, there is no need to create a Columnstore index. The regular index should just help it. The reason why Columnstore index was needed is because the table was so big that retrieving the data was taking a really, really long time. Now, updating such a huge table is always a challenge by itself. If the Columnstore Index is created on the table, and the table needs to be updated, you need to know that there are various ways to update it. The easiest way is to disable the Index and enable it. Consider the following code: USE AdventureWorks GO -- Create New Table CREATE TABLE [dbo].[MySalesOrderDetail]( [SalesOrderID] [int] NOT NULL, [SalesOrderDetailID] [int] NOT NULL, [CarrierTrackingNumber] [nvarchar](25) NULL, [OrderQty] [smallint] NOT NULL, [ProductID] [int] NOT NULL, [SpecialOfferID] [int] NOT NULL, [UnitPrice] [money] NOT NULL, [UnitPriceDiscount] [money] NOT NULL, [LineTotal] [numeric](38, 6) NOT NULL, [rowguid] [uniqueidentifier] NOT NULL, [ModifiedDate] [datetime] NOT NULL ) ON [PRIMARY] GO -- Create clustered index CREATE CLUSTERED INDEX [CL_MySalesOrderDetail] ON [dbo].[MySalesOrderDetail] ( [SalesOrderDetailID]) GO -- Create Sample Data Table -- WARNING: This Query may run upto 2-10 minutes based on your systems resources INSERT INTO [dbo].[MySalesOrderDetail] SELECT S1.* FROM Sales.SalesOrderDetail S1 GO 100 -- Create ColumnStore Index CREATE NONCLUSTERED COLUMNSTORE INDEX [IX_MySalesOrderDetail_ColumnStore] ON [MySalesOrderDetail] (UnitPrice, OrderQty, ProductID) GO -- Attempt to Update the table UPDATE [dbo].[MySalesOrderDetail] SET OrderQty = OrderQty +1 WHERE [SalesOrderID] = 43659 GO /* It will throw following error Msg 35330, Level 15, State 1, Line 2 UPDATE statement failed because data cannot be updated in a table with a columnstore index. Consider disabling the columnstore index before issuing the UPDATE statement, then rebuilding the columnstore index after UPDATE is complete. */ A similar error also shows up for Insert/Delete function. Here is the workaround. Disable the Columnstore Index and performance update, enable the Columnstore Index: -- Disable the Columnstore Index ALTER INDEX [IX_MySalesOrderDetail_ColumnStore] ON [dbo].[MySalesOrderDetail] DISABLE GO -- Attempt to Update the table UPDATE [dbo].[MySalesOrderDetail] SET OrderQty = OrderQty +1 WHERE [SalesOrderID] = 43659 GO -- Rebuild the Columnstore Index ALTER INDEX [IX_MySalesOrderDetail_ColumnStore] ON [dbo].[MySalesOrderDetail] REBUILD GO This time it will not throw an error while the update of the table goes successfully. Let us do a cleanup of our tables using this code: -- Cleanup DROP INDEX [IX_MySalesOrderDetail_ColumnStore] ON [dbo].[MySalesOrderDetail] GO TRUNCATE TABLE dbo.MySalesOrderDetail GO DROP TABLE dbo.MySalesOrderDetail GO In the next post we will see how we can use Partition to update the Columnstore Index. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Index, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Basic Calculation and PEMDAS Order of Operation

- by pinaldave

After thinking a long time, I have decided to write about this blog post. I had no plan to create a blog post about this subject but the amount of conversation this one has created on my Facebook page, I decided to bring up a few of the question and concerns discussed on the Facebook page. There are more than 10,000 comments here so far. There are lots of discussion about what should be the answer. Well, as far as I can tell there is a big debate going on on Facebook, for educational purpose you should go ahead and read some of the comments. They are very interesting and for sure teach some new stuff. Even though some of the comments are clearly wrong they have made some good points and I believe it for sure develops some logic. Here is my take on this subject. I believe the answer is 9 as I follow PEMDAS Order of Operation. PEMDAS stands for parentheses, exponents, multiplication, division, addition, subtraction. PEMDAS is commonly known as BODMAS in India. BODMAS stands for Brackets, Orders (ie Powers and Square Roots, etc), Division, Multiplication, Addition and Subtraction. PEMDAS and BODMAS are almost same and both of them follow the operation order from LEFT to RIGHT. Let us try to simplify above statement using the PEMDAS or BODMAS (whatever you prefer to call). Step 1: 6 ÷ 2 (1+2) (parentheses first) Step 2: = 6 ÷ 2 * (1+2) (adding multiplication sign for further clarification) Step 3: = 6 ÷ 2* (3) (single digit in parentheses – simplify using operator) Step 4: = 6 ÷ 2 * 3 (Remember next Operation should be LEFT to RIGHT) Step 5: = 3 * 3 (because 6 ÷ 2 = 3; remember LEFT to RIGHT) Step 6: = 9 (final answer) Some often find Step 4 confusing and often ended up multiplying 2 and 3 resulting Step 5 to be 6 ÷ 6, this is incorrect because in this case we did not follow the order of LEFT to RIGHT. When we do not follow the order of operation from LEFT to RIGHT we end up with the answer 1 which is incorrect. Let us see what SQL Server returns as a result. I executed following statement in SQL Server Management Studio SELECT 6/2*(1+2) It is clear that SQL Server also thinks that the answer should be 9. Let us go ahead and ask Google what will be the answer of above question in Google I have searched for the following term: 6/2(1+2) The result also says the answer should be 9. If you want a further reference here is a great video which describes why the answer should be 9 and not 1. And here is a fantastic conversation on Google Groups. Well, now what is your take on this subject? You are welcome to share constructive feedback and your answer may be different from my answer. NOTE: A healthy conversation about this subject is indeed encouraged but if there is a single bad word or comment is flaming it will be deleted without any notification (it does not matter how valuable information it contains). Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: About Me, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Read Only Files and SQL Server Management Studio (SSMS)

- by pinaldave

Just like any other Developer or DBA SQL Server Management Studio is my favorite application. Any any moment of the time I have multiple instances of the same application are open and I am working on it. Recently, I have come across a very interesting feature in SSMS related to “Read Only” files. I believe it is a little unknown feature as well so decided to write a blog about the same. First create a read only SQL file. You can make any file read by Right Click >> Properties >> Select Attribute Read Only. Now open the same file in SQL Server Management Studio. You will find that besides the file name there is a small ‘lock’ icon. This small icon indicates that the file is read only. Now let us attempt to edit the read only file. It will let us edit the file any way we want, however when we attempt to save it, it gives following pop-up value. The options in the pop-up are self explanatory and I liked it. The goal of the read only file is to prevent users to make un-intended changes. However, when a user should have complete control over the user file. User should be aware that the file is read only but if he wants to edit the file or save as a new file the choices should be present in front of it and the pop-up menu precisely captures the same. Now let us check option related to this feature in SSMS. Go to Menu >> Options >> Environment >> Documents You will find the third option which is “Allow editing of read-only files; warn when attempt to save”. In the above scenario it was already checked. Let us uncheck the same and do the same exercise which we have done earlier. I closed all the earlier window to avoid confusion. With the new option selected when I attempt to even modify the Read Only file, it gives me totally different pop up screen. It gives me an option like “Edit In-Memory”, “Make Writeable” etc. When you select “Edit In-Memory” it allows you to edit the file and later you can save as new file – just like the earlier scenario which we have discussed. . If clicked on the Make Writeable it will remove the restriction of the Read Only and file can be edited as pleased. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Server Management Studio, SQL Tips and Tricks, T SQL, Technology

Read the article
Big Data – Beginning Big Data Series Next Month in 21 Parts

- by Pinal Dave

Big Data is the next big thing. There was a time when we used to talk in terms of MB and GB of the data. However, the industry is changing and we are now moving to a conversation where we discuss about data in Petabyte, Exabyte and Zettabyte. It seems that the world is now talking about increased Volume of the data. In simple world we all think that Big Data is nothing but plenty of volume. In reality Big Data is much more than just a huge volume of the data. When talking about the data we need to understand about variety and volume along with volume. Though Big data look like a simple concept, it is extremely complex subject when we attempt to start learning the same. My Journey I have recently presented on Big Data in quite a few organizations and I have received quite a few questions during this roadshow event. I have collected all the questions which I have received and decided to post about them on the blog. In the month of October 2013, on every weekday we will be learning something new about Big Data. Every day I will share a concept/question and in the same blog post we will learn the answer of the same. Big Data – Plenty of Questions I received quite a few questions during my road trip. Here are few of the questions. I want to learn Big Data – where should I start? Do I need to know SQL to learn Big Data? What is Hadoop? There are so many organizations talking about Big Data, and every one has a different approach. How to start with big Data? Do I need to know Java to learn about Big Data? What is different between various NoSQL languages. I will attempt to answer most of the questions during the month long series in the next month. Big Data – Big Subject Big Data is a very big subject and I no way claim that I will be covering every single big data concept in this series. However, I promise that I will be indeed sharing lots of basic concepts which are revolving around Big Data. We will discuss from fundamentals about Big Data and continue further learning about it. I will attempt to cover the concept so simple that many of you might have wondered about it but afraid to ask. Your Role! During this series next month, I need your one help. Please keep on posting questions you might have related to big data as blog post comments and on Facebook Page. I will monitor them closely and will try to answer them as well during this series. Now make sure that you do not miss any single blog post in this series as every blog post will be linked to each other. You can subscribe to my feed or like my Facebook page or subscribe via email (by entering email in the blog post). Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Big Data, PostADay, SQL, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Big Data – Data Mining with Hive – What is Hive? – What is HiveQL (HQL)? – Day 15 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the operational database in Big Data Story. In this article we will understand what is Hive and HQL in Big Data Story. Yahoo started working on PIG (we will understand that in the next blog post) for their application deployment on Hadoop. The goal of Yahoo to manage their unstructured data. Similarly Facebook started deploying their warehouse solutions on Hadoop which has resulted in HIVE. The reason for going with HIVE is because the traditional warehousing solutions are getting very expensive. What is HIVE? Hive is a datawarehouseing infrastructure for Hadoop. The primary responsibility is to provide data summarization, query and analysis. It supports analysis of large datasets stored in Hadoop’s HDFS as well as on the Amazon S3 filesystem. The best part of HIVE is that it supports SQL-Like access to structured data which is known as HiveQL (or HQL) as well as big data analysis with the help of MapReduce. Hive is not built to get a quick response to queries but it it is built for data mining applications. Data mining applications can take from several minutes to several hours to analysis the data and HIVE is primarily used there. HIVE Organization The data are organized in three different formats in HIVE. Tables: They are very similar to RDBMS tables and contains rows and tables. Hive is just layered over the Hadoop File System (HDFS), hence tables are directly mapped to directories of the filesystems. It also supports tables stored in other native file systems. Partitions: Hive tables can have more than one partition. They are mapped to subdirectories and file systems as well. Buckets: In Hive data may be divided into buckets. Buckets are stored as files in partition in the underlying file system. Hive also has metastore which stores all the metadata. It is a relational database containing various information related to Hive Schema (column types, owners, key-value data, statistics etc.). We can use MySQL database over here. What is HiveSQL (HQL)? Hive query language provides the basic SQL like operations. Here are few of the tasks which HQL can do easily. Create and manage tables and partitions Support various Relational, Arithmetic and Logical Operators Evaluate functions Download the contents of a table to a local directory or result of queries to HDFS directory Here is the example of the HQL Query: SELECT upper(name), salesprice FROM sales; SELECT category, count(1) FROM products GROUP BY category; When you look at the above query, you can see they are very similar to SQL like queries. Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Pig. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Big Data – Operational Databases Supporting Big Data – Key-Value Pair Databases and Document Databases – Day 13 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the Relational Database and NoSQL database in the Big Data Story. In this article we will understand the role of Key-Value Pair Databases and Document Databases Supporting Big Data Story. Now we will see a few of the examples of the operational databases. Relational Databases (Yesterday’s post) NoSQL Databases (Yesterday’s post) Key-Value Pair Databases (This post) Document Databases (This post) Columnar Databases (Tomorrow’s post) Graph Databases (Tomorrow’s post) Spatial Databases (Tomorrow’s post) Key Value Pair Databases Key Value Pair Databases are also known as KVP databases. A key is a field name and attribute, an identifier. The content of that field is its value, the data that is being identified and stored. They have a very simple implementation of NoSQL database concepts. They do not have schema hence they are very flexible as well as scalable. The disadvantages of Key Value Pair (KVP) database are that they do not follow ACID (Atomicity, Consistency, Isolation, Durability) properties. Additionally, it will require data architects to plan for data placement, replication as well as high availability. In KVP databases the data is stored as strings. Here is a simple example of how Key Value Database will look like: Key Value Name Pinal Dave Color Blue Twitter @pinaldave Name Nupur Dave Movie The Hero As the number of users grow in Key Value Pair databases it starts getting difficult to manage the entire database. As there is no specific schema or rules associated with the database, there are chances that database grows exponentially as well. It is very crucial to select the right Key Value Pair Database which offers an additional set of tools to manage the data and provides finer control over various business aspects of the same. Riak Rick is one of the most popular Key Value Database. It is known for its scalability and performance in high volume and velocity database. Additionally, it implements a mechanism for collection key and values which further helps to build manageable system. We will further discuss Riak in future blog posts. Key Value Databases are a good choice for social media, communities, caching layers for connecting other databases. In simpler words, whenever we required flexibility of the data storage keeping scalability in mind – KVP databases are good options to consider. Document Database There are two different kinds of document databases. 1) Full document Content (web pages, word docs etc) and 2) Storing Document Components for storage. The second types of the document database we are talking about over here. They use Javascript Object Notation (JSON) and Binary JSON for the structure of the documents. JSON is very easy to understand language and it is very easy to write for applications. There are two major structures of JSON used for Document Database – 1) Name Value Pairs and 2) Ordered List. MongoDB and CouchDB are two of the most popular Open Source NonRelational Document Database. MongoDB MongoDB databases are called collections. Each collection is build of documents and each document is composed of fields. MongoDB collections can be indexed for optimal performance. MongoDB ecosystem is highly available, supports query services as well as MapReduce. It is often used in high volume content management system. CouchDB CouchDB databases are composed of documents which consists fields and attachments (known as description). It supports ACID properties. The main attraction points of CouchDB are that it will continue to operate even though network connectivity is sketchy. Due to this nature CouchDB prefers local data storage. Document Database is a good choice of the database when users have to generate dynamic reports from elements which are changing very frequently. A good example of document usages is in real time analytics in social networking or content management system. Tomorrow In tomorrow’s blog post we will discuss about various other Operational Databases supporting Big Data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Big Data – Basics of Big Data Architecture – Day 4 of 21

- by Pinal Dave

In yesterday’s blog post we understood how Big Data evolution happened. Today we will understand basics of the Big Data Architecture. Big Data Cycle Just like every other database related applications, bit data project have its development cycle. Though three Vs (link) for sure plays an important role in deciding the architecture of the Big Data projects. Just like every other project Big Data project also goes to similar phases of the data capturing, transforming, integrating, analyzing and building actionable reporting on the top of the data. While the process looks almost same but due to the nature of the data the architecture is often totally different. Here are few of the question which everyone should ask before going ahead with Big Data architecture. Questions to Ask How big is your total database? What is your requirement of the reporting in terms of time – real time, semi real time or at frequent interval? How important is the data availability and what is the plan for disaster recovery? What are the plans for network and physical security of the data? What platform will be the driving force behind data and what are different service level agreements for the infrastructure? This are just basic questions but based on your application and business need you should come up with the custom list of the question to ask. As I mentioned earlier this question may look quite simple but the answer will not be simple. When we are talking about Big Data implementation there are many other important aspects which we have to consider when we decide to go for the architecture. Building Blocks of Big Data Architecture It is absolutely impossible to discuss and nail down the most optimal architecture for any Big Data Solution in a single blog post, however, we can discuss the basic building blocks of big data architecture. Here is the image which I have built to explain how the building blocks of the Big Data architecture works. Above image gives good overview of how in Big Data Architecture various components are associated with each other. In Big Data various different data sources are part of the architecture hence extract, transform and integration are one of the most essential layers of the architecture. Most of the data is stored in relational as well as non relational data marts and data warehousing solutions. As per the business need various data are processed as well converted to proper reports and visualizations for end users. Just like software the hardware is almost the most important part of the Big Data Architecture. In the big data architecture hardware infrastructure is extremely important and failure over instances as well as redundant physical infrastructure is usually implemented. NoSQL in Data Management NoSQL is a very famous buzz word and it really means Not Relational SQL or Not Only SQL. This is because in Big Data Architecture the data is in any format. It can be unstructured, relational or in any other format or from any other data source. To bring all the data together relational technology is not enough, hence new tools, architecture and other algorithms are invented which takes care of all the kind of data. This is collectively called NoSQL. Tomorrow Next four days we will answer the Buzz Words – Hadoop. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL – Step by Step Guide to Download and Install NuoDB – Getting Started with NuoDB

- by Pinal Dave

Let us take a look at the application you own at your business. If you pay attention to the underlying database for that application you will be amazed. Every successful business these days processes way more data than they used to process before. The number of transactions and the amount of data is growing at an exponential rate. Every single day there is way more data to process than before. Big data is no longer a concept; it is now turning into reality. If you look around there are so many different big data solutions and it can be a quite difficult task to figure out where to begin. Personally, I have been experimenting with a lot of different solutions which allow my database to scale immediately without much hassle while maintaining optimal database performance. There are for sure some solutions out there, but for many I even have to learn their specific language and there is a lot of new exploration to do. Honestly, what I prefer is a product, which works with the language I know (SQL) and follows all the RDBMS concepts which I am familiar with (ACID etc.). NuoDB is one such solution. It is an operational NewSQL database built on a patented emergent architecture with full support for SQL and ACID guarantees. In this blog post, I will explore how one can download and install NuoDB database. Step 1: Follow me and go to the NuoDB download page. Simply fill out the form, accept the online license agreement, and you will be taken directly to a page where you can select any platform you prefer to install NuoDB. In my example below, I select the Windows 64-bit platform as it is one of the most popular NuoDB platforms. (You can also run NuoDB on Amazon Web Services but I prefer to install it on my local machine for the purposes of this blog). Step 2: Once you have downloaded the NuoDB installer, double click on it to install it on the Windows platform. Here is the enlarged the icon of the installer. Step 3: Follow the wizard installation, as it is pretty straight forward and easy to do so. I have selected all the options to install as the overall installation is very simple and it does not take up much space. I have installed it on my C drive but you can select your preferred drive. It is quite possible that if you do not have 64 bit Java, it will throw following error. If you face following error, I suggest you to download 64-bit Java from here. Make sure that you download 64-bit Java from following link: http://java.com/en/download/manual.jsp If already have Java 64-bit installed, you can continue with the installation as described in following image. Otherwise, install Java and start from with Step 1. As in my case, I already have 64-bit Java installed – and you won’t believe me when I say that the entire installation of NuoDB only took me around 90 seconds. Click on Finish to end to exit the installation. Step 4: Once the installation is successful, NuoDB will automatically open the following two tabs – Console and DevCenter — in your preferred browser. On the Console tab you can explore various components of the NuoDB solution, e.g. QuickStart, Admin, Explorer, Storefront and Samples. We will see various components and their usage in future blog posts. If you follow these steps in this post, which I have followed to install NuoDB, you will agree that the installation of NuoDB is extremely smooth and it was indeed a pleasure to install a database product with such ease. If you have installed other database products in the past, you will absolutely agree with me. So download NuoDB and install it today, and in tomorrow’s blog post I will take the installation to the next level. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: NuoDB

Read the article
SQL SERVER – Shard No More – An Innovative Look at Distributed Peer-to-peer SQL Database

- by pinaldave

There is no doubt that SQL databases play an important role in modern applications. In an ideal world, a single database can handle hundreds of incoming connections from multiple clients and scale to accommodate the related transactions. However the world is not ideal and databases are often a cause of major headaches when applications need to scale to accommodate more connections, transactions, or both. In order to overcome scaling issues, application developers often resort to administrative acrobatics, also known as database sharding. Sharding helps to improve application performance and throughput by splitting the database into two or more shards. Unfortunately, this practice also requires application developers to code transactional consistency into their applications. Getting transactional consistency across multiple SQL database shards can prove to be very difficult. Sharding requires developers to think about things like rollbacks, constraints, and referential integrity across tables within their applications when these types of concerns are best handled by the database. It also makes other common operations such as joins, searches, and memory management very difficult. In short, the very solution implemented to overcome throughput issues becomes a bottleneck in and of itself. What if database sharding was no longer required to scale your application? Let me explain. For the past several months I have been following and writing about NuoDB, a hot new SQL database technology out of Cambridge, MA. NuoDB is officially out of beta and they have recently released their first release candidate so I decided to dig into the database in a little more detail. Their architecture is very interesting and exciting because it completely eliminates the need to shard a database to achieve higher throughput. Each NuoDB database consists of at least three or more processes that enable a single database to run across multiple hosts. These processes include a Broker, a Transaction Engine and a Storage Manager. Brokers are responsible for connecting client applications to Transaction Engines and maintain a global view of the network to keep track of the multiple Transaction Engines available at any time. Transaction Engines are in-memory processes that client applications connect to for processing SQL transactions. Storage Managers are responsible for persisting data to disk and serving up records to the Transaction Managers if they don’t exist in memory. The secret to NuoDB’s approach to solving the sharding problem is that it is a truly distributed, peer-to-peer, SQL database. Each of its processes can be deployed across multiple hosts. When client applications need to connect to a Transaction Engine, the Broker will automatically route the request to the most available process. Since multiple Transaction Engines and Storage Managers running across multiple host machines represent a single logical database, you never have to resort to sharding to get the throughput your application requires. NuoDB is a new pioneer in the SQL database world. They are making database scalability simple by eliminating the need for acrobatics such as sharding, and they are also making general administration of the database simpler as well. Their distributed database appears to you as a user like a single SQL Server database. With their RC1 release they have also provided a web based administrative console that they call NuoConsole. This tool makes it extremely easy to deploy and manage NuoDB processes across one or multiple hosts with the click of a mouse button. See for yourself by downloading NuoDB here. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: CodeProject, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology Tagged: NuoDB

Read the article
Big Data – Basics of Big Data Analytics – Day 18 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the various components in Big Data Story. In this article we will understand what are the various analytics tasks we try to achieve with the Big Data and the list of the important tools in Big Data Story. When you have plenty of the data around you what is the first thing which comes to your mind? “What do all these data means?” Exactly – the same thought comes to my mind as well. I always wanted to know what all the data means and what meaningful information I can receive out of it. Most of the Big Data projects are built to retrieve various intelligence all this data contains within it. Let us take example of Facebook. When I look at my friends list of Facebook, I always want to ask many questions such as - On which date my maximum friends have a birthday? What is the most favorite film of my most of the friends so I can talk about it and engage them? What is the most liked placed to travel my friends? Which is the most disliked cousin for my friends in India and USA so when they travel, I do not take them there. There are many more questions I can think of. This illustrates that how important it is to have analysis of Big Data. Here are few of the kind of analysis listed which you can use with Big Data. Slicing and Dicing: This means breaking down your data into smaller set and understanding them one set at a time. This also helps to present various information in a variety of different user digestible ways. For example if you have data related to movies, you can use different slide and dice data in various formats like actors, movie length etc. Real Time Monitoring: This is very crucial in social media when there are any events happening and you wanted to measure the impact at the time when the event is happening. For example, if you are using twitter when there is a football match, you can watch what fans are talking about football match on twitter when the event is happening. Anomaly Predication and Modeling: If the business is running normal it is alright but if there are signs of trouble, everyone wants to know them early on the hand. Big Data analysis of various patterns can be very much helpful to predict future. Though it may not be always accurate but certain hints and signals can be very helpful. For example, lots of data can help conclude that if there is lots of rain it can increase the sell of umbrella. Text and Unstructured Data Analysis: unstructured data are now getting norm in the new world and they are a big part of the Big Data revolution. It is very important that we Extract, Transform and Load the unstructured data and make meaningful data out of it. For example, analysis of lots of images, one can predict that people like to use certain colors in certain months in their cloths. Big Data Analytics Solutions There are many different Big Data Analystics Solutions out in the market. It is impossible to list all of them so I will list a few of them over here. Tableau – This has to be one of the most popular visualization tools out in the big data market. SAS – A high performance analytics and infrastructure company IBM and Oracle – They have a range of tools for Big Data Analysis Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Data Scientist. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article

Search Results

Search found 848 results on 34 pages for 'sqlauthority'.

Page 29/34 | < Previous Page | 25 26 27 28 29 30 31 32 33 34 | Next Page >

- by pinaldave

- by Pinal Dave

- by pinaldave

- by Pinal Dave

- by pinaldave

- by pinaldave

- by pinaldave

- by pinaldave

- by pinaldave

- by Pinal Dave

- by pinaldave

- by pinaldave

- by pinaldave

- by Pinal Dave

- by pinaldave

- by pinaldave

- by pinaldave

- by pinaldave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by pinaldave

- by Pinal Dave

< Previous Page | 25 26 27 28 29 30 31 32 33 34 | Next Page >