Search Results

Search found 2466 results on 99 pages for 'dave mankoff'.

Page 20/99 | < Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27 | Next Page >

SQL SERVER – Learn SQL Server 2014 Online in a Day – My Latest Pluralsight Course

- by Pinal Dave

Click here watch SQL Server 2014 Administration New Features. SQL Server 2014 was released earlier this year and it has been extremely popular in Microsoft world. Here is the announcement for everyone, who have been asking me to build a tutorial around SQL Server 2014. I have authored latest Pluralsight courses on the subject of SQL Server 2014. This course is 4 hours and 17 minutes long, but the best part is that this course contains all the latest features of SQL Server 2014. I have build this course with the assumption that DBA is familiar with earlier versions of SQL Server and wants to explore and learn new features of SQL Server 2014. The Challenge I Faced The biggest challenge I faced was how to come up with the outline for the course. The reason is that there are so many different features introduced in SQL Server 2014 that is will be difficult to cover each of the features in a single course. I wanted to cover the topics which are the most relevant and useful to developers, but in addition I also wanted to cover the topics which may be useful to develop if they know that they exists in the product. I finally decided to depend on blog readers and few of the SQL Experts. I reached out to selected 20 people via email and gave them a list of the topics which I should be covering in this course. They all work in different organizations and have a good understanding about the need of the DBA and Developers. Based on their feedback, I was able to come up with a very good outline which is currently very popular with Pluralsight library. Lots of people have asked me how was I able to come up with a course content outline so accurately. The credit for the same goes to the developers and DBA, who have voted in the topics and have helped me to build a very solid outline for the course. Outline of the Course Here is a quick outline for the course: Introduction Backup Enhancements Security Enhancements Columnstore Enhancements Online Data Operations Enhancements Enhancements with Microsoft Azure SSD Buffer Pool Extensions Resource Governor IO Miscellaneous Features Online Index Rebuilding Live Plans for Long Running Queries Transaction Durability Cardinality Estimation In Memory OLTP Optimization Well, I had a great fun working on the topics which I have mentioned in the outline. I am very confident that once you start with the course, you will indeed understand how each of the topics builds and presented. I have made sure that each of the topic has a vivid and clear story to begin with. I first explain the story and right after that I explain the concept. Who Should Attend This Course Everyone who has basic knowledge of SQL Server and wants to update themselves with SQL Server 2014. They should attend this course. One thing I have made sure that this course is easy to understand and I have decided complex subject into multiple parts. This way the learning is progressive and anyone with a poor knowledge of the subject can have enough time to understand the presented concept. Screenshot of the Course Here are few of the screenshot of the courses. How to Watch Video Course This course is available at Pluralsight, and you will need a valid login to Pluralsight. If you do not have Pluralsight login, you can quickly sign up for the FREE Trial. Click here watch SQL Server 2014 Administration New Features. Reference: Pinal Dave (http://blog.SQLAuthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Training, SQLAuthority News, T SQL, Video

Read the article
Big Data – Data Mining with Hive – What is Hive? – What is HiveQL (HQL)? – Day 15 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the operational database in Big Data Story. In this article we will understand what is Hive and HQL in Big Data Story. Yahoo started working on PIG (we will understand that in the next blog post) for their application deployment on Hadoop. The goal of Yahoo to manage their unstructured data. Similarly Facebook started deploying their warehouse solutions on Hadoop which has resulted in HIVE. The reason for going with HIVE is because the traditional warehousing solutions are getting very expensive. What is HIVE? Hive is a datawarehouseing infrastructure for Hadoop. The primary responsibility is to provide data summarization, query and analysis. It supports analysis of large datasets stored in Hadoop’s HDFS as well as on the Amazon S3 filesystem. The best part of HIVE is that it supports SQL-Like access to structured data which is known as HiveQL (or HQL) as well as big data analysis with the help of MapReduce. Hive is not built to get a quick response to queries but it it is built for data mining applications. Data mining applications can take from several minutes to several hours to analysis the data and HIVE is primarily used there. HIVE Organization The data are organized in three different formats in HIVE. Tables: They are very similar to RDBMS tables and contains rows and tables. Hive is just layered over the Hadoop File System (HDFS), hence tables are directly mapped to directories of the filesystems. It also supports tables stored in other native file systems. Partitions: Hive tables can have more than one partition. They are mapped to subdirectories and file systems as well. Buckets: In Hive data may be divided into buckets. Buckets are stored as files in partition in the underlying file system. Hive also has metastore which stores all the metadata. It is a relational database containing various information related to Hive Schema (column types, owners, key-value data, statistics etc.). We can use MySQL database over here. What is HiveSQL (HQL)? Hive query language provides the basic SQL like operations. Here are few of the tasks which HQL can do easily. Create and manage tables and partitions Support various Relational, Arithmetic and Logical Operators Evaluate functions Download the contents of a table to a local directory or result of queries to HDFS directory Here is the example of the HQL Query: SELECT upper(name), salesprice FROM sales; SELECT category, count(1) FROM products GROUP BY category; When you look at the above query, you can see they are very similar to SQL like queries. Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Pig. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL – Step by Step Guide to Download and Install NuoDB – Getting Started with NuoDB

- by Pinal Dave

Let us take a look at the application you own at your business. If you pay attention to the underlying database for that application you will be amazed. Every successful business these days processes way more data than they used to process before. The number of transactions and the amount of data is growing at an exponential rate. Every single day there is way more data to process than before. Big data is no longer a concept; it is now turning into reality. If you look around there are so many different big data solutions and it can be a quite difficult task to figure out where to begin. Personally, I have been experimenting with a lot of different solutions which allow my database to scale immediately without much hassle while maintaining optimal database performance. There are for sure some solutions out there, but for many I even have to learn their specific language and there is a lot of new exploration to do. Honestly, what I prefer is a product, which works with the language I know (SQL) and follows all the RDBMS concepts which I am familiar with (ACID etc.). NuoDB is one such solution. It is an operational NewSQL database built on a patented emergent architecture with full support for SQL and ACID guarantees. In this blog post, I will explore how one can download and install NuoDB database. Step 1: Follow me and go to the NuoDB download page. Simply fill out the form, accept the online license agreement, and you will be taken directly to a page where you can select any platform you prefer to install NuoDB. In my example below, I select the Windows 64-bit platform as it is one of the most popular NuoDB platforms. (You can also run NuoDB on Amazon Web Services but I prefer to install it on my local machine for the purposes of this blog). Step 2: Once you have downloaded the NuoDB installer, double click on it to install it on the Windows platform. Here is the enlarged the icon of the installer. Step 3: Follow the wizard installation, as it is pretty straight forward and easy to do so. I have selected all the options to install as the overall installation is very simple and it does not take up much space. I have installed it on my C drive but you can select your preferred drive. It is quite possible that if you do not have 64 bit Java, it will throw following error. If you face following error, I suggest you to download 64-bit Java from here. Make sure that you download 64-bit Java from following link: http://java.com/en/download/manual.jsp If already have Java 64-bit installed, you can continue with the installation as described in following image. Otherwise, install Java and start from with Step 1. As in my case, I already have 64-bit Java installed – and you won’t believe me when I say that the entire installation of NuoDB only took me around 90 seconds. Click on Finish to end to exit the installation. Step 4: Once the installation is successful, NuoDB will automatically open the following two tabs – Console and DevCenter — in your preferred browser. On the Console tab you can explore various components of the NuoDB solution, e.g. QuickStart, Admin, Explorer, Storefront and Samples. We will see various components and their usage in future blog posts. If you follow these steps in this post, which I have followed to install NuoDB, you will agree that the installation of NuoDB is extremely smooth and it was indeed a pleasure to install a database product with such ease. If you have installed other database products in the past, you will absolutely agree with me. So download NuoDB and install it today, and in tomorrow’s blog post I will take the installation to the next level. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: NuoDB

Read the article
Big Data – Beginning Big Data Series Next Month in 21 Parts

- by Pinal Dave

Big Data is the next big thing. There was a time when we used to talk in terms of MB and GB of the data. However, the industry is changing and we are now moving to a conversation where we discuss about data in Petabyte, Exabyte and Zettabyte. It seems that the world is now talking about increased Volume of the data. In simple world we all think that Big Data is nothing but plenty of volume. In reality Big Data is much more than just a huge volume of the data. When talking about the data we need to understand about variety and volume along with volume. Though Big data look like a simple concept, it is extremely complex subject when we attempt to start learning the same. My Journey I have recently presented on Big Data in quite a few organizations and I have received quite a few questions during this roadshow event. I have collected all the questions which I have received and decided to post about them on the blog. In the month of October 2013, on every weekday we will be learning something new about Big Data. Every day I will share a concept/question and in the same blog post we will learn the answer of the same. Big Data – Plenty of Questions I received quite a few questions during my road trip. Here are few of the questions. I want to learn Big Data – where should I start? Do I need to know SQL to learn Big Data? What is Hadoop? There are so many organizations talking about Big Data, and every one has a different approach. How to start with big Data? Do I need to know Java to learn about Big Data? What is different between various NoSQL languages. I will attempt to answer most of the questions during the month long series in the next month. Big Data – Big Subject Big Data is a very big subject and I no way claim that I will be covering every single big data concept in this series. However, I promise that I will be indeed sharing lots of basic concepts which are revolving around Big Data. We will discuss from fundamentals about Big Data and continue further learning about it. I will attempt to cover the concept so simple that many of you might have wondered about it but afraid to ask. Your Role! During this series next month, I need your one help. Please keep on posting questions you might have related to big data as blog post comments and on Facebook Page. I will monitor them closely and will try to answer them as well during this series. Now make sure that you do not miss any single blog post in this series as every blog post will be linked to each other. You can subscribe to my feed or like my Facebook page or subscribe via email (by entering email in the blog post). Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Big Data, PostADay, SQL, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Big Data – Basics of Big Data Architecture – Day 4 of 21

- by Pinal Dave

In yesterday’s blog post we understood how Big Data evolution happened. Today we will understand basics of the Big Data Architecture. Big Data Cycle Just like every other database related applications, bit data project have its development cycle. Though three Vs (link) for sure plays an important role in deciding the architecture of the Big Data projects. Just like every other project Big Data project also goes to similar phases of the data capturing, transforming, integrating, analyzing and building actionable reporting on the top of the data. While the process looks almost same but due to the nature of the data the architecture is often totally different. Here are few of the question which everyone should ask before going ahead with Big Data architecture. Questions to Ask How big is your total database? What is your requirement of the reporting in terms of time – real time, semi real time or at frequent interval? How important is the data availability and what is the plan for disaster recovery? What are the plans for network and physical security of the data? What platform will be the driving force behind data and what are different service level agreements for the infrastructure? This are just basic questions but based on your application and business need you should come up with the custom list of the question to ask. As I mentioned earlier this question may look quite simple but the answer will not be simple. When we are talking about Big Data implementation there are many other important aspects which we have to consider when we decide to go for the architecture. Building Blocks of Big Data Architecture It is absolutely impossible to discuss and nail down the most optimal architecture for any Big Data Solution in a single blog post, however, we can discuss the basic building blocks of big data architecture. Here is the image which I have built to explain how the building blocks of the Big Data architecture works. Above image gives good overview of how in Big Data Architecture various components are associated with each other. In Big Data various different data sources are part of the architecture hence extract, transform and integration are one of the most essential layers of the architecture. Most of the data is stored in relational as well as non relational data marts and data warehousing solutions. As per the business need various data are processed as well converted to proper reports and visualizations for end users. Just like software the hardware is almost the most important part of the Big Data Architecture. In the big data architecture hardware infrastructure is extremely important and failure over instances as well as redundant physical infrastructure is usually implemented. NoSQL in Data Management NoSQL is a very famous buzz word and it really means Not Relational SQL or Not Only SQL. This is because in Big Data Architecture the data is in any format. It can be unstructured, relational or in any other format or from any other data source. To bring all the data together relational technology is not enough, hence new tools, architecture and other algorithms are invented which takes care of all the kind of data. This is collectively called NoSQL. Tomorrow Next four days we will answer the Buzz Words – Hadoop. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Big Data – Basics of Big Data Analytics – Day 18 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the various components in Big Data Story. In this article we will understand what are the various analytics tasks we try to achieve with the Big Data and the list of the important tools in Big Data Story. When you have plenty of the data around you what is the first thing which comes to your mind? “What do all these data means?” Exactly – the same thought comes to my mind as well. I always wanted to know what all the data means and what meaningful information I can receive out of it. Most of the Big Data projects are built to retrieve various intelligence all this data contains within it. Let us take example of Facebook. When I look at my friends list of Facebook, I always want to ask many questions such as - On which date my maximum friends have a birthday? What is the most favorite film of my most of the friends so I can talk about it and engage them? What is the most liked placed to travel my friends? Which is the most disliked cousin for my friends in India and USA so when they travel, I do not take them there. There are many more questions I can think of. This illustrates that how important it is to have analysis of Big Data. Here are few of the kind of analysis listed which you can use with Big Data. Slicing and Dicing: This means breaking down your data into smaller set and understanding them one set at a time. This also helps to present various information in a variety of different user digestible ways. For example if you have data related to movies, you can use different slide and dice data in various formats like actors, movie length etc. Real Time Monitoring: This is very crucial in social media when there are any events happening and you wanted to measure the impact at the time when the event is happening. For example, if you are using twitter when there is a football match, you can watch what fans are talking about football match on twitter when the event is happening. Anomaly Predication and Modeling: If the business is running normal it is alright but if there are signs of trouble, everyone wants to know them early on the hand. Big Data analysis of various patterns can be very much helpful to predict future. Though it may not be always accurate but certain hints and signals can be very helpful. For example, lots of data can help conclude that if there is lots of rain it can increase the sell of umbrella. Text and Unstructured Data Analysis: unstructured data are now getting norm in the new world and they are a big part of the Big Data revolution. It is very important that we Extract, Transform and Load the unstructured data and make meaningful data out of it. For example, analysis of lots of images, one can predict that people like to use certain colors in certain months in their cloths. Big Data Analytics Solutions There are many different Big Data Analystics Solutions out in the market. It is impossible to list all of them so I will list a few of them over here. Tableau – This has to be one of the most popular visualization tools out in the big data market. SAS – A high performance analytics and infrastructure company IBM and Oracle – They have a range of tools for Big Data Analysis Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Data Scientist. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL – Business Intelligence: Derive Data or Information?

- by Pinal Dave

We all know the value of information in our lives. Whether it’s a personal decision or a business initiated one, people need it. But the question is: who is to make the distinction between data and information? We all come across a whole lot of data daily, that may be significant or not. We filter what’s required and forget about the rest. Information is filtered and distilled data. Filtering and distillation can also alter its actual meaning and natural state. Therefore, in this blog we discover some ways to ensure that we’re using business intelligence derived from the right information for making critical management decisions. Four key questions managers must ask themselves before making a decision: 1. Am I working with data or information? 2. What is it’s context? 3. How recent is it? 4. How was it derived or what is the source? The first question is probably the most important. You must know what you’re dealing with here. If you see use of adjectives and conclusions drawn, it’s information. Not raw data. You very next concern must be whether this is guised to present a particular viewpoint or perspective. It makes a lot of difference if you take a decision based on someone’s propaganda to distort real facts. Therefore, the context and the intentions of the distillation process must be clear to you. The next consideration is whether data is recent enough to hold any value. Since it has a very short shelf life, you must ensure that its context and value is not lost out of time. The last and the most important consideration is how was it derived in the first place. The observer effect is what calls the shots here. The source can change the context to a great extent if the collection methodology and purpose is not clear. Gathering intelligence for decision making requires users to be keen observers and not take the information provided on its face value alone. These probing questions will allow you to make sure that you’re working with clean and accurate data devoid of any influence or manipulations. Only then can you be sure of deriving true business intelligence for your organization. BI technology is also a great way to ensure accuracy of reports. SQL BI Platform provides advanced tools and techniques for all your BI needs and concerns. Koenig Solutions offers this course along with a host of other Business Intelligence and IT courses on all latest technologies available in the market today. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL SERVER – What are Actions in SSAS and How to Make a Reporting Action

- by Pinal Dave

Actions are used for customized browsing and drilling of data for the end-user. It’s an event that a user can raise while accessing the cube data. They are used in cube browsers like excel and are triggered when a user in a client tool clicks on a particular member, level, dimension, cells or may be the cube itself. For example a user might be able to see a reporting services report, open a web page or drill through to detailed information related to the cube data. Analysis server supports 3 types of actions :- Report Drill-through Standard Actions In this blog post, I will explain the Reporting action. The objective of this action is to return a report with details of the product where the sales amount is greater than 1000 in cube browser analysis. You need to create a basic cube first with the facts and dimensions you want in the analysis. Following are the steps to create reporting action. Go to SQL server data tools and open the analysis services project. Navigate to actions and click on new reporting action. 2.) Specify the name of the action and choose target type as attribute members since we have to create the action on members for a attribute. 3.) Specify the Target object of your report action. Target object would be the dimension or attribute on which you want the report to appear. In our case it is product name. 4.) Next you have to define the condition on which you want the report link to appear. However, this is an optional feature. In this example we are specifying a condition, which will check if the sales amount is greater than 10,000. So, that the link appears only for those products where the defined condition is met. 5.) Next you have to specify the server name on which the report is present, report path and the report format in which you want the report to appear. 6.) Additionally you can specify the parameters. As with conditional expression, the parameters should be a valid MDX expression. The parameter name should be same as the one defined in the report. 7.) Deploy your solution after you are done with specifying parameters and go to the cube browser. 8.) Click on the analyze in excel button, this will open your cube in excel 9.) Make an analysis which shows product names and their sales amount. 10.) Right click on a product where sales amount is greater than 10000 you will see the reporting action link. Click on that and you will be taken to your reporting services report. 11.) Clicking on the link will take you to the URL of the report. I created this report using report project wizard in SQL server data tools. So, this is how we can launch reports from a cube browser. Similarly you can open web pages, run applications and a number of other tasks. Koenig Solutions offers SSAS training which contains all Analysis Services including Reporting in great detail. In my next blog post I will talk about drill-through actions. Author: Namita Sharma, Senior Corporate Trainer at Koenig Solutions. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: SSAS

Read the article
SQL SERVER – How to Roll Back SQL Server Database Changes

- by Pinal Dave

In a perfect scenario, no unexpected and unplanned changes occur. There are no unpleasant surprises, no inadvertent changes. However, even with all precautions and testing, there is sometimes a need to revert a structure or data change. One of the methods that can be used in this situation is to use an older database backup that has the records or database object structure you want to revert to. For this method, you have to have the adequate full database backup and a tool that will help you with comparison and synchronization is preferred. In this article, we will focus on another method: rolling back the changes. This can be done by using: An option in SQL Server Management Studio T-SQL, or ApexSQL Log The first two solutions have been described in this article The disadvantages of these methods are that you have to know when exactly the change you want to revert happened and that all transactions on the database executed in a specific time range are rolled back – the ones you want to undo and the ones you don’t. How to easily roll back SQL Server database changes using ApexSQL Log? The biggest challenge is to roll back just specific changes, not all changes that happened in a specific time range. While SQL Server Management Studio option and T-SQL read and roll forward all transactions in the transaction log files, I will show you a solution that finds and scripts only the specific changes that match your criteria. Therefore, you don’t need to worry about all other database changes that you don’t want to roll back. ApexSQL Log is a SQL Server disaster recovery tool that reads transaction logs and provides a wide range of filters that enable you to easily rollback only specific data changes. First, connect to the online database where you want to roll back the changes. Once you select the database, ApexSQL Log will show its recovery model. Note that changes can be rolled back even for a database in the Simple recovery model, when no database and transaction log backups are available. However, ApexSQL Log achieves best results when the database is in the Full recovery model and you have a chain of subsequent transaction log backups, back to the moment when the change occurred. In this example, we will use only the online transaction log. In the next step, use filters to read only the transactions that happened in a specific time range. To remove noise, it’s recommended to use as many filters as possible. Besides filtering by the time of the transaction, ApexSQL Log can filter by the operation type: Table name: As well as transaction state (committed, aborted, running, and unknown), name of the user who committed the change, specific field values, server process IDs, and transaction description. You can select only the tables affected by the changes you want to roll back. However, if you’re not certain which tables were affected, you can leave them all selected and once the results are shown in the main grid, analyze them to find the ones you to roll back. When you set the filters, you can select how to present the results. ApexSQL Log can automatically create undo or redo scripts, export the transactions into an XML, HTML, CSV, SQL, or SQL Bulk file, and create a batch file that you can use for unattended transaction log reading. In this example, I will open the results in the grid, as I want to analyze them before rolling back the transactions. The results contain information about the transaction, as well as who and when made it. For UPDATEs, ApexSQL Log shows both old and new values, so you can easily see what has happened. To create an UNDO script that rolls back the changes, select the transactions you want to roll back and click Create undo script in the menu. For the DELETE statement selected in the screenshot above, the undo script is: INSERT INTO [Sales].[PersonCreditCard] ([BusinessEntityID], [CreditCardID], [ModifiedDate]) VALUES (297, 8010, '20050901 00:00:00.000') When it comes to rolling back database changes, ApexSQL Log has a big advantage, as it rolls back only specific transactions, while leaving all other transactions that occurred at the same time range intact. That makes ApexSQL Log a good solution for rolling back inadvertent data and schema changes on your SQL Server databases. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: ApexSQL

Read the article
Big Data – How to become a Data Scientist and Learn Data Science? – Day 19 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the analytics in Big Data Story. In this article we will understand how to become a Data Scientist for Big Data Story. Data Scientist is a new buzz word, everyone seems to be wanting to become Data Scientist. Let us go over a few key topics related to Data Scientist in this blog post. First of all we will understand what is a Data Scientist. In the new world of Big Data, I see pretty much everyone wants to become Data Scientist and there are lots of people I have already met who claims that they are Data Scientist. When I ask what is their role, I have got a wide variety of answers. What is Data Scientist? Data scientists are the experts who understand various aspects of the business and know how to strategies data to achieve the business goals. They should have a solid foundation of various data algorithms, modeling and statistics methodology. What do Data Scientists do? Data scientists understand the data very well. They just go beyond the regular data algorithms and builds interesting trends from available data. They innovate and resurrect the entire new meaning from the existing data. They are artists in disguise of computer analyst. They look at the data traditionally as well as explore various new ways to look at the data. Data Scientists do not wait to build their solutions from existing data. They think creatively, they think before the data has entered into the system. Data Scientists are visionary experts who understands the business needs and plan ahead of the time, this tremendously help to build solutions at rapid speed. Besides being data expert, the major quality of Data Scientists is “curiosity”. They always wonder about what more they can get from their existing data and how to get maximum out of future incoming data. Data Scientists do wonders with the data, which goes beyond the job descriptions of Data Analysist or Business Analysist. Skills Required for Data Scientists Here are few of the skills a Data Scientist must have. Expert level skills with statistical tools like SAS, Excel, R etc. Understanding Mathematical Models Hands-on with Visualization Tools like Tableau, PowerPivots, D3. j’s etc. Analytical skills to understand business needs Communication skills On the technology front any Data Scientists should know underlying technologies like (Hadoop, Cloudera) as well as their entire ecosystem (programming language, analysis and visualization tools etc.) . Remember that for becoming a successful Data Scientist one require have par excellent skills, just having a degree in a relevant education field will not suffice. Final Note Data Scientists is indeed very exciting job profile. As per research there are not enough Data Scientists in the world to handle the current data explosion. In near future Data is going to expand exponentially, and the need of the Data Scientists will increase along with it. It is indeed the job one should focus if you like data and science of statistics. Courtesy: emc Tomorrow In tomorrow’s blog post we will discuss about various Big Data Learning resources. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL SERVER – Puzzle #1 – Querying Pattern Ranges and Wild Cards

- by Pinal Dave

Note: Read at the end of the blog post how you can get five Joes 2 Pros Book #1 and a surprise gift. I have been blogging for almost 7 years and every other day I receive questions about Querying Pattern Ranges. The most common way to solve the problem is to use Wild Cards. However, not everyone knows how to use wild card properly. SQL Queries 2012 Joes 2 Pros Volume 1 – The SQL Queries 2012 Hands-On Tutorial for Beginners Book On Amazon | Book On Flipkart Learn SQL Server get all the five parts combo kit Kit on Amazon | Kit on Flipkart Many people know wildcards are great for finding patterns in character data. There are also some special sequences with wildcards that can give you even more power. This series from SQL Queries 2012 Joes 2 Pros® Volume 1 will show you some of these cool tricks. All supporting files are available with a free download from the www.Joes2Pros.com web site. This example is from the SQL 2012 series Volume 1 in the file SQLQueries2012Vol1Chapter2.2Setup.sql. If you need help setting up then look in the “Free Videos” section on Joes2Pros under “Getting Started” called “How to install your labs” Querying Pattern Ranges The % wildcard character represents any number of characters of any length. Let’s find all first names that end in the letter ‘A’. By using the percentage ‘%’ sign with the letter ‘A’, we achieve this goal using the code sample below: SELECT * FROM Employee WHERE FirstName LIKE '%A' To find all FirstName values beginning with the letters ‘A’ or ‘B’ we can use two predicates in our WHERE clause, by separating them with the OR statement. Finding names beginning with an ‘A’ or ‘B’ is easy and this works fine until we want a larger range of letters as in the example below for ‘A’ thru ‘K’: SELECT * FROM Employee WHERE FirstName LIKE 'A%' OR FirstName LIKE 'B%' OR FirstName LIKE 'C%' OR FirstName LIKE 'D%' OR FirstName LIKE 'E%' OR FirstName LIKE 'F%' OR FirstName LIKE 'G%' OR FirstName LIKE 'H%' OR FirstName LIKE 'I%' OR FirstName LIKE 'J%' OR FirstName LIKE 'K%' The previous query does find FirstName values beginning with the letters ‘A’ thru ‘K’. However, when a query requires a large range of letters, the LIKE operator has an even better option. Since the first letter of the FirstName field can be ‘A’, ‘B’, ‘C’, ‘D’, ‘E’, ‘F’, ‘G’, ‘H’, ‘I’, ‘J’ or ‘K’, simply list all these choices inside a set of square brackets followed by the ‘%’ wildcard, as in the example below: SELECT * FROM Employee WHERE FirstName LIKE '[ABCDEFGHIJK]%' A more elegant example of this technique recognizes that all these letters are in a continuous range, so we really only need to list the first and last letter of the range inside the square brackets, followed by the ‘%’ wildcard allowing for any number of characters after the first letter in the range. Note: A predicate that uses a range will not work with the ‘=’ operator (equals sign). It will neither raise an error, nor produce a result set. --Bad query (will not error or return any records) SELECT * FROM Employee WHERE FirstName = '[A-K]%' Question: You want to find all first names that start with the letters A-M in your Customer table and end with the letter Z. Which SQL code would you use? a. SELECT * FROM Customer WHERE FirstName LIKE 'm%z' b. SELECT * FROM Customer WHERE FirstName LIKE 'a-m%z' c. SELECT * FROM Customer WHERE FirstName LIKE 'a-m%z' d. SELECT * FROM Customer WHERE FirstName LIKE '[a-m]%z' e. SELECT * FROM Customer WHERE FirstName LIKE '[a-m]z%' f. SELECT * FROM Customer WHERE FirstName LIKE '[a-m]%z' g. SELECT * FROM Customer WHERE FirstName LIKE '[a-m]z%' Contest Leave a valid answer before June 18, 2013 in the comment section. 5 winners will be selected from all the valid answers and will receive Joes 2 Pros Book #1. 1 Lucky person will get a surprise gift from Joes 2 Pros. The contest is open for all the countries where Amazon ships the book (USA, UK, Canada, India and many others). Special Note: Read all the options before you provide valid answer as there is a small trick hidden in answers. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Joes 2 Pros, PostADay, SQL, SQL Authority, SQL Puzzle, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Developer Training – 6 Online Courses to Learn SQL Server, MySQL and Technology

- by Pinal Dave

Video courses are the next big thing and I am so happy that I have so far authored 6 different video courses with Pluralsight. Here is the list of the courses. I have listed all of my video courses over here. Note: If you click on the courses and it does not open, you need to login to Pluralsight with a valid username and password or sign up for a FREE trial. Please leave a comment with your favorite course in the comment section. Random 10 winners will get surprise gift via email. Bonus: If you list your favorite module from the course site. SQL Server Performance: Introduction to Query Tuning SQL Server performance tuning is an in-depth topic, and an art to master. A key component of overall application performance tuning is query tuning. Writing queries in an efficient manner, and making sure they execute in the most optimal way possible, is always a challenge. The basics revolve around the details of how SQL Server carries out query execution, so the optimizations explored in this course follow along the same lines. Click to View Course SQL Server Performance: Indexing Basics Indexes are the most crucial objects of the database. They are the first stop for any DBA and Developer when it is about performance tuning. There is a good side as well evil side of the indexes. To master the art of performance tuning one has to understand the fundamentals of the indexes and the best practices associated with the same. This course is for every DBA and Developer who deals with performance tuning and wants to use indexes to improve the performance of the server. Click to View Course SQL Server Questions and Answers This course is designed to help you better understand how to use SQL Server effectively. The course presents many of the common misconceptions about SQL Server, and then carefully debunks those misconceptions with clear explanations and short but compelling demos, showing you how SQL Server really works. This course is for anyone working with SQL Server databases who wants to improve her knowledge and understanding of this complex platform. Click to View Course MySQL Fundamentals MySQL is a popular choice of database for use in web applications, and is a central component of the widely used LAMP open source web application software stack. This course covers the fundamentals of MySQL, including how to install MySQL as well as written basic data retrieval and data modification queries. Click to View Course Building a Successful Blog Expressing yourself is the most common behavior of humans. Blogging has made easy to express yourself. Just like a letter or book has a structure and formula, blogging also has structure and formula. In this introductory course on blogging we will go over a few of the basics of blogging and show the way to get started with blogging immediately. If you already have a blog, this course will be even more relevant as this will discuss many of the common questions and issue you face in your blogging routine. Click to View Course Introduction to ColdFusion ColdFusion is rapid web application development platform. In this course you will learn the basics of how to use ColdFusion platform and rapidly develop web sites. The course begins with learning basics of ColdFusion Markup Language and moves to common development language practices. From there we move to frequent database operations and advanced concepts of Forms, Sessions and Cookies. The last module sums up all the concepts covered in the course with sample application. Click to View Course Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Training, T SQL, Technology

Read the article
Big Data – Operational Databases Supporting Big Data – Columnar, Graph and Spatial Database – Day 14 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the Key-Value Pair Databases and Document Databases in the Big Data Story. In this article we will understand the role of Columnar, Graph and Spatial Database supporting Big Data Story. Now we will see a few of the examples of the operational databases. Relational Databases (The day before yesterday’s post) NoSQL Databases (The day before yesterday’s post) Key-Value Pair Databases (Yesterday’s post) Document Databases (Yesterday’s post) Columnar Databases (Tomorrow’s post) Graph Databases (Today’s post) Spatial Databases (Today’s post) Columnar Databases Relational Database is a row store database or a row oriented database. Columnar databases are column oriented or column store databases. As we discussed earlier in Big Data we have different kinds of data and we need to store different kinds of data in the database. When we have columnar database it is very easy to do so as we can just add a new column to the columnar database. HBase is one of the most popular columnar databases. It uses Hadoop file system and MapReduce for its core data storage. However, remember this is not a good solution for every application. This is particularly good for the database where there is high volume incremental data is gathered and processed. Graph Databases For a highly interconnected data it is suitable to use Graph Database. This database has node relationship structure. Nodes and relationships contain a Key Value Pair where data is stored. The major advantage of this database is that it supports faster navigation among various relationships. For example, Facebook uses a graph database to list and demonstrate various relationships between users. Neo4J is one of the most popular open source graph database. One of the major dis-advantage of the Graph Database is that it is not possible to self-reference (self joins in the RDBMS terms) and there might be real world scenarios where this might be required and graph database does not support it. Spatial Databases We all use Foursquare, Google+ as well Facebook Check-ins for location aware check-ins. All the location aware applications figure out the position of the phone with the help of Global Positioning System (GPS). Think about it, so many different users at different location in the world and checking-in all together. Additionally, the applications now feature reach and users are demanding more and more information from them, for example like movies, coffee shop or places see. They are all running with the help of Spatial Databases. Spatial data are standardize by the Open Geospatial Consortium known as OGC. Spatial data helps answering many interesting questions like “Distance between two locations, area of interesting places etc.” When we think of it, it is very clear that handing spatial data and returning meaningful result is one big task when there are millions of users moving dynamically from one place to another place & requesting various spatial information. PostGIS/OpenGIS suite is very popular spatial database. It runs as a layer implementation on the RDBMS PostgreSQL. This makes it totally unique as it offers best from both the worlds. Courtesy: mushroom network Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Hive. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Big Data – Interacting with Hadoop – What is Sqoop? – What is Zookeeper? – Day 17 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the Pig and Pig Latin in Big Data Story. In this article we will understand what is Sqoop and Zookeeper in Big Data Story. There are two most important components one should learn when learning about interacting with Hadoop – Sqoop and Zookper. What is Sqoop? Most of the business stores their data in RDBMS as well as other data warehouse solutions. They need a way to move data to the Hadoop system to do various processing and return it back to RDBMS from Hadoop system. The data movement can happen in real time or at various intervals in bulk. We need a tool which can help us move this data from SQL to Hadoop and from Hadoop to SQL. Sqoop (SQL to Hadoop) is such a tool which extract data from non-Hadoop data sources and transform them into the format which Hadoop can use it and later it loads them into HDFS. Essentially it is ETL tool where it Extracts, Transform and Load from SQL to Hadoop. The best part is that it also does extract data from Hadoop and loads them to Non-SQL (or RDBMS) data stores. Essentially, Sqoop is a command line tool which does SQL to Hadoop and Hadoop to SQL. It is a command line interpreter. It creates MapReduce job behinds the scene to import data from an external database to HDFS. It is very effective and easy to learn tool for nonprogrammers. What is Zookeeper? ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. In other words Zookeeper is a replicated synchronization service with eventual consistency. In simpler words – in Hadoop cluster there are many different nodes and one node is master. Let us assume that master node fails due to any reason. In this case, the role of the master node has to be transferred to a different node. The main role of the master node is managing the writers as that task requires persistence in order of writing. In this kind of scenario Zookeeper will assign new master node and make sure that Hadoop cluster performs without any glitch. Zookeeper is the Hadoop’s method of coordinating all the elements of these distributed systems. Here are few of the tasks which Zookeepr is responsible for. Zookeeper manages the entire workflow of starting and stopping various nodes in the Hadoop’s cluster. In Hadoop cluster when any processes need certain configuration to complete the task. Zookeeper makes sure that certain node gets necessary configuration consistently. In case of the master node fails, Zookeepr can assign new master node and make sure cluster works as expected. There many other tasks Zookeeper performance when it is about Hadoop cluster and communication. Basically without the help of Zookeeper it is not possible to design any new fault tolerant distributed application. Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Big Data Analytics. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL Authority News – Secret Tool Box of Successful Bloggers: 52 Tips to Build a High Traffic Top Ranking Blog

- by Pinal Dave

When I started this blog, it was meant as a bookmark for myself for helpful tips and tricks. Gradually, it grew into a blog that others were reading and commenting on. While SQL and databases are my first love and the reason I started this blog, the side effect was that I discovered I loved writing. I discovered a secret goal I didn’t even know I wanted – I wanted to become an author. For a long time, writing this blog satisfied that urge. Gradually, though, I wanted to see my name in print. 12th Book Over the past few years I have authored and co-authored a number of books – they are all based on my knowledge of SQL Server, and were meant to spread my years of experience into the world, to share what I have learned with my community. I currently have elevan of these “manuals” available for sale. As exciting as it was to see my name in print, I still felt that there was more I could do as an author. That is when I realized that I am more than just a SQL expert. I have been writing this blog now for more than 10 years, and it grew from a personal bookmark to a thriving website with over 2 million views per month. I thought to myself “I could write a book about how to create a successful blog!” And that is exactly what I did. I am extremely excited to share with all of you my new book – “Secret Toolbox of Successful Bloggers.” A Labor of Love This project has been a labor of love for me. It started out as a series for this blog – I would post one article a week until I felt the topic had been covered. I found that as I wrote, new topics kept popping up in my mind, and eventually this small blog series grew into a full book. The blog series was large enough to last a whole year, so I definitely thought that it could be a full book. Ideas on how to become a successful blogger were so frequent that, I will admit, I feel like there is so much I left out of this book. I had a lot more to say than I originally thought! I am so excited to be sharing this book with all of you. I am so passionate about this topic, and I feel like there are so many people who can benefit from this book. I know that when I started this blog, I did not know what I was doing, and I would have loved a “helping hand” to tell what to do and what not to do. If this book can act that way to any of my readers, I feel it is a success. Rules of Thumb If you are interested in the topic of becoming a blogger, as you read this book, keep in mind that it is suggestions only. Blogging is so new to the world that while there are “rules of thumb” about what to do and what not to do, a map of steps (“first, do x, then do y”) is not going to work for every single blogger. This book is meant to encourage new bloggers to put their content out there in the world, to be brave and create a community like the one I have here at SQL Authority. I have gained so much from this community, I wanted to give something back, and this book is just one small part. I hope that everyone who reads this books finds at least one helpful tip, and that everyone can experience the joy of blogging. That is the whole reason I wrote this book, and what I hope everyone takes away from it. Where Can You Get It? You can get the book from following URL: Kindle eBook | Print Book Reference: Pinal Dave (http://blog.SQLAuthority.com)Filed under: About Me, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority Author Visit, T SQL

Read the article
SQLAuthority News – Advantages of Distance Learning

- by Pinal Dave

Distance education is extremely popular – almost overnight, it seems. Almost everyone has taken an online course, or knows someone who has, or is considering joining an online school. There are many advantages and disadvantages to attending an online school – but the same can be said of attending a physical school! Let’s take a look at the top reasons to use distance education. 1) Flexibility. Physical universities are usually willing to make some concessions to student – like night classes, study hours, and online networks. However, nothing is going to beat the flexibility of distance education. You can attend classes and take notes anytime, anywhere, wearing anything you’d like! 2) Affordability. We don’t need to get into hard numbers to understand how an expensive university can be. Students are taking on more and more debt just to get an education. Many of these fees pay for room, board, and facilities. Distance education cuts out all these costs, and makes attending school much more affordable for the average student. 3) Try before you buy. Did you know that the average college student changes his or her major 10 times before they graduate? You can imagine that this kind of indecision plays a huge part in WHEN you graduate – not being able to make up your mind can cost you big bucks if you have to stay in school for extra years! Distance education allows you to take different classes from a wide range of disciplines. Do you want to study forensic science or English literature? Now you don’t have to pay for classes you can’t afford just to find out. 4) Pace yourself. Some students struggle in a traditional classroom setting – classes can be taught too fast, too slow, or there are too many distractions. Distance education allows mature students to set the pace themselves. They can rewatch lectures they didn’t catch the first time, or go through classes quickly if they are already familiar with the material – cutting out the chance of burning out or getting bored. 5) Lifelong learning. Maybe you already have a degree, but would like to learn more about your field, or a related field, or maybe even about something completely unrelated – just because you are curious! Distance education allows you to learn whatever you want ,whenever you want (and yes, wearing anything you’d like!). 6) Attend whatever college you want. Because of the popularity of distance education, physical campuses are getting in on the game by offering online courses – often just uploaded versions of classes already taught at their campus. Ever wanted to attend Harvard, but knew you couldn’t get in? Take a class online! Of course, you probably should not attempt to lie and say you have a Harvard degree, but Ivy League colleges are prestigious because they are the best in their field – take advantage of the best by taking an online course! I am a big believer in continuing education, whether it is online courses, returning to school, or even take informal classes online. Distance education can be a great way to accomplish these goals and become a lifelong learner. My friends at provides training through virtual classrooms for students who want to avoid travelling. Distance learning course allows IT aspirants to connect with trainers using the internet. I encourage everyone to check it out! Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Training, T SQL, Technology

Read the article
SQL Authority News – FalafelCON 2014: 2 days with the Best Developers in the World

- by Pinal Dave

I love presenting at various forums on various technologies. I am extremely excited that I got invited to speak at Falafel Conference 2014 in San Francisco. I will present two technology sessions on SQL Server. If you are into web development or if you just want to attend a conference with the best of the industry speakers, this may be the right conference for you. What set apart this conference from other conference is technology presented as well as speakers. Usually one has to attend very expensive and high scale event when they have to hear good speakers. At this conference, you will find quite a many industry legends are available to present on the bleeding edge technology. Here are few of the reasons why I believe you should attend this conference: Choose from four tracks covering Web, Mobile development and testing, Sitefinity, and Automated Testing, or attend sessions from all four! Learn from the best developers and testers in the business in an intimate setting. Surround yourself with your peers and the opportunity to network Learn about the latest platforms and technologies including Kendo UI, AngularJS, ASP.NET MVC, WebAPI, and more! Here are the details for the sessions which I am going to present at Falafel Conference. Secrets of SQL Server: Database Worst Practices Abstract: Chances are you have heard, or even uttered, this expression. This demo-oriented session will show many examples where database professionals were dumbfounded by their own mistakes, and could even bring back memories of your own early DBA days. The goal of this session is to expose the small details that can be dangerous to the production environment and SQL Server as a whole, as well as talk about worst practices and how to avoid them. Shedding light on some of these perils and the tricks to avoid them may even save your current job. After attending this session, Developers will only need 60 seconds to improve performance of their database server in their SharePoint implementation. We will have a quiz during the session to keep the conversation alive. Developers will walk out with scripts and knowledge that can be applied to their servers, immediately post the session. Additionally, all attendees of the session will have access to learning material presented in the session. The Unsung Hero Abstract: Slow Running Queries are the most common problem that developers face while working with SQL Server. While it is easy to blame the SQL Server for unsatisfactory performance, however the issue often persists with the way queries have been written, and how Indexes has been set up. The session will focus on the ways of identifying problems that slow down SQL Server, and Indexing tricks to fix them. Developers will walk out with scripts and knowledge that can be applied to their servers, immediately post the session. Register Now! I have learned from the Falafel Team that they are running out of tickets and soon they will close the registration. For next 10 days the price for the registration is only USD 149. Trust me, you can’t get such a world class training and networking opportunity at such a low price. Click to Register Here! Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority News, T SQL

Read the article
SQL SERVER – How to Get SQL Server Restart Notification?

- by Pinal Dave

Few days back my friend called me to know if there is any tool which can be used to get restart notification about SQL in their environment. I told that SQL Server can do it by itself with some configurations. He was happy and surprised to know that he need not spend any extra money. In SQL Server, we can configure stored procedure(s) to run at start-up of SQL Server. This blog would give steps to achieve how to achieve it. There are many situations where this feature can be used. Below are few. Logging SQL Server startup timings Modify data in some table during startup (i.e. table in tempdb) Sending notification about SQL start. Step 1 – Enable ‘scan for startup procs’ This can be done either using T-SQL or User Interface of Management Studio. EXEC sys.sp_configure N'Show Advanced Options', N'1' GO RECONFIGURE WITH OVERRIDE GO EXEC sys.sp_configure N'scan for startup procs', N'1' GO RECONFIGURE WITH OVERRIDE GO Below is the interface to change the setting. We need to go to “Server” > “Properties” and use “Advanced” tab. “Scan for Startup Procs” is the parameter under “Miscellaneous” section as shown below. We need to make value as “True” and hit OK. Step 2 – Create stored procedure It’s important to note that the procedure is executed after recovery is finished for ALL databases. Here is a sample stored procedure. You can use your own logic in the procedure. CREATE PROCEDURE SQLStartupProc AS BEGIN CREATE TABLE ##ThisTableShouldAlwaysExists (AnyColumn INT) END Step 3 – Set Procedure to run at startup We need to use sp_procoption to mark the procedure to run at startup. Here is the code to let SQL know that this is startup proc. sp_procoption 'SQLStartupProc', 'startup', 'true' This can be used only for procedures in master database. Msg 15398, Level 11, State 1, Procedure sp_procoption, Line 89 Only objects in the master database owned by dbo can have the startup setting changed. We also need to remember that such procedure should not have any input/output parameter. Here is the error which would be raised. Msg 15399, Level 11, State 1, Procedure sp_procoption, Line 107 Could not change startup option because this option is restricted to objects that have no parameters. Verification Here is the query to find which procedures is marked as startup procedures. SELECT name FROM sys.objects WHERE OBJECTPROPERTY(OBJECT_ID, 'ExecIsStartup') = 1 Once this is done, I have restarted SQL instance and here is what we would see in SQL ERRORLOG Launched startup procedure 'SQLStartupProc'. This confirms that stored procedure is executed. You can also notice that this is done after all databases are recovered. Recovery is complete. This is an informational message only. No user action is required. After few days my friend again called me and asked – I want to turn this OFF? Use comments section and post the answer for him. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Utility, T SQL

Read the article
Big Data – ClustrixDB – Extreme Scale SQL Database with Real-time Analytics, Releases Software Download – NewSQL

- by Pinal Dave

There are so many things to learn and there is so little time we all have. As we have little time we need to be selective to learn whatever we learn. I believe I know quite a lot of things in SQL but I still do not know what is around SQL. I have started to learn about NewSQL recently. If you wonder what is NewSQL I encourage all of you to read my blog post about NewSQL over here Big Data – Buzz Words: What is NewSQL – Day 10 of 21. NewSQL databases are quickly becoming popular – providing the scale of NoSQL with the SQL features and transactions. As a part of learning NewSQL database, I have recently started to learn about ClustrixDB. ClustrixDB has been the most mature NewSQL database used by some of the largest internet sites in the world for over 3 years, with extensive SQL support. In addition to scale, it provides fast real-time analytics by bringing massively parallel processing (MPP), available only in warehousing databases, to the transactional database. The reason I am more intrigued about learning ClustrixDB is their recent announcement on Oct 31. ClustrixDB was only available as an appliance, but now with their software release on Oct 31, everyone can use it. It is now available as forever free for up to 12 cores with community support, and there is a 45 day trial for unlimited cluster sizes. With the forever free world, I am indeed interested in ClustrixDB now. I know that few of the leading eCommerce sites in the world uses them for their transactional database. Here are few of the details I have quickly noted for ClustrixDB. ClustrixDB allows user to: Scale by simply adding nodes to the cluster with a single command Run billions of transactions a day Run fast real-time analytics Achieve high-availability with recovery from node failure Manages itself Easily migrate from MySQL as it is nearly plug-and-play compatible, use MySQL drivers, tools and replication. While I was going through the documentation I realized that ClustrixDB also has extensive support for SQL features including complex queries involving joins on a dozen or more tables, aggregates, sorts, sub-queries. It also supports stored procedures, triggers, foreign keys, partitioned and temporary tables, and fully online schema changes. It is indeed a very matured product and SQL solution. Indeed Clusterix sound very promising solution, I decided to dig a bit deeper to understand who are current customers of the Clustrix as they exist in the industry for quite a few years. Their client list is indeed very interesting and here is my quick research about them. Twoo.com – Europe’s largest social discovery (dating) site runs 4.4 Billion Transactions a day with table sizes over a Terabyte, on a 168 core cluster. EngageBDR – Top 3 in the online advertising category uses ClustrixDB to serve 6.9 billion ads a day through real-time bidding platform. Their reports went from 4 hours to 15 seconds. NoMoreRack – Top 2 fastest growing e-commerce company in US used ClustrixDB for high availability and fast growth through Amazon cloud. MakeMyTrip – India’s leading travel site runs on ClustrixDB with two clusters running as multi-master in Chennai and Bangalore. Many enterprises such as AOL, CSC, Rakuten, Symantec use ClustrixDB when their applications need scale. I must accept that I am impressed with the information I have learned so far and now is the time to do some hand’s on experience with their product. I want to learn this technology so in future when it is about NewSQL, I know what I am talking about. Read more why Clustrix explains why you ClustrixDB might be the right database for you. Download ClustrixDB with me today and install it on your machine so in future when we discuss the technical aspects of it, we all are on the same page. The software can be downloaded here. Reference : Pinal Dave (http://blog.SQLAuthority.com)Filed under: Big Data, MySQL, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Clustrix

Read the article
Developer’s Life – Summary of Superhero Articles

- by Pinal Dave

Earlier this year, I wrote an article series where I talked about developer’s life and compared it with Superhero. I have got amazing response to this series and I have been receiving quite a lots of email suggesting that I should write more blog post about them. Currently I am not planning to write more blog post but I will soon continue another series. In this blog post, I have summarized the entire series. Let me know if you want me to write about any superhero. I will see what I can do about that hero. Developer’s Life – Every Developer is a Captain America Captain America was first created as a comic book character in the 1940’s as a way to boost morale during World War II. Aimed at a children’s audience, his legacy faded away when the war ended. However, he has recently has a major reboot to become a popular movie character that deals with modern issues. Developer’s Life – Every Developer is the Incredible Hulk The Incredible Hulk is possibly one of the scariest superheroes out there. All superheroes are meant to be “out of this world” and awe-inspiring, but I think most people will agree with I say The Hulk takes this to the next level. He is the result of an industrial accident, which is scary enough in it’s own right. Plus, when mild-mannered Bruce Banner is angered, he goes completely out-of-control and transforms into a destructive monster that he cannot control and has no memories of. Developer’s Life – Every Developer is a Wonder Woman We have focused a lot lately on this “superhero series.” I love fantasy books and movies, and I feel like there is a lot to be learned from them. As I am writing this series, though, I have noticed that every super hero I write about is a man. So today, I would like to talk about the major female super hero – Wonder Woman. Developer’s Life – Every Developer is a Harry Potter Harry Potter might not be a superhero in the traditional sense, but I believe he still has a lot to teach us and show us about life as a developer. If you have been living under a rock for the last 17 years, you might not know that Harry Potter is the main character in an extremely popular series of books and movies documenting the education and tribulation of a young wizard (and his friends). Developer’s Life – Every Developer is Like Transformers Transformers may not be superheroes – they don’t wear capes, they don’t have amazing powers outside of their size and folding ability, they’re not even human (technically). Part of their enduring popularity is that while we are enjoying over-the-top movies, we are learning about good leadership and strong personal skills. Developer’s Life – Every Developer is a Iron Man Iron Man is another superhero who is not naturally “super,” but relies on his brain (and money) to turn him into a fighting machine. While traditional superheroes are still popular, a three-movie franchise and incorporation into the new Avengers series shows that Iron Man is popular enough on his own. Developer’s Life – Every Developer is a Sherlock Holmes I have been thinking a lot about how developers are like super heroes, and I have written two blog posts now comparing them to Spiderman and Superman. I have a lot of love and respect for developers, and I hope that they are enjoying these articles, and others are learning a little bit about the profession. There is another fictional character who, while not technically asuper hero, is very powerful, and I also think stands as a good example of a developer. That character is Sherlock Holmes. Sherlock Holmes is a British detective, first made popular at the turn of the 19thcentury by author Sir Arthur Conan Doyle. The original Sherlock Holmes was a brilliant detective who could solve the most mind-boggling crime through simple observations and deduction. Developer’s Life – Every Developer is a Chhota Bheem Chhota Bheem is a cartoon character that is extremely popular where I live. He is my daughter’s favorite characters. I like to say that children love Chhota Bheem more than their parents – it is lucky for us he is not real! Children love Chhota Bheem because he is the absolute “good guy.” He is smart, loyal, and strong. He and his friends live in Dholakpur and fight off their many enemies – and always win – in every episode. In each episode, they learn something about friendship, bravery, and being kind to others. Chhota Bheem is a good role model for children, and I think that he is a good role model for developers are well. Developer’s Life – Every Developer is a Batman Batman is one of the darkest superheroes in the fantasy canon. He does not come to his powers through any sort of magical coincidence or radioactive insect, but through a lot of psychological scarring caused by witnessing the death of his parents. Despite his dark back story, he possesses a lot of admirable abilities that I feel bear comparison to developers. Developer’s Life – Every Developer is a Superman I enjoyed comparing developers to Spiderman so much, that I have decided to continue the trend and encourage some of my favorite people (developers) with another favorite superhero – Superman. Superman is probably the most famous superhero – and one of the most inspiring. Developer’s Life – Every Developer is a Spiderman I have to admit, Spiderman is my favorite superhero. The most recent movie recently was released in theaters, so it has been at the front of my mind for some time. Spiderman was my favorite superhero even before the latest movie came out, but of course I took my whole family to see the movie as soon as I could! Every one of us loved it, including my daughter. We all left the movie thinking how great it would be to be Spiderman. So, with that in mind, I started thinking about how we are like Spiderman in our everyday lives, especially developers. I would like to know which Superhero is your favorite hero! Reference: Pinal Dave (http://blog.SQLAuthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Developer, Superhero

Read the article
SQLAuthority News – Don’t Be Afraid To Fool The World – Video by John Sonmez

- by Pinal Dave

Sometime some words and statements grabs your attention and it is hard to stop thinking about that after a while. Something similar happened a few days ago when I read the twitter statement of my friend and Pluralsight author John Sonmez. He twitted few days ago very interesting statement. “I don’t know a single successful person, who doesn’t deep down think that have the world fooled. #fooltheworld” by John Sonmez. When I read it, I was extremely intrigued by this statement. I read it many times, I shared with my family and I just could not stop interpreting this statement. It was indeed fun to read it again and again and there are so many different meanings one can take away from the statement. I know John very well, he is a wonderful person and have very positive energy for the life. I just had to request him to build a video around it. Right after 5 days of my request, John created a wonderful video around this subject. I watched it multiple times as it was a wonderful video. I am not going to write about what was in the video much as I suggest you to watch the video itself. Here is one of the personal stories I want to share which is absolutely relevant to this video. I think my story 100% resonant the story of John. A Real Story from My Past Three years ago, I submitted a session in one of the SharePoint conference as a SQL Server session. My session was accepted and I prepared it very well. I put more than 2 month’s time to prepare for the session and I was very excited to present the session. I reached to the event place traveling thousands of the miles and I was very much excited to present the session. However, there was a little mixed up in the session. There were multiple session which were similar to my session title. One of the other speakers also had proposed a database related session and was selected. When the material went to print the printing team got confused and by mistake swapped the sessions. The other speaker got Performance with SQL Server session and I had received Performance with SharePoint session. IT was indeed a big mixed up but now that is how it was in the event guide and it was marketed the same way everything in the event. A Big Mix Up I had to talk with the event organizer and we come to the conclusion that we all had good intention but things just got mixed up and now was the time when “The show must go on“. I had a great amount of hesitation to go and present the session as I had personally never worked with Sharepoint so close in my life and my session abstracted talked about SharePoint tricks in depth. Two hours before the session I took the help of one of my friend and installed the SharePoint on my box. He showed me a few things here and there but it was never a good enough time to learn everything which I wanted to learn. The Moments of Confidence I was very scared and nervous to go on the stage as a SharePoint was not something I felt comfortable. However, I decided to go on stage with confidence as a SharePoint expert. Though I did not know SharePoint at the best, I had confidence that whatever I know is correct and I will not misguide people. I had no intention to fool people but I had no intention to accept that I am a fool and you all wasted your time and money to dedicate your time to attend my session. I decided to be honest but at the same time decided to take the session beyond my expertise. The sixty minutes of the session went very fine and I was able to manage all the difficult question at a satisfactory level. When the session was over my feeling was that I would have not presented or talked any different if I had more knowledge of the SharePoint at that time. I think it was one of my best sessions and it was reflected in the session feedback as well. I was the best speaker across all the track and my session had highest ranking. I was delighted and I learned a very valuable lesson. I must go beyond my limits and knowledge. I must aim higher and work harder. I should not lie but I should have confidence that I have a good heart and I put 100% in my efforts. Lessions Learned Since this incident I have learned a lot about SharePoint and I am now a regular speaker at various SharePoint conferences along with SQL Server sessions. I am motivated and I am not afraid. I know people have lots of expectation from me but I have learned not to judge myself before I do my best. I leave the judgement of my efforts to my audience. I do not take the burden of the feedback on me, even though I know my audience have expected from me. I know what I know and I put my best. I must go out, if I fail, I learn from my mistake but I must keep my progress trajectory very high. As John said in the video, sometime success is not something we can achieve 100% but we can keep on going near to it. As long as we do not lose our focus from our goal and do not deviate from our progress path, we are doing things right. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: About Me, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – SSMS: Memory Usage By Memory Optimized Objects Report

- by Pinal Dave

At conferences and at speaking engagements at the local UG, there is one question that keeps on coming which I wish were never asked. The question around, “Why is SQL Server using up all the memory and not releasing even when idle?” Well, the answer can be long and with the release of SQL Server 2014, this got even more complicated. This release of SQL Server 2014 has the option of introducing In-Memory OLTP which is completely new concept and our dependency on memory has increased multifold. In reality, nothing much changes but we have memory optimized objects (Tables and Stored Procedures) additional which are residing completely in memory and improving performance. As a DBA, it is humanly impossible to get a hang of all the innovations and the new features introduced in the next version. So today’s blog is around the report added to SSMS which gives a high level view of this new feature addition. This reports is available only from SQL Server 2014 onwards because the feature was introduced in SQL Server 2014. Earlier versions of SQL Server Management Studio would not show the report in the list. If we try to launch the report on the database which is not having In-Memory File group defined, then we would see the message in report. To demonstrate, I have created new fresh database called MemoryOptimizedDB with no special file group. Here is the query used to identify whether a database has memory-optimized file group or not. SELECT TOP(1) 1 FROM sys.filegroups FG WHERE FG.[type] = 'FX' Once we add filegroup using below command, we would see different version of report. USE [master] GO ALTER DATABASE [MemoryOptimizedDB] ADD FILEGROUP [IMO_FG] CONTAINS MEMORY_OPTIMIZED_DATA GO The report is still empty because we have not defined any Memory Optimized table in the database. Total allocated size is shown as 0 MB. Now, let’s add the folder location into the filegroup and also created few in-memory tables. We have used the nomenclature of IMO to denote “InMemory Optimized” objects. USE [master] GO ALTER DATABASE [MemoryOptimizedDB] ADD FILE ( NAME = N'MemoryOptimizedDB_IMO', FILENAME = N'E:\Program Files\Microsoft SQL Server\MSSQL12.SQL2014\MSSQL\DATA\MemoryOptimizedDB_IMO') TO FILEGROUP [IMO_FG] GO You may have to change the path based on your SQL Server configuration. Below is the script to create the table. USE MemoryOptimizedDB GO --Drop table if it already exists. IF OBJECT_ID('dbo.SQLAuthority','U') IS NOT NULL DROP TABLE dbo.SQLAuthority GO CREATE TABLE dbo.SQLAuthority ( ID INT IDENTITY NOT NULL, Name CHAR(500) COLLATE Latin1_General_100_BIN2 NOT NULL DEFAULT 'Pinal', CONSTRAINT PK_SQLAuthority_ID PRIMARY KEY NONCLUSTERED (ID), INDEX hash_index_sample_memoryoptimizedtable_c2 HASH (Name) WITH (BUCKET_COUNT = 131072) ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA) GO As soon as above script is executed, table and index both are created. If we run the report again, we would see something like below. Notice that table memory is zero but index is using memory. This is due to the fact that hash index needs memory to manage the buckets created. So even if table is empty, index would consume memory. More about the internals of how In-Memory indexes and tables work will be reserved for future posts. Now, use below script to populate the table with 10000 rows INSERT INTO SQLAuthority VALUES (DEFAULT) GO 10000 Here is the same report after inserting 1000 rows into our InMemory table. There are total three sections in the whole report. Total Memory consumed by In-Memory Objects Pie chart showing memory distribution based on type of consumer – table, index and system. Details of memory usage by each table. The information about all three is taken from one single DMV, sys.dm_db_xtp_table_memory_stats This DMV contains memory usage statistics for both user and system In-Memory tables. If we query the DMV and look at data, we can easily notice that the system tables have negative object IDs. So, to look at user table memory usage, below is the over-simplified version of query. USE MemoryOptimizedDB GO SELECT OBJECT_NAME(OBJECT_ID), * FROM sys.dm_db_xtp_table_memory_stats WHERE OBJECT_ID > 0 GO This report would help DBA to identify which in-memory object taking lot of memory which can be used as a pointer for designing solution. I am sure in future we will discuss at lengths the whole concept of In-Memory tables in detail over this blog. To read more about In-Memory OLTP, have a look at In-Memory OLTP Series at Balmukund’s Blog. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Server Management Studio, SQL Tips and Tricks, T SQL Tagged: SQL Memory, SQL Reports

Read the article
SQL SERVER – SSMS: Backup and Restore Events Report

- by Pinal Dave

A DBA wears multiple hats and in fact does more than what an eye can see. One of the core task of a DBA is to take backups. This looks so trivial that most developers shrug this off as the only activity a DBA might be doing. I have huge respect for DBA’s all around the world because even if they seem cool with all the scripting, automation, maintenance works round the clock to keep the business working almost 365 days 24×7, their worth is knowing that one day when the systems / HDD crashes and you have an important delivery to make. So these backup tasks / maintenance jobs that have been done come handy and are no more trivial as they might seem to be as considered by many. So the important question like: “When was the last backup taken?”, “How much time did the last backup take?”, “What type of backup was taken last?” etc are tricky questions and this report lands answers to the same in a jiffy. So the SSMS report, we are talking can be used to find backups and restore operation done for the selected database. Whenever we perform any backup or restore operation, the information is stored in the msdb database. This report can utilize that information and provide information about the size, time taken and also the file location for those operations. Here is how this report can be launched. Once we launch this report, we can see 4 major sections shown as listed below. Average Time Taken For Backup Operations Successful Backup Operations Backup Operation Errors Successful Restore Operations Let us look at each section next. Average Time Taken For Backup Operations Information shown in “Average Time Taken For Backup Operations” section is taken from a backupset table in the msdb database. Here is the query and the expanded version of that particular section USE msdb; SELECT (ROW_NUMBER() OVER (ORDER BY t1.TYPE))%2 AS l1 , 1 AS l2 , 1 AS l3 , t1.TYPE AS [type] , (AVG(DATEDIFF(ss,backup_start_date, backup_finish_date)))/60.0 AS AverageBackupDuration FROM backupset t1 INNER JOIN sys.databases t3 ON ( t1.database_name = t3.name) WHERE t3.name = N'AdventureWorks2014' GROUP BY t1.TYPE ORDER BY t1.TYPE On my small database the time taken for differential backup was less than a minute, hence the value of zero is displayed. This is an important piece of backup operation which might help you in planning maintenance windows. Successful Backup Operations Here is the expanded version of this section. This information is derived from various backup tracking tables from msdb database. Here is the simplified version of the query which can be used separately as well. SELECT * FROM sys.databases t1 INNER JOIN backupset t3 ON (t3.database_name = t1.name) LEFT OUTER JOIN backupmediaset t5 ON ( t3.media_set_id = t5.media_set_id) LEFT OUTER JOIN backupmediafamily t6 ON ( t6.media_set_id = t5.media_set_id) WHERE (t1.name = N'AdventureWorks2014') ORDER BY backup_start_date DESC,t3.backup_set_id,t6.physical_device_name; The report does some calculations to show the data in a more readable format. For example, the backup size is shown in KB, MB or GB. I have expanded first row by clicking on (+) on “Device type” column. That has shown me the path of the physical backup file. Personally looking at this section, the Backup Size, Device Type and Backup Name are critical and are worth a note. As mentioned in the previous section, this section also has the Duration embedded inside it. Backup Operation Errors This section of the report gets data from default trace. You might wonder how. One of the event which is tracked by default trace is “ErrorLog”. This means that whatever message is written to errorlog gets written to default trace file as well. Interestingly, whenever there is a backup failure, an error message is written to ERRORLOG and hence default trace. This section takes advantage of that and shows the information. We can read below message under this section, which confirms above logic. No backup operations errors occurred for (AdventureWorks2014) database in the recent past or default trace is not enabled. Successful Restore Operations This section may not be very useful in production server (do you perform a restore of database?) but might be useful in the development and log shipping secondary environment, where we might be interested to see restore operations for a particular database. Here is the expanded version of the section. To fill this section of the report, I have restored the same backups which were taken to populate earlier sections. Here is the simplified version of the query used to populate this output. USE msdb; SELECT * FROM restorehistory t1 LEFT OUTER JOIN restorefile t2 ON ( t1.restore_history_id = t2.restore_history_id) LEFT OUTER JOIN backupset t3 ON ( t1.backup_set_id = t3.backup_set_id) WHERE t1.destination_database_name = N'AdventureWorks2014' ORDER BY restore_date DESC, t1.restore_history_id,t2.destination_phys_name Have you ever looked at the backup strategy of your key databases? Are they in sync and do we have scope for improvements? Then this is the report to analyze after a week or month of maintenance plans running in your database. Do chime in with what are the strategies you are using in your environments. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Backup and Restore, SQL Query, SQL Server, SQL Server Management Studio, SQL Tips and Tricks, T SQL Tagged: SQL Reports

Read the article
Big Data – Buzz Words: Importance of Relational Database in Big Data World – Day 9 of 21

- by Pinal Dave

In yesterday’s blog post we learned what is HDFS. In this article we will take a quick look at the importance of the Relational Database in Big Data world. A Big Question? Here are a few questions I often received since the beginning of the Big Data Series - Does the relational database have no space in the story of the Big Data? Does relational database is no longer relevant as Big Data is evolving? Is relational database not capable to handle Big Data? Is it true that one no longer has to learn about relational data if Big Data is the final destination? Well, every single time when I hear that one person wants to learn about Big Data and is no longer interested in learning about relational database, I find it as a bit far stretched. I am not here to give ambiguous answers of It Depends. I am personally very clear that one who is aspiring to become Big Data Scientist or Big Data Expert they should learn about relational database. NoSQL Movement The reason for the NoSQL Movement in recent time was because of the two important advantages of the NoSQL databases. Performance Flexible Schema In personal experience I have found that when I use NoSQL I have found both of the above listed advantages when I use NoSQL database. There are instances when I found relational database too much restrictive when my data is unstructured as well as they have in the datatype which my Relational Database does not support. It is the same case when I have found that NoSQL solution performing much better than relational databases. I must say that I am a big fan of NoSQL solutions in the recent times but I have also seen occasions and situations where relational database is still perfect fit even though the database is growing increasingly as well have all the symptoms of the big data. Situations in Relational Database Outperforms Adhoc reporting is the one of the most common scenarios where NoSQL is does not have optimal solution. For example reporting queries often needs to aggregate based on the columns which are not indexed as well are built while the report is running, in this kind of scenario NoSQL databases (document database stores, distributed key value stores) database often does not perform well. In the case of the ad-hoc reporting I have often found it is much easier to work with relational databases. SQL is the most popular computer language of all the time. I have been using it for almost over 10 years and I feel that I will be using it for a long time in future. There are plenty of the tools, connectors and awareness of the SQL language in the industry. Pretty much every programming language has a written drivers for the SQL language and most of the developers have learned this language during their school/college time. In many cases, writing query based on SQL is much easier than writing queries in NoSQL supported languages. I believe this is the current situation but in the future this situation can reverse when No SQL query languages are equally popular. ACID (Atomicity Consistency Isolation Durability) – Not all the NoSQL solutions offers ACID compliant language. There are always situations (for example banking transactions, eCommerce shopping carts etc.) where if there is no ACID the operations can be invalid as well database integrity can be at risk. Even though the data volume indeed qualify as a Big Data there are always operations in the application which absolutely needs ACID compliance matured language. The Mixed Bag I have often heard argument that all the big social media sites now a days have moved away from Relational Database. Actually this is not entirely true. While researching about Big Data and Relational Database, I have found that many of the popular social media sites uses Big Data solutions along with Relational Database. Many are using relational databases to deliver the results to end user on the run time and many still uses a relational database as their major backbone. Here are a few examples: Facebook uses MySQL to display the timeline. (Reference Link) Twitter uses MySQL. (Reference Link) Tumblr uses Sharded MySQL (Reference Link) Wikipedia uses MySQL for data storage. (Reference Link) There are many for prominent organizations which are running large scale applications uses relational database along with various Big Data frameworks to satisfy their various business needs. Summary I believe that RDBMS is like a vanilla ice cream. Everybody loves it and everybody has it. NoSQL and other solutions are like chocolate ice cream or custom ice cream – there is a huge base which loves them and wants them but not every ice cream maker can make it just right for everyone’s taste. No matter how fancy an ice cream store is there is always plain vanilla ice cream available there. Just like the same, there are always cases and situations in the Big Data’s story where traditional relational database is the part of the whole story. In the real world scenarios there will be always the case when there will be need of the relational database concepts and its ideology. It is extremely important to accept relational database as one of the key components of the Big Data instead of treating it as a substandard technology. Ray of Hope – NewSQL In this module we discussed that there are places where we need ACID compliance from our Big Data application and NoSQL will not support that out of box. There is a new termed coined for the application/tool which supports most of the properties of the traditional RDBMS and supports Big Data infrastructure – NewSQL. Tomorrow In tomorrow’s blog post we will discuss about NewSQL. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL SERVER – Example of Performance Tuning for Advanced Users with DB Optimizer

- by Pinal Dave

Performance tuning is such a subject that everyone wants to master it. In beginning everybody is at a novice level and spend lots of time learning how to master the art of performance tuning. However, as we progress further the tuning of the system keeps on getting very difficult. I have understood in my early career there should be no need of ego in the technology field. There are always better solutions and better ideas out there and we should not resist them. Instead of resisting the change and new wave I personally adopt it. Here is a similar example, as I personally progress to the master level of performance tuning, I face that it is getting harder to come up with optimal solutions. In such scenarios I rely on various tools to teach me how I can do things better. Once I learn about tools, I am often able to come up with better solutions when I face the similar situation next time. A few days ago I had received a query where the user wanted to tune it further to get the maximum out of the performance. I have re-written the similar query with the help of AdventureWorks sample database. SELECT * FROM HumanResources.Employee e INNER JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID INNER JOIN HumanResources.Shift s ON edh.ShiftID = s.ShiftID; User had similar query to above query was used in very critical report and wanted to get best out of the query. When I looked at the query – here were my initial thoughts Use only column in the select statements as much as you want in the application Let us look at the query pattern and data workload and find out the optimal index for it Before I give further solutions I was told by the user that they need all the columns from all the tables and creating index was not allowed in their system. He can only re-write queries or use hints to further tune this query. Now I was in the constraint box – I believe * was not a great idea but if they wanted all the columns, I believe we can’t do much besides using *. Additionally, if I cannot create a further index, I must come up with some creative way to write this query. I personally do not like to use hints in my application but there are cases when hints work out magically and gives optimal solutions. Finally, I decided to use Embarcadero’s DB Optimizer. It is a fantastic tool and very helpful when it is about performance tuning. I have previously explained how it works over here. First open DBOptimizer and open Tuning Job from File >> New >> Tuning Job. Once you open DBOptimizer Tuning Job follow the various steps indicates in the following diagram. Essentially we will take our original script and will paste that into Step 1: New SQL Text and right after that we will enable Step 2 for Generating Various cases, Step 3 for Detailed Analysis and Step 4 for Executing each generated case. Finally we will click on Analysis in Step 5 which will generate the report detailed analysis in the result pan. The detailed pan looks like. It generates various cases of T-SQL based on the original query. It applies various hints and available hints to the query and generate various execution plans of the query and displays them in the resultant. You can clearly notice that original query had a cost of 0.0841 and logical reads about 607 pages. Whereas various options which are just following it has different execution cost as well logical read. There are few cases where we have higher logical read and there are few cases where as we have very low logical read. If we pay attention the very next row to original query have Merge_Join_Query in description and have lowest execution cost value of 0.044 and have lowest Logical Reads of 29. This row contains the query which is the most optimal re-write of the original query. Let us double click over it. Here is the query: SELECT * FROM HumanResources.Employee e INNER JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID INNER JOIN HumanResources.Shift s ON edh.ShiftID = s.ShiftID OPTION (MERGE JOIN) If you notice above query have additional hint of Merge Join. With the help of this Merge Join query hint this query is now performing much better than before. The entire process takes less than 60 seconds. Please note that it the join hint Merge Join was optimal for this query but it is not necessary that the same hint will be helpful in all the queries. Additionally, if the workload or data pattern changes the query hint of merge join may be no more optimal join. In that case, we will have to redo the entire exercise once again. This is the reason I do not like to use hints in my queries and I discourage all of my users to use the same. However, if you look at this example, this is a great case where hints are optimizing the performance of the query. It is humanly not possible to test out various query hints and index options with the query to figure out which is the most optimal solution. Sometimes, we need to depend on the efficiency tools like DB Optimizer to guide us the way and select the best option from the suggestion provided. Let me know what you think of this article as well your experience with DB Optimizer. Please leave a comment. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Joins, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article

Search Results

Search found 2466 results on 99 pages for 'dave mankoff'.

Page 20/99 | < Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27 | Next Page >

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

< Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27 | Next Page >