Search Results

Search found 2461 results on 99 pages for 'dave clausen'.

Page 20/99 | < Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27 | Next Page >

Big Data – Basics of Big Data Analytics – Day 18 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the various components in Big Data Story. In this article we will understand what are the various analytics tasks we try to achieve with the Big Data and the list of the important tools in Big Data Story. When you have plenty of the data around you what is the first thing which comes to your mind? “What do all these data means?” Exactly – the same thought comes to my mind as well. I always wanted to know what all the data means and what meaningful information I can receive out of it. Most of the Big Data projects are built to retrieve various intelligence all this data contains within it. Let us take example of Facebook. When I look at my friends list of Facebook, I always want to ask many questions such as - On which date my maximum friends have a birthday? What is the most favorite film of my most of the friends so I can talk about it and engage them? What is the most liked placed to travel my friends? Which is the most disliked cousin for my friends in India and USA so when they travel, I do not take them there. There are many more questions I can think of. This illustrates that how important it is to have analysis of Big Data. Here are few of the kind of analysis listed which you can use with Big Data. Slicing and Dicing: This means breaking down your data into smaller set and understanding them one set at a time. This also helps to present various information in a variety of different user digestible ways. For example if you have data related to movies, you can use different slide and dice data in various formats like actors, movie length etc. Real Time Monitoring: This is very crucial in social media when there are any events happening and you wanted to measure the impact at the time when the event is happening. For example, if you are using twitter when there is a football match, you can watch what fans are talking about football match on twitter when the event is happening. Anomaly Predication and Modeling: If the business is running normal it is alright but if there are signs of trouble, everyone wants to know them early on the hand. Big Data analysis of various patterns can be very much helpful to predict future. Though it may not be always accurate but certain hints and signals can be very helpful. For example, lots of data can help conclude that if there is lots of rain it can increase the sell of umbrella. Text and Unstructured Data Analysis: unstructured data are now getting norm in the new world and they are a big part of the Big Data revolution. It is very important that we Extract, Transform and Load the unstructured data and make meaningful data out of it. For example, analysis of lots of images, one can predict that people like to use certain colors in certain months in their cloths. Big Data Analytics Solutions There are many different Big Data Analystics Solutions out in the market. It is impossible to list all of them so I will list a few of them over here. Tableau – This has to be one of the most popular visualization tools out in the big data market. SAS – A high performance analytics and infrastructure company IBM and Oracle – They have a range of tools for Big Data Analysis Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Data Scientist. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Big Data – Data Mining with Hive – What is Hive? – What is HiveQL (HQL)? – Day 15 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the operational database in Big Data Story. In this article we will understand what is Hive and HQL in Big Data Story. Yahoo started working on PIG (we will understand that in the next blog post) for their application deployment on Hadoop. The goal of Yahoo to manage their unstructured data. Similarly Facebook started deploying their warehouse solutions on Hadoop which has resulted in HIVE. The reason for going with HIVE is because the traditional warehousing solutions are getting very expensive. What is HIVE? Hive is a datawarehouseing infrastructure for Hadoop. The primary responsibility is to provide data summarization, query and analysis. It supports analysis of large datasets stored in Hadoop’s HDFS as well as on the Amazon S3 filesystem. The best part of HIVE is that it supports SQL-Like access to structured data which is known as HiveQL (or HQL) as well as big data analysis with the help of MapReduce. Hive is not built to get a quick response to queries but it it is built for data mining applications. Data mining applications can take from several minutes to several hours to analysis the data and HIVE is primarily used there. HIVE Organization The data are organized in three different formats in HIVE. Tables: They are very similar to RDBMS tables and contains rows and tables. Hive is just layered over the Hadoop File System (HDFS), hence tables are directly mapped to directories of the filesystems. It also supports tables stored in other native file systems. Partitions: Hive tables can have more than one partition. They are mapped to subdirectories and file systems as well. Buckets: In Hive data may be divided into buckets. Buckets are stored as files in partition in the underlying file system. Hive also has metastore which stores all the metadata. It is a relational database containing various information related to Hive Schema (column types, owners, key-value data, statistics etc.). We can use MySQL database over here. What is HiveSQL (HQL)? Hive query language provides the basic SQL like operations. Here are few of the tasks which HQL can do easily. Create and manage tables and partitions Support various Relational, Arithmetic and Logical Operators Evaluate functions Download the contents of a table to a local directory or result of queries to HDFS directory Here is the example of the HQL Query: SELECT upper(name), salesprice FROM sales; SELECT category, count(1) FROM products GROUP BY category; When you look at the above query, you can see they are very similar to SQL like queries. Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Pig. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL SERVER – What are Actions in SSAS and How to Make a Reporting Action

- by Pinal Dave

Actions are used for customized browsing and drilling of data for the end-user. It’s an event that a user can raise while accessing the cube data. They are used in cube browsers like excel and are triggered when a user in a client tool clicks on a particular member, level, dimension, cells or may be the cube itself. For example a user might be able to see a reporting services report, open a web page or drill through to detailed information related to the cube data. Analysis server supports 3 types of actions :- Report Drill-through Standard Actions In this blog post, I will explain the Reporting action. The objective of this action is to return a report with details of the product where the sales amount is greater than 1000 in cube browser analysis. You need to create a basic cube first with the facts and dimensions you want in the analysis. Following are the steps to create reporting action. Go to SQL server data tools and open the analysis services project. Navigate to actions and click on new reporting action. 2.) Specify the name of the action and choose target type as attribute members since we have to create the action on members for a attribute. 3.) Specify the Target object of your report action. Target object would be the dimension or attribute on which you want the report to appear. In our case it is product name. 4.) Next you have to define the condition on which you want the report link to appear. However, this is an optional feature. In this example we are specifying a condition, which will check if the sales amount is greater than 10,000. So, that the link appears only for those products where the defined condition is met. 5.) Next you have to specify the server name on which the report is present, report path and the report format in which you want the report to appear. 6.) Additionally you can specify the parameters. As with conditional expression, the parameters should be a valid MDX expression. The parameter name should be same as the one defined in the report. 7.) Deploy your solution after you are done with specifying parameters and go to the cube browser. 8.) Click on the analyze in excel button, this will open your cube in excel 9.) Make an analysis which shows product names and their sales amount. 10.) Right click on a product where sales amount is greater than 10000 you will see the reporting action link. Click on that and you will be taken to your reporting services report. 11.) Clicking on the link will take you to the URL of the report. I created this report using report project wizard in SQL server data tools. So, this is how we can launch reports from a cube browser. Similarly you can open web pages, run applications and a number of other tasks. Koenig Solutions offers SSAS training which contains all Analysis Services including Reporting in great detail. In my next blog post I will talk about drill-through actions. Author: Namita Sharma, Senior Corporate Trainer at Koenig Solutions. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: SSAS

Read the article
SQL – Business Intelligence: Derive Data or Information?

- by Pinal Dave

We all know the value of information in our lives. Whether it’s a personal decision or a business initiated one, people need it. But the question is: who is to make the distinction between data and information? We all come across a whole lot of data daily, that may be significant or not. We filter what’s required and forget about the rest. Information is filtered and distilled data. Filtering and distillation can also alter its actual meaning and natural state. Therefore, in this blog we discover some ways to ensure that we’re using business intelligence derived from the right information for making critical management decisions. Four key questions managers must ask themselves before making a decision: 1. Am I working with data or information? 2. What is it’s context? 3. How recent is it? 4. How was it derived or what is the source? The first question is probably the most important. You must know what you’re dealing with here. If you see use of adjectives and conclusions drawn, it’s information. Not raw data. You very next concern must be whether this is guised to present a particular viewpoint or perspective. It makes a lot of difference if you take a decision based on someone’s propaganda to distort real facts. Therefore, the context and the intentions of the distillation process must be clear to you. The next consideration is whether data is recent enough to hold any value. Since it has a very short shelf life, you must ensure that its context and value is not lost out of time. The last and the most important consideration is how was it derived in the first place. The observer effect is what calls the shots here. The source can change the context to a great extent if the collection methodology and purpose is not clear. Gathering intelligence for decision making requires users to be keen observers and not take the information provided on its face value alone. These probing questions will allow you to make sure that you’re working with clean and accurate data devoid of any influence or manipulations. Only then can you be sure of deriving true business intelligence for your organization. BI technology is also a great way to ensure accuracy of reports. SQL BI Platform provides advanced tools and techniques for all your BI needs and concerns. Koenig Solutions offers this course along with a host of other Business Intelligence and IT courses on all latest technologies available in the market today. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Big Data – How to become a Data Scientist and Learn Data Science? – Day 19 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the analytics in Big Data Story. In this article we will understand how to become a Data Scientist for Big Data Story. Data Scientist is a new buzz word, everyone seems to be wanting to become Data Scientist. Let us go over a few key topics related to Data Scientist in this blog post. First of all we will understand what is a Data Scientist. In the new world of Big Data, I see pretty much everyone wants to become Data Scientist and there are lots of people I have already met who claims that they are Data Scientist. When I ask what is their role, I have got a wide variety of answers. What is Data Scientist? Data scientists are the experts who understand various aspects of the business and know how to strategies data to achieve the business goals. They should have a solid foundation of various data algorithms, modeling and statistics methodology. What do Data Scientists do? Data scientists understand the data very well. They just go beyond the regular data algorithms and builds interesting trends from available data. They innovate and resurrect the entire new meaning from the existing data. They are artists in disguise of computer analyst. They look at the data traditionally as well as explore various new ways to look at the data. Data Scientists do not wait to build their solutions from existing data. They think creatively, they think before the data has entered into the system. Data Scientists are visionary experts who understands the business needs and plan ahead of the time, this tremendously help to build solutions at rapid speed. Besides being data expert, the major quality of Data Scientists is “curiosity”. They always wonder about what more they can get from their existing data and how to get maximum out of future incoming data. Data Scientists do wonders with the data, which goes beyond the job descriptions of Data Analysist or Business Analysist. Skills Required for Data Scientists Here are few of the skills a Data Scientist must have. Expert level skills with statistical tools like SAS, Excel, R etc. Understanding Mathematical Models Hands-on with Visualization Tools like Tableau, PowerPivots, D3. j’s etc. Analytical skills to understand business needs Communication skills On the technology front any Data Scientists should know underlying technologies like (Hadoop, Cloudera) as well as their entire ecosystem (programming language, analysis and visualization tools etc.) . Remember that for becoming a successful Data Scientist one require have par excellent skills, just having a degree in a relevant education field will not suffice. Final Note Data Scientists is indeed very exciting job profile. As per research there are not enough Data Scientists in the world to handle the current data explosion. In near future Data is going to expand exponentially, and the need of the Data Scientists will increase along with it. It is indeed the job one should focus if you like data and science of statistics. Courtesy: emc Tomorrow In tomorrow’s blog post we will discuss about various Big Data Learning resources. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL SERVER – How to Roll Back SQL Server Database Changes

- by Pinal Dave

In a perfect scenario, no unexpected and unplanned changes occur. There are no unpleasant surprises, no inadvertent changes. However, even with all precautions and testing, there is sometimes a need to revert a structure or data change. One of the methods that can be used in this situation is to use an older database backup that has the records or database object structure you want to revert to. For this method, you have to have the adequate full database backup and a tool that will help you with comparison and synchronization is preferred. In this article, we will focus on another method: rolling back the changes. This can be done by using: An option in SQL Server Management Studio T-SQL, or ApexSQL Log The first two solutions have been described in this article The disadvantages of these methods are that you have to know when exactly the change you want to revert happened and that all transactions on the database executed in a specific time range are rolled back – the ones you want to undo and the ones you don’t. How to easily roll back SQL Server database changes using ApexSQL Log? The biggest challenge is to roll back just specific changes, not all changes that happened in a specific time range. While SQL Server Management Studio option and T-SQL read and roll forward all transactions in the transaction log files, I will show you a solution that finds and scripts only the specific changes that match your criteria. Therefore, you don’t need to worry about all other database changes that you don’t want to roll back. ApexSQL Log is a SQL Server disaster recovery tool that reads transaction logs and provides a wide range of filters that enable you to easily rollback only specific data changes. First, connect to the online database where you want to roll back the changes. Once you select the database, ApexSQL Log will show its recovery model. Note that changes can be rolled back even for a database in the Simple recovery model, when no database and transaction log backups are available. However, ApexSQL Log achieves best results when the database is in the Full recovery model and you have a chain of subsequent transaction log backups, back to the moment when the change occurred. In this example, we will use only the online transaction log. In the next step, use filters to read only the transactions that happened in a specific time range. To remove noise, it’s recommended to use as many filters as possible. Besides filtering by the time of the transaction, ApexSQL Log can filter by the operation type: Table name: As well as transaction state (committed, aborted, running, and unknown), name of the user who committed the change, specific field values, server process IDs, and transaction description. You can select only the tables affected by the changes you want to roll back. However, if you’re not certain which tables were affected, you can leave them all selected and once the results are shown in the main grid, analyze them to find the ones you to roll back. When you set the filters, you can select how to present the results. ApexSQL Log can automatically create undo or redo scripts, export the transactions into an XML, HTML, CSV, SQL, or SQL Bulk file, and create a batch file that you can use for unattended transaction log reading. In this example, I will open the results in the grid, as I want to analyze them before rolling back the transactions. The results contain information about the transaction, as well as who and when made it. For UPDATEs, ApexSQL Log shows both old and new values, so you can easily see what has happened. To create an UNDO script that rolls back the changes, select the transactions you want to roll back and click Create undo script in the menu. For the DELETE statement selected in the screenshot above, the undo script is: INSERT INTO [Sales].[PersonCreditCard] ([BusinessEntityID], [CreditCardID], [ModifiedDate]) VALUES (297, 8010, '20050901 00:00:00.000') When it comes to rolling back database changes, ApexSQL Log has a big advantage, as it rolls back only specific transactions, while leaving all other transactions that occurred at the same time range intact. That makes ApexSQL Log a good solution for rolling back inadvertent data and schema changes on your SQL Server databases. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: ApexSQL

Read the article
SQL SERVER – Puzzle #1 – Querying Pattern Ranges and Wild Cards

- by Pinal Dave

Note: Read at the end of the blog post how you can get five Joes 2 Pros Book #1 and a surprise gift. I have been blogging for almost 7 years and every other day I receive questions about Querying Pattern Ranges. The most common way to solve the problem is to use Wild Cards. However, not everyone knows how to use wild card properly. SQL Queries 2012 Joes 2 Pros Volume 1 – The SQL Queries 2012 Hands-On Tutorial for Beginners Book On Amazon | Book On Flipkart Learn SQL Server get all the five parts combo kit Kit on Amazon | Kit on Flipkart Many people know wildcards are great for finding patterns in character data. There are also some special sequences with wildcards that can give you even more power. This series from SQL Queries 2012 Joes 2 Pros® Volume 1 will show you some of these cool tricks. All supporting files are available with a free download from the www.Joes2Pros.com web site. This example is from the SQL 2012 series Volume 1 in the file SQLQueries2012Vol1Chapter2.2Setup.sql. If you need help setting up then look in the “Free Videos” section on Joes2Pros under “Getting Started” called “How to install your labs” Querying Pattern Ranges The % wildcard character represents any number of characters of any length. Let’s find all first names that end in the letter ‘A’. By using the percentage ‘%’ sign with the letter ‘A’, we achieve this goal using the code sample below: SELECT * FROM Employee WHERE FirstName LIKE '%A' To find all FirstName values beginning with the letters ‘A’ or ‘B’ we can use two predicates in our WHERE clause, by separating them with the OR statement. Finding names beginning with an ‘A’ or ‘B’ is easy and this works fine until we want a larger range of letters as in the example below for ‘A’ thru ‘K’: SELECT * FROM Employee WHERE FirstName LIKE 'A%' OR FirstName LIKE 'B%' OR FirstName LIKE 'C%' OR FirstName LIKE 'D%' OR FirstName LIKE 'E%' OR FirstName LIKE 'F%' OR FirstName LIKE 'G%' OR FirstName LIKE 'H%' OR FirstName LIKE 'I%' OR FirstName LIKE 'J%' OR FirstName LIKE 'K%' The previous query does find FirstName values beginning with the letters ‘A’ thru ‘K’. However, when a query requires a large range of letters, the LIKE operator has an even better option. Since the first letter of the FirstName field can be ‘A’, ‘B’, ‘C’, ‘D’, ‘E’, ‘F’, ‘G’, ‘H’, ‘I’, ‘J’ or ‘K’, simply list all these choices inside a set of square brackets followed by the ‘%’ wildcard, as in the example below: SELECT * FROM Employee WHERE FirstName LIKE '[ABCDEFGHIJK]%' A more elegant example of this technique recognizes that all these letters are in a continuous range, so we really only need to list the first and last letter of the range inside the square brackets, followed by the ‘%’ wildcard allowing for any number of characters after the first letter in the range. Note: A predicate that uses a range will not work with the ‘=’ operator (equals sign). It will neither raise an error, nor produce a result set. --Bad query (will not error or return any records) SELECT * FROM Employee WHERE FirstName = '[A-K]%' Question: You want to find all first names that start with the letters A-M in your Customer table and end with the letter Z. Which SQL code would you use? a. SELECT * FROM Customer WHERE FirstName LIKE 'm%z' b. SELECT * FROM Customer WHERE FirstName LIKE 'a-m%z' c. SELECT * FROM Customer WHERE FirstName LIKE 'a-m%z' d. SELECT * FROM Customer WHERE FirstName LIKE '[a-m]%z' e. SELECT * FROM Customer WHERE FirstName LIKE '[a-m]z%' f. SELECT * FROM Customer WHERE FirstName LIKE '[a-m]%z' g. SELECT * FROM Customer WHERE FirstName LIKE '[a-m]z%' Contest Leave a valid answer before June 18, 2013 in the comment section. 5 winners will be selected from all the valid answers and will receive Joes 2 Pros Book #1. 1 Lucky person will get a surprise gift from Joes 2 Pros. The contest is open for all the countries where Amazon ships the book (USA, UK, Canada, India and many others). Special Note: Read all the options before you provide valid answer as there is a small trick hidden in answers. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Joes 2 Pros, PostADay, SQL, SQL Authority, SQL Puzzle, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Big Data – Operational Databases Supporting Big Data – Columnar, Graph and Spatial Database – Day 14 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the Key-Value Pair Databases and Document Databases in the Big Data Story. In this article we will understand the role of Columnar, Graph and Spatial Database supporting Big Data Story. Now we will see a few of the examples of the operational databases. Relational Databases (The day before yesterday’s post) NoSQL Databases (The day before yesterday’s post) Key-Value Pair Databases (Yesterday’s post) Document Databases (Yesterday’s post) Columnar Databases (Tomorrow’s post) Graph Databases (Today’s post) Spatial Databases (Today’s post) Columnar Databases Relational Database is a row store database or a row oriented database. Columnar databases are column oriented or column store databases. As we discussed earlier in Big Data we have different kinds of data and we need to store different kinds of data in the database. When we have columnar database it is very easy to do so as we can just add a new column to the columnar database. HBase is one of the most popular columnar databases. It uses Hadoop file system and MapReduce for its core data storage. However, remember this is not a good solution for every application. This is particularly good for the database where there is high volume incremental data is gathered and processed. Graph Databases For a highly interconnected data it is suitable to use Graph Database. This database has node relationship structure. Nodes and relationships contain a Key Value Pair where data is stored. The major advantage of this database is that it supports faster navigation among various relationships. For example, Facebook uses a graph database to list and demonstrate various relationships between users. Neo4J is one of the most popular open source graph database. One of the major dis-advantage of the Graph Database is that it is not possible to self-reference (self joins in the RDBMS terms) and there might be real world scenarios where this might be required and graph database does not support it. Spatial Databases We all use Foursquare, Google+ as well Facebook Check-ins for location aware check-ins. All the location aware applications figure out the position of the phone with the help of Global Positioning System (GPS). Think about it, so many different users at different location in the world and checking-in all together. Additionally, the applications now feature reach and users are demanding more and more information from them, for example like movies, coffee shop or places see. They are all running with the help of Spatial Databases. Spatial data are standardize by the Open Geospatial Consortium known as OGC. Spatial data helps answering many interesting questions like “Distance between two locations, area of interesting places etc.” When we think of it, it is very clear that handing spatial data and returning meaningful result is one big task when there are millions of users moving dynamically from one place to another place & requesting various spatial information. PostGIS/OpenGIS suite is very popular spatial database. It runs as a layer implementation on the RDBMS PostgreSQL. This makes it totally unique as it offers best from both the worlds. Courtesy: mushroom network Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Hive. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Developer Training – 6 Online Courses to Learn SQL Server, MySQL and Technology

- by Pinal Dave

Video courses are the next big thing and I am so happy that I have so far authored 6 different video courses with Pluralsight. Here is the list of the courses. I have listed all of my video courses over here. Note: If you click on the courses and it does not open, you need to login to Pluralsight with a valid username and password or sign up for a FREE trial. Please leave a comment with your favorite course in the comment section. Random 10 winners will get surprise gift via email. Bonus: If you list your favorite module from the course site. SQL Server Performance: Introduction to Query Tuning SQL Server performance tuning is an in-depth topic, and an art to master. A key component of overall application performance tuning is query tuning. Writing queries in an efficient manner, and making sure they execute in the most optimal way possible, is always a challenge. The basics revolve around the details of how SQL Server carries out query execution, so the optimizations explored in this course follow along the same lines. Click to View Course SQL Server Performance: Indexing Basics Indexes are the most crucial objects of the database. They are the first stop for any DBA and Developer when it is about performance tuning. There is a good side as well evil side of the indexes. To master the art of performance tuning one has to understand the fundamentals of the indexes and the best practices associated with the same. This course is for every DBA and Developer who deals with performance tuning and wants to use indexes to improve the performance of the server. Click to View Course SQL Server Questions and Answers This course is designed to help you better understand how to use SQL Server effectively. The course presents many of the common misconceptions about SQL Server, and then carefully debunks those misconceptions with clear explanations and short but compelling demos, showing you how SQL Server really works. This course is for anyone working with SQL Server databases who wants to improve her knowledge and understanding of this complex platform. Click to View Course MySQL Fundamentals MySQL is a popular choice of database for use in web applications, and is a central component of the widely used LAMP open source web application software stack. This course covers the fundamentals of MySQL, including how to install MySQL as well as written basic data retrieval and data modification queries. Click to View Course Building a Successful Blog Expressing yourself is the most common behavior of humans. Blogging has made easy to express yourself. Just like a letter or book has a structure and formula, blogging also has structure and formula. In this introductory course on blogging we will go over a few of the basics of blogging and show the way to get started with blogging immediately. If you already have a blog, this course will be even more relevant as this will discuss many of the common questions and issue you face in your blogging routine. Click to View Course Introduction to ColdFusion ColdFusion is rapid web application development platform. In this course you will learn the basics of how to use ColdFusion platform and rapidly develop web sites. The course begins with learning basics of ColdFusion Markup Language and moves to common development language practices. From there we move to frequent database operations and advanced concepts of Forms, Sessions and Cookies. The last module sums up all the concepts covered in the course with sample application. Click to View Course Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Training, T SQL, Technology

Read the article
Big Data – Interacting with Hadoop – What is Sqoop? – What is Zookeeper? – Day 17 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the Pig and Pig Latin in Big Data Story. In this article we will understand what is Sqoop and Zookeeper in Big Data Story. There are two most important components one should learn when learning about interacting with Hadoop – Sqoop and Zookper. What is Sqoop? Most of the business stores their data in RDBMS as well as other data warehouse solutions. They need a way to move data to the Hadoop system to do various processing and return it back to RDBMS from Hadoop system. The data movement can happen in real time or at various intervals in bulk. We need a tool which can help us move this data from SQL to Hadoop and from Hadoop to SQL. Sqoop (SQL to Hadoop) is such a tool which extract data from non-Hadoop data sources and transform them into the format which Hadoop can use it and later it loads them into HDFS. Essentially it is ETL tool where it Extracts, Transform and Load from SQL to Hadoop. The best part is that it also does extract data from Hadoop and loads them to Non-SQL (or RDBMS) data stores. Essentially, Sqoop is a command line tool which does SQL to Hadoop and Hadoop to SQL. It is a command line interpreter. It creates MapReduce job behinds the scene to import data from an external database to HDFS. It is very effective and easy to learn tool for nonprogrammers. What is Zookeeper? ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. In other words Zookeeper is a replicated synchronization service with eventual consistency. In simpler words – in Hadoop cluster there are many different nodes and one node is master. Let us assume that master node fails due to any reason. In this case, the role of the master node has to be transferred to a different node. The main role of the master node is managing the writers as that task requires persistence in order of writing. In this kind of scenario Zookeeper will assign new master node and make sure that Hadoop cluster performs without any glitch. Zookeeper is the Hadoop’s method of coordinating all the elements of these distributed systems. Here are few of the tasks which Zookeepr is responsible for. Zookeeper manages the entire workflow of starting and stopping various nodes in the Hadoop’s cluster. In Hadoop cluster when any processes need certain configuration to complete the task. Zookeeper makes sure that certain node gets necessary configuration consistently. In case of the master node fails, Zookeepr can assign new master node and make sure cluster works as expected. There many other tasks Zookeeper performance when it is about Hadoop cluster and communication. Basically without the help of Zookeeper it is not possible to design any new fault tolerant distributed application. Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Big Data Analytics. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL Authority News – FalafelCON 2014: 2 days with the Best Developers in the World

- by Pinal Dave

I love presenting at various forums on various technologies. I am extremely excited that I got invited to speak at Falafel Conference 2014 in San Francisco. I will present two technology sessions on SQL Server. If you are into web development or if you just want to attend a conference with the best of the industry speakers, this may be the right conference for you. What set apart this conference from other conference is technology presented as well as speakers. Usually one has to attend very expensive and high scale event when they have to hear good speakers. At this conference, you will find quite a many industry legends are available to present on the bleeding edge technology. Here are few of the reasons why I believe you should attend this conference: Choose from four tracks covering Web, Mobile development and testing, Sitefinity, and Automated Testing, or attend sessions from all four! Learn from the best developers and testers in the business in an intimate setting. Surround yourself with your peers and the opportunity to network Learn about the latest platforms and technologies including Kendo UI, AngularJS, ASP.NET MVC, WebAPI, and more! Here are the details for the sessions which I am going to present at Falafel Conference. Secrets of SQL Server: Database Worst Practices Abstract: Chances are you have heard, or even uttered, this expression. This demo-oriented session will show many examples where database professionals were dumbfounded by their own mistakes, and could even bring back memories of your own early DBA days. The goal of this session is to expose the small details that can be dangerous to the production environment and SQL Server as a whole, as well as talk about worst practices and how to avoid them. Shedding light on some of these perils and the tricks to avoid them may even save your current job. After attending this session, Developers will only need 60 seconds to improve performance of their database server in their SharePoint implementation. We will have a quiz during the session to keep the conversation alive. Developers will walk out with scripts and knowledge that can be applied to their servers, immediately post the session. Additionally, all attendees of the session will have access to learning material presented in the session. The Unsung Hero Abstract: Slow Running Queries are the most common problem that developers face while working with SQL Server. While it is easy to blame the SQL Server for unsatisfactory performance, however the issue often persists with the way queries have been written, and how Indexes has been set up. The session will focus on the ways of identifying problems that slow down SQL Server, and Indexing tricks to fix them. Developers will walk out with scripts and knowledge that can be applied to their servers, immediately post the session. Register Now! I have learned from the Falafel Team that they are running out of tickets and soon they will close the registration. For next 10 days the price for the registration is only USD 149. Trust me, you can’t get such a world class training and networking opportunity at such a low price. Click to Register Here! Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority News, T SQL

Read the article
SQLAuthority News – Advantages of Distance Learning

- by Pinal Dave

Distance education is extremely popular – almost overnight, it seems. Almost everyone has taken an online course, or knows someone who has, or is considering joining an online school. There are many advantages and disadvantages to attending an online school – but the same can be said of attending a physical school! Let’s take a look at the top reasons to use distance education. 1) Flexibility. Physical universities are usually willing to make some concessions to student – like night classes, study hours, and online networks. However, nothing is going to beat the flexibility of distance education. You can attend classes and take notes anytime, anywhere, wearing anything you’d like! 2) Affordability. We don’t need to get into hard numbers to understand how an expensive university can be. Students are taking on more and more debt just to get an education. Many of these fees pay for room, board, and facilities. Distance education cuts out all these costs, and makes attending school much more affordable for the average student. 3) Try before you buy. Did you know that the average college student changes his or her major 10 times before they graduate? You can imagine that this kind of indecision plays a huge part in WHEN you graduate – not being able to make up your mind can cost you big bucks if you have to stay in school for extra years! Distance education allows you to take different classes from a wide range of disciplines. Do you want to study forensic science or English literature? Now you don’t have to pay for classes you can’t afford just to find out. 4) Pace yourself. Some students struggle in a traditional classroom setting – classes can be taught too fast, too slow, or there are too many distractions. Distance education allows mature students to set the pace themselves. They can rewatch lectures they didn’t catch the first time, or go through classes quickly if they are already familiar with the material – cutting out the chance of burning out or getting bored. 5) Lifelong learning. Maybe you already have a degree, but would like to learn more about your field, or a related field, or maybe even about something completely unrelated – just because you are curious! Distance education allows you to learn whatever you want ,whenever you want (and yes, wearing anything you’d like!). 6) Attend whatever college you want. Because of the popularity of distance education, physical campuses are getting in on the game by offering online courses – often just uploaded versions of classes already taught at their campus. Ever wanted to attend Harvard, but knew you couldn’t get in? Take a class online! Of course, you probably should not attempt to lie and say you have a Harvard degree, but Ivy League colleges are prestigious because they are the best in their field – take advantage of the best by taking an online course! I am a big believer in continuing education, whether it is online courses, returning to school, or even take informal classes online. Distance education can be a great way to accomplish these goals and become a lifelong learner. My friends at provides training through virtual classrooms for students who want to avoid travelling. Distance learning course allows IT aspirants to connect with trainers using the internet. I encourage everyone to check it out! Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Training, T SQL, Technology

Read the article
SQL Authority News – Secret Tool Box of Successful Bloggers: 52 Tips to Build a High Traffic Top Ranking Blog

- by Pinal Dave

When I started this blog, it was meant as a bookmark for myself for helpful tips and tricks. Gradually, it grew into a blog that others were reading and commenting on. While SQL and databases are my first love and the reason I started this blog, the side effect was that I discovered I loved writing. I discovered a secret goal I didn’t even know I wanted – I wanted to become an author. For a long time, writing this blog satisfied that urge. Gradually, though, I wanted to see my name in print. 12th Book Over the past few years I have authored and co-authored a number of books – they are all based on my knowledge of SQL Server, and were meant to spread my years of experience into the world, to share what I have learned with my community. I currently have elevan of these “manuals” available for sale. As exciting as it was to see my name in print, I still felt that there was more I could do as an author. That is when I realized that I am more than just a SQL expert. I have been writing this blog now for more than 10 years, and it grew from a personal bookmark to a thriving website with over 2 million views per month. I thought to myself “I could write a book about how to create a successful blog!” And that is exactly what I did. I am extremely excited to share with all of you my new book – “Secret Toolbox of Successful Bloggers.” A Labor of Love This project has been a labor of love for me. It started out as a series for this blog – I would post one article a week until I felt the topic had been covered. I found that as I wrote, new topics kept popping up in my mind, and eventually this small blog series grew into a full book. The blog series was large enough to last a whole year, so I definitely thought that it could be a full book. Ideas on how to become a successful blogger were so frequent that, I will admit, I feel like there is so much I left out of this book. I had a lot more to say than I originally thought! I am so excited to be sharing this book with all of you. I am so passionate about this topic, and I feel like there are so many people who can benefit from this book. I know that when I started this blog, I did not know what I was doing, and I would have loved a “helping hand” to tell what to do and what not to do. If this book can act that way to any of my readers, I feel it is a success. Rules of Thumb If you are interested in the topic of becoming a blogger, as you read this book, keep in mind that it is suggestions only. Blogging is so new to the world that while there are “rules of thumb” about what to do and what not to do, a map of steps (“first, do x, then do y”) is not going to work for every single blogger. This book is meant to encourage new bloggers to put their content out there in the world, to be brave and create a community like the one I have here at SQL Authority. I have gained so much from this community, I wanted to give something back, and this book is just one small part. I hope that everyone who reads this books finds at least one helpful tip, and that everyone can experience the joy of blogging. That is the whole reason I wrote this book, and what I hope everyone takes away from it. Where Can You Get It? You can get the book from following URL: Kindle eBook | Print Book Reference: Pinal Dave (http://blog.SQLAuthority.com)Filed under: About Me, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority Author Visit, T SQL

Read the article
SQL SERVER – How to Get SQL Server Restart Notification?

- by Pinal Dave

Few days back my friend called me to know if there is any tool which can be used to get restart notification about SQL in their environment. I told that SQL Server can do it by itself with some configurations. He was happy and surprised to know that he need not spend any extra money. In SQL Server, we can configure stored procedure(s) to run at start-up of SQL Server. This blog would give steps to achieve how to achieve it. There are many situations where this feature can be used. Below are few. Logging SQL Server startup timings Modify data in some table during startup (i.e. table in tempdb) Sending notification about SQL start. Step 1 – Enable ‘scan for startup procs’ This can be done either using T-SQL or User Interface of Management Studio. EXEC sys.sp_configure N'Show Advanced Options', N'1' GO RECONFIGURE WITH OVERRIDE GO EXEC sys.sp_configure N'scan for startup procs', N'1' GO RECONFIGURE WITH OVERRIDE GO Below is the interface to change the setting. We need to go to “Server” > “Properties” and use “Advanced” tab. “Scan for Startup Procs” is the parameter under “Miscellaneous” section as shown below. We need to make value as “True” and hit OK. Step 2 – Create stored procedure It’s important to note that the procedure is executed after recovery is finished for ALL databases. Here is a sample stored procedure. You can use your own logic in the procedure. CREATE PROCEDURE SQLStartupProc AS BEGIN CREATE TABLE ##ThisTableShouldAlwaysExists (AnyColumn INT) END Step 3 – Set Procedure to run at startup We need to use sp_procoption to mark the procedure to run at startup. Here is the code to let SQL know that this is startup proc. sp_procoption 'SQLStartupProc', 'startup', 'true' This can be used only for procedures in master database. Msg 15398, Level 11, State 1, Procedure sp_procoption, Line 89 Only objects in the master database owned by dbo can have the startup setting changed. We also need to remember that such procedure should not have any input/output parameter. Here is the error which would be raised. Msg 15399, Level 11, State 1, Procedure sp_procoption, Line 107 Could not change startup option because this option is restricted to objects that have no parameters. Verification Here is the query to find which procedures is marked as startup procedures. SELECT name FROM sys.objects WHERE OBJECTPROPERTY(OBJECT_ID, 'ExecIsStartup') = 1 Once this is done, I have restarted SQL instance and here is what we would see in SQL ERRORLOG Launched startup procedure 'SQLStartupProc'. This confirms that stored procedure is executed. You can also notice that this is done after all databases are recovered. Recovery is complete. This is an informational message only. No user action is required. After few days my friend again called me and asked – I want to turn this OFF? Use comments section and post the answer for him. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Utility, T SQL

Read the article
Big Data – ClustrixDB – Extreme Scale SQL Database with Real-time Analytics, Releases Software Download – NewSQL

- by Pinal Dave

There are so many things to learn and there is so little time we all have. As we have little time we need to be selective to learn whatever we learn. I believe I know quite a lot of things in SQL but I still do not know what is around SQL. I have started to learn about NewSQL recently. If you wonder what is NewSQL I encourage all of you to read my blog post about NewSQL over here Big Data – Buzz Words: What is NewSQL – Day 10 of 21. NewSQL databases are quickly becoming popular – providing the scale of NoSQL with the SQL features and transactions. As a part of learning NewSQL database, I have recently started to learn about ClustrixDB. ClustrixDB has been the most mature NewSQL database used by some of the largest internet sites in the world for over 3 years, with extensive SQL support. In addition to scale, it provides fast real-time analytics by bringing massively parallel processing (MPP), available only in warehousing databases, to the transactional database. The reason I am more intrigued about learning ClustrixDB is their recent announcement on Oct 31. ClustrixDB was only available as an appliance, but now with their software release on Oct 31, everyone can use it. It is now available as forever free for up to 12 cores with community support, and there is a 45 day trial for unlimited cluster sizes. With the forever free world, I am indeed interested in ClustrixDB now. I know that few of the leading eCommerce sites in the world uses them for their transactional database. Here are few of the details I have quickly noted for ClustrixDB. ClustrixDB allows user to: Scale by simply adding nodes to the cluster with a single command Run billions of transactions a day Run fast real-time analytics Achieve high-availability with recovery from node failure Manages itself Easily migrate from MySQL as it is nearly plug-and-play compatible, use MySQL drivers, tools and replication. While I was going through the documentation I realized that ClustrixDB also has extensive support for SQL features including complex queries involving joins on a dozen or more tables, aggregates, sorts, sub-queries. It also supports stored procedures, triggers, foreign keys, partitioned and temporary tables, and fully online schema changes. It is indeed a very matured product and SQL solution. Indeed Clusterix sound very promising solution, I decided to dig a bit deeper to understand who are current customers of the Clustrix as they exist in the industry for quite a few years. Their client list is indeed very interesting and here is my quick research about them. Twoo.com – Europe’s largest social discovery (dating) site runs 4.4 Billion Transactions a day with table sizes over a Terabyte, on a 168 core cluster. EngageBDR – Top 3 in the online advertising category uses ClustrixDB to serve 6.9 billion ads a day through real-time bidding platform. Their reports went from 4 hours to 15 seconds. NoMoreRack – Top 2 fastest growing e-commerce company in US used ClustrixDB for high availability and fast growth through Amazon cloud. MakeMyTrip – India’s leading travel site runs on ClustrixDB with two clusters running as multi-master in Chennai and Bangalore. Many enterprises such as AOL, CSC, Rakuten, Symantec use ClustrixDB when their applications need scale. I must accept that I am impressed with the information I have learned so far and now is the time to do some hand’s on experience with their product. I want to learn this technology so in future when it is about NewSQL, I know what I am talking about. Read more why Clustrix explains why you ClustrixDB might be the right database for you. Download ClustrixDB with me today and install it on your machine so in future when we discuss the technical aspects of it, we all are on the same page. The software can be downloaded here. Reference : Pinal Dave (http://blog.SQLAuthority.com)Filed under: Big Data, MySQL, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Clustrix

Read the article
Developer’s Life – Summary of Superhero Articles

- by Pinal Dave

Earlier this year, I wrote an article series where I talked about developer’s life and compared it with Superhero. I have got amazing response to this series and I have been receiving quite a lots of email suggesting that I should write more blog post about them. Currently I am not planning to write more blog post but I will soon continue another series. In this blog post, I have summarized the entire series. Let me know if you want me to write about any superhero. I will see what I can do about that hero. Developer’s Life – Every Developer is a Captain America Captain America was first created as a comic book character in the 1940’s as a way to boost morale during World War II. Aimed at a children’s audience, his legacy faded away when the war ended. However, he has recently has a major reboot to become a popular movie character that deals with modern issues. Developer’s Life – Every Developer is the Incredible Hulk The Incredible Hulk is possibly one of the scariest superheroes out there. All superheroes are meant to be “out of this world” and awe-inspiring, but I think most people will agree with I say The Hulk takes this to the next level. He is the result of an industrial accident, which is scary enough in it’s own right. Plus, when mild-mannered Bruce Banner is angered, he goes completely out-of-control and transforms into a destructive monster that he cannot control and has no memories of. Developer’s Life – Every Developer is a Wonder Woman We have focused a lot lately on this “superhero series.” I love fantasy books and movies, and I feel like there is a lot to be learned from them. As I am writing this series, though, I have noticed that every super hero I write about is a man. So today, I would like to talk about the major female super hero – Wonder Woman. Developer’s Life – Every Developer is a Harry Potter Harry Potter might not be a superhero in the traditional sense, but I believe he still has a lot to teach us and show us about life as a developer. If you have been living under a rock for the last 17 years, you might not know that Harry Potter is the main character in an extremely popular series of books and movies documenting the education and tribulation of a young wizard (and his friends). Developer’s Life – Every Developer is Like Transformers Transformers may not be superheroes – they don’t wear capes, they don’t have amazing powers outside of their size and folding ability, they’re not even human (technically). Part of their enduring popularity is that while we are enjoying over-the-top movies, we are learning about good leadership and strong personal skills. Developer’s Life – Every Developer is a Iron Man Iron Man is another superhero who is not naturally “super,” but relies on his brain (and money) to turn him into a fighting machine. While traditional superheroes are still popular, a three-movie franchise and incorporation into the new Avengers series shows that Iron Man is popular enough on his own. Developer’s Life – Every Developer is a Sherlock Holmes I have been thinking a lot about how developers are like super heroes, and I have written two blog posts now comparing them to Spiderman and Superman. I have a lot of love and respect for developers, and I hope that they are enjoying these articles, and others are learning a little bit about the profession. There is another fictional character who, while not technically asuper hero, is very powerful, and I also think stands as a good example of a developer. That character is Sherlock Holmes. Sherlock Holmes is a British detective, first made popular at the turn of the 19thcentury by author Sir Arthur Conan Doyle. The original Sherlock Holmes was a brilliant detective who could solve the most mind-boggling crime through simple observations and deduction. Developer’s Life – Every Developer is a Chhota Bheem Chhota Bheem is a cartoon character that is extremely popular where I live. He is my daughter’s favorite characters. I like to say that children love Chhota Bheem more than their parents – it is lucky for us he is not real! Children love Chhota Bheem because he is the absolute “good guy.” He is smart, loyal, and strong. He and his friends live in Dholakpur and fight off their many enemies – and always win – in every episode. In each episode, they learn something about friendship, bravery, and being kind to others. Chhota Bheem is a good role model for children, and I think that he is a good role model for developers are well. Developer’s Life – Every Developer is a Batman Batman is one of the darkest superheroes in the fantasy canon. He does not come to his powers through any sort of magical coincidence or radioactive insect, but through a lot of psychological scarring caused by witnessing the death of his parents. Despite his dark back story, he possesses a lot of admirable abilities that I feel bear comparison to developers. Developer’s Life – Every Developer is a Superman I enjoyed comparing developers to Spiderman so much, that I have decided to continue the trend and encourage some of my favorite people (developers) with another favorite superhero – Superman. Superman is probably the most famous superhero – and one of the most inspiring. Developer’s Life – Every Developer is a Spiderman I have to admit, Spiderman is my favorite superhero. The most recent movie recently was released in theaters, so it has been at the front of my mind for some time. Spiderman was my favorite superhero even before the latest movie came out, but of course I took my whole family to see the movie as soon as I could! Every one of us loved it, including my daughter. We all left the movie thinking how great it would be to be Spiderman. So, with that in mind, I started thinking about how we are like Spiderman in our everyday lives, especially developers. I would like to know which Superhero is your favorite hero! Reference: Pinal Dave (http://blog.SQLAuthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Developer, Superhero

Read the article
SQLAuthority News – Don’t Be Afraid To Fool The World – Video by John Sonmez

- by Pinal Dave

Sometime some words and statements grabs your attention and it is hard to stop thinking about that after a while. Something similar happened a few days ago when I read the twitter statement of my friend and Pluralsight author John Sonmez. He twitted few days ago very interesting statement. “I don’t know a single successful person, who doesn’t deep down think that have the world fooled. #fooltheworld” by John Sonmez. When I read it, I was extremely intrigued by this statement. I read it many times, I shared with my family and I just could not stop interpreting this statement. It was indeed fun to read it again and again and there are so many different meanings one can take away from the statement. I know John very well, he is a wonderful person and have very positive energy for the life. I just had to request him to build a video around it. Right after 5 days of my request, John created a wonderful video around this subject. I watched it multiple times as it was a wonderful video. I am not going to write about what was in the video much as I suggest you to watch the video itself. Here is one of the personal stories I want to share which is absolutely relevant to this video. I think my story 100% resonant the story of John. A Real Story from My Past Three years ago, I submitted a session in one of the SharePoint conference as a SQL Server session. My session was accepted and I prepared it very well. I put more than 2 month’s time to prepare for the session and I was very excited to present the session. I reached to the event place traveling thousands of the miles and I was very much excited to present the session. However, there was a little mixed up in the session. There were multiple session which were similar to my session title. One of the other speakers also had proposed a database related session and was selected. When the material went to print the printing team got confused and by mistake swapped the sessions. The other speaker got Performance with SQL Server session and I had received Performance with SharePoint session. IT was indeed a big mixed up but now that is how it was in the event guide and it was marketed the same way everything in the event. A Big Mix Up I had to talk with the event organizer and we come to the conclusion that we all had good intention but things just got mixed up and now was the time when “The show must go on“. I had a great amount of hesitation to go and present the session as I had personally never worked with Sharepoint so close in my life and my session abstracted talked about SharePoint tricks in depth. Two hours before the session I took the help of one of my friend and installed the SharePoint on my box. He showed me a few things here and there but it was never a good enough time to learn everything which I wanted to learn. The Moments of Confidence I was very scared and nervous to go on the stage as a SharePoint was not something I felt comfortable. However, I decided to go on stage with confidence as a SharePoint expert. Though I did not know SharePoint at the best, I had confidence that whatever I know is correct and I will not misguide people. I had no intention to fool people but I had no intention to accept that I am a fool and you all wasted your time and money to dedicate your time to attend my session. I decided to be honest but at the same time decided to take the session beyond my expertise. The sixty minutes of the session went very fine and I was able to manage all the difficult question at a satisfactory level. When the session was over my feeling was that I would have not presented or talked any different if I had more knowledge of the SharePoint at that time. I think it was one of my best sessions and it was reflected in the session feedback as well. I was the best speaker across all the track and my session had highest ranking. I was delighted and I learned a very valuable lesson. I must go beyond my limits and knowledge. I must aim higher and work harder. I should not lie but I should have confidence that I have a good heart and I put 100% in my efforts. Lessions Learned Since this incident I have learned a lot about SharePoint and I am now a regular speaker at various SharePoint conferences along with SQL Server sessions. I am motivated and I am not afraid. I know people have lots of expectation from me but I have learned not to judge myself before I do my best. I leave the judgement of my efforts to my audience. I do not take the burden of the feedback on me, even though I know my audience have expected from me. I know what I know and I put my best. I must go out, if I fail, I learn from my mistake but I must keep my progress trajectory very high. As John said in the video, sometime success is not something we can achieve 100% but we can keep on going near to it. As long as we do not lose our focus from our goal and do not deviate from our progress path, we are doing things right. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: About Me, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – SSMS: Memory Usage By Memory Optimized Objects Report

- by Pinal Dave

At conferences and at speaking engagements at the local UG, there is one question that keeps on coming which I wish were never asked. The question around, “Why is SQL Server using up all the memory and not releasing even when idle?” Well, the answer can be long and with the release of SQL Server 2014, this got even more complicated. This release of SQL Server 2014 has the option of introducing In-Memory OLTP which is completely new concept and our dependency on memory has increased multifold. In reality, nothing much changes but we have memory optimized objects (Tables and Stored Procedures) additional which are residing completely in memory and improving performance. As a DBA, it is humanly impossible to get a hang of all the innovations and the new features introduced in the next version. So today’s blog is around the report added to SSMS which gives a high level view of this new feature addition. This reports is available only from SQL Server 2014 onwards because the feature was introduced in SQL Server 2014. Earlier versions of SQL Server Management Studio would not show the report in the list. If we try to launch the report on the database which is not having In-Memory File group defined, then we would see the message in report. To demonstrate, I have created new fresh database called MemoryOptimizedDB with no special file group. Here is the query used to identify whether a database has memory-optimized file group or not. SELECT TOP(1) 1 FROM sys.filegroups FG WHERE FG.[type] = 'FX' Once we add filegroup using below command, we would see different version of report. USE [master] GO ALTER DATABASE [MemoryOptimizedDB] ADD FILEGROUP [IMO_FG] CONTAINS MEMORY_OPTIMIZED_DATA GO The report is still empty because we have not defined any Memory Optimized table in the database. Total allocated size is shown as 0 MB. Now, let’s add the folder location into the filegroup and also created few in-memory tables. We have used the nomenclature of IMO to denote “InMemory Optimized” objects. USE [master] GO ALTER DATABASE [MemoryOptimizedDB] ADD FILE ( NAME = N'MemoryOptimizedDB_IMO', FILENAME = N'E:\Program Files\Microsoft SQL Server\MSSQL12.SQL2014\MSSQL\DATA\MemoryOptimizedDB_IMO') TO FILEGROUP [IMO_FG] GO You may have to change the path based on your SQL Server configuration. Below is the script to create the table. USE MemoryOptimizedDB GO --Drop table if it already exists. IF OBJECT_ID('dbo.SQLAuthority','U') IS NOT NULL DROP TABLE dbo.SQLAuthority GO CREATE TABLE dbo.SQLAuthority ( ID INT IDENTITY NOT NULL, Name CHAR(500) COLLATE Latin1_General_100_BIN2 NOT NULL DEFAULT 'Pinal', CONSTRAINT PK_SQLAuthority_ID PRIMARY KEY NONCLUSTERED (ID), INDEX hash_index_sample_memoryoptimizedtable_c2 HASH (Name) WITH (BUCKET_COUNT = 131072) ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA) GO As soon as above script is executed, table and index both are created. If we run the report again, we would see something like below. Notice that table memory is zero but index is using memory. This is due to the fact that hash index needs memory to manage the buckets created. So even if table is empty, index would consume memory. More about the internals of how In-Memory indexes and tables work will be reserved for future posts. Now, use below script to populate the table with 10000 rows INSERT INTO SQLAuthority VALUES (DEFAULT) GO 10000 Here is the same report after inserting 1000 rows into our InMemory table. There are total three sections in the whole report. Total Memory consumed by In-Memory Objects Pie chart showing memory distribution based on type of consumer – table, index and system. Details of memory usage by each table. The information about all three is taken from one single DMV, sys.dm_db_xtp_table_memory_stats This DMV contains memory usage statistics for both user and system In-Memory tables. If we query the DMV and look at data, we can easily notice that the system tables have negative object IDs. So, to look at user table memory usage, below is the over-simplified version of query. USE MemoryOptimizedDB GO SELECT OBJECT_NAME(OBJECT_ID), * FROM sys.dm_db_xtp_table_memory_stats WHERE OBJECT_ID > 0 GO This report would help DBA to identify which in-memory object taking lot of memory which can be used as a pointer for designing solution. I am sure in future we will discuss at lengths the whole concept of In-Memory tables in detail over this blog. To read more about In-Memory OLTP, have a look at In-Memory OLTP Series at Balmukund’s Blog. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Server Management Studio, SQL Tips and Tricks, T SQL Tagged: SQL Memory, SQL Reports

Read the article
Big Data – Buzz Words: Importance of Relational Database in Big Data World – Day 9 of 21

- by Pinal Dave

In yesterday’s blog post we learned what is HDFS. In this article we will take a quick look at the importance of the Relational Database in Big Data world. A Big Question? Here are a few questions I often received since the beginning of the Big Data Series - Does the relational database have no space in the story of the Big Data? Does relational database is no longer relevant as Big Data is evolving? Is relational database not capable to handle Big Data? Is it true that one no longer has to learn about relational data if Big Data is the final destination? Well, every single time when I hear that one person wants to learn about Big Data and is no longer interested in learning about relational database, I find it as a bit far stretched. I am not here to give ambiguous answers of It Depends. I am personally very clear that one who is aspiring to become Big Data Scientist or Big Data Expert they should learn about relational database. NoSQL Movement The reason for the NoSQL Movement in recent time was because of the two important advantages of the NoSQL databases. Performance Flexible Schema In personal experience I have found that when I use NoSQL I have found both of the above listed advantages when I use NoSQL database. There are instances when I found relational database too much restrictive when my data is unstructured as well as they have in the datatype which my Relational Database does not support. It is the same case when I have found that NoSQL solution performing much better than relational databases. I must say that I am a big fan of NoSQL solutions in the recent times but I have also seen occasions and situations where relational database is still perfect fit even though the database is growing increasingly as well have all the symptoms of the big data. Situations in Relational Database Outperforms Adhoc reporting is the one of the most common scenarios where NoSQL is does not have optimal solution. For example reporting queries often needs to aggregate based on the columns which are not indexed as well are built while the report is running, in this kind of scenario NoSQL databases (document database stores, distributed key value stores) database often does not perform well. In the case of the ad-hoc reporting I have often found it is much easier to work with relational databases. SQL is the most popular computer language of all the time. I have been using it for almost over 10 years and I feel that I will be using it for a long time in future. There are plenty of the tools, connectors and awareness of the SQL language in the industry. Pretty much every programming language has a written drivers for the SQL language and most of the developers have learned this language during their school/college time. In many cases, writing query based on SQL is much easier than writing queries in NoSQL supported languages. I believe this is the current situation but in the future this situation can reverse when No SQL query languages are equally popular. ACID (Atomicity Consistency Isolation Durability) – Not all the NoSQL solutions offers ACID compliant language. There are always situations (for example banking transactions, eCommerce shopping carts etc.) where if there is no ACID the operations can be invalid as well database integrity can be at risk. Even though the data volume indeed qualify as a Big Data there are always operations in the application which absolutely needs ACID compliance matured language. The Mixed Bag I have often heard argument that all the big social media sites now a days have moved away from Relational Database. Actually this is not entirely true. While researching about Big Data and Relational Database, I have found that many of the popular social media sites uses Big Data solutions along with Relational Database. Many are using relational databases to deliver the results to end user on the run time and many still uses a relational database as their major backbone. Here are a few examples: Facebook uses MySQL to display the timeline. (Reference Link) Twitter uses MySQL. (Reference Link) Tumblr uses Sharded MySQL (Reference Link) Wikipedia uses MySQL for data storage. (Reference Link) There are many for prominent organizations which are running large scale applications uses relational database along with various Big Data frameworks to satisfy their various business needs. Summary I believe that RDBMS is like a vanilla ice cream. Everybody loves it and everybody has it. NoSQL and other solutions are like chocolate ice cream or custom ice cream – there is a huge base which loves them and wants them but not every ice cream maker can make it just right for everyone’s taste. No matter how fancy an ice cream store is there is always plain vanilla ice cream available there. Just like the same, there are always cases and situations in the Big Data’s story where traditional relational database is the part of the whole story. In the real world scenarios there will be always the case when there will be need of the relational database concepts and its ideology. It is extremely important to accept relational database as one of the key components of the Big Data instead of treating it as a substandard technology. Ray of Hope – NewSQL In this module we discussed that there are places where we need ACID compliance from our Big Data application and NoSQL will not support that out of box. There is a new termed coined for the application/tool which supports most of the properties of the traditional RDBMS and supports Big Data infrastructure – NewSQL. Tomorrow In tomorrow’s blog post we will discuss about NewSQL. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL SERVER – SSMS: Backup and Restore Events Report

- by Pinal Dave

A DBA wears multiple hats and in fact does more than what an eye can see. One of the core task of a DBA is to take backups. This looks so trivial that most developers shrug this off as the only activity a DBA might be doing. I have huge respect for DBA’s all around the world because even if they seem cool with all the scripting, automation, maintenance works round the clock to keep the business working almost 365 days 24×7, their worth is knowing that one day when the systems / HDD crashes and you have an important delivery to make. So these backup tasks / maintenance jobs that have been done come handy and are no more trivial as they might seem to be as considered by many. So the important question like: “When was the last backup taken?”, “How much time did the last backup take?”, “What type of backup was taken last?” etc are tricky questions and this report lands answers to the same in a jiffy. So the SSMS report, we are talking can be used to find backups and restore operation done for the selected database. Whenever we perform any backup or restore operation, the information is stored in the msdb database. This report can utilize that information and provide information about the size, time taken and also the file location for those operations. Here is how this report can be launched. Once we launch this report, we can see 4 major sections shown as listed below. Average Time Taken For Backup Operations Successful Backup Operations Backup Operation Errors Successful Restore Operations Let us look at each section next. Average Time Taken For Backup Operations Information shown in “Average Time Taken For Backup Operations” section is taken from a backupset table in the msdb database. Here is the query and the expanded version of that particular section USE msdb; SELECT (ROW_NUMBER() OVER (ORDER BY t1.TYPE))%2 AS l1 , 1 AS l2 , 1 AS l3 , t1.TYPE AS [type] , (AVG(DATEDIFF(ss,backup_start_date, backup_finish_date)))/60.0 AS AverageBackupDuration FROM backupset t1 INNER JOIN sys.databases t3 ON ( t1.database_name = t3.name) WHERE t3.name = N'AdventureWorks2014' GROUP BY t1.TYPE ORDER BY t1.TYPE On my small database the time taken for differential backup was less than a minute, hence the value of zero is displayed. This is an important piece of backup operation which might help you in planning maintenance windows. Successful Backup Operations Here is the expanded version of this section. This information is derived from various backup tracking tables from msdb database. Here is the simplified version of the query which can be used separately as well. SELECT * FROM sys.databases t1 INNER JOIN backupset t3 ON (t3.database_name = t1.name) LEFT OUTER JOIN backupmediaset t5 ON ( t3.media_set_id = t5.media_set_id) LEFT OUTER JOIN backupmediafamily t6 ON ( t6.media_set_id = t5.media_set_id) WHERE (t1.name = N'AdventureWorks2014') ORDER BY backup_start_date DESC,t3.backup_set_id,t6.physical_device_name; The report does some calculations to show the data in a more readable format. For example, the backup size is shown in KB, MB or GB. I have expanded first row by clicking on (+) on “Device type” column. That has shown me the path of the physical backup file. Personally looking at this section, the Backup Size, Device Type and Backup Name are critical and are worth a note. As mentioned in the previous section, this section also has the Duration embedded inside it. Backup Operation Errors This section of the report gets data from default trace. You might wonder how. One of the event which is tracked by default trace is “ErrorLog”. This means that whatever message is written to errorlog gets written to default trace file as well. Interestingly, whenever there is a backup failure, an error message is written to ERRORLOG and hence default trace. This section takes advantage of that and shows the information. We can read below message under this section, which confirms above logic. No backup operations errors occurred for (AdventureWorks2014) database in the recent past or default trace is not enabled. Successful Restore Operations This section may not be very useful in production server (do you perform a restore of database?) but might be useful in the development and log shipping secondary environment, where we might be interested to see restore operations for a particular database. Here is the expanded version of the section. To fill this section of the report, I have restored the same backups which were taken to populate earlier sections. Here is the simplified version of the query used to populate this output. USE msdb; SELECT * FROM restorehistory t1 LEFT OUTER JOIN restorefile t2 ON ( t1.restore_history_id = t2.restore_history_id) LEFT OUTER JOIN backupset t3 ON ( t1.backup_set_id = t3.backup_set_id) WHERE t1.destination_database_name = N'AdventureWorks2014' ORDER BY restore_date DESC, t1.restore_history_id,t2.destination_phys_name Have you ever looked at the backup strategy of your key databases? Are they in sync and do we have scope for improvements? Then this is the report to analyze after a week or month of maintenance plans running in your database. Do chime in with what are the strategies you are using in your environments. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Backup and Restore, SQL Query, SQL Server, SQL Server Management Studio, SQL Tips and Tricks, T SQL Tagged: SQL Reports

Read the article
Big Data – Role of Cloud Computing in Big Data – Day 11 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the NewSQL. In this article we will understand the role of Cloud in Big Data Story What is Cloud? Cloud is the biggest buzzword around from last few years. Everyone knows about the Cloud and it is extremely well defined online. In this article we will discuss cloud in the context of the Big Data. Cloud computing is a method of providing a shared computing resources to the application which requires dynamic resources. These resources include applications, computing, storage, networking, development and various deployment platforms. The fundamentals of the cloud computing are that it shares pretty much share all the resources and deliver to end users as a service. Examples of the Cloud Computing and Big Data are Google and Amazon.com. Both have fantastic Big Data offering with the help of the cloud. We will discuss this later in this blog post. There are two different Cloud Deployment Models: 1) The Public Cloud and 2) The Private Cloud Public Cloud Public Cloud is the cloud infrastructure build by commercial providers (Amazon, Rackspace etc.) creates a highly scalable data center that hides the complex infrastructure from the consumer and provides various services. Private Cloud Private Cloud is the cloud infrastructure build by a single organization where they are managing highly scalable data center internally. Here is the quick comparison between Public Cloud and Private Cloud from Wikipedia: Public Cloud Private Cloud Initial cost Typically zero Typically high Running cost Unpredictable Unpredictable Customization Impossible Possible Privacy No (Host has access to the data Yes Single sign-on Impossible Possible Scaling up Easy while within defined limits Laborious but no limits Hybrid Cloud Hybrid Cloud is the cloud infrastructure build with the composition of two or more clouds like public and private cloud. Hybrid cloud gives best of the both the world as it combines multiple cloud deployment models together. Cloud and Big Data – Common Characteristics There are many characteristics of the Cloud Architecture and Cloud Computing which are also essentially important for Big Data as well. They highly overlap and at many places it just makes sense to use the power of both the architecture and build a highly scalable framework. Here is the list of all the characteristics of cloud computing important in Big Data Scalability Elasticity Ad-hoc Resource Pooling Low Cost to Setup Infastructure Pay on Use or Pay as you Go Highly Available Leading Big Data Cloud Providers There are many players in Big Data Cloud but we will list a few of the known players in this list. Amazon Amazon is arguably the most popular Infrastructure as a Service (IaaS) provider. The history of how Amazon started in this business is very interesting. They started out with a massive infrastructure to support their own business. Gradually they figured out that their own resources are underutilized most of the time. They decided to get the maximum out of the resources they have and hence they launched their Amazon Elastic Compute Cloud (Amazon EC2) service in 2006. Their products have evolved a lot recently and now it is one of their primary business besides their retail selling. Amazon also offers Big Data services understand Amazon Web Services. Here is the list of the included services: Amazon Elastic MapReduce – It processes very high volumes of data Amazon DynammoDB – It is fully managed NoSQL (Not Only SQL) database service Amazon Simple Storage Services (S3) – A web-scale service designed to store and accommodate any amount of data Amazon High Performance Computing – It provides low-tenancy tuned high performance computing cluster Amazon RedShift – It is petabyte scale data warehousing service Google Though Google is known for Search Engine, we all know that it is much more than that. Google Compute Engine – It offers secure, flexible computing from energy efficient data centers Google Big Query – It allows SQL-like queries to run against large datasets Google Prediction API – It is a cloud based machine learning tool Other Players Besides Amazon and Google we also have other players in the Big Data market as well. Microsoft is also attempting Big Data with the Cloud with Microsoft Azure. Additionally Rackspace and NASA together have initiated OpenStack. The goal of Openstack is to provide a massively scaled, multitenant cloud that can run on any hardware. Thing to Watch The cloud based solutions provides a great integration with the Big Data’s story as well it is very economical to implement as well. However, there are few things one should be very careful when deploying Big Data on cloud solutions. Here is a list of a few things to watch: Data Integrity Initial Cost Recurring Cost Performance Data Access Security Location Compliance Every company have different approaches to Big Data and have different rules and regulations. Based on various factors, one can implement their own custom Big Data solution on a cloud. Tomorrow In tomorrow’s blog post we will discuss about various Operational Databases supporting Big Data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL – Quick Start with Explorer Sections of NuoDB – Query NuoDB Database

- by Pinal Dave

This is the third post in the series of the blog posts I am writing about NuoDB. NuoDB is very innovative and easy-to-use product. I can clearly see how one can scale-out NuoDB with so much ease and confidence. In my very first blog post we discussed how we can install NuoDB (link), and in my second post I discussed how we can manage the NuoDB database transaction engines and storage managers with a few clicks (link). Note: You can Download NuoDB from here. In this post, we will learn how we can use the Explorer feature of NuoDB to do various SQL operations. NuoDB has a browser-based Explorer, which is very powerful and has many of the features any IDE would normally have. Let us see how it works in the following step-by-step tutorial. Let us go to the NuoDBNuoDB Console by typing the following URL in your browser: http://localhost:8080/ It will bring you to the QuickStart screen. Make sure that you have created the sample database. If you have not created sample database, click on Create Database and create it successfully. Now go to the NuoDB Explorer by clicking on the main tab, and it will ask you for your domain username and password. Enter the username as a domain and password as a bird. Alternatively you can also enter username as a quickstart and password as a quickstart. Once you enter the password you will be able to see the databases. In our example we have installed the Sample Database hence you will see the Test database in our Database Hierarchy screen. When you click on database it will ask for the database login. Note that Database Login is different from Domain login and you will have to enter your database login over here. In our case the database username is dba and password is goalie. Once you enter a valid username and password it will display your database. Further expand your database and you will notice various objects in your database. Once you explore various objects, select any database and click on Open. When you click on execute, it will display the SQL script to select the data from the table. The autogenerated script displays entire result set from the database. The NuoDB Explorer is very powerful and makes the life of developers very easy. If you click on List SQL Statements it will list all the available SQL statements right away in Query Editor. You can see the popup window in following image. Here is the cool thing for geeks. You can even click on Query Plan and it will display the text based query plan as well. In case of a SELECT, the query plan will be much simpler, however, when we write complex queries it will be very interesting. We can use the query plan tab for performance tuning of the database. Here is another feature, when we click on List Tables in NuoDB Explorer. It lists all the available tables in the query editor. This is very helpful when we are writing a long complex query. Here is a relatively complex example I have built using Inner Join syntax. Right below I have displayed the Query Plan. The query plan displays all the little details related to the query. Well, we just wrote multi-table query and executed it against the NuoDB database. You can use the NuoDB Admin section and do various analyses of the query and its performance. NuoDB is a distributed database built on a patented emergent architecture with full support for SQL and ACID guarantees. It allows you to add Transaction Engine processes to a running system to improve the performance of your system. You can also add a second Storage Engine to your running system for redundancy purposes. Conversely, you can shut down processes when you don’t need the extra database resources. NuoDB also provides developers and administrators with a single intuitive interface for centrally monitoring deployments. If you have read my blog posts and have not tried out NuoDB, I strongly suggest that you download it today and catch up with the learnings with me. Trust me though the product is very powerful, it is extremely easy to learn and use. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: NuoDB

Read the article
SQL SERVER – Example of Performance Tuning for Advanced Users with DB Optimizer

- by Pinal Dave

Performance tuning is such a subject that everyone wants to master it. In beginning everybody is at a novice level and spend lots of time learning how to master the art of performance tuning. However, as we progress further the tuning of the system keeps on getting very difficult. I have understood in my early career there should be no need of ego in the technology field. There are always better solutions and better ideas out there and we should not resist them. Instead of resisting the change and new wave I personally adopt it. Here is a similar example, as I personally progress to the master level of performance tuning, I face that it is getting harder to come up with optimal solutions. In such scenarios I rely on various tools to teach me how I can do things better. Once I learn about tools, I am often able to come up with better solutions when I face the similar situation next time. A few days ago I had received a query where the user wanted to tune it further to get the maximum out of the performance. I have re-written the similar query with the help of AdventureWorks sample database. SELECT * FROM HumanResources.Employee e INNER JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID INNER JOIN HumanResources.Shift s ON edh.ShiftID = s.ShiftID; User had similar query to above query was used in very critical report and wanted to get best out of the query. When I looked at the query – here were my initial thoughts Use only column in the select statements as much as you want in the application Let us look at the query pattern and data workload and find out the optimal index for it Before I give further solutions I was told by the user that they need all the columns from all the tables and creating index was not allowed in their system. He can only re-write queries or use hints to further tune this query. Now I was in the constraint box – I believe * was not a great idea but if they wanted all the columns, I believe we can’t do much besides using *. Additionally, if I cannot create a further index, I must come up with some creative way to write this query. I personally do not like to use hints in my application but there are cases when hints work out magically and gives optimal solutions. Finally, I decided to use Embarcadero’s DB Optimizer. It is a fantastic tool and very helpful when it is about performance tuning. I have previously explained how it works over here. First open DBOptimizer and open Tuning Job from File >> New >> Tuning Job. Once you open DBOptimizer Tuning Job follow the various steps indicates in the following diagram. Essentially we will take our original script and will paste that into Step 1: New SQL Text and right after that we will enable Step 2 for Generating Various cases, Step 3 for Detailed Analysis and Step 4 for Executing each generated case. Finally we will click on Analysis in Step 5 which will generate the report detailed analysis in the result pan. The detailed pan looks like. It generates various cases of T-SQL based on the original query. It applies various hints and available hints to the query and generate various execution plans of the query and displays them in the resultant. You can clearly notice that original query had a cost of 0.0841 and logical reads about 607 pages. Whereas various options which are just following it has different execution cost as well logical read. There are few cases where we have higher logical read and there are few cases where as we have very low logical read. If we pay attention the very next row to original query have Merge_Join_Query in description and have lowest execution cost value of 0.044 and have lowest Logical Reads of 29. This row contains the query which is the most optimal re-write of the original query. Let us double click over it. Here is the query: SELECT * FROM HumanResources.Employee e INNER JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID INNER JOIN HumanResources.Shift s ON edh.ShiftID = s.ShiftID OPTION (MERGE JOIN) If you notice above query have additional hint of Merge Join. With the help of this Merge Join query hint this query is now performing much better than before. The entire process takes less than 60 seconds. Please note that it the join hint Merge Join was optimal for this query but it is not necessary that the same hint will be helpful in all the queries. Additionally, if the workload or data pattern changes the query hint of merge join may be no more optimal join. In that case, we will have to redo the entire exercise once again. This is the reason I do not like to use hints in my queries and I discourage all of my users to use the same. However, if you look at this example, this is a great case where hints are optimizing the performance of the query. It is humanly not possible to test out various query hints and index options with the query to figure out which is the most optimal solution. Sometimes, we need to depend on the efficiency tools like DB Optimizer to guide us the way and select the best option from the suggestion provided. Let me know what you think of this article as well your experience with DB Optimizer. Please leave a comment. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Joins, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Parsing SSIS Catalog Messages – Notes from the Field #030

- by Pinal Dave

[Note from Pinal]: This is a new episode of Notes from the Field series. SQL Server Integration Service (SSIS) is one of the most key essential part of the entire Business Intelligence (BI) story. It is a platform for data integration and workflow applications. The tool may also be used to automate maintenance of SQL Server databases and updates to multidimensional cube data. In this episode of the Notes from the Field series I requested SSIS Expert Andy Leonard to discuss one of the most interesting concepts of SSIS Catalog Messages. There are plenty of interesting and useful information captured in the SSIS catalog and we will learn together how to explore the same. The SSIS Catalog captures a lot of cool information by default. Here’s a query I use to parse messages from the catalog.operation_messages table in the SSISDB database, where the logged messages are stored. This query is set up to parse a default message transmitted by the Lookup Transformation. It’s one of my favorite messages in the SSIS log because it gives me excellent information when I’m tuning SSIS data flows. The message reads similar to: Data Flow Task:Information: The Lookup processed 4485 rows in the cache. The processing time was 0.015 seconds. The cache used 1376895 bytes of memory. The query: USE SSISDB GO DECLARE @MessageSourceType INT = 60 DECLARE @StartOfIDString VARCHAR(100) = 'The Lookup processed ' DECLARE @ProcessingTimeString VARCHAR(100) = 'The processing time was ' DECLARE @CacheUsedString VARCHAR(100) = 'The cache used ' DECLARE @StartOfIDSearchString VARCHAR(100) = '%' + @StartOfIDString + '%' DECLARE @ProcessingTimeSearchString VARCHAR(100) = '%' + @ProcessingTimeString + '%' DECLARE @CacheUsedSearchString VARCHAR(100) = '%' + @CacheUsedString + '%' SELECT operation_id , SUBSTRING(MESSAGE, (PATINDEX(@StartOfIDSearchString,MESSAGE) + LEN(@StartOfIDString) + 1), ((CHARINDEX(' ', MESSAGE, PATINDEX(@StartOfIDSearchString,MESSAGE) + LEN(@StartOfIDString) + 1)) - (PATINDEX(@StartOfIDSearchString, MESSAGE) + LEN(@StartOfIDString) + 1))) AS LookupRowsCount , SUBSTRING(MESSAGE, (PATINDEX(@ProcessingTimeSearchString,MESSAGE) + LEN(@ProcessingTimeString) + 1), ((CHARINDEX(' ', MESSAGE, PATINDEX(@ProcessingTimeSearchString,MESSAGE) + LEN(@ProcessingTimeString) + 1)) - (PATINDEX(@ProcessingTimeSearchString, MESSAGE) + LEN(@ProcessingTimeString) + 1))) AS LookupProcessingTime , CASE WHEN (CONVERT(numeric(3,3),SUBSTRING(MESSAGE, (PATINDEX(@ProcessingTimeSearchString,MESSAGE) + LEN(@ProcessingTimeString) + 1), ((CHARINDEX(' ', MESSAGE, PATINDEX(@ProcessingTimeSearchString,MESSAGE) + LEN(@ProcessingTimeString) + 1)) - (PATINDEX(@ProcessingTimeSearchString, MESSAGE) + LEN(@ProcessingTimeString) + 1))))) = 0 THEN 0 ELSE CONVERT(bigint,SUBSTRING(MESSAGE, (PATINDEX(@StartOfIDSearchString,MESSAGE) + LEN(@StartOfIDString) + 1), ((CHARINDEX(' ', MESSAGE, PATINDEX(@StartOfIDSearchString,MESSAGE) + LEN(@StartOfIDString) + 1)) - (PATINDEX(@StartOfIDSearchString, MESSAGE) + LEN(@StartOfIDString) + 1)))) / CONVERT(numeric(3,3),SUBSTRING(MESSAGE, (PATINDEX(@ProcessingTimeSearchString,MESSAGE) + LEN(@ProcessingTimeString) + 1), ((CHARINDEX(' ', MESSAGE, PATINDEX(@ProcessingTimeSearchString,MESSAGE) + LEN(@ProcessingTimeString) + 1)) - (PATINDEX(@ProcessingTimeSearchString, MESSAGE) + LEN(@ProcessingTimeString) + 1)))) END AS LookupRowsPerSecond , SUBSTRING(MESSAGE, (PATINDEX(@CacheUsedSearchString,MESSAGE) + LEN(@CacheUsedString) + 1), ((CHARINDEX(' ', MESSAGE, PATINDEX(@CacheUsedSearchString,MESSAGE) + LEN(@CacheUsedString) + 1)) - (PATINDEX(@CacheUsedSearchString, MESSAGE) + LEN(@CacheUsedString) + 1))) AS LookupBytesUsed ,CASE WHEN (CONVERT(bigint,SUBSTRING(MESSAGE, (PATINDEX(@StartOfIDSearchString,MESSAGE) + LEN(@StartOfIDString) + 1), ((CHARINDEX(' ', MESSAGE, PATINDEX(@StartOfIDSearchString,MESSAGE) + LEN(@StartOfIDString) + 1)) - (PATINDEX(@StartOfIDSearchString, MESSAGE) + LEN(@StartOfIDString) + 1)))))= 0 THEN 0 ELSE CONVERT(bigint,SUBSTRING(MESSAGE, (PATINDEX(@CacheUsedSearchString,MESSAGE) + LEN(@CacheUsedString) + 1), ((CHARINDEX(' ', MESSAGE, PATINDEX(@CacheUsedSearchString,MESSAGE) + LEN(@CacheUsedString) + 1)) - (PATINDEX(@CacheUsedSearchString, MESSAGE) + LEN(@CacheUsedString) + 1)))) / CONVERT(bigint,SUBSTRING(MESSAGE, (PATINDEX(@StartOfIDSearchString,MESSAGE) + LEN(@StartOfIDString) + 1), ((CHARINDEX(' ', MESSAGE, PATINDEX(@StartOfIDSearchString,MESSAGE) + LEN(@StartOfIDString) + 1)) - (PATINDEX(@StartOfIDSearchString, MESSAGE) + LEN(@StartOfIDString) + 1)))) END AS LookupBytesPerRow FROM [catalog].[operation_messages] WHERE message_source_type = @MessageSourceType AND MESSAGE LIKE @StartOfIDSearchString GO Note that you have to set some parameter values: @MessageSourceType [int] – represents the message source type value from the following results: Value Description 10 Entry APIs, such as T-SQL and CLR Stored procedures 20 External process used to run package (ISServerExec.exe) 30 Package-level objects 40 Control Flow tasks 50 Control Flow containers 60 Data Flow task 70 Custom execution message Note: Taken from Reza Rad’s (excellent!) helper.MessageSourceType table found here. @StartOfIDString [VarChar(100)] – use this to uniquely identify the message field value you wish to parse. In this case, the string ‘The Lookup processed ‘ identifies all the Lookup Transformation messages I desire to parse. @ProcessingTimeString [VarChar(100)] – this parameter is message-specific. I use this parameter to specifically search the message field value for the beginning of the Lookup Processing Time value. For this execution, I use the string ‘The processing time was ‘. @CacheUsedString [VarChar(100)] – this parameter is also message-specific. I use this parameter to specifically search the message field value for the beginning of the Lookup Cache Used value. It returns the memory used, in bytes. For this execution, I use the string ‘The cache used ‘. The other parameters are built from variations of the parameters listed above. The query parses the values into text. The string values are converted to numeric values for ratio calculations; LookupRowsPerSecond and LookupBytesPerRow. Since ratios involve division, CASE statements check for denominators that equal 0. Here are the results in an SSMS grid: This is not the only way to retrieve this information. And much of the code lends itself to conversion to functions. If there is interest, I will share the functions in an upcoming post. If you want to get started with SSIS with the help of experts, read more over at Fix Your SQL Server. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Notes from the Field, PostADay, SQL, SQL Authority, SQL Backup and Restore, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: SSIS

Read the article
Big Data – What is Big Data – 3 Vs of Big Data – Volume, Velocity and Variety – Day 2 of 21

- by Pinal Dave

Data is forever. Think about it – it is indeed true. Are you using any application as it is which was built 10 years ago? Are you using any piece of hardware which was built 10 years ago? The answer is most certainly No. However, if I ask you – are you using any data which were captured 50 years ago, the answer is most certainly Yes. For example, look at the history of our nation. I am from India and we have documented history which goes back as over 1000s of year. Well, just look at our birthday data – atleast we are using it till today. Data never gets old and it is going to stay there forever. Application which interprets and analysis data got changed but the data remained in its purest format in most cases. As organizations have grown the data associated with them also grew exponentially and today there are lots of complexity to their data. Most of the big organizations have data in multiple applications and in different formats. The data is also spread out so much that it is hard to categorize with a single algorithm or logic. The mobile revolution which we are experimenting right now has completely changed how we capture the data and build intelligent systems. Big organizations are indeed facing challenges to keep all the data on a platform which give them a single consistent view of their data. This unique challenge to make sense of all the data coming in from different sources and deriving the useful actionable information out of is the revolution Big Data world is facing. Defining Big Data The 3Vs that define Big Data are Variety, Velocity and Volume. Volume We currently see the exponential growth in the data storage as the data is now more than text data. We can find data in the format of videos, musics and large images on our social media channels. It is very common to have Terabytes and Petabytes of the storage system for enterprises. As the database grows the applications and architecture built to support the data needs to be reevaluated quite often. Sometimes the same data is re-evaluated with multiple angles and even though the original data is the same the new found intelligence creates explosion of the data. The big volume indeed represents Big Data. Velocity The data growth and social media explosion have changed how we look at the data. There was a time when we used to believe that data of yesterday is recent. The matter of the fact newspapers is still following that logic. However, news channels and radios have changed how fast we receive the news. Today, people reply on social media to update them with the latest happening. On social media sometimes a few seconds old messages (a tweet, status updates etc.) is not something interests users. They often discard old messages and pay attention to recent updates. The data movement is now almost real time and the update window has reduced to fractions of the seconds. This high velocity data represent Big Data. Variety Data can be stored in multiple format. For example database, excel, csv, access or for the matter of the fact, it can be stored in a simple text file. Sometimes the data is not even in the traditional format as we assume, it may be in the form of video, SMS, pdf or something we might have not thought about it. It is the need of the organization to arrange it and make it meaningful. It will be easy to do so if we have data in the same format, however it is not the case most of the time. The real world have data in many different formats and that is the challenge we need to overcome with the Big Data. This variety of the data represent represent Big Data. Big Data in Simple Words Big Data is not just about lots of data, it is actually a concept providing an opportunity to find new insight into your existing data as well guidelines to capture and analysis your future data. It makes any business more agile and robust so it can adapt and overcome business challenges. Tomorrow In tomorrow’s blog post we will try to answer discuss Evolution of Big Data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article

Search Results

Search found 2461 results on 99 pages for 'dave clausen'.

Page 20/99 | < Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27 | Next Page >

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

< Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27 | Next Page >