Search Results

Search found 6404 results on 257 pages for 'mad han ror tips'.

Page 62/257 | < Previous Page | 58 59 60 61 62 63 64 65 66 67 68 69 | Next Page >

SQLAuthority News – Pluralsight Course Review – Practices for Software Startups – Part 2 of 2

- by pinaldave

This is the second part of the two part series of Practices for Software Startup Pluralsight Course. Please read the first part of this series over here. The course is written by Stephen Forte (Blog | Twitter). Stephen Forte is the Chief Strategy Officer of the venture backed company, Telerik. Personal Learning Schedule After these three sessions it was 6:30 am and time to do my own blog. But for the rest of the day, I kept thinking about the course, and wanted to go back and finish. I was wishing that I had woken up at 3 am so I could finish all at one go. All day long I was digesting what I had learned. At 10 pm, after my daughter had gone to bed, I sighed on again. I was not disappointed by the long wait. As I mentioned before, Stephen has started four to six companies, and all of them are very successful today. Here is the video I promised yesterday – it discusses the importance of Right Sizing Your Startup. The Heartbeat of Startup – Technology Stephen has combined all technology knowledge into one 30 minute session. He discussed how to start your project, how to deal with opinions, and how to deal with multiple ideas – every start up has multiple directions it can go. He spent a lot of time emphasized deciding which direction to go and how to decide which will be the best for you. He called it a continuous development cycle. One of the biggest hazards for a start-up company is one person deciding the direction the company will go, until down the road another team member announces that there is a glitch in their part of the work and that everyone will have to start over. Even though a team of two or five people can move quickly, often the decision has gone too long and cannot be easily fixed. Stephen used an example from his own life: he was biased for one type of technology, and his teammate for another. In the end they opted for his teammate’s choice , and in the end it was a good decision, even though he was unfamiliar with that particular program. He argues that technology should not be a barrier to progress, that you cannot rely on your experience only. This really spoke to me because I am a big fan of SQL, but I know there is more out there, and I should be more open to it. I give my thanks to Stephen, I learned something in this module besides startups. Money, Success and Epic Win! The longest, but most interesting, the module was funding your start-up. You need to fund the start-up right at the very beginning, if not done right you will run into trouble. The good news is that a few years ago start-ups required a lot more money – think millions of dollars – but now start-ups can get off the ground for thousands. Stephen used an example of a company that years ago would have needed a million dollars, but today could be started for $600. It is true that things have changed, but you still need money. For $600 you can start small and add dynamically, as needed. But the truth is that if you have $600, $6000, or $6 million, it will be spent. Don’t think of it as trying to save money, think of it as investing in your future. You will need money, and you will need to (quickly) decide what you do with the money: shares, stakeholders, investing in a team, hiring a CEO. This is so important because once you have money and start the company, the company IS your money. It is your biggest currency – having a percentage of ownership in the company. Investors will want percentages as repayment for their investment, and they will want a say in the business as well. You will have to decide how far you will dilute your shares, and how the company will be divided, if at all. If you don’t plan in advance, you will find that after gaining three or four investors, suddenly you are the minority owner in your own dream. You need to understand funding carefully. This single module is worth all the money you would have spent on the whole course alone. I encourage everyone to listen to this single module even if they don’t watch any of the others. Press End to Start the Game – Exists! The final module is exit strategies. You did all this work, dealt with all political and legal issues. What are you going to get out of it? The answer is simple: money. Maybe you want your company to be bought out, for you talent to bring you a profit. You can sell the company to someone and still head it. Many options are available. You could sell and still work as an employee but no longer own the company. There are many exit strategies. This is where all your hard work comes into play. It is important not to feel fooled at any step. There are so many good ideas that end up in the garbage because of poor planning, so that if you find yourself successful, you don’t want to blow it at this step! The exit is important. I thought that this aspect of the course was completely unique, and I loved Stephen’s point of view. I was lost deep in thought after this module ended. I actually took two hours worth of notes on this section alone – and it was only a three hour course. I am planning on attending this course one more time next week, just to catch up on all the small bits of wisdom I’m sure I missed. Thank you Stephen for bringing your real world experience with us! I recommend that everyone attends this course, even if they don’t want to begin their own start-up company. It was indeed a long day for me. Do not forget to read part 1 of this story and attend course Practices for Software Startup Pluralsight Course. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Best Practices, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Introduction – Day 1 of 31

- by pinaldave

List of all the Interview Questions and Answers Series blogs Posts covering interview questions and answers always make for interesting reading. Some people like the subject for their helpful hints and thought provoking subject, and others dislike these posts because they feel it is nothing more than cheating. I’d like to discuss the pros and cons of a Question and Answer format here. Interview Questions and Answers are Helpful Just like blog posts, books, and articles, interview Question and Answer discussions are learning material. The popular Dummy’s books or Idiots Guides are not only for “dummies,” but can help everyone relearn the fundamentals. Question and Answer discussions can serve the same purpose. You could call this SQL Server Fundamentals or SQL Server 101. I have administrated hundreds of interviews during my career and I have noticed that sometimes an interviewee with several years of experience lacks an understanding of the fundamentals. These individuals have been in the industry for so long, usually working on a very specific project, that the ABCs of the business have slipped their mind. Or, when a college graduate is looking to get into the industry, he is not expected to have experience since he is just graduated. However, the new grad is expected to have an understanding of fundamentals and theory. Sometimes after the stress of final exams and graduation, it can be difficult to remember the correct answers to interview questions, though. An interview Question and Answer discussion can be very helpful to both these individuals. It is simply a way to go back over the building blocks of a topic. Many times a simple review like this will help “jog” your memory, and all those previously-memorized facts will come flooding back to you. It is not a way to re-learn a topic, but a way to remind yourself of what you already know. A Question and Answer discussion can also be a way to go over old topics in a more interesting manner. Especially if you have been working in the industry, or taking lots of classes on the topic, everything you read can sound like a repeat of what you already know. Going over a topic in a new format can make the material seem fresh and interesting. And an interested mind will be more engaged and remember more in the end. Interview Questions and Answers are Harmful A common argument against a Question and Answer discussion is that it will give someone a “cheat sheet.” A new guy with relatively little experience can read the interview questions and answers, and then memorize them. When an interviewer asks him the same questions, he will repeat the answers and get the job. Honestly, is he good hire because he memorized the interview questions? Wouldn’t it be better for the interviewer to hire someone with actual experience? The answer is not as easy as it seems – there are many different factors to be considered. If the interviewer is asking fundamentals-related questions only, he gets the answers he wants to hear, and then hires this first candidate – there is a good chance that he is hiring based on personality rather than experience. If the interviewer is smart he will ask deeper questions, have more than one person on the interview team, and interview a variety of candidates. If one interviewee happens to memorize some answers, it usually doesn’t mean he will automatically get the job at the expense of more qualified candidates. Another argument against interview Question and Answers is that it will give candidates a false sense of confidence, and that they will appear more qualified than they are. Well, if that is true, it will not last after the first interview when the candidate is asked difficult questions and he cannot find the answers in the list of interview Questions and Answers. Besides, confidence is one of the best things to walk into an interview with! In today’s competitive job market, there are often hundreds of candidates applying for the same position. With so many applicants to choose from, interviewers must make decisions about who to call back and who to hire based on their gut feeling. One drawback to reading an interview Question and Answer article is that you might sound very boring in your interview – saying the same thing as every single candidate, and parroting answers that sound like someone else wrote them for you – because they did. However, it is definitely better to go to an interview prepared, just make sure that you give a lot of thought to your answers to make them sound like your own voice. Remember that you will be hired based on your skills as well as your personality, so don’t think that having all the right answers will make get you hired. A good interviewee will be prepared, confident, and know how to stand out. My Opinion A list of interview Questions and Answers is really helpful as a refresher or for beginners. To really ace an interview, one needs to have real-world, hands-on experience with SQL Server as well. Interview questions just serve as a starter or easy read for experienced professionals. When I have to learn new technology, I often search online for interview questions and get an idea about the breadth and depth of the technology. Next Action I am going to write about interview Questions and Answers for next 30 days. I have previously written a series of interview questions and answers; now I have re-written them keeping the latest version of SQL Server and current industry progress in mind. If you have faced interesting interview questions or situations, please write to me and I will publish them as a guest post. If you want me to add few more details, leave a comment and I will make sure that I do my best to accommodate. Tomorrow we will start the interview Questions and Answers series, with a few interesting stories, best practices and guest posts. We will have a prize give-away and other awards when the series ends. List of all the Interview Questions and Answers Series blogs Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Interview Questions and Answers, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Big Data – Is Big Data Relevant to me? – Big Data Questionnaires – Guest Post by Vinod Kumar

- by Pinal Dave

This guest post is by Vinod Kumar. Vinod Kumar has worked with SQL Server extensively since joining the industry over a decade ago. Working on various versions of SQL Server 7.0, Oracle 7.3 and other database technologies – he now works with the Microsoft Technology Center (MTC) as a Technology Architect. Let us read the blog post in Vinod’s own voice. I think the series from Pinal is a good one for anyone planning to start on Big Data journey from the basics. In my daily customer interactions this buzz of “Big Data” always comes up, I react generally saying – “Sir, do you really have a ‘Big Data’ problem or do you have a big Data problem?” Generally, there is a silence in the air when I ask this question. Data is everywhere in organizations – be it big data, small data, all data and for few it is bad data which is same as no data :). Wow, don’t discount me as someone who opposes “Big Data”, I am a big supporter as much as I am a critic of the abuse of this term by the people. In this post, I wanted to let my mind flow so that you can also think in the direction I want you to see these concepts. In any case, this is not an exhaustive dump of what is in my mind – but you will surely get the drift how I am going to question Big Data terms from customers!!! Is Big Data Relevant to me? Many of my customers talk to me like blank whiteboard with no idea – “why Big Data”. They want to jump into the bandwagon of technology and they want to decipher insights from their unexplored data a.k.a. unstructured data with structured data. So what are these industry scenario’s that come to mind? Here are some of them: Financials Fraud detection: Banks and Credit cards are monitoring your spending habits on real-time basis. Customer Segmentation: applies in every industry from Banking to Retail to Aviation to Utility and others where they deal with end customer who consume their products and services. Customer Sentiment Analysis: Responding to negative brand perception on social or amplify the positive perception. Sales and Marketing Campaign: Understand the impact and get closer to customer delight. Call Center Analysis: attempt to take unstructured voice recordings and analyze them for content and sentiment. Medical Reduce Re-admissions: How to build a proactive follow-up engagements with patients. Patient Monitoring: How to track Inpatient, Out-Patient, Emergency Visits, Intensive Care Units etc. Preventive Care: Disease identification and Risk stratification is a very crucial business function for medical. Claims fraud detection: There is no precise dollars that one can put here, but this is a big thing for the medical field. Retail Customer Sentiment Analysis, Customer Care Centers, Campaign Management. Supply Chain Analysis: Every sensors and RFID data can be tracked for warehouse space optimization. Location based marketing: Based on where a check-in happens retail stores can be optimize their marketing. Telecom Price optimization and Plans, Finding Customer churn, Customer loyalty programs Call Detail Record (CDR) Analysis, Network optimizations, User Location analysis Customer Behavior Analysis Insurance Fraud Detection & Analysis, Pricing based on customer Sentiment Analysis, Loyalty Management Agents Analysis, Customer Value Management This list can go on to other areas like Utility, Manufacturing, Travel, ITES etc. So as you can see, there are obviously interesting use cases for each of these industry verticals. These are just representative list. Where to start? A lot of times I try to quiz customers on a number of dimensions before starting a Big Data conversation. Are you getting the data you need the way you want it and in a timely manner? Can you get in and analyze the data you need? How quickly is IT to respond to your BI Requests? How easily can you get at the data that you need to run your business/department/project? How are you currently measuring your business? Can you get the data you need to react WITHIN THE QUARTER to impact behaviors to meet your numbers or is it always “rear-view mirror?” How are you measuring: The Brand Customer Sentiment Your Competition Your Pricing Your performance Supply Chain Efficiencies Predictive product / service positioning What are your key challenges of driving collaboration across your global business? What the challenges in innovation? What challenges are you facing in getting more information out of your data? Note: Garbage-in is Garbage-out. Hold good for all reporting / analytics requirements Big Data POCs? A number of customers get into the realm of setting a small team to work on Big Data – well it is a great start from an understanding point of view, but I tend to ask a number of other questions to such customers. Some of these common questions are: To what degree is your advanced analytics (natural language processing, sentiment analysis, predictive analytics and classification) paired with your Big Data’s efforts? Do you have dedicated resources exploring the possibilities of advanced analytics in Big Data for your business line? Do you plan to employ machine learning technology while doing Advanced Analytics? How is Social Media being monitored in your organization? What is your ability to scale in terms of storage and processing power? Do you have a system in place to sort incoming data in near real time by potential value, data quality, and use frequency? Do you use event-driven architecture to manage incoming data? Do you have specialized data services that can accommodate different formats, security, and the management requirements of multiple data sources? Is your organization currently using or considering in-memory analytics? To what degree are you able to correlate data from your Big Data infrastructure with that from your enterprise data warehouse? Have you extended the role of Data Stewards to include ownership of big data components? Do you prioritize data quality based on the source system (that is Facebook/Twitter data has lower quality thresholds than radio frequency identification (RFID) for a tracking system)? Do your retention policies consider the different legal responsibilities for storing Big Data for a specific amount of time? Do Data Scientists work in close collaboration with Data Stewards to ensure data quality? How is access to attributes of Big Data being given out in the organization? Are roles related to Big Data (Advanced Analyst, Data Scientist) clearly defined? How involved is risk management in the Big Data governance process? Is there a set of documented policies regarding Big Data governance? Is there an enforcement mechanism or approach to ensure that policies are followed? Who is the key sponsor for your Big Data governance program? (The CIO is best) Do you have defined policies surrounding the use of social media data for potential employees and customers, as well as the use of customer Geo-location data? How accessible are complex analytic routines to your user base? What is the level of involvement with outside vendors and third parties in regard to the planning and execution of Big Data projects? What programming technologies are utilized by your data warehouse/BI staff when working with Big Data? These are some of the important questions I ask each customer who is actively evaluating Big Data trends for their organizations. These questions give you a sense of direction where to start, what to use, how to secure, how to analyze and more. Sign off Any Big data is analysis is incomplete without a compelling story. The best way to understand this is to watch Hans Rosling – Gapminder (2:17 to 6:06) videos about the third world myths. Don’t get overwhelmed with the Big Data buzz word, the destination to what your data speaks is important. In this blog post, we did not particularly look at any Big Data technologies. This is a set of questionnaire one needs to keep in mind as they embark their journey of Big Data. I did write some of the basics in my blog: Big Data – Big Hype yet Big Opportunity. Do let me know if these questions make sense? Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL SERVER – Shrinking Database is Bad – Increases Fragmentation – Reduces Performance

- by pinaldave

Earlier, I had written two articles related to Shrinking Database. I wrote about why Shrinking Database is not good. SQL SERVER – SHRINKDATABASE For Every Database in the SQL Server SQL SERVER – What the Business Says Is Not What the Business Wants I received many comments on Why Database Shrinking is bad. Today we will go over a very interesting example that I have created for the same. Here are the quick steps of the example. Create a test database Create two tables and populate with data Check the size of both the tables Size of database is very low Check the Fragmentation of one table Fragmentation will be very low Truncate another table Check the size of the table Check the fragmentation of the one table Fragmentation will be very low SHRINK Database Check the size of the table Check the fragmentation of the one table Fragmentation will be very HIGH REBUILD index on one table Check the size of the table Size of database is very HIGH Check the fragmentation of the one table Fragmentation will be very low Here is the script for the same. USE MASTER GO CREATE DATABASE ShrinkIsBed GO USE ShrinkIsBed GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Create FirstTable CREATE TABLE FirstTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_FirstTable_ID] ON FirstTable ( [ID] ASC ) ON [PRIMARY] GO -- Create SecondTable CREATE TABLE SecondTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_SecondTable_ID] ON SecondTable ( [ID] ASC ) ON [PRIMARY] GO -- Insert One Hundred Thousand Records INSERT INTO FirstTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Insert One Hundred Thousand Records INSERT INTO SecondTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO Let us check the table size and fragmentation. Now let us TRUNCATE the table and check the size and Fragmentation. USE MASTER GO CREATE DATABASE ShrinkIsBed GO USE ShrinkIsBed GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Create FirstTable CREATE TABLE FirstTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_FirstTable_ID] ON FirstTable ( [ID] ASC ) ON [PRIMARY] GO -- Create SecondTable CREATE TABLE SecondTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_SecondTable_ID] ON SecondTable ( [ID] ASC ) ON [PRIMARY] GO -- Insert One Hundred Thousand Records INSERT INTO FirstTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Insert One Hundred Thousand Records INSERT INTO SecondTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO You can clearly see that after TRUNCATE, the size of the database is not reduced and it is still the same as before TRUNCATE operation. After the Shrinking database operation, we were able to reduce the size of the database. If you notice the fragmentation, it is considerably high. The major problem with the Shrink operation is that it increases fragmentation of the database to very high value. Higher fragmentation reduces the performance of the database as reading from that particular table becomes very expensive. One of the ways to reduce the fragmentation is to rebuild index on the database. Let us rebuild the index and observe fragmentation and database size. -- Rebuild Index on FirstTable ALTER INDEX IX_SecondTable_ID ON SecondTable REBUILD GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO You can notice that after rebuilding, Fragmentation reduces to a very low value (almost same to original value); however the database size increases way higher than the original. Before rebuilding, the size of the database was 5 MB, and after rebuilding, it is around 20 MB. Regular rebuilding the index is rebuild in the same user database where the index is placed. This usually increases the size of the database. Look at irony of the Shrinking database. One person shrinks the database to gain space (thinking it will help performance), which leads to increase in fragmentation (reducing performance). To reduce the fragmentation, one rebuilds index, which leads to size of the database to increase way more than the original size of the database (before shrinking). Well, by Shrinking, one did not gain what he was looking for usually. Rebuild indexing is not the best suggestion as that will create database grow again. I have always remembered the excellent post from Paul Randal regarding Shrinking the database is bad. I suggest every one to read that for accuracy and interesting conversation. Let us run following script where we Shrink the database and REORGANIZE. -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO -- Shrink the Database DBCC SHRINKDATABASE (ShrinkIsBed); GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO -- Rebuild Index on FirstTable ALTER INDEX IX_SecondTable_ID ON SecondTable REORGANIZE GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO You can see that REORGANIZE does not increase the size of the database or remove the fragmentation. Again, I no way suggest that REORGANIZE is the solution over here. This is purely observation using demo. Read the blog post of Paul Randal. Following script will clean up the database -- Clean up USE MASTER GO ALTER DATABASE ShrinkIsBed SET SINGLE_USER WITH ROLLBACK IMMEDIATE GO DROP DATABASE ShrinkIsBed GO There are few valid cases of the Shrinking database as well, but that is not covered in this blog post. We will cover that area some other time in future. Additionally, one can rebuild index in the tempdb as well, and we will also talk about the same in future. Brent has written a good summary blog post as well. Are you Shrinking your database? Well, when are you going to stop Shrinking it? Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Index, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

Read the article
SQL SERVER – Introduction to SQL Server 2014 In-Memory OLTP

- by Pinal Dave

In SQL Server 2014 Microsoft has introduced a new database engine component called In-Memory OLTP aka project “Hekaton” which is fully integrated into the SQL Server Database Engine. It is optimized for OLTP workloads accessing memory resident data. In-memory OLTP helps us create memory optimized tables which in turn offer significant performance improvement for our typical OLTP workload. The main objective of memory optimized table is to ensure that highly transactional tables could live in memory and remain in memory forever without even losing out a single record. The most significant part is that it still supports majority of our Transact-SQL statement. Transact-SQL stored procedures can be compiled to machine code for further performance improvements on memory-optimized tables. This engine is designed to ensure higher concurrency and minimal blocking. In-Memory OLTP alleviates the issue of locking, using a new type of multi-version optimistic concurrency control. It also substantially reduces waiting for log writes by generating far less log data and needing fewer log writes. Points to remember Memory-optimized tables refer to tables using the new data structures and key words added as part of In-Memory OLTP. Disk-based tables refer to your normal tables which we used to create in SQL Server since its inception. These tables use a fixed size 8 KB pages that need to be read from and written to disk as a unit. Natively compiled stored procedures refer to an object Type which is new and is supported by in-memory OLTP engine which convert it into machine code, which can further improve the data access performance for memory –optimized tables. Natively compiled stored procedures can only reference memory-optimized tables, they can’t be used to reference any disk –based table. Interpreted Transact-SQL stored procedures, which is what SQL Server has always used. Cross-container transactions refer to transactions that reference both memory-optimized tables and disk-based tables. Interop refers to interpreted Transact-SQL that references memory-optimized tables. Using In-Memory OLTP In-Memory OLTP engine has been available as part of SQL Server 2014 since June 2013 CTPs. Installation of In-Memory OLTP is part of the SQL Server setup application. The In-Memory OLTP components can only be installed with a 64-bit edition of SQL Server 2014 hence they are not available with 32-bit editions. Creating Databases Any database that will store memory-optimized tables must have a MEMORY_OPTIMIZED_DATA filegroup. This filegroup is specifically designed to store the checkpoint files needed by SQL Server to recover the memory-optimized tables, and although the syntax for creating the filegroup is almost the same as for creating a regular filestream filegroup, it must also specify the option CONTAINS MEMORY_OPTIMIZED_DATA. Here is an example of a CREATE DATABASE statement for a database that can support memory-optimized tables: CREATE DATABASE InMemoryDB ON PRIMARY(NAME = [InMemoryDB_data], FILENAME = 'D:\data\InMemoryDB_data.mdf', size=500MB), FILEGROUP [SampleDB_mod_fg] CONTAINS MEMORY_OPTIMIZED_DATA (NAME = [InMemoryDB_mod_dir], FILENAME = 'S:\data\InMemoryDB_mod_dir'), (NAME = [InMemoryDB_mod_dir], FILENAME = 'R:\data\InMemoryDB_mod_dir') LOG ON (name = [SampleDB_log], Filename='L:\log\InMemoryDB_log.ldf', size=500MB) COLLATE Latin1_General_100_BIN2; Above example code creates files on three different drives (D: S: and R:) for the data files and in memory storage so if you would like to run this code kindly change the drive and folder locations as per your convenience. Also notice that binary collation was specified as Windows (non-SQL). BIN2 collation is the only collation support at this point for any indexes on memory optimized tables. It is also possible to add a MEMORY_OPTIMIZED_DATA file group to an existing database, use the below command to achieve the same. ALTER DATABASE AdventureWorks2012 ADD FILEGROUP hekaton_mod CONTAINS MEMORY_OPTIMIZED_DATA; GO ALTER DATABASE AdventureWorks2012 ADD FILE (NAME='hekaton_mod', FILENAME='S:\data\hekaton_mod') TO FILEGROUP hekaton_mod; GO Creating Tables There is no major syntactical difference between creating a disk based table or a memory –optimized table but yes there are a few restrictions and a few new essential extensions. Essentially any memory-optimized table should use the MEMORY_OPTIMIZED = ON clause as shown in the Create Table query example. DURABILITY clause (SCHEMA_AND_DATA or SCHEMA_ONLY) Memory-optimized table should always be defined with a DURABILITY value which can be either SCHEMA_AND_DATA or SCHEMA_ONLY the former being the default. A memory-optimized table defined with DURABILITY=SCHEMA_ONLY will not persist the data to disk which means the data durability is compromised whereas DURABILITY= SCHEMA_AND_DATA ensures that data is also persisted along with the schema. Indexing Memory Optimized Table A memory-optimized table must always have an index for all tables created with DURABILITY= SCHEMA_AND_DATA and this can be achieved by declaring a PRIMARY KEY Constraint at the time of creating a table. The following example shows a PRIMARY KEY index created as a HASH index, for which a bucket count must also be specified. CREATE TABLE Mem_Table ( [Name] VARCHAR(32) NOT NULL PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 100000), [City] VARCHAR(32) NULL, [State_Province] VARCHAR(32) NULL, [LastModified] DATETIME NOT NULL, ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA); Now as you can see in the above query example we have used the clause MEMORY_OPTIMIZED = ON to make sure that it is considered as a memory optimized table and not just a normal table and also used the DURABILITY Clause= SCHEMA_AND_DATA which means it will persist data along with metadata and also you can notice this table has a PRIMARY KEY mentioned upfront which is also a mandatory clause for memory-optimized tables. We will talk more about HASH Indexes and BUCKET_COUNT in later articles on this topic which will be focusing more on Row and Index storage on Memory-Optimized tables. So stay tuned for that as well. Now as we covered the basics of Memory Optimized tables and understood the key things to remember while using memory optimized tables, let’s explore more using examples to understand the Performance gains using memory-optimized tables. I will be using the database which i created earlier in this article i.e. InMemoryDB in the below Demo Exercise. USE InMemoryDB GO -- Creating a disk based table CREATE TABLE dbo.Disktable ( Id INT IDENTITY, Name CHAR(40) ) GO CREATE NONCLUSTERED INDEX IX_ID ON dbo.Disktable (Id) GO -- Creating a memory optimized table with similar structure and DURABILITY = SCHEMA_AND_DATA CREATE TABLE dbo.Memorytable_durable ( Id INT NOT NULL PRIMARY KEY NONCLUSTERED Hash WITH (bucket_count =1000000), Name CHAR(40) ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA) GO -- Creating an another memory optimized table with similar structure but DURABILITY = SCHEMA_Only CREATE TABLE dbo.Memorytable_nondurable ( Id INT NOT NULL PRIMARY KEY NONCLUSTERED Hash WITH (bucket_count =1000000), Name CHAR(40) ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_only) GO -- Now insert 100000 records in dbo.Disktable and observe the Time Taken DECLARE @i_t bigint SET @i_t =1 WHILE @i_t<= 100000 BEGIN INSERT INTO dbo.Disktable(Name) VALUES('sachin' + CONVERT(VARCHAR,@i_t)) SET @i_t+=1 END -- Do the same inserts for Memory table dbo.Memorytable_durable and observe the Time Taken DECLARE @i_t bigint SET @i_t =1 WHILE @i_t<= 100000 BEGIN INSERT INTO dbo.Memorytable_durable VALUES(@i_t, 'sachin' + CONVERT(VARCHAR,@i_t)) SET @i_t+=1 END -- Now finally do the same inserts for Memory table dbo.Memorytable_nondurable and observe the Time Taken DECLARE @i_t bigint SET @i_t =1 WHILE @i_t<= 100000 BEGIN INSERT INTO dbo.Memorytable_nondurable VALUES(@i_t, 'sachin' + CONVERT(VARCHAR,@i_t)) SET @i_t+=1 END The above 3 Inserts took 1.20 minutes, 54 secs, and 2 secs respectively to insert 100000 records on my machine with 8 Gb RAM. This proves the point that memory-optimized tables can definitely help businesses achieve better performance for their highly transactional business table and memory- optimized tables with Durability SCHEMA_ONLY is even faster as it does not bother persisting its data to disk which makes it supremely fast. Koenig Solutions is one of the few organizations which offer IT training on SQL Server 2014 and all its updates. Now, I leave the decision on using memory_Optimized tables on you, I hope you like this article and it helped you understand the fundamentals of IN-Memory OLTP . Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Koenig

Read the article
Developer’s Life – Disaster Lessons – Notes from the Field #039

- by Pinal Dave

[Note from Pinal]: This is a 39th episode of Notes from the Field series. What is the best solution do you have when you encounter a disaster in your organization. Now many of you would answer that in this scenario you would have another standby machine or alternative which you will plug in. Now let me ask second question – What would you do if you as an individual faces disaster? In this episode of the Notes from the Field series database expert Mike Walsh explains a very crucial issue we face in our career, which is not technical but more to relate to human nature. Read on this may be the best blog post you might read in recent times. Howdy! When it was my turn to share the Notes from the Field last time, I took a departure from my normal technical content to talk about Attitude and Communication.(http://blog.sqlauthority.com/2014/05/08/developers-life-attitude-and-communication-they-can-cause-problems-notes-from-the-field-027/) Pinal said it was a popular topic so I hope he won’t mind if I stick with Professional Development for another of my turns at sharing some information here. Like I said last time, the “soft skills” of the IT world are often just as important – sometimes more important – than the technical skills. As a consultant with Linchpin People – I see so many situations where the professional skills I’ve gained and use are more valuable to clients than knowing the best way to tune a query. Today I want to continue talking about professional development and tell you about the way I almost got myself hit by a train – and why that matters in our day jobs. Sometimes we can learn a lot from disasters. Whether we caused them or someone else did. If you are interested in learning about some of my observations in these lessons you can see more where I talk about lessons from disasters on my blog. For now, though, onto how I almost got my vehicle hit by a train… The Train Crash That Almost Was…. My family and I own a little schoolhouse building about a 10 mile drive away from our house. We use it as a free resource for families in the area that homeschool their children – so they can have some class space. I go up there a lot to check in on the property, to take care of the trash and to do work on the property. On the way there, there is a very small Stop Sign controlled railroad intersection. There is only two small freight trains a day passing there. Actually the same train, making a journey south and then back North. That’s it. This road is a small rural road, barely ever a second car driving in the neighborhood there when I am. The stop sign is pretty much there only for the train crossing. When we first bought the building, I was up there a lot doing renovations on the property. Being familiar with the area, I am also familiar with the train schedule and know the tracks are normally free of trains. So I developed a bad habit. You see, I’d approach the stop sign and slow down as I roll through it. Sometimes I’d do a quick look and come to an “almost” stop there but keep on going. I let my impatience and complacency take over. And that is because most of the time I was going there long after the train was done for the day or in between the runs. This habit became pretty well established after a couple years of driving the route. The behavior reinforced a bit by the success ratio. I saw others doing it as well from the neighborhood when I would happen to be there around the time another car was there. Well. You already know where this ends up by the title and backstory here. A few months ago I came to that little crossing, and I started to do the normal routine. I’d pretty much stopped looking in some respects because of the pattern I’d gotten into. For some reason I looked and heard and saw the train slowly approaching and slammed on my brakes and stopped. It was an abrupt stop, and it was close. I probably would have made it okay, but I sat there thinking about lessons for IT professionals from the situation once I started breathing again and watched the cars loaded with sand and propane slowly labored down the tracks… Here are Those Lessons… It’s easy to get stuck into a routine – That isn’t always bad. Except when it’s a bad routine. Momentum and inertia are powerful. Once you have a habit and a routine developed – it’s really hard to break that. Make sure you are setting the right routines and habits TODAY. What almost dangerous things are you doing today? How are you almost messing up your production environment today? Stop doing that. Be Deliberate – (Even when you are the only one) – Like I said – a lot of people roll through that stop sign. Perhaps the neighbors or other drivers think “why is he fully stopping and looking… The train only comes two times a day!” – they can think that all they want. Through deliberate actions and forcing myself to pay attention, I will avoid that oops again. Slow down. Take a deep breath. Be Deliberate in your job. Pay attention to the small stuff and go out of your way to be careful. It will save you later. Be Observant – Keep your eyes open. By looking around, observing the situation and understanding what your servers, databases, users and vendors are doing – you’ll notice when something is out of place. But if you don’t know what is normal, if you don’t look to make sure nothing has changed – that train will come and get you. Where can you be more observant? What warning signs are you ignoring in your environment today? In the IT world – trains are everywhere. Projects move fast. Decisions happen fast. Problems turn from a warning sign to a disaster quickly. If you get stuck in a complacent pattern of “Everything is okay, it always has been and always will be” – that’s the time that you will most likely get stuck in a bad situation. Don’t let yourself get complacent, don’t let your team get complacent. That will lead to being proactive. And a proactive environment spends less money on consultants for troubleshooting problems you should have seen ahead of time. You can spend your money and IT budget on improving for your customers. If you want to get started with performance analytics and triage of virtualized SQL Servers with the help of experts, read more over at Fix Your SQL Server. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Notes from the Field, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL SERVER – Extending SQL Azure with Azure worker role – Guest Post by Paras Doshi

- by pinaldave

This is guest post by Paras Doshi. Paras Doshi is a research Intern at SolidQ.com and a Microsoft student partner. He is currently working in the domain of SQL Azure. SQL Azure is nothing but a SQL server in the cloud. SQL Azure provides benefits such as on demand rapid provisioning, cost-effective scalability, high availability and reduced management overhead. To see an introduction on SQL Azure, check out the post by Pinal here In this article, we are going to discuss how to extend SQL Azure with the Azure worker role. In other words, we will attempt to write a custom code and host it in the Azure worker role; the aim is to add some features that are not available with SQL Azure currently or features that need to be customized for flexibility. This way we extend the SQL Azure capability by building some solutions that run on Azure as worker roles. To understand Azure worker role, think of it as a windows service in cloud. Azure worker role can perform background processes, and to handle processes such as synchronization and backup, it becomes our ideal tool. First, we will focus on writing a worker role code that synchronizes SQL Azure databases. Before we do so, let’s see some scenarios in which synchronization between SQL Azure databases is beneficial: scaling out access over multiple databases enables us to handle workload efficiently As of now, SQL Azure database can be hosted in one of any six datacenters. By synchronizing databases located in different data centers, one can extend the data by enabling access to geographically distributed data Let us see some scenarios in which SQL server to SQL Azure database synchronization is beneficial To backup SQL Azure database on local infrastructure Rather than investing in local infrastructure for increased workloads, such workloads could be handled by cloud Ability to extend data to different datacenters located across the world to enable efficient data access from remote locations Now, let us develop cloud-based app that synchronizes SQL Azure databases. For an Introduction to developing cloud based apps, click here Now, in this article, I aim to provide a bird’s eye view of how a code that synchronizes SQL Azure databases look like and then list resources that can help you develop the solution from scratch. Now, if you newly add a worker role to the cloud-based project, this is how the code will look like. (Note: I have added comments to the skeleton code to point out the modifications that will be required in the code to carry out the SQL Azure synchronization. Note the placement of Setup() and Sync() function.) Click here (http://parasdoshi1989.files.wordpress.com/2011/06/code-snippet-1-for-extending-sql-azure-with-azure-worker-role1.pdf ) Enabling SQL Azure databases synchronization through sync framework is a two-step process. In the first step, the database is provisioned and sync framework creates tracking tables, stored procedures, triggers, and tables to store metadata to enable synchronization. This is one time step. The code for the same is put in the setup() function which is called once when the worker role starts. Now, the second step is continuous (or on demand) synchronization of SQL Azure databases by propagating changes between databases. This is done on a continuous basis by calling the sync() function in the while loop. The code logic to synchronize changes between SQL Azure databases should be put in the sync() function. Discussing the coding part step by step is out of the scope of this article. Therefore, let me suggest you a resource, which is given here. Also, note that before you start developing the code, you will need to install SYNC framework 2.1 SDK (download here). Further, you will reference some libraries before you start coding. Details regarding the same are available in the article that I just pointed to. You will be charged for data transfers if the databases are not in the same datacenter. For pricing information, go here Currently, a tool named DATA SYNC, which is built on top of sync framework, is available in CTP that allows SQL Azure <-> SQL server and SQL Azure <-> SQL Azure synchronization (without writing single line of code); however, in some cases, the custom code shown in this blogpost provides flexibility that is not available with Data SYNC. For instance, filtering is not supported in the SQL Azure DATA SYNC CTP2; if you wish to have such a functionality now, then you have the option of developing a custom code using SYNC Framework. Now, this code can be easily extended to synchronize at some schedule. Let us say we want the databases to get synchronized every day at 10:00 pm. This is what the code will look like now: (http://parasdoshi1989.files.wordpress.com/2011/06/code-snippet-2-for-extending-sql-azure-with-azure-worker-role.pdf) Don’t you think that by writing such a code, we are imitating the functionality provided by the SQL server agent for a SQL server? Think about it. We are scheduling our administrative task by writing custom code – in other words, we have developed a “Light weight SQL server agent for SQL Azure!” Since the SQL server agent is not currently available in cloud, we have developed a solution that enables us to schedule tasks, and thus we have extended SQL Azure with the Azure worker role! Now if you wish to track jobs, you can do so by storing this data in SQL Azure (or Azure tables). The reason is that Windows Azure is a stateless platform, and we will need to store the state of the job ourselves and the choice that you have is SQL Azure or Azure tables. Note that this solution requires custom code and also it is not UI driven; however, for now, it can act as a temporary solution until SQL server agent is made available in the cloud. Moreover, this solution does not encompass functionalities that a SQL server agent provides, but it does open up an interesting avenue to schedule some of the tasks such as backup and synchronization of SQL Azure databases by writing some custom code in the Azure worker role. Now, let us see one more possibility – i.e., running BCP through a worker role in Azure-hosted services and then uploading the backup files either locally or on blobs. If you upload it locally, then consider the data transfer cost. If you upload it to blobs residing in the same datacenter, then no transfer cost applies but the cost on blob size applies. So, before choosing the option, you need to evaluate your preferences keeping the cost associated with each option in mind. In this article, I have shown that Azure worker role solution could be developed to synchronize SQL Azure databases. Moreover, a light-weight SQL server agent for SQL Azure can be developed. Also we discussed the possibility of running BCP through a worker role in Azure-hosted services for backing up our precious SQL Azure data. Thus, we can extend SQL Azure with the Azure worker role. But remember: you will be charged for running Azure worker roles. So at the end of the day, you need to ask – am I willing to build a custom code and pay money to achieve this functionality? I hope you found this blog post interesting. If you have any questions/feedback, you can comment below or you can mail me at Paras[at]student-partners[dot]com Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Azure, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Step by Step Guide to Beginning Data Quality Services in SQL Server 2012 – Introduction to DQS

- by pinaldave

Data Quality Services is a very important concept of SQL Server. I have recently started to explore the same and I am really learning some good concepts. Here are two very important blog posts which one should go over before continuing this blog post. Installing Data Quality Services (DQS) on SQL Server 2012 Connecting Error to Data Quality Services (DQS) on SQL Server 2012 This article is introduction to Data Quality Services for beginners. We will be using an Excel file Click on the image to enlarge the it. In the first article we learned to install DQS. In this article we will see how we can learn about building Knowledge Base and using it to help us identify the quality of the data as well help correct the bad quality of the data. Here are the two very important steps we will be learning in this tutorial. Building a New Knowledge Base Creating a New Data Quality Project Let us start the building the Knowledge Base. Click on New Knowledge Base. In our project we will be using the Excel as a knowledge base. Here is the Excel which we will be using. There are two columns. One is Colors and another is Shade. They are independent columns and not related to each other. The point which I am trying to show is that in Column A there are unique data and in Column B there are duplicate records. Clicking on New Knowledge Base will bring up the following screen. Enter the name of the new knowledge base. Clicking NEXT will bring up following screen where it will allow to select the EXCE file and it will also let users select the source column. I have selected Colors and Shade both as a source column. Creating a domain is very important. Here you can create a unique domain or domain which is compositely build from Colors and Shade. As this is the first example, I will create unique domain – for Colors I will create domain Colors and for Shade I will create domain Shade. Here is the screen which will demonstrate how the screen will look after creating domains. Clicking NEXT it will bring you to following screen where you can do the data discovery. Clicking on the START will start the processing of the source data provided. Pre-processed data will show various information related to the source data. In our case it shows that Colors column have unique data whereas Shade have non-unique data and unique data rows are only two. In the next screen you can actually add more rows as well see the frequency of the data as the values are listed unique. Clicking next will publish the knowledge base which is just created. Now the knowledge base is created. We will try to take any random data and attempt to do DQS implementation over it. I am using another excel sheet here for simplicity purpose. In reality you can easily use SQL Server table for the same. Click on New Data Quality Project to see start DQS Project. In the next screen it will ask which knowledge base to use. We will be using our Colors knowledge base which we have recently created. In the Colors knowledge base we had two columns – 1) Colors and 2) Shade. In our case we will be using both of the mappings here. User can select one or multiple column mapping over here. Now the most important phase of the complete project. Click on Start and it will make the cleaning process and shows various results. In our case there were two columns to be processed and it completed the task with necessary information. It demonstrated that in Colors columns it has not corrected any value by itself but in Shade value there is a suggestion it has. We can train the DQS to correct values but let us keep that subject for future blog posts. Now click next and keep the domain Colors selected left side. It will demonstrate that there are two incorrect columns which it needs to be corrected. Here is the place where once corrected value will be auto-corrected in future. I manually corrected the value here and clicked on Approve radio buttons. As soon as I click on Approve buttons the rows will be disappeared from this tab and will move to Corrected Tab. If I had rejected tab it would have moved the rows to Invalid tab as well. In this screen you can see how the corrected 2 rows are demonstrated. You can click on Correct tab and see previously validated 6 rows which passed the DQS process. Now let us click on the Shade domain on the left side of the screen. This domain shows very interesting details as there DQS system guessed the correct answer as Dark with the confidence level of 77%. It is quite a high confidence level and manual observation also demonstrate that Dark is the correct answer. I clicked on Approve and the row moved to corrected tab. On the next screen DQS shows the summary of all the activities. It also demonstrates how the correction of the quality of the data was performed. The user can explore their data to a SQL Server Table, CSV file or Excel. The user also has an option to either explore data and all the associated cleansing info or data only. I will select Data only for demonstration purpose. Clicking explore will generate the files. Let us open the generated file. It will look as following and it looks pretty complete and corrected. Well, we have successfully completed DQS Process. The process is indeed very easy. I suggest you try this out yourself and you will find it very easy to learn. In future we will go over advanced concepts. Are you using this feature on your production server? If yes, would you please leave a comment with your environment and business need. It will be indeed interesting to see where it is implemented. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Business Intelligence, Data Warehousing, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Data Quality Services, DQS

Read the article
SQL SERVER – Guest Post – Architecting Data Warehouse – Niraj Bhatt

- by pinaldave

Niraj Bhatt works as an Enterprise Architect for a Fortune 500 company and has an innate passion for building / studying software systems. He is a top rated speaker at various technical forums including Tech·Ed, MCT Summit, Developer Summit, and Virtual Tech Days, among others. Having run a successful startup for four years Niraj enjoys working on – IT innovations that can impact an enterprise bottom line, streamlining IT budgets through IT consolidation, architecture and integration of systems, performance tuning, and review of enterprise applications. He has received Microsoft MVP award for ASP.NET, Connected Systems and most recently on Windows Azure. When he is away from his laptop, you will find him taking deep dives in automobiles, pottery, rafting, photography, cooking and financial statements though not necessarily in that order. He is also a manager/speaker at BDOTNET, Asia’s largest .NET user group. Here is the guest post by Niraj Bhatt. As data in your applications grows it’s the database that usually becomes a bottleneck. It’s hard to scale a relational DB and the preferred approach for large scale applications is to create separate databases for writes and reads. These databases are referred as transactional database and reporting database. Though there are tools / techniques which can allow you to create snapshot of your transactional database for reporting purpose, sometimes they don’t quite fit the reporting requirements of an enterprise. These requirements typically are data analytics, effective schema (for an Information worker to self-service herself), historical data, better performance (flat data, no joins) etc. This is where a need for data warehouse or an OLAP system arises. A Key point to remember is a data warehouse is mostly a relational database. It’s built on top of same concepts like Tables, Rows, Columns, Primary keys, Foreign Keys, etc. Before we talk about how data warehouses are typically structured let’s understand key components that can create a data flow between OLTP systems and OLAP systems. There are 3 major areas to it: a) OLTP system should be capable of tracking its changes as all these changes should go back to data warehouse for historical recording. For e.g. if an OLTP transaction moves a customer from silver to gold category, OLTP system needs to ensure that this change is tracked and send to data warehouse for reporting purpose. A report in context could be how many customers divided by geographies moved from sliver to gold category. In data warehouse terminology this process is called Change Data Capture. There are quite a few systems that leverage database triggers to move these changes to corresponding tracking tables. There are also out of box features provided by some databases e.g. SQL Server 2008 offers Change Data Capture and Change Tracking for addressing such requirements. b) After we make the OLTP system capable of tracking its changes we need to provision a batch process that can run periodically and takes these changes from OLTP system and dump them into data warehouse. There are many tools out there that can help you fill this gap – SQL Server Integration Services happens to be one of them. c) So we have an OLTP system that knows how to track its changes, we have jobs that run periodically to move these changes to warehouse. The question though remains is how warehouse will record these changes? This structural change in data warehouse arena is often covered under something called Slowly Changing Dimension (SCD). While we will talk about dimensions in a while, SCD can be applied to pure relational tables too. SCD enables a database structure to capture historical data. This would create multiple records for a given entity in relational database and data warehouses prefer having their own primary key, often known as surrogate key. As I mentioned a data warehouse is just a relational database but industry often attributes a specific schema style to data warehouses. These styles are Star Schema or Snowflake Schema. The motivation behind these styles is to create a flat database structure (as opposed to normalized one), which is easy to understand / use, easy to query and easy to slice / dice. Star schema is a database structure made up of dimensions and facts. Facts are generally the numbers (sales, quantity, etc.) that you want to slice and dice. Fact tables have these numbers and have references (foreign keys) to set of tables that provide context around those facts. E.g. if you have recorded 10,000 USD as sales that number would go in a sales fact table and could have foreign keys attached to it that refers to the sales agent responsible for sale and to time table which contains the dates between which that sale was made. These agent and time tables are called dimensions which provide context to the numbers stored in fact tables. This schema structure of fact being at center surrounded by dimensions is called Star schema. A similar structure with difference of dimension tables being normalized is called a Snowflake schema. This relational structure of facts and dimensions serves as an input for another analysis structure called Cube. Though physically Cube is a special structure supported by commercial databases like SQL Server Analysis Services, logically it’s a multidimensional structure where dimensions define the sides of cube and facts define the content. Facts are often called as Measures inside a cube. Dimensions often tend to form a hierarchy. E.g. Product may be broken into categories and categories in turn to individual items. Category and Items are often referred as Levels and their constituents as Members with their overall structure called as Hierarchy. Measures are rolled up as per dimensional hierarchy. These rolled up measures are called Aggregates. Now this may seem like an overwhelming vocabulary to deal with but don’t worry it will sink in as you start working with Cubes and others. Let’s see few other terms that we would run into while talking about data warehouses. ODS or an Operational Data Store is a frequently misused term. There would be few users in your organization that want to report on most current data and can’t afford to miss a single transaction for their report. Then there is another set of users that typically don’t care how current the data is. Mostly senior level executives who are interesting in trending, mining, forecasting, strategizing, etc. don’t care for that one specific transaction. This is where an ODS can come in handy. ODS can use the same star schema and the OLAP cubes we saw earlier. The only difference is that the data inside an ODS would be short lived, i.e. for few months and ODS would sync with OLTP system every few minutes. Data warehouse can periodically sync with ODS either daily or weekly depending on business drivers. Data marts are another frequently talked about topic in data warehousing. They are subject-specific data warehouse. Data warehouses that try to span over an enterprise are normally too big to scope, build, manage, track, etc. Hence they are often scaled down to something called Data mart that supports a specific segment of business like sales, marketing, or support. Data marts too, are often designed using star schema model discussed earlier. Industry is divided when it comes to use of data marts. Some experts prefer having data marts along with a central data warehouse. Data warehouse here acts as information staging and distribution hub with spokes being data marts connected via data feeds serving summarized data. Others eliminate the need for a centralized data warehouse citing that most users want to report on detailed data. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Best Practices, Business Intelligence, Data Warehousing, Database, Pinal Dave, PostADay, Readers Contribution, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Web Platform Installer bundles for Visual Studio 2010 SP1 - and how you can build your own WebPI bundles

- by Jon Galloway

Visual Studio SP1 is now available via the Web Platform Installer, which means you've got three options: Download the 1.5 GB ISO image Run the 750KB Web Installer (which figures out what you need to download) Install via Web PI Note: I covered some tips for installing VS2010 SP1 last week - including some that apply to all of these, such as removing options you don't use prior to installing the service pack to decrease the installation time and download size. Two Visual Studio 2010 SP1 Web PI packages There are actually two WebPI packages for VS2010 SP1. There's the standard Visual Studio 2010 SP1 package [Web PI link], which includes (quoting ScottGu's post): VS2010 2010 SP1 ASP.NET MVC 3 (runtime + tools support) IIS 7.5 Express SQL Server Compact Edition 4.0 (runtime + tools support) Web Deployment 2.0 The notes on that package sum it up pretty well: Looking for the latest everything? Look no further. This will get you Visual Studio 2010 Service Pack 1 and the RTM releases of ASP.NET MVC 3, IIS 7.5 Express, SQL Server Compact 4.0 with tooling, and Web Deploy 2.0. It's the value meal of Microsoft products. Tell your friends! Note: This bundle includes the Visual Studio 2010 SP1 web installer, which will dynamically determine the appropriate service pack components to download and install. This is typically in the range of 200-500 MB and will take 30-60 minutes to install, depending on your machine configuration. There is also a Visual Studio 2010 SP1 Core package [Web PI link], which only includes only the SP without any of the other goodies (MVC3, IIS Express, etc.). If you're doing any web development, I'd highly recommend the main pack since it the other installs are small, simple installs, but if you're working in another space, you might want the core package. Installing via the Web Platform Installer I generally like to go with the Web PI when possible since it simplifies most software installations due to things like: Smart dependency management - installing apps or tools which have software dependencies will automatically figure out which dependencies you don't have and add them to the list (which you can review before install) Simultaneous download and install - if your install includes more than one package, it will automatically pull the dependencies first and begin installing them while downloading the others Lists the latest downloads - no need to search around, as they're all listed based on a live feed Includes open source applications - a lot of popular open source applications are included as well as Microsoft software and tools No worries about reinstallation - WebPI installations detect what you've got installed, so for instance if you've got MVC 3 installed you don't need to worry about the VS2010 SP1 package install messing anything up In addition to the links I included above, you can install the WebPI from http://www.microsoft.com/web/downloads/platform.aspx, and if you have Web PI installed you can just tap the Windows key and type "Web Platform" to bring it up in the Start search list. You'll see Visual Studio SP1 listed in the spotlight list as shown below. That's the standard package, which includes MVC 3 / IIS 7.5 Express / SQL Compact / Web Deploy. If you just want the core install, you can use the search box in the upper right corner, typing in "Visual Studio SP1" as shown. Core Install: Use Web PI or the Visual Studio Web Installer? I think the big advantage of using Web PI to install VS 2010 SP1 is that it includes the other new bits. If you're going to install the SP1 core, I don't think there's as much advantage to using Web PI, as the Web PI Core install just downloads the Visual Studio Web Installer anyways. I think Web PI makes it a little easier to find the download, but not a lot. The Visual Studio Web Installer checks dependencies, so there's no big advantage there. If you do happen to hit any problems installing Visual Studio SP1 via Web PI, I'd recommend running the Visual Studio Web Installer, then running the Web PI VS 2010 SP1 package to get all the other goodies. I talked to one person who hit some random snag, recommended that, and it worked out. Custom Web Platform Installer bundles You can create links that will launch the Web Platform Installer with a custom list of tools. You can see an example of this by clicking through on the install button at http://asp.net/downloads (cancelling the installation dialog). You'll see this in the address bar: http://www.microsoft.com/web/gallery/install.aspx?appsxml=&appid=MVC3;ASPNET;NETFramework4;SQLExpress;VWD Notice that the appid querystring parameter includes a semicolon delimited list, and you can make your own custom Web PI links with your own desired app list. I can think of a lot of cases where that would be handy: linking to a recommended software configuration from a software project or product, setting up a recommended / documented / supported install list for a software development team or IT shop, etc. For instance, here's a link that installs just VS2010 SP1 Core and the SQL CE tools: http://www.microsoft.com/web/gallery/install.aspx?appsxml=&appid=VS2010SP1Core;SQLCETools Note: If you've already got all or some of the products installed, the display will reflect that. On my dev box which has the full SP1 package, here's what the above link gives me: Here's another example - on a fresh box I created a link to install MVC 3 and the Web Farm Framework (http://www.microsoft.com/web/gallery/install.aspx?appsxml=&appid=MVC3;WebFarmFramework) and got the following items added to the cart: But where do I get the App ID's? Aha, that's the trick. You can link to a list of cool packages, but you need to know the App ID's to link to them. To figure that out, I turned on tracing in Web Platform Installer (also handy if you're ever having trouble with a WebPI install) and from the trace logs saw that the list of packages is pulled from an XML file: DownloadManager Information: 0 : Loading product xml from: https://go.microsoft.com/?linkid=9763242 DownloadManager Verbose: 0 : Connecting to https://go.microsoft.com/?linkid=9763242 with (partial) headers: Referer: wpi://2.1.0.0/Microsoft Windows NT 6.1.7601 Service Pack 1 If-Modified-Since: Wed, 09 Mar 2011 14:15:27 GMT User-Agent:Platform-Installer/3.0.3.0(Microsoft Windows NT 6.1.7601 Service Pack 1) DownloadManager Information: 0 : https://go.microsoft.com/?linkid=9763242 responded with 302 DownloadManager Information: 0 : Response headers: HTTP/1.1 302 Found Cache-Control: private Content-Length: 175 Content-Type: text/html; charset=utf-8 Expires: Wed, 09 Mar 2011 22:52:28 GMT Location: https://www.microsoft.com/web/webpi/3.0/webproductlist.xml Server: Microsoft-IIS/7.5 X-AspNet-Version: 2.0.50727 X-Powered-By: ASP.NET Date: Wed, 09 Mar 2011 22:53:27 GMT Browsing to https://www.microsoft.com/web/webpi/3.0/webproductlist.xml shows the full list. You can search through that in your browser / text editor if you'd like, open it in Excel as an XML table, etc. Here's a list of the App ID's as of today: SMO SMO32 PHP52ForIISExpress PHP53ForIISExpress StaticContent DefaultDocument DirectoryBrowse HTTPErrors HTTPRedirection ASPNET NETExtensibility ASP CGI ISAPIExtensions ISAPIFilters ServerSideIncludes HTTPLogging LoggingTools RequestMonitor Tracing CustomLogging ODBCLogging BasicAuthentication WindowsAuthentication DigestAuthentication ClientCertificateMappingAuthentication IISClientCertificateMappingAuthentication URLAuthorization RequestFiltering IPSecurity StaticContentCompression DynamicContentCompression IISManagementConsole IISManagementScriptsAndTools ManagementService MetabaseAndIIS6Compatibility WASProcessModel WASNetFxEnvironment WASConfigurationAPI IIS6WPICompatibility IIS6ScriptingTools IIS6ManagementConsole LegacyFTPServer FTPServer WebDAV LegacyFTPManagementConsole FTPExtensibility AdminPack AdvancedLogging WebFarmFrameworkNonLoc ExternalCacheNonLoc WebFarmFramework WebFarmFrameworkv2 WebFarmFrameworkv2_beta ExternalCache ECacheUpdate ARRv1 ARRv2Beta1 ARRv2Beta2 ARRv2RC ARRv2NonLoc ARRv2 ARRv2Update MVC MVCBeta MVCRC1 MVCRC2 DBManager DbManagerUpdate DynamicIPRestrictions DynamicIPRestrictionsUpdate DynamicIPRestrictionsLegacy DynamicIPRestrictionsBeta2 FTPOOB IISPowershellSnapin RemoteManager SEOToolkit VS2008RTM MySQL SQLDriverPHP52IIS SQLDriverPHP53IIS SQLDriverPHP52IISExpress SQLDriverPHP53IISExpress SQLExpress SQLManagementStudio SQLExpressAdv SQLExpressTools UrlRewrite UrlRewrite2 UrlRewrite2NonLoc UrlRewrite2RC UrlRewrite2Beta UrlRewrite10 UrlScan MVC3Installer MVC3 MVC3LocInstaller MVC3Loc MVC2 VWD VWD2010SP1Pack NETFramework4 WebMatrix WebMatrix_v1Refresh IISExpress IISExpress_v1 IIS7 AspWebPagesVS AspWebPagesVS_1_0 Plan9 Plan9Loc WebMatrix_WHP SQLCE SQLCETools SQLCEVSTools SQLCEVSTools_4_0 SQLCEVSToolsInstaller_4_0 SQLCEVSToolsInstallerNew_4_0 SQLCEVSToolsInstallerRepair_EN_4_0 SQLCEVSToolsInstallerRepair_JA_4_0 SQLCEVSToolsInstallerRepair_FR_4_0 SQLCEVSToolsInstallerRepair_DE_4_0 SQLCEVSToolsInstallerRepair_ES_4_0 SQLCEVSToolsInstallerRepair_IT_4_0 SQLCEVSToolsInstallerRepair_RU_4_0 SQLCEVSToolsInstallerRepair_KO_4_0 SQLCEVSToolsInstallerRepair_ZH_CN_4_0 SQLCEVSToolsInstallerRepair_ZH_TW_4_0 VWD2008 WebDAVOOB WDeploy WDeploy_v2 WDeployNoSMO WDeploy11 WinCache52 WinCache53 NETFramework35 WindowsImagingComponent VC9Redist NETFramework20SP2 WindowsInstaller31 PowerShell PowerShellMsu PowerShell2 WindowsInstaller45 FastCGIUpdate FastCGIBackport FastCGIIIS6 IIS51 IIS60 SQLNativeClient SQLNativeClient2008 SQLNativeClient2005 SQLCLRTypes SQLCLRTypes32 SMO_10_1 MySQLConnector PHP52 PHP53 PHPManager VSVWD2010Feature VWD2010WebFeature_0 VWD2010WebFeature_1 VWD2010WebFeature_2 VS2010SP1Prerequisite RIAServicesToolkitMay2010 Silverlight4Toolkit Silverlight4Tools VSLS SSMAMySQL WebsitePanel VS2010SP1Core VS2010SP1Installer VS2010SP1Pack MissingVWDOrVSVWD2010Feature VB2010Beta2Express VCS2010Beta2Express VC2010Beta2Express RIAServicesToolkitApr2010 VS2010Beta1 VS2010RC VS2010Beta2 VS2010Beta2Express VS2k8RTM VSCPP2k8RTM VSVB2k8RTM VSCS2k8RTM VSVWDFeature LegacyWinCache SQLExpress2005 SSMS2005

Read the article
MySQL – Scalability on Amazon RDS: Scale out to multiple RDS instances

- by Pinal Dave

Today, I’d like to discuss getting better MySQL scalability on Amazon RDS. The question of the day: “What can you do when a MySQL database needs to scale write-intensive workloads beyond the capabilities of the largest available machine on Amazon RDS?” Let’s take a look. In a typical EC2/RDS set-up, users connect to app servers from their mobile devices and tablets, computers, browsers, etc. Then app servers connect to an RDS instance (web/cloud services) and in some cases they might leverage some read-only replicas. Figure 1. A typical RDS instance is a single-instance database, with read replicas. This is not very good at handling high write-based throughput. As your application becomes more popular you can expect an increasing number of users, more transactions, and more accumulated data. User interactions can become more challenging as the application adds more sophisticated capabilities. The result of all this positive activity: your MySQL database will inevitably begin to experience scalability pressures. What can you do? Broadly speaking, there are four options available to improve MySQL scalability on RDS. 1. Larger RDS Instances – If you’re not already using the maximum available RDS instance, you can always scale up – to larger hardware. Bigger CPUs, more compute power, more memory et cetera. But the largest available RDS instance is still limited. And they get expensive. “High-Memory Quadruple Extra Large DB Instance”: 68 GB of memory 26 ECUs (8 virtual cores with 3.25 ECUs each) 64-bit platform High I/O Capacity Provisioned IOPS Optimized: 1000Mbps 2. Provisioned IOPs – You can get provisioned IOPs and higher throughput on the I/O level. However, there is a hard limit with a maximum instance size and maximum number of provisioned IOPs you can buy from Amazon and you simply cannot scale beyond these hardware specifications. 3. Leverage Read Replicas – If your application permits, you can leverage read replicas to offload some reads from the master databases. But there are a limited number of replicas you can utilize and Amazon generally requires some modifications to your existing application. And read-replicas don’t help with write-intensive applications. 4. Multiple Database Instances – Amazon offers a fourth option: “You can implement partitioning,thereby spreading your data across multiple database Instances” (Link) However, Amazon does not offer any guidance or facilities to help you with this. “Multiple database instances” is not an RDS feature. And Amazon doesn’t explain how to implement this idea. In fact, when asked, this is the response on an Amazon forum: Q: Is there any documents that describe the partition DB across multiple RDS? I need to use DB with more 1TB but exist a limitation during the create process, but I read in the any FAQ that you need to partition database, but I don’t find any documents that describe it. A: “DB partitioning/sharding is not an official feature of Amazon RDS or MySQL, but a technique to scale out database by using multiple database instances. The appropriate way to split data depends on the characteristics of the application or data set. Therefore, there is no concrete and specific guidance.” So now what? The answer is to scale out with ScaleBase. Amazon RDS with ScaleBase: What you get – MySQL Scalability! ScaleBase is specifically designed to scale out a single MySQL RDS instance into multiple MySQL instances. Critically, this is accomplished with no changes to your application code. Your application continues to “see” one database. ScaleBase does all the work of managing and enforcing an optimized data distribution policy to create multiple MySQL instances. With ScaleBase, data distribution, transactions, concurrency control, and two-phase commit are all 100% transparent and 100% ACID-compliant, so applications, services and tooling continue to interact with your distributed RDS as if it were a single MySQL instance. The result: now you can cost-effectively leverage multiple MySQL RDS instance to scale out write-intensive workloads to an unlimited number of users, transactions, and data. Amazon RDS with ScaleBase: What you keep – Everything! And how does this change your Amazon environment? 1. Keep your application, unchanged – There is no change your application development life-cycle at all. You still use your existing development tools, frameworks and libraries. Application quality assurance and testing cycles stay the same. And, critically, you stay with an ACID-compliant MySQL environment. 2. Keep your RDS value-added services – The value-added services that you rely on are all still available. Amazon will continue to handle database maintenance and updates for you. You can still leverage High Availability via Multi A-Z. And, if it benefits youra application throughput, you can still use read replicas. 3. Keep your RDS administration – Finally the RDS monitoring and provisioning tools you rely on still work as they did before. With your one large MySQL instance, now split into multiple instances, you can actually use less expensive, smallersmaller available RDS hardware and continue to see better database performance. Conclusion Amazon RDS is a tremendous service, but it doesn’t offer solutions to scale beyond a single MySQL instance. Larger RDS instances get more expensive. And when you max-out on the available hardware, you’re stuck. Amazon recommends scaling out your single instance into multiple instances for transaction-intensive apps, but offers no services or guidance to help you. This is where ScaleBase comes in to save the day. It gives you a simple and effective way to create multiple MySQL RDS instances, while removing all the complexities typically caused by “DIY” sharding andwith no changes to your applications . With ScaleBase you continue to leverage the AWS/RDS ecosystem: commodity hardware and value added services like read replicas, multi A-Z, maintenance/updates and administration with monitoring tools and provisioning. SCALEBASE ON AMAZON If you’re curious to try ScaleBase on Amazon, it can be found here – Download NOW. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: MySQL, PostADay, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL – Migrate Database from SQL Server to NuoDB – A Quick Tutorial

- by Pinal Dave

Data is growing exponentially and every organization with growing data is thinking of next big innovation in the world of Big Data. Big data is a indeed a future for every organization at one point of the time. Just like every other next big thing, big data has its own challenges and issues. The biggest challenge associated with the big data is to find the ideal platform which supports the scalability and growth of the data. If you are a regular reader of this blog, you must be familiar with NuoDB. I have been working with NuoDB for a while and their recent release is the best thus far. NuoDB is an elastically scalable SQL database that can run on local host, datacenter and cloud-based resources. A key feature of the product is that it does not require sharding (read more here). Last week, I was able to install NuoDB in less than 90 seconds and have explored their Explorer and Admin sections. You can read about my experiences in these posts: SQL – Step by Step Guide to Download and Install NuoDB – Getting Started with NuoDB SQL – Quick Start with Admin Sections of NuoDB – Manage NuoDB Database SQL – Quick Start with Explorer Sections of NuoDB – Query NuoDB Database Many SQL Authority readers have been following me in my journey to evaluate NuoDB. One of the frequently asked questions I’ve received from you is if there is any way to migrate data from SQL Server to NuoDB. The fact is that there is indeed a way to do so and NuoDB provides a fantastic tool which can help users to do it. NuoDB Migrator is a command line utility that supports the migration of Microsoft SQL Server, MySQL, Oracle, and PostgreSQL schemas and data to NuoDB. The migration to NuoDB is a three-step process: NuoDB Migrator generates a schema for a target NuoDB database It loads data into the target NuoDB database It dumps data from the source database Let’s see how we can migrate our data from SQL Server to NuoDB using a simple three-step approach. But before we do that we will create a sample database in MSSQL and later we will migrate the same database to NuoDB: Setup Step 1: Build a sample data CREATE DATABASE [Test]; CREATE TABLE [Department]( [DepartmentID] [smallint] NOT NULL, [Name] VARCHAR(100) NOT NULL, [GroupName] VARCHAR(100) NOT NULL, [ModifiedDate] [datetime] NOT NULL, CONSTRAINT [PK_Department_DepartmentID] PRIMARY KEY CLUSTERED ( [DepartmentID] ASC ) ) ON [PRIMARY]; INSERT INTO Department SELECT * FROM AdventureWorks2012.HumanResources.Department; Note that I am using the SQL Server AdventureWorks database to build this sample table but you can build this sample table any way you prefer. Setup Step 2: Install Java 64 bit Before you can begin the migration process to NuoDB, make sure you have 64-bit Java installed on your computer. This is due to the fact that the NuoDB Migrator tool is built in Java. You can download 64-bit Java for Windows, Mac OSX, or Linux from the following link: http://java.com/en/download/manual.jsp. One more thing to remember is that you make sure that the path in your environment settings is set to your JAVA_HOME directory or else the tool will not work. Here is how you can do it: Go to My Computer >> Right Click >> Select Properties >> Click on Advanced System Settings >> Click on Environment Variables >> Click on New and enter the following values. Variable Name: JAVA_HOME Variable Value: C:\Program Files\Java\jre7 Make sure you enter your Java installation directory in the Variable Value field. Setup Step 3: Install JDBC driver for SQL Server. There are two JDBC drivers available for SQL Server. Select the one you prefer to use by following one of the two links below: Microsoft JDBC Driver jTDS JDBC Driver In this example we will be using jTDS JDBC driver. Once you download the driver, move the driver to your NuoDB installation folder. In my case, I have moved the JAR file of the driver into the C:\Program Files\NuoDB\tools\migrator\jar folder as this is my NuoDB installation directory. Now we are all set to start the three-step migration process from SQL Server to NuoDB: Migration Step 1: NuoDB Schema Generation Here is the command I use to generate a schema of my SQL Server Database in NuoDB. First I go to the folder C:\Program Files\NuoDB\tools\migrator\bin and execute the nuodb-migrator.bat file. Note that my database name is ‘test’. Additionally my username and password is also ‘test’. You can see that my SQL Server database is running on my localhost on port 1433. Additionally, the schema of the table is ‘dbo’. nuodb-migrator schema –source.driver=net.sourceforge.jtds.jdbc.Driver –source.url=jdbc:jtds:sqlserver://localhost:1433/ –source.username=test –source.password=test –source.catalog=test –source.schema=dbo –output.path=/tmp/schema.sql The above script will generate a schema of all my SQL Server tables and will put it in the folder C:\tmp\schema.sql . You can open the schema.sql file and execute this file directly in your NuoDB instance. You can follow the link here to see how you can execute the SQL script in NuoDB. Please note that if you have not yet created the schema in the NuoDB database, you should create it before executing this step. Step 2: Generate the Dump File of the Data Once you have recreated your schema in NuoDB from SQL Server, the next step is very easy. Here we create a CSV format dump file, which will contain all the data from all the tables from the SQL Server database. The command to do so is very similar to the above command. Be aware that this step may take a bit of time based on your database size. nuodb-migrator dump –source.driver=net.sourceforge.jtds.jdbc.Driver –source.url=jdbc:jtds:sqlserver://localhost:1433/ –source.username=test –source.password=test –source.catalog=test –source.schema=dbo –output.type=csv –output.path=/tmp/dump.cat Once the above command is successfully executed you can find your CSV file in the C:\tmp\ folder. However, you do not have to do anything manually. The third and final step will take care of completing the migration process. Migration Step 3: Load the Data into NuoDB After building schema and taking a dump of the data, the very next step is essential and crucial. It will take the CSV file and load it into the NuoDB database. nuodb-migrator load –target.url=jdbc:com.nuodb://localhost:48004/mytest –target.schema=dbo –target.username=test –target.password=test –input.path=/tmp/dump.cat Please note that in the above script we are now targeting the NuoDB database, which we have already created with the name of “MyTest”. If the database does not exist, create it manually before executing the above script. I have kept the username and password as “test”, but please make sure that you create a more secure password for your database for security reasons. Voila! You’re Done That’s it. You are done. It took 3 setup and 3 migration steps to migrate your SQL Server database to NuoDB. You can now start exploring the database and build excellent, scale-out applications. In this blog post, I have done my best to come up with simple and easy process, which you can follow to migrate your app from SQL Server to NuoDB. Download NuoDB I strongly encourage you to download NuoDB and go through my 3-step migration tutorial from SQL Server to NuoDB. Additionally here are two very important blog post from NuoDB CTO Seth Proctor. He has written excellent blog posts on the concept of the Administrative Domains. NuoDB has this concept of an Administrative Domain, which is a collection of hosts that can run one or multiple databases. Each database has its own TEs and SMs, but all are managed within the Admin Console for that particular domain. http://www.nuodb.com/techblog/2013/03/11/getting-started-provisioning-a-domain/ http://www.nuodb.com/techblog/2013/03/14/getting-started-running-a-database/ Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: NuoDB

Read the article
Developer’s Life – Attitude and Communication – They Can Cause Problems – Notes from the Field #027

- by Pinal Dave

[Note from Pinal]: This is a 27th episode of Notes from the Field series. The biggest challenge for anyone is to understand human nature. We human have so many things on our mind at any moment of time. There are cases when what we say is not what we mean and there are cases where what we mean we do not say. We do say and things as per our mood and our agenda in mind. Sometimes there are incidents when our attitude creates confusion in the communication and we end up creating a situation which is absolutely not warranted. In this episode of the Notes from the Field series database expert Mike Walsh explains a very crucial issue we face in our career, which is not technical but more to relate to human nature. Read on this may be the best blog post you might read in recent times. In this week’s note from the field, I’m taking a slight departure from technical knowledge and concepts explained. We’ll be back to it next week, I’m sure. Pinal wanted us to explain some of the issues we bump into and how we see some of our customers arrive at problem situations and how we have helped get them back on the right track. Often it is a technical problem we are officially solving – but in a lot of cases as a consultant, we are really helping fix some communication difficulties. This is a technical blog post and not an “advice column” in a newspaper – but the longer I am a consultant, the more years I add to my experience in technology the more I learn that the vast majority of the problems we encounter have “soft skills” included in the chain of causes for the issue we are helping overcome. This is not going to be exhaustive but I hope that sharing four pieces of advice inspired by real issues starts a process of searching for places where we can be the cause of these challenges and look at fixing them in ourselves. Or perhaps we can begin looking at resolving them in teams that we manage. I’ll share three statements that I’ve either heard, read or said and talk about some of the communication or attitude challenges highlighted by the statement. 1 – “But that’s the SAN Administrator’s responsibility…” I heard that early on in my consulting career when talking with a customer who had serious corruption and no good recent backups – potentially no good backups at all. The statement doesn’t have to be this one exactly, but the attitude here is an attitude of “my job stops here, and I don’t care about the intent or principle of why I’m here.” It’s also a situation of having the attitude that as long as there is someone else to blame, I’m fine… You see in this case, the DBA had a suspicion that the backups were not being handled right. They were the DBA and they knew that they had responsibility to ensure SQL backups were good to go – it’s a basic requirement of a production DBA. In my “As A DBA Where Do I start?!” presentation, I argue that is job #1 of a DBA. But in this case, the thought was that there was someone else to blame. Rather than create extra work and take on responsibility it was decided to just let it be another team’s responsibility. This failed the company, the company’s customers and no one won. As technologists – we should strive to go the extra mile. If there is a lack of clarity around roles and responsibilities and we know it – we should push to get it resolved. Especially as the DBAs who should act as the advocates of the data contained in the databases we are responsible for. 2 – “We’ve always done it this way, it’s never caused a problem before!” Complacency. I have to say that many failures I’ve been paid good money to help recover from would have not happened had it been for an attitude of complacency. If any thoughts like this have entered your mind about your situation you may be suffering from it. If, while reading this, you get this sinking feeling in your stomach about that one thing you know should be fixed but haven’t done it.. Why don’t you stop and go fix it then come back.. “We should have better backups, but we’re on a SAN so we should be fine really.” “Technically speaking that could happen, but what are the chances?” “We’ll just clean that up as a fast follow” ..and so on. In the age of tightening IT budgets, increased expectations of up time, availability and performance there is no room for complacency. Our customers and business units expect – no demand – the best. Complacency says “we will give you second best or hopefully good enough and we accept the risk and know this may hurt us later. Sometimes an organization will opt for “good enough” and I agree with the concept that at times the perfect can be the enemy of the good. But when we make those decisions in a vacuum and are not reporting them up and discussing them as an organization that is different. That is us unilaterally choosing to do something less than the best and purposefully playing a game of chance. 3 – “This device must accept interference from other devices but not create any” I’ve paraphrased this one – but it’s something the Federal Communications Commission – a federal agency in the United States that regulates electronic communication – requires of all manufacturers of any device that could cause or receive interference electronically. I blogged in depth about this here (http://www.straightpathsql.com/archives/2011/07/relationship-advice-from-the-fcc/) so I won’t go into much detail other than to say this… If we all operated more on the premise that we should do our best to not be the cause of conflict, and to be less easily offended and less upset when we perceive offense life would be easier in many areas! This doesn’t always cause the issues we are called in to help out. Not directly. But where we see it is in unhealthy relationships between the various technology teams at a client. We’ll see teams hoarding knowledge, not sharing well with others and almost working against other teams instead of working with them. If you trace these problems back far enough it often stems from someone or some group of people violating this principle from the FCC. To Sum It Up Technology problems are easy to solve. At Linchpin People we help many customers get past the toughest technological challenge – and at the end of the day it is really just a repeatable process of pattern based troubleshooting, logical thinking and starting at the beginning and carefully stepping through to the end. It’s easy at the end of the day. The tough part of what we do as consultants is the people skills. Being able to help get teams working together, being able to help teams take responsibility, to improve team to team communication? That is the difficult part, and we get to use the soft skills on every engagement. Work on professional development (http://professionaldevelopment.sqlpass.org/) and see continuing improvement here, not just with technology. I can teach just about anyone how to be an excellent DBA and performance tuner, but some of these soft skills are much more difficult to teach. If you want to get started with performance analytics and triage of virtualized SQL Servers with the help of experts, read more over at Fix Your SQL Server. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Notes from the Field, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL SERVER – SSMS Automatically Generates TOP (100) PERCENT in Query Designer

- by pinaldave

Earlier this week, I was surfing various SQL forums to see what kind of help developer need in the SQL Server world. One of the question indeed caught my attention. I am here regenerating complete question as well scenario to illustrate the point in a precise manner. Additionally, I have added added second part of the question to give completeness. Question: I am trying to create a view in Query Designer (not in the New Query Window). Every time I am trying to create a view it always adds TOP (100) PERCENT automatically on the T-SQL script. No matter what I do, it always automatically adds the TOP (100) PERCENT to the script. I have attempted to copy paste from notepad, build a query and a few other things – there is no success. I am really not sure what I am doing wrong with Query Designer. Here is my query script: (I use AdventureWorks as a sample database) SELECT Person.Address.AddressID FROM Person.Address INNER JOIN Person.AddressType ON Person.Address.AddressID = Person.AddressType.AddressTypeID ORDER BY Person.Address.AddressID This script automatically replaces by following query: SELECT TOP (100) PERCENT Person.Address.AddressID FROM Person.Address INNER JOIN Person.AddressType ON Person.Address.AddressID = Person.AddressType.AddressTypeID ORDER BY Person.Address.AddressID However, when I try to do the same from New Query Window it works totally fine. However, when I attempt to create a view of the same query it gives following error. Msg 1033, Level 15, State 1, Procedure myView, Line 6 The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP, OFFSET or FOR XML is also specified. It is pretty clear to me now that the script which I have written seems to need TOP (100) PERCENT, so Query . Why do I need it? Is there any work around to this issue. I particularly find this question pretty interesting as it really touches the fundamentals of the T-SQL query writing. Please note that the query which is automatically changed is not in New Query Editor but opened from SSMS using following way. Database >> Views >> Right Click >> New View (see the image below) Answer: The answer to the above question can be very long but I will keep it simple and to the point. There are three things to discuss in above script 1) Reason for Error 2) Reason for Auto generates TOP (100) PERCENT and 3) Potential solutions to the above error. Let us quickly see them in detail. 1) Reason for Error The reason for error is already given in the error. ORDER BY is invalid in the views and a few other objects. One has to use TOP or other keywords along with it. The way semantics of the query works where optimizer only follows(honors) the ORDER BY in the same scope or the same SELECT/UPDATE/DELETE statement. There is a possibility that one can order after the scope of the view again the efforts spend to order view will be wasted. The final resultset of the query always follows the final ORDER BY or outer query’s order and due to the same reason optimizer follows the final order of the query and not of the views (as view will be used in another query for further processing e.g. in SELECT statement). Due to same reason ORDER BY is now allowed in the view. For further accuracy and clear guidance I suggest you read this blog post by Query Optimizer Team. They have explained it very clear manner the same subject. 2) Reason for Auto Generated TOP (100) PERCENT One of the most popular workaround to above error is to use TOP (100) PERCENT in the view. Now TOP (100) PERCENT allows user to use ORDER BY in the query and allows user to overcome above error which we discussed. This gives the impression to the user that they have resolved the error and successfully able to use ORDER BY in the View. Well, this is incorrect as well. The way this works is when TOP (100) PERCENT is used the result is not guaranteed as well it is ignored in our the query where the view is used. Here is the blog post on this subject: Interesting Observation – TOP 100 PERCENT and ORDER BY. Now when you create a new view in the SSMS and build a query with ORDER BY to avoid the error automatically it adds the TOP 100 PERCENT. Here is the connect item for the same issue. I am sure there will be more connect items as well but I could not find them. 3) Potential Solutions If you are reading this post from the beginning in that case, it is clear by now that ORDER BY should not be used in the View as it does not serve any purpose unless there is a specific need of it. If you are going to use TOP 100 PERCENT with ORDER BY there is absolutely no need of using ORDER BY rather avoid using it all together. Here is another blog post of mine which describes the same subject ORDER BY Does Not Work – Limitation of the Views Part 1. It is valid to use ORDER BY in a view if there is a clear business need of using TOP with any other percentage lower than 100 (for example TOP 10 PERCENT or TOP 50 PERCENT etc). In most of the cases ORDER BY is not needed in the view and it should be used in the most outer query for present result in desired order. User can remove TOP 100 PERCENT and ORDER BY from the view before using the view in any query or procedure. In the most outer query there should be ORDER BY as per the business need. I think this sums up the concept in a few words. This is a very long topic and not easy to illustrate in one single blog post. I welcome your comments and suggestions. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Server Management Studio, SQL Tips and Tricks, SQL View, T SQL, Technology

Read the article
SQL SERVER – SSMS: Disk Usage Report

- by Pinal Dave

Let us start with humor! I think we the series on various reports, we come to a logical point. We covered all the reports at server level. This means the reports we saw were targeted towards activities that are related to instance level operations. These are mostly like how a doctor diagnoses a patient. At this point I am reminded of a dialog which I read somewhere: Patient: Doc, It hurts when I touch my head. Doc: Ok, go on. What else have you experienced? Patient: It hurts even when I touch my eye, it hurts when I touch my arms, it even hurts when I touch my feet, etc. Doc: Hmmm … Patient: I feel it hurts when I touch anywhere in my body. Doc: Ahh … now I get it. You need a plaster to your finger John. Sometimes the server level gives an indicator to what is happening in the system, but we need to get to the root cause for a specific database. So, this is the first blog in series where we would start discussing about database level reports. To launch database level reports, expand selected server in Object Explorer, expand the Databases folder, and then right-click any database for which we want to look at reports. From the menu, select Reports, then Standard Reports, and then any of database level reports. In this blog, we would talk about four “disk” reports because they are similar: Disk Usage Disk Usage by Top Tables Disk Usage by Table Disk Usage by Partition Disk Usage This report shows multiple information about the database. Let us discuss them one by one. We have divided the output into 5 different sections. Section 1 shows the high level summary of the database. It shows the space used by database files (mdf and ldf). Under the hood, the report uses, various DMVs and DBCC Commands, it is using sys.data_spaces and DBCC SHOWFILESTATS. Section 2 and 3 are pie charts. One for data file allocation and another for the transaction log file. Pie chart for “Data Files Space Usage (%)” shows space consumed data, indexes, allocated to the SQL Server database, and unallocated space which is allocated to the SQL Server database but not yet filled with anything. “Transaction Log Space Usage (%)” used DBCC SQLPERF (LOGSPACE) and shows how much empty space we have in the physical transaction log file. Section 4 shows the data from Default Trace and looks at Event IDs 92, 93, 94, 95 which are for “Data File Auto Grow”, “Log File Auto Grow”, “Data File Auto Shrink” and “Log File Auto Shrink” respectively. Here is an expanded view for that section. If default trace is not enabled, then this section would be replaced by the message “Trace Log is disabled” as highlighted below. Section 5 of the report uses DBCC SHOWFILESTATS to get information. Here is the enhanced version of that section. This shows the physical layout of the file. In case you have In-Memory Objects in the database (from SQL Server 2014), then report would show information about those as well. Here is the screenshot taken for a different database, which has In-Memory table. I have highlighted new things which are only shown for in-memory database. The new sections which are highlighted above are using sys.dm_db_xtp_checkpoint_files, sys.database_files and sys.data_spaces. The new type for in-memory OLTP is ‘FX’ in sys.data_space. The next set of reports is targeted to get information about a table and its storage. These reports can answer questions like: Which is the biggest table in the database? How many rows we have in table? Is there any table which has a lot of reserved space but its unused? Which partition of the table is having more data? Disk Usage by Top Tables This report provides detailed data on the utilization of disk space by top 1000 tables within the Database. The report does not provide data for memory optimized tables. Disk Usage by Table This report is same as earlier report with few difference. First Report shows only 1000 rows First Report does order by values in DMV sys.dm_db_partition_stats whereas second one does it based on name of the table. Both of the reports have interactive sort facility. We can click on any column header and change the sorting order of data. Disk Usage by Partition This report shows the distribution of the data in table based on partition in the table. This is so similar to previous output with the partition details now. Here is the query taken from profiler. SELECT row_number() OVER (ORDER BY a1.used_page_count DESC, a1.index_id) AS row_number , (dense_rank() OVER (ORDER BY a5.name, a2.name))%2 AS l1 , a1.OBJECT_ID , a5.name AS [schema] , a2.name , a1.index_id , a3.name AS index_name , a3.type_desc , a1.partition_number , a1.used_page_count * 8 AS total_used_pages , a1.reserved_page_count * 8 AS total_reserved_pages , a1.row_count FROM sys.dm_db_partition_stats a1 INNER JOIN sys.all_objects a2 ON ( a1.OBJECT_ID = a2.OBJECT_ID) AND a1.OBJECT_ID NOT IN (SELECT OBJECT_ID FROM sys.tables WHERE is_memory_optimized = 1) INNER JOIN sys.schemas a5 ON (a5.schema_id = a2.schema_id) LEFT OUTER JOIN sys.indexes a3 ON ( (a1.OBJECT_ID = a3.OBJECT_ID) AND (a1.index_id = a3.index_id) ) WHERE (SELECT MAX(DISTINCT partition_number) FROM sys.dm_db_partition_stats a4 WHERE (a4.OBJECT_ID = a1.OBJECT_ID)) >= 1 AND a2.TYPE <> N'S' AND a2.TYPE <> N'IT' ORDER BY a5.name ASC, a2.name ASC, a1.index_id, a1.used_page_count DESC, a1.partition_number Using all of the above reports, you should be able to get the usage of database files and also space used by tables. I think this is too much disk information for a single blog and I hope you have used them in the past to get data. Do let me know if you found anything interesting using these reports in your environments. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Server Management Studio, SQL Tips and Tricks, T SQL Tagged: SQL Reports

Read the article
SQL SERVER – Weekly Series – Memory Lane – #035

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Row Overflow Data Explanation In SQL Server 2005 one table row can contain more than one varchar(8000) fields. One more thing, the exclusions has exclusions also the limit of each individual column max width of 8000 bytes does not apply to varchar(max), nvarchar(max), varbinary(max), text, image or xml data type columns. Comparison Index Fragmentation, Index De-Fragmentation, Index Rebuild – SQL SERVER 2000 and SQL SERVER 2005 An old but like a gold article. Talks about lots of concepts related to Index and the difference from earlier version to the newer version. I strongly suggest that everyone should read this article just to understand how SQL Server has moved forward with the technology. Improvements in TempDB SQL Server 2005 had come up with quite a lots of improvements and this blog post describes them and explains the same. If you ask me what is my the most favorite article from early career. I must point out to this article as when I wrote this one I personally have learned a lot of new things. Recompile All The Stored Procedure on Specific TableI prefer to recompile all the stored procedure on the table, which has faced mass insert or update. sp_recompiles marks stored procedures to recompile when they execute next time. This blog post explains the same with the help of a script. 2008 SQLAuthority Download – SQL Server Cheatsheet You can download and print this cheat sheet and use it for your personal reference. If you have any suggestions, please let me know and I will see if I can update this SQL Server cheat sheet. Difference Between DBMS and RDBMS What is the difference between DBMS and RDBMS? DBMS – Data Base Management System RDBMS – Relational Data Base Management System or Relational DBMS High Availability – Hot Add Memory Hot Add CPU and Hot Add Memory are extremely interesting features of the SQL Server, however, personally I have not witness them heavily used. These features also have few restriction as well. I blogged about them in detail. 2009 Delete Duplicate Rows I have demonstrated in this blog post how one can identify and delete duplicate rows. Interesting Observation of Logon Trigger On All Servers – Solution The question I put forth in my previous article was – In single login why the trigger fires multiple times; it should be fired only once. I received numerous answers in thread as well as in my MVP private news group. Now, let us discuss the answer for the same. The answer is – It happens because multiple SQL Server services are running as well as intellisense is turned on. Blog post demonstrates how we can do the same with the help of SQL scripts. Management Studio New Features I have selected my favorite 5 features and blogged about it. IntelliSense for Query Editing Multi Server Query Query Editor Regions Object Explorer Enhancements Activity Monitors Maximum Number of Index per Table One of the questions I asked in my user group was – What is the maximum number of Index per table? I received lots of answers to this question but only two answers are correct. Let us now take a look at them in this blog post. 2010 Default Statistics on Column – Automatic Statistics on Column The truth is, Statistics can be in a table even though there is no Index in it. If you have the auto- create and/or auto-update Statistics feature turned on for SQL Server database, Statistics will be automatically created on the Column based on a few conditions. Please read my previously posted article, SQL SERVER – When are Statistics Updated – What triggers Statistics to Update, for the specific conditions when Statistics is updated. 2011 T-SQL Scripts to Find Maximum between Two Numbers In this blog post there are two different scripts listed which demonstrates way to find the maximum number between two numbers. I need your help, which one of the script do you think is the most accurate way to find maximum number? Find Details for Statistics of Whole Database – DMV – T-SQL Script I was recently asked is there a single script which can provide all the necessary details about statistics for any database. This question made me write following script. I was initially planning to use sp_helpstats command but I remembered that this is marked to be deprecated in future. 2012 Introduction to Function SIGN SIGN Function is very fundamental function. It will return the value 1, -1 or 0. If your value is negative it will return you negative -1 and if it is positive it will return you positive +1. Let us start with a simple small example. Template Browser – A Very Important and Useful Feature of SSMS Templates are like a quick cheat sheet or quick reference. Templates are available to create objects like databases, tables, views, indexes, stored procedures, triggers, statistics, and functions. Templates are also available for Analysis Services as well. The template scripts contain parameters to help you customize the code. You can Replace Template Parameters dialog box to insert values into the script. An invalid floating point operation occurred If you run any of the above functions they will give you an error related to invalid floating point. Honestly there is no workaround except passing the function appropriate values. SQRT of a negative number will give you result in real numbers which is not supported at this point of time as well LOG of a negative number is not possible (because logarithm is the inverse function of an exponential function and the exponential function is NEVER negative). Validating Spatial Object with IsValidDetailed Function SQL Server 2012 has introduced the new function IsValidDetailed(). This function has made my life very easy. In simple words, this function will check if the spatial object passed is valid or not. If it is valid it will give information that it is valid. If the spatial object is not valid it will return the answer that it is not valid and the reason for the same. This makes it very easy to debug the issue and make the necessary correction. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Faster SQL Server Databases and Applications – Power and Control with SafePeak Caching Options

- by Pinal Dave

Update: This blog post is written based on the SafePeak, which is available for free download. Today, I’d like to examine more closely one of my preferred technologies for accelerating SQL Server databases, SafePeak. Safepeak’s software provides a variety of advanced data caching options, techniques and tools to accelerate the performance and scalability of SQL Server databases and applications. I’d like to look more closely at some of these options, as some of these capabilities could help you address lagging database and performance on your systems. To better understand the available options, it is best to start by understanding the difference between the usual “Basic Caching” vs. SafePeak’s “Dynamic Caching”. Basic Caching Basic Caching (or the stale and static cache) is an ability to put the results from a query into cache for a certain period of time. It is based on TTL, or Time-to-live, and is designed to stay in cache no matter what happens to the data. For example, although the actual data can be modified due to DML commands (update/insert/delete), the cache will still hold the same obsolete query data. Meaning that with the Basic Caching is really static / stale cache. As you can tell, this approach has its limitations. Dynamic Caching Dynamic Caching (or the non-stale cache) is an ability to put the results from a query into cache while maintaining the cache transaction awareness looking for possible data modifications. The modifications can come as a result of: DML commands (update/insert/delete), indirect modifications due to triggers on other tables, executions of stored procedures with internal DML commands complex cases of stored procedures with multiple levels of internal stored procedures logic. When data modification commands arrive, the caching system identifies the related cache items and evicts them from cache immediately. In the dynamic caching option the TTL setting still exists, although its importance is reduced, since the main factor for cache invalidation (or cache eviction) become the actual data updates commands. Now that we have a basic understanding of the differences between “basic” and “dynamic” caching, let’s dive in deeper. SafePeak: A comprehensive and versatile caching platform SafePeak comes with a wide range of caching options. Some of SafePeak’s caching options are automated, while others require manual configuration. Together they provide a complete solution for IT and Data managers to reach excellent performance acceleration and application scalability for a wide range of business cases and applications. Automated caching of SQL Queries: Fully/semi-automated caching of all “read” SQL queries, containing any types of data, including Blobs, XMLs, Texts as well as all other standard data types. SafePeak automatically analyzes the incoming queries, categorizes them into SQL Patterns, identifying directly and indirectly accessed tables, views, functions and stored procedures; Automated caching of Stored Procedures: Fully or semi-automated caching of all read” stored procedures, including procedures with complex sub-procedure logic as well as procedures with complex dynamic SQL code. All procedures are analyzed in advance by SafePeak’s Metadata-Learning process, their SQL schemas are parsed – resulting with a full understanding of the underlying code, objects dependencies (tables, views, functions, sub-procedures) enabling automated or semi-automated (manually review and activate by a mouse-click) cache activation, with full understanding of the transaction logic for cache real-time invalidation; Transaction aware cache: Automated cache awareness for SQL transactions (SQL and in-procs); Dynamic SQL Caching: Procedures with dynamic SQL are pre-parsed, enabling easy cache configuration, eliminating SQL Server load for parsing time and delivering high response time value even in most complicated use-cases; Fully Automated Caching: SQL Patterns (including SQL queries and stored procedures) that are categorized by SafePeak as “read and deterministic” are automatically activated for caching; Semi-Automated Caching: SQL Patterns categorized as “Read and Non deterministic” are patterns of SQL queries and stored procedures that contain reference to non-deterministic functions, like getdate(). Such SQL Patterns are reviewed by the SafePeak administrator and in usually most of them are activated manually for caching (point and click activation); Fully Dynamic Caching: Automated detection of all dependent tables in each SQL Pattern, with automated real-time eviction of the relevant cache items in the event of “write” commands (a DML or a stored procedure) to one of relevant tables. A default setting; Semi Dynamic Caching: A manual cache configuration option enabling reducing the sensitivity of specific SQL Patterns to “write” commands to certain tables/views. An optimization technique relevant for cases when the query data is either known to be static (like archive order details), or when the application sensitivity to fresh data is not critical and can be stale for short period of time (gaining better performance and reduced load); Scheduled Cache Eviction: A manual cache configuration option enabling scheduling SQL Pattern cache eviction based on certain time(s) during a day. A very useful optimization technique when (for example) certain SQL Patterns can be cached but are time sensitive. Example: “select customers that today is their birthday”, an SQL with getdate() function, which can and should be cached, but the data stays relevant only until 00:00 (midnight); Parsing Exceptions Management: Stored procedures that were not fully parsed by SafePeak (due to too complex dynamic SQL or unfamiliar syntax), are signed as “Dynamic Objects” with highest transaction safety settings (such as: Full global cache eviction, DDL Check = lock cache and check for schema changes, and more). The SafePeak solution points the user to the Dynamic Objects that are important for cache effectiveness, provides easy configuration interface, allowing you to improve cache hits and reduce cache global evictions. Usually this is the first configuration in a deployment; Overriding Settings of Stored Procedures: Override the settings of stored procedures (or other object types) for cache optimization. For example, in case a stored procedure SP1 has an “insert” into table T1, it will not be allowed to be cached. However, it is possible that T1 is just a “logging or instrumentation” table left by developers. By overriding the settings a user can allow caching of the problematic stored procedure; Advanced Cache Warm-Up: Creating an XML-based list of queries and stored procedure (with lists of parameters) for periodically automated pre-fetching and caching. An advanced tool allowing you to handle more rare but very performance sensitive queries pre-fetch them into cache allowing high performance for users’ data access; Configuration Driven by Deep SQL Analytics: All SQL queries are continuously logged and analyzed, providing users with deep SQL Analytics and Performance Monitoring. Reduce troubleshooting from days to minutes with database objects and SQL Patterns heat-map. The performance driven configuration helps you to focus on the most important settings that bring you the highest performance gains. Use of SafePeak SQL Analytics allows continuous performance monitoring and analysis, easy identification of bottlenecks of both real-time and historical data; Cloud Ready: Available for instant deployment on Amazon Web Services (AWS). As you can see, there are many options to configure SafePeak’s SQL Server database and application acceleration caching technology to best fit a lot of situations. If you’re not familiar with their technology, they offer free-trial software you can download that comes with a free “help session” to help get you started. You can access the free trial here. Also, SafePeak is available to use on Amazon Cloud. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
SQL SERVER – Weekly Series – Memory Lane – #048

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Order of Result Set of SELECT Statement on Clustered Indexed Table When ORDER BY is Not Used Above theory is true in most of the cases. However SQL Server does not use that logic when returning the resultset. SQL Server always returns the resultset which it can return fastest.In most of the cases the resultset which can be returned fastest is the resultset which is returned using clustered index. Effect of TRANSACTION on Local Variable – After ROLLBACK and After COMMIT One of the Jr. Developer asked me this question (What will be the Effect of TRANSACTION on Local Variable – After ROLLBACK and After COMMIT?) while I was rushing to an important meeting. I was getting late so I asked him to talk with his Application Tech Lead. When I came back from meeting both of them were looking for me. They said they are confused. I quickly wrote down following example for them. 2008 SQL SERVER – Guidelines and Coding Standards Complete List Download Coding standards and guidelines are very important for any developer on the path of a successful career. A coding standard is a set of guidelines, rules and regulations on how to write code. Coding standards should be flexible enough or should take care of the situation where they should not prevent best practices for coding. They are basically the guidelines that one should follow for better understanding. Download Guidelines and Coding Standards complete List Download Get Answer in Float When Dividing of Two Integer Many times we have requirements of some calculations amongst different fields in Tables. One of the software developers here was trying to calculate some fields having integer values and divide it which gave incorrect results in integer where accurate results including decimals was expected. Puzzle – Computed Columns Datatype Explanation SQL Server automatically does a cast to the data type having the highest precedence. So the result of INT and INT will be INT, but INT and FLOAT will be FLOAT because FLOAT has a higher precedence. If you want a different data type, you need to do an EXPLICIT cast. Renaming SP is Not Good Idea – Renaming Stored Procedure Does Not Update sys.procedures I have written many articles about renaming a tables, columns and procedures SQL SERVER – How to Rename a Column Name or Table Name, here I found something interesting about renaming the stored procedures and felt like sharing it with you all. The interesting fact is that when we rename a stored procedure using SP_Rename command, the Stored Procedure is successfully renamed. But when we try to test the procedure using sp_helptext, the procedure will be having the old name instead of new names. 2009 Insert Values of Stored Procedure in Table – Use Table Valued Function It is clear from the result set that , where I have converted stored procedure logic into the table valued function, is much better in terms of logic as it saves a large number of operations. However, this option should be used carefully. The performance of the stored procedure is “usually” better than that of functions. Interesting Observation – Index on Index View Used in Similar Query Recently, I was working on an optimization project for one of the largest organizations. While working on one of the queries, we came across a very interesting observation. We found that there was a query on the base table and when the query was run, it used the index, which did not exist in the base table. On careful examination, we found that the query was using the index that was on another view. This was very interesting as I have personally never experienced a scenario like this. In simple words, “Query on the base table can use the index created on the indexed view of the same base table.” Interesting Observation – Execution Plan and Results of Aggregate Concatenation Queries Working with SQL Server has never seemed to be monotonous – no matter how long one has worked with it. Quite often, I come across some excellent comments that I feel like acknowledging them as blog posts. Recently, I wrote an article on SQL SERVER – Execution Plan and Results of Aggregate Concatenation Queries Depend Upon Expression Location, which is well received in the community. 2010 I encourage all of you to go through complete series and write your own on the subject. If you write an article and send it to me, I will publish it on this blog with due credit to you. If you write on your own blog, I will update this blog post pointing to your blog post. SQL SERVER – ORDER BY Does Not Work – Limitation of the View 1 SQL SERVER – Adding Column is Expensive by Joining Table Outside View – Limitation of the View 2 SQL SERVER – Index Created on View not Used Often – Limitation of the View 3 SQL SERVER – SELECT * and Adding Column Issue in View – Limitation of the View 4 SQL SERVER – COUNT(*) Not Allowed but COUNT_BIG(*) Allowed – Limitation of the View 5 SQL SERVER – UNION Not Allowed but OR Allowed in Index View – Limitation of the View 6 SQL SERVER – Cross Database Queries Not Allowed in Indexed View – Limitation of the View 7 SQL SERVER – Outer Join Not Allowed in Indexed Views – Limitation of the View 8 SQL SERVER – SELF JOIN Not Allowed in Indexed View – Limitation of the View 9 SQL SERVER – Keywords View Definition Must Not Contain for Indexed View – Limitation of the View 10 SQL SERVER – View Over the View Not Possible with Index View – Limitations of the View 11 2011 Startup Parameters Easy to Configure If you are a regular reader of this blog, you must be aware that I have written about SQL Server Denali recently. Here is the quickest way to reach into the screen where we can change the startup parameters. Go to SQL Server Configuration Manager >> SQL Server Services >> Right Click on the Server >> Properties >> Startup Parameters 2012 Validating Unique Columnname Across Whole Database I sometimes come across very strange requirements and often I do not receive a proper explanation of the same. Here is the one of those examples. For example “Our business requirement is when we add new column we want it unique across current database.” Read the solution to this strange request in this blog post. Excel Losing Decimal Values When Value Pasted from SSMS ResultSet It is very common when users are coping the resultset to Excel, the floating point or decimals are missed. The solution is very much simple and it requires a small adjustment in the Excel. By default Excel is very smart and when it detects the value which is getting pasted is numeric it changes the column format to accommodate that. Basic Calculation and PEMDAS Order of Operation Read this interesting blog post for fantastic conversation about the subject. Copy Column Headers from Resultset – SQL in Sixty Seconds #027 – Video http://www.youtube.com/watch?v=x_-3tLqTRv0 Delete From Multiple Table – Update Multiple Table in Single Statement There are two questions which I get every single day multiple times. In my gmail, I have created standard canned reply for them. Let us see the questions here. I want to delete from multiple table in a single statement how will I do it? I want to update multiple table in a single statement how will I do it? Read the answer in the blog post. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Weekly Series – Memory Lane – #052

- by Pinal Dave

Let us continue with the final episode of the Memory Lane Series. Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Set Server Level FILLFACTOR Using T-SQL Script Specifies a percentage that indicates how full the Database Engine should make the leaf level of each index page during index creation or alteration. fillfactor must be an integer value from 1 to 100. The default is 0. Limitation of Online Index Rebuld Operation Online operation means when online operations are happening in the database are in normal operational condition, the processes which are participating in online operations does not require exclusive access to the database. Get Permissions of My Username / Userlogin on Server / Database A few days ago, I was invited to one of the largest database company. I was asked to review database schema and propose changes to it. There was special username or user logic was created for me, so I can review their database. I was very much interested to know what kind of permissions I was assigned per server level and database level. I did not feel like asking Sr. DBA the question about permissions. Simple Example of WHILE Loop With CONTINUE and BREAK Keywords This question is one of those questions which is very simple and most of the users get it correct, however few users find it confusing for the first time. I have tried to explain the usage of simple WHILE loop in the first example. BREAK keyword will exit the stop the while loop and control is moved to the next statement after the while loop. CONTINUE keyword skips all the statement after its execution and control is sent to the first statement of while loop. Forced Parameterization and Simple Parameterization – T-SQL and SSMS When the PARAMETERIZATION option is set to FORCED, any literal value that appears in a SELECT, INSERT, UPDATE or DELETE statement is converted to a parameter during query compilation. When the PARAMETERIZATION database option is SET to SIMPLE, the SQL Server query optimizer may choose to parameterize the queries. 2008 Transaction and Local Variables – Swap Variables – Update All At Once Concept Summary : Transaction have no effect over memory variables. When UPDATE statement is applied over any table (physical or memory) all the updates are applied at one time together when the statement is committed. First of all I suggest that you read the article listed above about the effect of transaction on local variant. As seen there local variables are independent of any transaction effect. Simulate INNER JOIN using LEFT JOIN statement – Performance Analysis Just a day ago, while I was working with JOINs I find one interesting observation, which has prompted me to create following example. Before we continue further let me make very clear that INNER JOIN should be used where it cannot be used and simulating INNER JOIN using any other JOINs will degrade the performance. If there are scopes to convert any OUTER JOIN to INNER JOIN it should be done with priority. 2009 Introduction to Business Intelligence – Important Terms & Definitions Business intelligence (BI) is a broad category of application programs and technologies for gathering, storing, analyzing, and providing access to data from various data sources, thus providing enterprise users with reliable and timely information and analysis for improved decision making. Difference Between Candidate Keys and Primary Key Candidate Key – A Candidate Key can be any column or a combination of columns that can qualify as unique key in database. There can be multiple Candidate Keys in one table. Each Candidate Key can qualify as Primary Key. Primary Key – A Primary Key is a column or a combination of columns that uniquely identify a record. Only one Candidate Key can be Primary Key. 2010 Taking Multiple Backup of Database in Single Command – Mirrored Database Backup I recently had a very interesting experience. In one of my recent consultancy works, I was told by our client that they are going to take the backup of the database and will also a copy of it at the same time. I expressed that it was surely possible if they were going to use a mirror command. In addition, they told me that whenever they take two copies of the database, the size of the database, is always reduced. Now this was something not clear to me, I said it was not possible and so I asked them to show me the script. Corrupted Backup File and Unsuccessful Restore The CTO, who was also present at the location, got very upset with this situation. He then asked when the last successful restore test was done. As expected, the answer was NEVER.There were no successful restore tests done before. During that time, I was present and I could clearly see the stress, confusion, carelessness and anger around me. I did not appreciate the feeling and I was pretty sure that no one in there wanted the atmosphere like me. 2011 TRACEWRITE – Wait Type – Wait Related to Buffer and Resolution SQL Trace is a SQL Server database engine technology which monitors specific events generated when various actions occur in the database engine. When any event is fired it goes through various stages as well various routes. One of the routes is Trace I/O Provider, which sends data to its final destination either as a file or rowset. DATEDIFF – Accuracy of Various Dateparts If you want to have accuracy in seconds, you need to use a different approach. In the first example, the accurate method is to find the number of seconds first and then divide it by 60 to convert it in minutes. Dedicated Access Control for SQL Server Express Edition http://www.youtube.com/watch?v=1k00z82u4OI Book Signing at SQLPASS 2012 Who I Am And How I Got Here – True Story as Blog Post If there was a shortcut to success – I want to know. I learnt SQL Server hard way and I am still learning. There are so many things, I have to learn. There is not enough time to learn everything which we want to learn. I am constantly working on it every day. I welcome you to join my journey as well. Please join me in my journey to learn SQL Server – more the merrier. Vacation, Travel and Study – A New Concept Even those who have advanced degrees and went to college for years, or even decades, find studying hard. There is a difference between studying for a career and studying for a certification. At least to get a degree there is a variety of subjects, with labs, exams, and practice problems to make things more interesting. Order By Numeric Values Formatted as String We have a table which has a column containing alphanumeric data. The data always has first as an integer and later part as a string. The business need is to order the data based on the first part of the alphanumeric data which is an integer. Now the problem is that no matter how we use ORDER BY the result is not produced as expected. Let us understand this with an example. Resolving SQL Server Connection Errors – SQL in Sixty Seconds #030 – Video One of the most famous errors related to SQL Server is about connecting to SQL Server itself. Here is how it goes, most of the time developers have worked with SQL Server and knows pretty much every error which they face during development language. However, hardly they install fresh SQL Server. As the installation of the SQL Server is a rare occasion unless you are a DBA who is responsible for such an instance – the error faced during installations are pretty rare as well. http://www.youtube.com/watch?v=1k00z82u4OI Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Thinking about Deprecated, Discontinued Features and Breaking Changes while Upgrading to SQL Server 2012 – Guest Post by Nakul Vachhrajani

- by pinaldave

Nakul Vachhrajani is a Technical Specialist and systems development professional with iGATE having a total IT experience of more than 7 years. Nakul is an active blogger with BeyondRelational.com (150+ blogs), and can also be found on forums at SQLServerCentral and BeyondRelational.com. Nakul has also been a guest columnist for SQLAuthority.com and SQLServerCentral.com. Nakul presented a webcast on the “Underappreciated Features of Microsoft SQL Server” at the Microsoft Virtual Tech Days Exclusive Webcast series (May 02-06, 2011) on May 06, 2011. He is also the author of a research paper on Database upgrade methodologies, which was published in a CSI journal, published nationwide. In addition to his passion about SQL Server, Nakul also contributes to the academia out of personal interest. He visits various colleges and universities as an external faculty to judge project activities being carried out by the students. Disclaimer: The opinions expressed herein are his own personal opinions and do not represent his employer’s view in anyway. Blog | LinkedIn | Twitter | Google+ Let us hear the thoughts of Nakul in first person - Those who have been following my blogs would be aware that I am recently running a series on the database engine features that have been deprecated in Microsoft SQL Server 2012. Based on the response that I have received, I was quite surprised to know that most of the audience found these to be breaking changes, when in fact, they were not! It was then that I decided to write a little piece on how to plan your database upgrade such that it works with the next version of Microsoft SQL Server. Please note that the recommendations made in this article are high-level markers and are intended to help you think over the specific steps that you would need to take to upgrade your database. Refer the documentation – Understand the terms Change is the only constant in this world. Therefore, whenever customer requirements, newer architectures and designs require software vendors to make a change to the keywords, functions, etc; they ensure that they provide their end users sufficient time to migrate over to the new standards before dropping off the old ones. Microsoft does that too with it’s Microsoft SQL Server product. Whenever a new SQL Server release is announced, it comes with a list of the following features: Breaking changes These are changes that would break your currently running applications, scripts or functionalities that are based on earlier version of Microsoft SQL Server These are mostly features whose behavior has been changed keeping in mind the newer architectures and designs Lesson: These are the changes that you need to be most worried about! Discontinued features These features are no longer available in the associated version of Microsoft SQL Server These features used to be “deprecated” in the prior release Lesson: Without these changes, your database would not be compliant/may not work with the version of Microsoft SQL Server under consideration Deprecated features These features are those that are still available in the current version of Microsoft SQL Server, but are scheduled for removal in a future version. These may be removed in either the next version or any other future version of Microsoft SQL Server The features listed for deprecation will compose the list of discontinued features in the next version of SQL Server Lesson: Plan to make necessary changes required to remove/replace usage of the deprecated features with the latest recommended replacements Once a feature appears on the list, it moves from bottom to the top, i.e. it is first marked as “Deprecated” and then “Discontinued”. We know of “Breaking change” comes later on in the product life cycle. What this means is that if you want to know what features would not work with SQL Server 2012 (and you are currently using SQL Server 2008 R2), you need to refer the list of breaking changes and discontinued features in SQL Server 2012. Use the tools! There are a lot of tools and technologies around us, but it is rarely that I find teams using these tools religiously and to the best of their potential. Below are the top two tools, from Microsoft, that I use every time I plan a database upgrade. The SQL Server Upgrade Advisor Ever since SQL Server 2005 was announced, Microsoft provides a small, very light-weight tool called the “SQL Server upgrade advisor”. The upgrade advisor analyzes installed components from earlier versions of SQL Server, and then generates a report that identifies issues to fix either before or after you upgrade. The analysis examines objects that can be accessed, such as scripts, stored procedures, triggers, and trace files. Upgrade Advisor cannot analyze desktop applications or encrypted stored procedures. Refer the links towards the end of the post to know how to get the Upgrade Advisor. The SQL Server Profiler Another great tool that you can use is the one most SQL Server developers & administrators use often – the SQL Server profiler. SQL Server Profiler provides functionality to monitor the “Deprecation” event, which contains: Deprecation announcement – equivalent to features to be deprecated in a future release of SQL Server Deprecation final support – equivalent to features to be deprecated in the next release of SQL Server You can learn more using the links towards the end of the post. A basic checklist There are a lot of finer points that need to be taken care of when upgrading your database. But, it would be worth-while to identify a few basic steps in order to make your database compliant with the next version of SQL Server: Monitor the current application workload (on a test bed) via the Profiler in order to identify usage of features marked as Deprecated If none appear, you are all set! (This almost never happens) Note down all the offending queries and feature usages Run analysis sessions using the SQL Server upgrade advisor on your database Based on the inputs from the analysis report and Profiler trace sessions, Incorporate solutions for the breaking changes first Next, incorporate solutions for the discontinued features Revisit and document the upgrade strategy for your deployment scenarios Revisit the fall-back, i.e. rollback strategies in case the upgrades fail Because some programming changes are dependent upon the SQL server version, this may need to be done in consultation with the development teams Before any other enhancements are incorporated by the development team, send out the database changes into QA QA strategy should involve a comparison between an environment running the old version of SQL Server against the new one Because minimal application changes have gone in (essential changes for SQL Server version compliance only), this would be possible As an ongoing activity, keep incorporating changes recommended as per the deprecated features list As a DBA, update your coding standards to ensure that the developers are using ANSI compliant code – this code will require a change only if the ANSI standard changes Remember this: Change management is a continuous process. Keep revisiting the product release notes and incorporate recommended changes to stay prepared for the next release of SQL Server. May the power of SQL Server be with you! Links Referenced in this post Breaking changes in SQL Server 2012: Link Discontinued features in SQL Server 2012: Link Get the upgrade advisor from the Microsoft Download Center at: Link Upgrade Advisor page on MSDN: Link Profiler: Review T-SQL code to identify objects no longer supported by Microsoft: Link Upgrading to SQL Server 2012 by Vinod Kumar: Link Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Upgrade

Read the article
SQL SERVER – Core Concepts – Elasticity, Scalability and ACID Properties – Exploring NuoDB an Elastically Scalable Database System

- by pinaldave

I have been recently exploring Elasticity and Scalability attributes of databases. You can see that in my earlier blog posts about NuoDB where I wanted to look at Elasticity and Scalability concepts. The concepts are very interesting, and intriguing as well. I have discussed these concepts with my friend Joyti M and together we have come up with this interesting read. The goal of this article is to answer following simple questions What is Elasticity? What is Scalability? How ACID properties vary from NOSQL Concepts? What are the prevailing problems in the current database system architectures? Why is NuoDB an innovative and welcome change in database paradigm? Elasticity This word’s original form is used in many different ways and honestly it does do a decent job in holding things together over the years as a person grows and contracts. Within the tech world, and specifically related to software systems (database, application servers), it has come to mean a few things - allow stretching of resources without reaching the breaking point (on demand). What are resources in this context? Resources are the usual suspects – RAM/CPU/IO/Bandwidth in the form of a container (a process or bunch of processes combined as modules). When it is about increasing resources the simplest idea which comes to mind is the addition of another container. Another container means adding a brand new physical node. When it is about adding a new node there are two questions which comes to mind. 1) Can we add another node to our software system? 2) If yes, does adding new node cause downtime for the system? Let us assume we have added new node, let us see what the new needs of the system are when a new node is added. Balancing incoming requests to multiple nodes Synchronization of a shared state across multiple nodes Identification of “downstate” and resolution action to bring it to “upstate” Well, adding a new node has its advantages as well. Here are few of the positive points Throughput can increase nearly horizontally across the node throughout the system Response times of application will increase as in-between layer interactions will be improved Now, Let us put the above concepts in the perspective of a Database. When we mention the term “running out of resources” or “application is bound to resources” the resources can be CPU, Memory or Bandwidth. The regular approach to “gain scalability” in the database is to look around for bottlenecks and increase the bottlenecked resource. When we have memory as a bottleneck we look at the data buffers, locks, query plans or indexes. After a point even this is not enough as there needs to be an efficient way of managing such large workload on a “single machine” across memory and CPU bound (right kind of scheduling) workload. We next move on to either read/write separation of the workload or functionality-based sharing so that we still have control of the individual. But this requires lots of planning and change in client systems in terms of knowing where to go/update/read and for reporting applications to “aggregate the data” in an intelligent way. What we ideally need is an intelligent layer which allows us to do these things without us getting into managing, monitoring and distributing the workload. Scalability In the context of database/applications, scalability means three main things Ability to handle normal loads without pressure E.g. X users at the Y utilization of resources (CPU, Memory, Bandwidth) on the Z kind of hardware (4 processor, 32 GB machine with 15000 RPM SATA drives and 1 GHz Network switch) with T throughput Ability to scale up to expected peak load which is greater than normal load with acceptable response times Ability to provide acceptable response times across the system E.g. Response time in S milliseconds (or agreed upon unit of measure) – 90% of the time The Issue – Need of Scale In normal cases one can plan for the load testing to test out normal, peak, and stress scenarios to ensure specific hardware meets the needs. With help from Hardware and Software partners and best practices, bottlenecks can be identified and requisite resources added to the system. Unfortunately this vertical scale is expensive and difficult to achieve and most of the operational people need the ability to scale horizontally. This helps in getting better throughput as there are physical limits in terms of adding resources (Memory, CPU, Bandwidth and Storage) indefinitely. Today we have different options to achieve scalability: Read & Write Separation The idea here is to do actual writes to one store and configure slaves receiving the latest data with acceptable delays. Slaves can be used for balancing out reads. We can also explore functional separation or sharing as well. We can separate data operations by a specific identifier (e.g. region, year, month) and consolidate it for reporting purposes. For functional separation the major disadvantage is when schema changes or workload pattern changes. As the requirement grows one still needs to deal with scale need in manual ways by providing an abstraction in the middle tier code. Using NOSQL solutions The idea is to flatten out the structures in general to keep all values which are retrieved together at the same store and provide flexible schema. The issue with the stores is that they are compromising on mostly consistency (no ACID guarantees) and one has to use NON-SQL dialect to work with the store. The other major issue is about education with NOSQL solutions. Would one really want to make these compromises on the ability to connect and retrieve in simple SQL manner and learn other skill sets? Or for that matter give up on ACID guarantee and start dealing with consistency issues? Hybrid Deployment – Mac, Linux, Cloud, and Windows One of the challenges today that we see across On-premise vs Cloud infrastructure is a difference in abilities. Take for example SQL Azure – it is wonderful in its concepts of throttling (as it is shared deployment) of resources and ability to scale using federation. However, the same abilities are not available on premise. This is not a mistake, mind you – but a compromise of the sweet spot of workloads, customer requirements and operational SLAs which can be supported by the team. In today’s world it is imperative that databases are available across operating systems – which are a commodity and used by developers of all hues. An Ideal Database Ability List A system which allows a linear scale of the system (increase in throughput with reasonable response time) with the addition of resources A system which does not compromise on the ACID guarantees and require developers to learn new paradigms A system which does not force fit a new way interacting with database by learning Non-SQL dialect A system which does not force fit its mechanisms for providing availability across its various modules. Well NuoDB is the first database which has all of the above abilities and much more. In future articles I will cover my hands-on experience with it. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: NuoDB

Read the article
SQL SERVER – Weekly Series – Memory Lane – #032

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Complete Series of Database Coding Standards and Guidelines SQL SERVER Database Coding Standards and Guidelines – Introduction SQL SERVER – Database Coding Standards and Guidelines – Part 1 SQL SERVER – Database Coding Standards and Guidelines – Part 2 SQL SERVER Database Coding Standards and Guidelines Complete List Download Explanation and Example – SELF JOIN When all of the data you require is contained within a single table, but data needed to extract is related to each other in the table itself. Examples of this type of data relate to Employee information, where the table may have both an Employee’s ID number for each record and also a field that displays the ID number of an Employee’s supervisor or manager. To retrieve the data tables are required to relate/join to itself. Insert Multiple Records Using One Insert Statement – Use of UNION ALL This is very interesting question I have received from new developer. How can I insert multiple values in table using only one insert? Now this is interesting question. When there are multiple records are to be inserted in the table following is the common way using T-SQL. Function to Display Current Week Date and Day – Weekly Calendar Straight blog post with script to find current week date and day based on the parameters passed in the function. 2008 In my beginning years, I have almost same confusion as many of the developer had in their earlier years. Here are two of the interesting question which I have attempted to answer in my early year. Even if you are experienced developer may be you will still like to read following two questions: Order Of Column In Index Order of Conditions in WHERE Clauses Example of DISTINCT in Aggregate Functions Have you ever used DISTINCT with the Aggregation Function? Here is a simple example about how users can do it. Create a Comma Delimited List Using SELECT Clause From Table Column Straight to script example where I explained how to do something easy and quickly. Compound Assignment Operators SQL SERVER 2008 has introduced new concept of Compound Assignment Operators. Compound Assignment Operators are available in many other programming languages for quite some time. Compound Assignment Operators is operator where variables are operated upon and assigned on the same line. PIVOT and UNPIVOT Table Examples Here is a very interesting question – the answer to the question can be YES or NO both. “If we PIVOT any table and UNPIVOT that table do we get our original table?” Read the blog post to get the explanation of the question above. 2009 What is Interim Table – Simple Definition of Interim Table The interim table is a table that is generated by joining two tables and not the final result table. In other words, when two tables are joined they create an interim table as resultset but the resultset is not final yet. It may be possible that more tables are about to join on the interim table, and more operations are still to be applied on that table (e.g. Order By, Having etc). Besides, it may be possible that there is no interim table; sometimes final table is what is generated when the query is run. 2010 Stored Procedure and Transactions If Stored Procedure is transactional then, it should roll back complete transactions when it encounters any errors. Well, that does not happen in this case, which proves that Stored Procedure does not only provide just the transactional feature to a batch of T-SQL. Generate Database Script for SQL Azure When talking about SQL Azure the most common complaint I hear is that the script generated from stand-along SQL Server database is not compatible with SQL Azure. This was true for some time for sure but not any more. If you have SQL Server 2008 R2 installed you can follow the guideline below to generate a script which is compatible with SQL Azure. Convert IN to EXISTS – Performance Talk It is NOT necessary that every time when IN is replaced by EXISTS it gives better performance. However, in our case listed above it does for sure give better performance. You can read about this subject in the associated blog post. Subquery or Join – Various Options – SQL Server Engine Knows the Best Every single time whenever there is a performance tuning exercise, I hear the conversation from developer where some prefer subquery and some prefer join. In this two part blog post, I explain the same in the detail with examples. Part 1 | Part 2 Merge Operations – Insert, Update, Delete in Single Execution MERGE is a new feature that provides an efficient way to do multiple DML operations. In earlier versions of SQL Server, we had to write separate statements to INSERT, UPDATE, or DELETE data based on certain conditions; however, at present, by using the MERGE statement, we can include the logic of such data changes in one statement that even checks when the data is matched and then just update it, and similarly, when the data is unmatched, it is inserted. 2011 Puzzle – Statistics are not updated but are Created Once Here is the quick scenario about my setup. Create Table Insert 1000 Records Check the Statistics Now insert 10 times more 10,000 indexes Check the Statistics – it will be NOT updated – WHY? Question to You – When to use Function and When to use Stored Procedure Personally, I believe that they are both different things - they cannot be compared. I can say, it will be like comparing apples and oranges. Each has its own unique use. However, they can be used interchangeably at many times and in real life (i.e., production environment). I have personally seen both of these being used interchangeably many times. This is the precise reason for asking this question. 2012 In year 2012 I had two interesting series ran on the blog. If there is no fun in learning, the learning becomes a burden. For the same reason, I had decided to build a three part quiz around SEQUENCE. The quiz was to identify the next value of the sequence. I encourage all of you to take part in this fun quiz. Guess the Next Value – Puzzle 1 Guess the Next Value – Puzzle 2 Guess the Next Value – Puzzle 3 Guess the Next Value – Puzzle 4 Simple Example to Configure Resource Governor – Introduction to Resource Governor Resource Governor is a feature which can manage SQL Server Workload and System Resource Consumption. We can limit the amount of CPU and memory consumption by limiting /governing /throttling on the SQL Server. If there are different workloads running on SQL Server and each of the workload needs different resources or when workloads are competing for resources with each other and affecting the performance of the whole server resource governor is a very important task. Tricks to Replace SELECT * with Column Names – SQL in Sixty Seconds #017 – Video Retrieves unnecessary columns and increases network traffic When a new columns are added views needs to be refreshed manually Leads to usage of sub-optimal execution plan Uses clustered index in most of the cases instead of using optimal index It is difficult to debug SQL SERVER – Load Generator – Free Tool From CodePlex The best part of this SQL Server Load Generator is that users can run multiple simultaneous queries again SQL Server using different login account and different application name. The interface of the tool is extremely easy to use and very intuitive as well. A Puzzle – Swap Value of Column Without Case Statement Let us assume there is a single column in the table called Gender. The challenge is to write a single update statement which will flip or swap the value in the column. For example if the value in the gender column is ‘male’ swap it with ‘female’ and if the value is ‘female’ swap it with ‘male’. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Weekly Series – Memory Lane – #050

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Executing Remote Stored Procedure – Calling Stored Procedure on Linked Server In this example we see two different methods of how to call Stored Procedures remotely. Connection Property of SQL Server Management Studio SSMS A very simple example of the how to build connection properties for SQL Server with the help of SSMS. Sample Example of RANKING Functions – ROW_NUMBER, RANK, DENSE_RANK, NTILE SQL Server has a total of 4 ranking functions. Ranking functions return a ranking value for each row in a partition. All the ranking functions are non-deterministic. T-SQL Script to Add Clustered Primary Key Jr. DBA asked me three times in a day, how to create Clustered Primary Key. I gave him following sample example. That was the last time he asked “How to create Clustered Primary Key to table?” 2008 2008 – TRIM() Function – User Defined Function SQL Server does not have functions which can trim leading or trailing spaces of any string at the same time. SQL does have LTRIM() and RTRIM() which can trim leading and trailing spaces respectively. SQL Server 2008 also does not have TRIM() function. User can easily use LTRIM() and RTRIM() together and simulate TRIM() functionality. http://www.youtube.com/watch?v=1-hhApy6MHM 2009 Earlier I have written two different articles on the subject Remove Bookmark Lookup. This article is as part 3 of original article. Please read the first two articles here before continuing reading this article. Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup – Part 2 Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup – Part 3 Interesting Observation – Query Hint – FORCE ORDER SQL Server never stops to amaze me. As regular readers of this blog already know that besides conducting corporate training, I work on large-scale projects on query optimizations and server tuning projects. In one of the recent projects, I have noticed that a Junior Database Developer used the query hint Force Order; when I asked for details, I found out that the basic concept was not properly understood by him. Queries Waiting for Memory Allocation to Execute In one of the recent projects, I was asked to create a report of queries that are waiting for memory allocation. The reason was that we were doubtful regarding whether the memory was sufficient for the application. The following query can be useful in similar cases. Queries that do not have to wait on a memory grant will not appear in the result set of following query. 2010 Quickest Way to Identify Blocking Query and Resolution – Dirty Solution As the title suggests, this is quite a dirty solution; it’s not as elegant as you expect. However, it works totally fine. Simple Explanation of Data Type Precedence While I was working on creating a question for SQL SERVER – SQL Quiz – The View, The Table and The Clustered Index Confusion, I had actually created yet another question along with this question. However, I felt that the one which is posted on the SQL Quiz is much better than this one because what makes that more challenging question is that it has a multiple answer. Encrypted Stored Procedure and Activity Monitor I recently had received questionable if any stored procedure is encrypted can we see its definition in Activity Monitor.Answer is - No. Let us do a quick test. Let us create following Stored Procedure and then launch the Activity Monitor and check the text. Indexed View always Use Index on Table A single table can have maximum 249 non clustered indexes and 1 clustered index. In SQL Server 2008, a single table can have maximum 999 non clustered indexes and 1 clustered index. It is widely believed that a table can have only 1 clustered index, and this belief is true. I have some questions for all of you. Let us assume that I am creating view from the table itself and then create a clustered index on it. In my view, I am selecting the complete table itself. 2011 Detecting Database Case Sensitive Property using fn_helpcollations() I received a question on how to determine the case sensitivity of the database. The quick answer to this is to identify the collation of the database and check the properties of the collation. I have previously written how one can identify database collation. Once you have figured out the collation of the database, you can put that in the WHERE condition of the following T-SQL and then check the case sensitivity from the description. Server Side Paging in SQL Server CE (Compact Edition) SQL Server Denali is coming up with new T-SQL of Paging. I have written about the same earlier.SQL SERVER – Server Side Paging in SQL Server Denali – A Better Alternative, SQL SERVER – Server Side Paging in SQL Server Denali Performance Comparison, SQL SERVER – Server Side Paging in SQL Server Denali – Part2 What is very interesting is that SQL Server CE 4.0 have the same feature introduced. Here is the quick example of the same. To run the script in the example, you will have to do installWebmatrix 4.0 and download sample database. Once done you can run following script. Why I am Going to Attend PASS Summit Unite 2011 The four-day event will be marked by a lot of learning, sharing, and networking, which will help me increase both my knowledge and contacts. Every year, PASS Summit provides me a golden opportunity to build my network as well as to identify and meet potential customers or employees. 2012 Manage Help Settings – CTRL + ALT + F1 This is very interesting read as my daughter once accidently came across a screen in SQL Server Management Studio. It took me 2-3 minutes to figure out how she has created the same screen. Recover the Accidentally Renamed Table “I accidentally renamed table in my SSMS. I was scrolling very fast and I made mistakes. It was either because I double clicked or clicked on F2 (shortcut key for renaming). However, I have made the mistake and now I have no idea how to fix this. If you have renamed the table, I think you pretty much is out of luck. Here are few things which you can do which can give you an idea about what your table name can be if you are lucky. Identify Numbers of Non Clustered Index on Tables for Entire Database Here is the script which will give you numbers of non clustered indexes on any table in entire database. Identify Most Resource Intensive Queries – SQL in Sixty Seconds #029 – Video Here is the complete complete script which I have used in the SQL in Sixty Seconds Video. Thanks Harsh for important Tip in the comment. http://www.youtube.com/watch?v=3kDHC_Tjrns Advanced Data Quality Services with Melissa Data – Azure Data Market For the purposes of the review, I used a database I had in an Excel spreadsheet with name and address information. Upon a cursory inspection, there are miscellaneous problems with these records; some addresses are missing ZIP codes, others missing a city, and some records are slightly misspelled or have unparsed suites. With DQS, I can easily add a knowledge base to help standardize my values, such as for state abbreviations. But how do I know that my address is correct? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Weekly Series – Memory Lane – #034

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 UDF – User Defined Function to Strip HTML – Parse HTML – No Regular Expression The UDF used in the blog does fantastic task – it scans entire HTML text and removes all the HTML tags. It keeps only valid text data without HTML task. This is one of the quite commonly requested tasks many developers have to face everyday. De-fragmentation of Database at Operating System to Improve Performance Operating system skips MDF file while defragging the entire filesystem of the operating system. It is absolutely fine and there is no impact of the same on performance. Read the entire blog post for my conversation with our network engineers. Delay Function – WAITFOR clause – Delay Execution of Commands How do you delay execution of the commands in SQL Server – ofcourse by using WAITFOR keyword. In this blog post, I explain the same with the help of T-SQL script. Find Length of Text Field To measure the length of TEXT fields the function is DATALENGTH(textfield). Len will not work for text field. As of SQL Server 2005, developers should migrate all the text fields to VARCHAR(MAX) as that is the way forward. Retrieve Current Date Time in SQL Server CURRENT_TIMESTAMP, GETDATE(), {fn NOW()} There are three ways to retrieve the current datetime in SQL SERVER. CURRENT_TIMESTAMP, GETDATE(), {fn NOW()} Explanation and Comparison of NULLIF and ISNULL An interesting observation is NULLIF returns null if it comparison is successful, whereas ISNULL returns not null if its comparison is successful. In one way they are opposite to each other. Here is my question to you - How to create infinite loop using NULLIF and ISNULL? If this is even possible? 2008 Introduction to SERVERPROPERTY and example SERVERPROPERTY is a very interesting system function. It returns many of the system values. I use it very frequently to get different server values like Server Collation, Server Name etc. SQL Server Start Time We can use DMV to find out what is the start time of SQL Server in 2008 and later version. In this blog you can see how you can do the same. Find Current Identity of Table Many times we need to know what is the current identity of the column. I have found one of my developers using aggregated function MAX () to find the current identity. However, I prefer following DBCC command to figure out current identity. Create Check Constraint on Column Some time we just need to create a simple constraint over the table but I have noticed that developers do many different things to make table column follow rules than just creating constraint. I suggest constraint is a very useful concept and every SQL Developer should pay good attention to this subject. 2009 List Schema Name and Table Name for Database This is one of the blog post where I straight forward display script. One of the kind of blog posts, which I still love to read and write. Clustered Index on Separate Drive From Table Location A table devoid of primary key index is called heap, and here data is not arranged in a particular order, which gives rise to issues that adversely affect performance. Data must be stored in some kind of order. If we put clustered index on it then the order will be forced by that index and the data will be stored in that particular order. Understanding Table Hints with Examples Hints are options and strong suggestions specified for enforcement by the SQL Server query processor on DML statements. The hints override any execution plan the query optimizer might select for a query. 2010 Data Pages in Buffer Pool – Data Stored in Memory Cache One of my earlier year article, which I still read it many times and point developers to read it again. It is clear from the Resultset that when more than one index is used, datapages related to both or all of the indexes are stored in Memory Cache separately. TRANSACTION, DML and Schema Locks Can you create a situation where you can see Schema Lock? Well, this is a very simple question, however during the interview I notice over 50 candidates failed to come up with the scenario. In this blog post, I have demonstrated the situation where we can see the schema lock in database. 2011 Solution – Puzzle – Statistics are not updated but are Created Once In this example I have created following situation: Create Table Insert 1000 Records Check the Statistics Now insert 10 times more 10,000 indexes Check the Statistics – it will be NOT updated Auto Update Statistics and Auto Create Statistics for database is TRUE Now I have requested two things in the example 1) Why this is happening? 2) How to fix this issue? Selecting Domain from Email Address This is a straight to script blog post where I explain how to select only domain name from entire email address. Solution – Generating Zero Without using Any Numbers in T-SQL How to get zero digit without using any digit? This is indeed a very interesting question and the answer is even interesting. Try to come up with answer in next 10 minutes and if you can’t come up with the answer the blog post read this post for solution. 2012 Simple Explanation and Puzzle with SOUNDEX Function and DIFFERENCE Function In simple words - SOUNDEX converts an alphanumeric string to a four-character code to find similar-sounding words or names. DIFFERENCE function returns an integer value. The integer returned is the number of characters in the SOUNDEX values that are the same. Read Only Files and SQL Server Management Studio (SSMS) I have come across a very interesting feature in SSMS related to “Read Only” files. I believe it is a little unknown feature as well so decided to write a blog about the same. Identifying Column Data Type of uniqueidentifier without Querying System Tables How do I know if any table has a uniqueidentifier column and what is its value without using any DMV or System Catalogues? Only information you know is the table name and you are allowed to return any kind of error if the table does not have uniqueidentifier column. Read the blog post to find the answer. Solution – User Not Able to See Any User Created Object in Tables – Security and Permissions Issue Interesting question – “When I try to connect to SQL Server, it lets me connect just fine as well let me open and explore the database. I noticed that I do not see any user created instances but when my colleague attempts to connect to the server, he is able to explore the database as well see all the user created tables and other objects. Can you help me fix it?” Importing CSV File Into Database – SQL in Sixty Seconds #018 – Video Here is interesting small 60 second video on how to import CSV file into Database. ColumnStore Index – Batch Mode vs Row Mode Here is the logic behind when Columnstore Index uses Batch Mode and when it uses Row Mode. A batch typically represents about 1000 rows of data. Batch mode processing also uses algorithms that are optimized for the multicore CPUs and increased memory throughput. Follow up – Usage of $rowguid and $IDENTITY This is an excellent follow up blog post of my earlier blog post where I explain where to use $rowguid and $identity. If you do not know the difference between them, this is a blog with a script example. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Using expressor Composite Types to Enforce Business Rules

- by pinaldave

One of the features that distinguish the expressor Data Integration Platform from other products in the data integration space is its concept of composite types, which provide an effective and easily reusable way to clearly define the structure and characteristics of data within your application. An important feature of the composite type approach is that it allows you to easily adjust the content of a record to its ultimate purpose. For example, a record used to update a row in a database table is easily defined to include only the minimum set of columns, that is, a value for the key column and values for only those columns that need to be updated. Much like a class in higher level programming languages, you can also use the composite type as a way to enforce business rules onto your data by encapsulating a datum’s name, data type, and constraints (for example, maximum, minimum, or acceptable values) as a single entity, which ensures that your data can not assume an invalid value. To what extent you use this functionality is a decision you make when designing your application; the expressor design paradigm does not force this approach on you. Let’s take a look at how these features are used. Suppose you want to create a group of applications that maintain the employee table in your human resources database. Your table might have a structure similar to the HumanResources.Employee table in the AdventureWorks database. This table includes two columns, EmployeID and rowguid, that are maintained by the relational database management system; you cannot provide values for these columns when inserting new rows into the table. Additionally, there are columns such as VacationHours and SickLeaveHours that you might choose to update for all employees on a monthly basis, which justifies creation of a dedicated application. By creating distinct composite types for the read, insert and update operations against this table, you can more easily manage this table’s content. When developing this application within expressor Studio, your first task is to create a schema artifact for the database table. This process is completely driven by a wizard, only requiring that you select the desired database schema and table. The resulting schema artifact defines the mapping of result set records to a record within the expressor data integration application. The structure of the record within the expressor application is a composite type that is given the default name CompositeType1. As you can see in the following figure, all columns from the table are included in the result set and mapped to an identically named attribute in the default composite type. If you are developing an application that needs to read this table, perhaps to prepare a year-end report of employees by department, you would probably not be interested in the data in the rowguid and ModifiedDate columns. A typical approach would be to drop this unwanted data in a downstream operator. But using an alternative composite type provides a better approach in which the unwanted data never enters your application. While working in expressor Studio’s schema editor, simply create a second composite type within the same schema artifact, which you could name ReadTable, and remove the attributes corresponding to the unwanted columns. The value of an alternative composite type is even more apparent when you want to insert into or update the table. In the composite type used to insert rows, remove the attributes corresponding to the EmployeeID primary key and rowguid uniqueidentifier columns since these values are provided by the relational database management system. And to update just the VacationHours and SickLeaveHours columns, use a composite type that includes only the attributes corresponding to the EmployeeID, VacationHours, SickLeaveHours and ModifiedDate columns. By specifying this schema artifact and composite type in a Write Table operator, your upstream application need only deal with the four required attributes and there is no risk of unintentionally overwriting a value in a column that does not need to be updated. Now, what about the option to use the composite type to enforce business rules? If you review the composition of the default composite type CompositeType1, you will note that the constraints defined for many of the attributes mirror the table column specifications. For example, the maximum number of characters in the NationaIDNumber, LoginID and Title attributes is equivalent to the maximum width of the target column, and the size of the MaritalStatus and Gender attributes is limited to a single character as required by the table column definition. If your application code leads to a violation of these constraints, an error will be raised. The expressor design paradigm then allows you to handle the error in a way suitable for your application. For example, a string value could be truncated or a numeric value could be rounded. Moreover, you have the option of specifying additional constraints that support business rules unrelated to the table definition. Let’s assume that the only acceptable values for marital status are S, M, and D. Within the schema editor, double-click on the MaritalStatus attribute to open the Edit Attribute window. Then click the Allowed Values checkbox and enter the acceptable values into the Constraint Value text box. The schema editor is updated accordingly. There is one more option that the expressor semantic type paradigm supports. Since the MaritalStatus attribute now clearly specifies how this type of information should be represented (a single character limited to S, M or D), you can convert this attribute definition into a shared type, which will allow you to quickly incorporate this definition into another composite type or into the description of an output record from a transform operator. Again, double-click on the MaritalStatus attribute and in the Edit Attribute window, click Convert, which opens the Share Local Semantic Type window that you use to name this shared type. There’s no requirement that you give the shared type the same name as the attribute from which it was derived. You should supply a name that makes it obvious what the shared type represents. In this posting, I’ve overviewed the expressor semantic type paradigm and shown how it can be used to make your application development process more productive. The beauty of this feature is that you choose when and to what extent you utilize the functionality, but I’m certain that if you opt to follow this approach your efforts will become more efficient and your work will progress more quickly. As always, I encourage you to download and evaluate expressor Studio for your current and future data integration needs. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: CodeProject, Pinal Dave, PostADay, SQL, SQL Authority, SQL Documentation, SQL Query, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

Read the article

Search Results

Search found 6404 results on 257 pages for 'mad han ror tips'.

Page 62/257 | < Previous Page | 58 59 60 61 62 63 64 65 66 67 68 69 | Next Page >

- by pinaldave

- by pinaldave

- by Pinal Dave

- by pinaldave

- by Pinal Dave

- by Pinal Dave

- by pinaldave

- by pinaldave

- by pinaldave

- by Jon Galloway

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by pinaldave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by pinaldave

- by pinaldave

- by Pinal Dave

- by Pinal Dave

- by Pinal Dave

- by pinaldave

< Previous Page | 58 59 60 61 62 63 64 65 66 67 68 69 | Next Page >