Search Results

Search found 4849 results on 194 pages for 'sqlauthority news'.

Page 180/194 | < Previous Page | 176 177 178 179 180 181 182 183 184 185 186 187  | Next Page >

  • looking for advise on importing excel into mysql with php

    - by Ole Media
    Alright, see if I can pick your brains from you all. I'm currently working on a project where all the information comes from different clients, the only thing in common is that the received data is done with excel. The excel spread sheet that they present is just a bunch of references and codes, and the problem than I'm facing is that I need the references and codes to be entered in certain format in order for the website to work. The perfect situation will be to go to each client and teach how I would need the data, but I can't do that because of the large number of clients, and more importantly I will be interrupting their work flow. Each client has its own codes and reference model and they are not willing to change their process The good news is that there is a standard pattern for the codes, but I'm talking close to 200 thousand codes with a bunch of combination. They way that we are currently solving the problem is that we have a person who checks each excel sheet received, runs a few macros, and manually fixes those codes in which the macro was not able to fix. The person that is doing this, is already burn out and frustrated and I would like to automatize this process with php. Suggestions?

    Read the article

  • Extraction from string - Ruby

    - by Dantes
    I have a string. That string is a html code and it serves as a teaser for the blog posts I am creating. The whole html code (teaser) is stored in a field in the database. My goal: I'd like to make that when a user (facebook like social button) likes certain blog post, right data is displayed on his news feeds. In order to do that I need to extract from the teaser in the first occurrence of an image an image path inside src="i-m-a-g-e--p-a-t-h". I succeeded when a user puts only one image in teaser, but if he accidentally puts two images or more the whole thing craches. Furthermore, for description field I need to extract text inside the first occurrence inside <p> tag. The problem is also that a user can put an image inside the first tag. I would very much appreciate if an expert could help me resolve this what's been bugging me for days. Text string with a regular expression for extracting src can be found here: http://rubular.com/r/gajzivoBSf Thanks!

    Read the article

  • visual analysis of web pages in ruby

    - by Clint Miller
    I'm looking to write some code that does visual analysis of web pages, preferably using Ruby. My code will need to be able to determine the top, left, width, height, background color, color, and font size for all the elements in the DOM. Of course, these values can only be calculated once all CSS is applied. So, I don't think that Nokogiri is up for the job. Ultimately, I'm trying to use this data in a VIPS-like (Vision-Based Page Segmentation) algorithm in an attempt to find the main content in downloaded news articles. I've considered using Watir to drive Chrome or Firefox and then extract the data. The problem is that browsers can't be run headless through Watir (I think). Ultimately, this code will be running on an array of Linux servers in a data center. So, the code won't have easy access to an X Server for displaying the browser. I suppose one solution is to use Watir and run a headless X Server on the Linux servers. That's a bit of a pain, but it looks like my best option right now. Does anyone have any better ideas?

    Read the article

  • How to automatically show Title of the Entries/Articles in the Browser Title Bar in ExpressionEngine 2?

    - by Ibn Saeed
    How would I output the title of an entry in ExpressionEngine and display it in the browser's title bar? Here is the content of my page's header: <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Test Site</title> <link rel="stylesheet" href="{stylesheet=site/site_css}" type="text/css" media="screen" /> </head> What I need is for each page to display the title of the entry in my browser's title bar — how can I achieve that? Part of UPDATED Code: Here is how i have done it : {exp:channel:entries channel="news_articles" status="open|Featured Top Story|Top Story" limit="1" disable="member_data|trackbacks|pagination"} {embed="includes/document_header" page_title=" | {title}"} <body class="home"> <div id="layoutWrapper"> {embed="includes/masthead_navigation"} <div id="content"> <div id="article"> <img src="{article_image}" alt="News Article Image" /> <h4>{title}</h4> <h5><span class="by">By</span> {article_author}</h5> <p>{entry_date format="%M %d, %Y"} -- Updated {gmt_edit_date format="%M %d, %Y"}</p> {article_body} {/exp:channel:entries} </div> What do you think?

    Read the article

  • RegEx expression or jQuery selector to NOT match "external" links in href

    - by TrueBlueAussie
    I have a jQuery plugin that overrides link behavior, to allow Ajax loading of page content. Simple enough with a delegated event like $(document).on('click','a', function(){});. but I only want it to apply to links that are not like these ones (Ajax loading is not applicable to them, so links like these need to behave normally): target="_blank" // New browser window href="#..." // Bookmark link (page is already loaded). href="afs://..." // AFS file access. href="cid://..." // Content identifiers for MIME body part. href="file://..." // Specifies the address of a file from the locally accessible drive. href="ftp://..." // Uses Internet File Transfer Protocol (FTP) to retrieve a file. href="http://..." // The most commonly used access method. href="https://..." // Provide some level of security of transmission href="mailto://..." // Opens an email program. href="mid://..." // The message identifier for email. href="news://..." // Usenet newsgroup. href="x-exec://..." // Executable program. href="http://AnythingNotHere.com" // External links Sample code: $(document).on('click', 'a:not([target="_blank"])', function(){ var $this = $(this); if ('some additional check of href'){ // Do ajax load and stop default behaviour return false; } // allow link to work normally }); Q: Is there a way to easily detect all "local links" that would only navigate within the current website? excluding all the variations mentioned above. Note: This is for an MVC 5 Razor website, so absolute site URLs are unlikely to occur.

    Read the article

  • Complex sorting on MySQL database

    - by ChrisR
    I'm facing the following situation. We've got an CMS with an entity with translations. These translations are stored in a different table with a one-to-many relationship. For example newsarticles and newsarticle_translations. The amount of available languages is dynamically determined by the same CMS. When entering a new newsarticle the editor is required to enter at least one translation, which one of the available languages he chooses is up to him. In the newsarticle overview in our CMS we would like to show a column with the (translated) article title, but since none of the languages are mandatory (one of them is mandatory but i don't know which one) i don't really know how to construct my mysql query to select a title for each newsarticle, regardless of the entered language. And to make it all a little harder, our manager asked for the possibilty to also be able to sort on title, so fetching the translations in a separate query is ruled out as far as i know. Anyone has an idea on how to solve this in the most efficient way? Here are my table schema's it it might help > desc news; +-----------------+----------------+------+-----+-------------------+----------------+ | Field | Type | Null | Key | Default | Extra | +-----------------+----------------+------+-----+-------------------+----------------+ | id | int(10) | NO | PRI | NULL | auto_increment | | category_id | int(1) | YES | | NULL | | | created | timestamp | NO | | CURRENT_TIMESTAMP | | | user_id | int(10) | YES | | NULL | | +-----------------+----------------+------+-----+-------------------+----------------+ > desc news_translations; +-----------------+------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-----------------+------------------+------+-----+---------+----------------+ | id | int(10) unsigned | NO | PRI | NULL | auto_increment | | enabled | tinyint(1) | NO | | 0 | | | news_id | int(1) unsigned | NO | | NULL | | | title | varchar(255) | NO | | | | | summary | text | YES | | NULL | | | body | text | NO | | NULL | | | language | varchar(2) | NO | | NULL | | +-----------------+------------------+------+-----+---------+----------------+ PS: i've though about subqueries and coalesce() solutions but those seem rather dirty tricks, wondering if something better is know that i'm not thinking of?

    Read the article

  • Order words by number of letters, then place words neatly

    - by bmaster
    I have a list of words in javascript similar to this: var words = ["mine", "minute", "mist", "mixed", "money", "monkey", "month", "moon", "morning", "mother", "motion", "mountain", "mouth", "move", "much", "muscle", "music", "nail", "name", "narrow", "nation", "natural", "near", "necessary", "neck", "need", "needle", "nerve", "net", "new", "news", "night"]; The words can be 1-25? letters long. I have a div id="words", with a set width of 700px (but I might change it from this). Using css/javascript/jquery, how can I make it: Order the words by number of letters Place the words inside the div tag, left to right, but so that there are no gaps at the right edge of the words div, and there is even spacing between words on a line. Each word should have a border around it and a background. Like this: |reallylongwordssdf shorterwordfdf dfsdfsdfsdf sdfsdfsdf| |sdfsdfsdf sdffsdop sdfjpogs sdfsds dfsdsd dfsdsd dfsdsd| I really have no idea where to begin with this. Perhaps I could manage to write code to order the words by number of letters, but after that, I'd be stuck. Edit: I forgot to add, the words must be links.

    Read the article

  • Is this the best way to make an API request using PHP CURL?

    - by Abs
    Hello all, I have a site that has a simple API which can be used via http. I wish to make use of the API and submit data about 1000-1500 times at one time. Here is their API: http://api.jum.name/ I have constructed the URL to make a submission but now I am wondering what is the best way to make these 1000-1500 API GET requests? Here is the PHP CURL implementation I was thinking of: $add = 'http://www.mysite.com/3rdparty/API/api.php?fn=post&username=test&password=tester&url=http://google.com&category=21&title=story a&content=content text&tags=Season,news'; curl_setopt ($ch, CURLOPT_URL, "$add"); curl_setopt ($ch, CURLOPT_POST, 0); curl_setopt ($ch, CURLOPT_COOKIEFILE, 'files/cookie.txt'); curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 0); curl_setopt ($ch, CURLOPT_RETURNTRANSFER, TRUE); $postdata = curl_exec ($ch); Shall I close the CURL connection every time I make a submission? Can I re-write the above in a better way that will make these 1000-1500 submissions quicker? Thanks all

    Read the article

  • drop down menu will not display outside containing div in IE7..

    - by playahabana
    I am tearing my hair out over this, I have a dropdown menu using CSS and jQuery (thanks to Soh Tanaka) and it works perfectly in Firefox, Safari, Google chrome and I.E. 8, but in IE 7 it will not drop down outside the 'Banner div'. It drops below the nav div however. I have moved the nav div higher in the banner the result is the same, menu drops until it reaches the border of the banner div and then vanishes.... Below is the css. This is my first website and I have some limited understanding of what I am doing. The drop down menu includes transparent png's as links (I know, I know...but it's what the Boss wants...) please could someone take a quick scan at the below CSS and let me know what is wroong? Is this some form of the IE z-index bug? i have tried all different combinations of z-index and still I can't get a different result. . The html is below as well. Thankyou in advance #banner { position: relative; width: 62.5em; height: 12em; background-color: #46280A; background-image: url('images/includes/banner2.jpg'); background-repeat: no-repeat; background-position: center; -moz-box-shadow: -4px 6px 8px #000; -webkit-box-shadow: -4px 6px 8px #000; box-shadow: -4px 6px 8px #000; /* For IE 8 */ -ms-filter: "progid:DXImageTransform.Microsoft.Shadow(Strength=8, Direction=225, Color='#000000')"; /* For IE 5.5 - 7 */ filter: progid:DXImageTransform.Microsoft.Shadow(Strength=8, Direction=225, Color='#000000'); z-index: 1; } /*------------------------------------SCROLLER---------------------------------------------*/ #headlines{ position: absolute; top: 1.3em; right: 2.75em; overflow: hidden; height: 2.5em; width: 24em; background-color: #000000; display: block; z-index: 3; } #news{ position: relative; height: 3.1em; line-height: 2.5em; font-size: 0.8em color: #FFFF99; white-space: nowrap; overflow: hidden; font-family: Georgia,Arial; } #scrollerglass{ position: absolute; top: 0.95em; right: 2em; height: 52px; width: 410px; border: none; padding: 0.2em 0em 0em 0em; line-height: 0.7em; text-align: center; background-image: url('images/includes/scrollerglass.png'); background-color: transparent; background-repeat: no-repeat; background-position: center; opacity: 20; z-index: 10; } #scrollerglass a i { visibility: hiddn ; } /-------------------------------------NAVIGATION-----------------------------------------/ #nav { position: absolute; top: 5.8em; left: 0.2em; font-family: trebuchet, sans-serif; font-size: 1em; line-height: 3.75em; text-align: center; color: #FFFF00; z-index: 3; } ul.navlist { list-style: none; padding: 0em; margin: 1em; float: left; width: 62.5em; background: transparent; font-size: 1em; } ul.navlist li { position: relative; /*--Declare X and Y axis base for sub navigation--*/ float: left; margin: 0em 1.4em; padding: 0em 0.7em 0em 0em; z-index: 1; } ul.navlist li a{ display: block; text-decoration: none; float: left; border: 0px solid; } ul.navlist li img{ border: 0px solid; } ul.navlist li span { trigger styles--*/ width: 1.2em; height: 5.25em; float: left; background: url(images/links/downlogo.png) no-repeat center top; } ul.navlist li span.subhover { background-position: center bottom; cursor: pointer; } ul.navlist li ul.navdrop { list-style: none; position: absolute; float: left; top: 5.3em; left: -2.4em; height: 15.0em; width: 11.25em; margin: 0; padding: 0.5em 0em 0em 0em; display: none; background-position: center; background-image: url('images/includes/slider.jpg'); background-color: transparent; background-repeat: no-repeat; -moz-box-shadow: -4px 6px 8px #000; -webkit-box-shadow: -4px 6px 8px #000; box-shadow: -4px 6px 8px #000; /* For IE 8 */ -ms-filter: "progid:DXImageTransform.Microsoft.Shadow(Strength=8, Direction=225, Color='#000000')"; /* For IE 5.5 - 7 */ filter: progid:DXImageTransform.Microsoft.Shadow(Strength=8, Direction=225, Color='#000000'); z-index:1; } ul.navlist li ul.navdrop li{ margin: 0em 2.3em 0em 0em; padding: 1em 0em 0em 0em; width: 8em; clear: both; } html ul.navlist li ul.navdrop li a { border: 0px solid; width: 11.25em; } html ul.navlist li ul.navdrop li a:hover { background: transparent; } <div id="banner"> <div id="headlines"> <div id="news"> Whatever we want to promote </div> </div> <div id="scrollerglass"> <a href="vintagecigars.php"> <i>------s-c-r-o-l-l-e-r - - - -l-i-n-k-s--------<br /> <br>------s-c-r-o-l-l-e-r - - - -l-i-n-k-s------</i></a> </div> <div id="nav"> <ul class="navmenu"> <li><a href="index.php"><img src="images/links/home.png" alt="Home" ></a></li> <li><a href="ourbar.php"><img src="images/links/ourbar.png" alt="Our Bar" ></a> <ul class="navdrop"> <li ><a href="ourcocktails.php"><img src="images/links/cockteles.png" alt="Our Cocktails" ></a></li> <li ><a href="celebrate.php"><img src="images/links/celebrate.png" alt="Celebrate in Style" ></a></li> </ul> </li> <li><a href="ourcigars.php"><img src="images/links/ourcigars.png" alt="Our Cigars" ></a> <ul class="navdrop"> <li ><a href="edicioneslimitadas.php"><img src="images/links/edicioneslimitadas.png" alt="Edition Limitadas" ></a></li> <li ><a href="cigartasting.php"><img src="images/links/cigartasting.png" alt="Cigar Tastings" ></a></li> </ul> </li> <li><a href="personalroller.php"><img src="images/links/personalcigar.png" alt="Personal Cigar Roller" ></a></li> <li><a href="galleryentrance.php"><img src="images/links/photogallery.png" alt="Photo Gallery" ></a></li> <li><a href="contactus.php"><img src="images/links/contactus.png" alt="Contact Us" ></a></li> </ul></div></div><!--end banner-->

    Read the article

  • Which web solution should I use for my project?

    - by BenIOs
    I'm going to create a fairly large (from my point of view anyway) web project with a friend. We will create a site with roads and other road related info. Our calculations is that we will have around 100k items in our database. Each item will contain some information like location, name etc. (about 30 thing each). We are counting on having a few hundred thousand unique visitors per month. The 100k items and their locations (that will be searchable) will be the main part of the page but we will also have some articles, comments, news and later on some more social functions (accounts, forums, picture uploads etc.). We were going to use Google AppEngine to develop our project since it is really scalable and free (at least for a while). But I'm actually starting to doubt that AppEngine is right for us. It seems to be for webbapps and not sites like ours. Which system (language/framework etc.) would you guys recommend us to use? It doesn't really mater if we know the language since before (we like learning new stuff) but it would be good if it's something that is future proof.

    Read the article

  • database design - empty fields

    - by imanc
    Hey, I am currently debating an issue with a guy on my dev team. He believes that empty fields are bad news. For instance, if we have a customer details table that stores data for customers from different countries, and each country has a slightly different address configuration - plus 1-2 extra fields, e.g. French customer details may also store details for entry code, and floor/level plus title fields (madamme, etc.). South Africa would have a security number. And so on. Given that we're talking about minor variances my idea is to put all of the fields into the table and use what is needed on each form. My colleague believes we should have a separate table with extra data. E.g. customer_info_fr. But this seams to totally defeat the purpose of a combined table in the first place. His argument is that empty fields / columns is bad - but I'm struggling to find justification in terms of database design principles for or against this argument and preferred solutions. Another option is a separate mini EAV table that stores extra data with parent_id, key, val fields. Or to serialise extra data into an extra_data column in the main customer_data table. I think I am confused because what I'm discussing is not covered by 3NF which is what I would typically use as a reference for how to structure data. So my question specifically: - if you have slight variances in data for each record (1-2 different fields for instance) what is the best way to proceed?

    Read the article

  • ajax on parked domain

    - by Daryl
    I'm currently writing this jquery and for some reason (I don't know why) it works on the normal domain, but on the parked domain it doesn't. Normal domain - http://www.thefinishedbox.com Parked domain - http://www.tfbox.com If you scroll down to the colony news and hit the click me link you'll see it will retrieve data via jquery ajax on the Normal domain, but on the parked domain it wont. Here is the jQuery code I have so far (its pretty standard): $(function() { $.ajaxSetup({ cache: false }); var ajax_load = "Load me plz"; // load() functions var loadUrl = "http://thefinishedbox.com/wp-content/themes/tfbox-beta/test.php"; $('.overlay').css({ opacity: '0' }); $('.toggle').click(function() { $('.overlay').css({ display: 'block' }).animate({ opacity: '1' }, 300); $(".overlay .content").html(ajax_load).load(loadUrl); return false; }); $('.close').click(function() { $('.overlay').animate({ opacity: '0' }, 300); $('.overlay').queue(function() { $(this).css({ display: 'none' }); $(this).dequeue(); }); return false; }); I'm a complete noob when it comes to ajax so any help would be massivly appreciated.

    Read the article

  • Handling Email Bouncebacks in Rails

    - by aressidi
    Hi there, I've built a very basic CRM system for my rails app that allows me to send weekly user activity digests with custom text and create multi-part marketing messages which I can configure and send through a basic admin interface. I'm happy with what I've put together on the send-side of things (minus the fact that I haven't tried to volume test its capabilities), but I'm concerned about how to handle bounce-backs. I came across this plugin with associated scripts: http://github.com/kovyrin/bounces-handler I'm using Google Apps to handle my mail and really don't know enough about Perl to want to mess with the above plugin - I have enough headaches. I'm looking for a simple solution for processing bounce-backs in Rails. All my email will go out from an address like this which will be managed in Google Apps: "[email protected]." What's the best workflow for this? Can anyone post an example solution they're using keeping in mind the fact that I'm using Google Apps for the mail? Any guidance, links, or basic workflow best-practices to handle this would be greatly appreciated. Thanks! -A

    Read the article

  • PHP OOP class Sensative To Counter field name !

    - by Mac Taylor
    hey guys recently i managed to write a class for my stories , and everything is fine , unless counter field that stores page's hits here is my class : class nk_posts { var $data = array(); public function _data() { global $db; $result = $db->sql_query(" SELECT bt_stories.*, bt_tags.*, bt_topics.*, group_concat(DISTINCT bt_tags.tag ) as mytags, group_concat(DISTINCT bt_topics.topicname ) as mytopics FROM bt_stories,bt_tags,bt_topics WHERE CONCAT( ' ', bt_stories.tags, ' ' ) LIKE CONCAT( '%', bt_tags.tid, '%' ) AND CONCAT( ' ', bt_stories.associated, ' ' ) LIKE CONCAT( '%', bt_topics.topicid, '%' ) AND time<=now() AND section='news' AND approved='1' GROUP BY bt_stories.sid ORDER BY bt_stories.sid DESC "); while ($this->data = $db->sql_fetchrow($result)) { $this->sid = $this->data['sid']; $this->title = $this->data['title']; $this->counter = $this->data['counter']; //------Rest of Fields ------ $this->_output(); } } public function _output() { themeindex( $this->sid, $this->title, $this->counter, //------Rest of Fields ------ ); } } problem this class can't show counter filed value , but if i change counter field name to other thing like hit , .. everything goes fine im sure its okay if i write it in normal php mysql way , but i need this to be in OOP way any comment why it's sensitive to counter field name ?!

    Read the article

  • Vacancy Tracking Algorithm implementation in C++

    - by Dave
    I'm trying to use the vacancy tracking algorithm to perform transposition of multidimensional arrays in C++. The arrays come as void pointers so I'm using address manipulation to perform the copies. Basically, there is an algorithm that starts with an offset and works its way through the whole 1-d representation of the array like swiss cheese, knocking out other offsets until it gets back to the original one. Then, you have to start at the next, untouched offset and do it again. You repeat until all offsets have been touched. Right now, I'm using a std::set to just fill up all possible offsets (0 up to the multiplicative fold of the dimensions of the array). Then, as I go through the algorithm, I erase from the set. I figure this would be fastest because I need to randomly access offsets in the tree/set and delete them. Then I need to quickly find the next untouched/undeleted offset. First of all, filling up the set is very slow and it seems like there must be a better way. It's individually calling new[] for every insert. So if I have 5 million offsets, there's 5 million news, plus re-balancing the tree constantly which as you know is not fast for a pre-sorted list. Second, deleting is slow as well. Third, assuming 4-byte data types like int and float, I'm using up actually the same amount of memory as the array itself to store this list of untouched offsets. Fourth, determining if there are any untouched offsets and getting one of them is fast -- a good thing. Does anyone have suggestions for any of these issues?

    Read the article

  • UnknownHostException again!

    - by Nitesh Panchal
    Hello, I posted one question previously and all of them answered that there is some problem with DNS but i changed my DNS to many addressed and now i have the most reliable, google DNS :- 8.8.8.8 Still i get the same UnknownHostException. What can be the problem? This is my code :- DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); Document doc = db.parse("http://rss.news.yahoo.com/rss/india"); Infact if i pass address as something very common like :- http://google.com i still get the same error. Please help me :(. I have my submissions tomorrow. Thanks in advance :) EDIT : If i type the same address in my mozilla, it works great. So, i am sure that there is no DNS problem. 2nd EDIT :- I found this link http://www.ehow.com/how_4747553_fix-unknownhostexception-java-applications-ubuntu.html But when i run the command sudo apt-get install lib32nss-mdns i get package not found. Somebody even mentioned :- -Djava.net.preferIPv4Stack=true But where do i write this statement of Djava? I am using Netbeans 6.8 to run my web application

    Read the article

  • how to retrieve data from db and display it in list?

    - by raji
    In the below code I am passing catID to db.getNews(catID) super.onCreate(savedInstanceState); setContentView(R.layout.newslist); Bundle extras = getIntent().getExtras(); int catID = extras.getInt("cat_id"); mInflater = (LayoutInflater)getSystemService(Context.LAYOUT_INFLATER_SERVICE); final ListView lv = (ListView) findViewById(R.id.category); lv.setAdapter(new ArrayAdapter<String>(this, R.layout.newslist, db.getNews(catID)){ public View getView(int position, View convertView, ViewGroup parent) { View row; if (null == convertView) { row = mInflater.inflate(R.layout.newslist, null); } else { row = convertView; } TextView tv = (TextView) row.findViewById(R.id.newslist_text); tv.setText(getItem(position)); return row; } }); the getnews(int catid) function is below: public NewsItemCursor getNews(int catID) { String sql = "SELECT title FROM news WHERE catid = " + catID + " ORDER BY id ASC"; SQLiteDatabase d = getReadableDatabase(); NewsItemCursor c = (NewsItemCursor) d.rawQueryWithFactory( new NewsItemCursor.Factory(), sql, null, null); c.moveToFirst(); d.close(); return c; } But I'm getting bug as array adapter undefined... Can anyone help me to resolve this, and retrieve data to make it display in list.

    Read the article

  • SQL SERVER – Developer Training Resources and Summary Roundup

    - by pinaldave
    It is always pleasure for any author when other renowned authors in the industry write about you. Earlier I wrote a five part blog series on Developer Training and I have received a phenomenal response to the series. I have received plenty of comments, questions and feedback. I thought it would be nice to sum up the whole series as well answer a few of the questions received. Quick Recap Developer Training - Importance and Significance - Part 1 In this part we discussed the importance of training in the real world. The most important and valuable resource any company is its employee. Employees who have been well-trained will be better at their jobs and produce a better product.  An employee who is well trained obviously knows more about their job and all the technical aspects. I have a very high opinion about training employees and it is the most important task. Developer Training – Employee Morals and Ethics – Part 2 In this part we discussed the most crucial components of training. Often employees are expecting the company to pay for their training and the company expresses no interest in training the employee. Quite often training expenses are the real issue for both the employee and employer. There are companies that pay for 100% of the expenses and there are employees who opt for training on their own expense during their personal time. Training is often looked at as vacation by employee and employers and we need to change this mind-set. One of the ways is to report back the learning to your manager and implement newly learned knowledge in day-to-day work. Developer Training – Difficult Questions and Alternative Perspective - Part 3 This part was the most difficult to write as I tried to address a few difficult questions and answers. Training is such a sensitive issue that many developers when not receiving chance for training think about leaving the organization. The manager often feels pressure to accommodate every single employee for training even though his training budget is limited. It is indeed the responsibility of the developer to get maximum advantage from the training. Training immediately helps organizations but stays as a part of an employee’s knowledge forever. Developer Training – Various Options for Developer Training – Part 4 In this part I tried to explore a few methods and options for training. The generic feedback I received on this blog post was short and I should have explored each of the subject of the training in details. I believe there are two big buckets of training 1) Instructor Lead Training and 2) Self Lead Training. The common element between both the methods is “learning material”. Learning material can be of any format – videos, books, paper notes or just a plain black board. Instructor-led training is a very effective mode but not possible every single time. During the course of the developer’s career, one has to learn lots of new technology and it is almost impossible to have a quality trainer available on that subject at that time. Books are most effective and proven methods, however, it always helps if someone explains the concepts of the book with a demonstration. In recent times I have started to believe in online trainings which leads to a hybrid experience. Online trainings take the best part of the books and the best part of the instructor-led training and gives effective training in a matter of hours. Developer Training – A Conclusive Summary- Part 5 In this part, I shared what I was continuously thinking about developer training. There is no better teacher than oneself. There is no better motivation than a personal desire to learn new technology. Honestly there is nothing more personal learning. That “change is the only constant” and “adapt & overcome” are the essential lessons of life. One cannot stop the learning and resist the change. In the IT industry “ego of knowing all” and the “resistance to change” are the most challenging issues. Once someone overcomes them, life is much easier. I believe that proper and appropriate high quality training can help to address the burning issues. Opinion of Friends I invited a few of my friends to express their opinion about developer training and here are their opinions. I am listing them here in the order of the blog post publishing date. Nakul Vachhrajani - Developer Trainings-Importance, Benefits, Tips and follow-up Nakul’s sums of many of the concepts which are complementary to my blog posts. Nakul addresses the burning question of developer training with different angles. I am personally very impressed by his following statement - “Being skilled does not mean having just a stack of certifications, but it also means having an understanding about the internals of the products that you are working on – and using that knowledge to improve the efficiency & productivity at the workplace in turn resulting in better products, better consulting abilities and a happier self.” Nakul also suggests the online training options of Pluralsight. Vinod Kumar - Training–a necessity or bonus Vinod Kumar comes up with excellent follow up on developer training. Vinod is known for his inspirational writing about SQL Server. Vinod starts with a story of a student who is extremely eager to learn the wisdom of life from a monk but the monk does not accept him as a disciple for a long time. The conversation between student and monk is indeed an essence of all learning. We all want to learn quickly and be successful but the most important thing in life is to have the right attitude towards learning and more so towards life. The blog post end with a very important thought about how to avoid the famous excuse – “I don’t have enough time.” Ritesh Shah - Training – useful or useless? Ritesh brings up very important concept related to training. Ritesh in his meticulous style explains why training is an important and lifelong process. Training must not stop at any age but should continue forever. The moment training stops, progress stops along with. Paras Doshi - Professional Development Resource Paras is known for his to–the-point writing, and has summarized the five part series very precisely. He read the five part series and created a digest summary of the blog post. If you are in a rush and have no time to read my five series – I suggest you read his blog post. Training Resources I am often asked what the best resources for learning new technology are. This is the most difficult question EVER. There are plenty of good training resources available. When it is about training our needs are different, our preference of learning is different and we all have an opinion. Additionally, we all are located in different geographic locations worldwide and there is no way one solution will fit all. However, let me list a few of the training resources which I have built so far and you can consume them if you find it relevant to your need. SQL Server Books SQL Server Interview Questions and Answers SQL Wait Stats SQL Programming Joes 2 Pros SQL Server Video Tutorials SQL Server Questions and Answers SQL Server Performance: Indexing Basics SQL Server Performance: Introduction to Query Tuning SQL in Sixty Seconds Series of Sixty Seconds Learning Video on YouTube Trust me worldwide web is very big and there are plenty of high quality learning materials available worldwide – trainer-led as well online. I suggest you explore various options and make the best choice for yourself. Remember, training is your personal journey and it should never stop. Are you ready? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Developer Training, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – How to Recover SQL Database Data Deleted by Accident

    - by Pinal Dave
    In Repair a SQL Server database using a transaction log explorer, I showed how to use ApexSQL Log, a SQL Server transaction log viewer, to recover a SQL Server database after a disaster. In this blog, I’ll show you how to use another SQL Server disaster recovery tool from ApexSQL in a situation when data is accidentally deleted. You can download ApexSQL Recover here, install, and play along. With a good SQL Server disaster recovery strategy, data recovery is not a problem. You have a reliable full database backup with valid data, a full database backup and subsequent differential database backups, or a full database backup and a chain of transaction log backups. But not all situations are ideal. Here we’ll address some sub-optimal scenarios, where you can still successfully recover data. If you have only a full database backup This is the least optimal SQL Server disaster recovery strategy, as it doesn’t ensure minimal data loss. For example, data was deleted on Wednesday. Your last full database backup was created on Sunday, three days before the records were deleted. By using the full database backup created on Sunday, you will be able to recover SQL database records that existed in the table on Sunday. If there were any records inserted into the table on Monday or Tuesday, they will be lost forever. The same goes for records modified in this period. This method will not bring back modified records, only the old records that existed on Sunday. If you restore this full database backup, all your changes (intentional and accidental) will be lost and the database will be reverted to the state it had on Sunday. What you have to do is compare the records that were in the table on Sunday to the records on Wednesday, create a synchronization script, and execute it against the Wednesday database. If you have a full database backup followed by differential database backups Let’s say the situation is the same as in the example above, only you create a differential database backup every night. Use the full database backup created on Sunday, and the last differential database backup (created on Tuesday). In this scenario, you will lose only the data inserted and updated after the differential backup created on Tuesday. If you have a full database backup and a chain of transaction log backups This is the SQL Server disaster recovery strategy that provides minimal data loss. With a full chain of transaction logs, you can recover the SQL database to an exact point in time. To provide optimal results, you have to know exactly when the records were deleted, because restoring to a later point will not bring back the records. This method requires restoring the full database backup first. If you have any differential log backup created after the last full database backup, restore the most recent one. Then, restore transaction log backups, one by one, it the order they were created starting with the first created after the restored differential database backup. Now, the table will be in the state before the records were deleted. You have to identify the deleted records, script them and run the script against the original database. Although this method is reliable, it is time-consuming and requires a lot of space on disk. How to easily recover deleted records? The following solution enables you to recover SQL database records even if you have no full or differential database backups and no transaction log backups. To understand how ApexSQL Recover works, I’ll explain what happens when table data is deleted. Table data is stored in data pages. When you delete table records, they are not immediately deleted from the data pages, but marked to be overwritten by new records. Such records are not shown as existing anymore, but ApexSQL Recover can read them and create undo script for them. How long will deleted records stay in the MDF file? It depends on many factors, as time passes it’s less likely that the records will not be overwritten. The more transactions occur after the deletion, the more chances the records will be overwritten and permanently lost. Therefore, it’s recommended to create a copy of the database MDF and LDF files immediately (if you cannot take your database offline until the issue is solved) and run ApexSQL Recover on them. Note that a full database backup will not help here, as the records marked for overwriting are not included in the backup. First, I’ll delete some records from the Person.EmailAddress table in the AdventureWorks database.   I can delete these records in SQL Server Management Studio, or execute a script such as DELETE FROM Person.EmailAddress WHERE BusinessEntityID BETWEEN 70 AND 80 Then, I’ll start ApexSQL Recover and select From DELETE operation in the Recovery tab.   In the Select the database to recover step, first select the SQL Server instance. If it’s not shown in the drop-down list, click the Server icon right to the Server drop-down list and browse for the SQL Server instance, or type the instance name manually. Specify the authentication type and select the database in the Database drop-down list.   In the next step, you’re prompted to add additional data sources. As this can be a tricky step, especially for new users, ApexSQL Recover offers help via the Help me decide option.   The Help me decide option guides you through a series of questions about the database transaction log and advises what files to add. If you know that you have no transaction log backups or detached transaction logs, or the online transaction log file has been truncated after the data was deleted, select No additional transaction logs are available. If you know that you have transaction log backups that contain the delete transactions you want to recover, click Add transaction logs. The online transaction log is listed and selected automatically.   Click Add if to add transaction log backups. It would be best if you have a full transaction log chain, as explained above. The next step for this option is to specify the time range.   Selecting a small time range for the time of deletion will create the recovery script just for the accidentally deleted records. A wide time range might script the records deleted on purpose, and you don’t want that. If needed, you can check the script generated and manually remove such records. After that, for all data sources options, the next step is to select the tables. Be careful here, if you deleted some data from other tables on purpose, and don’t want to recover them, don’t select all tables, as ApexSQL Recover will create the INSERT script for them too.   The next step offers two options: to create a recovery script that will insert the deleted records back into the Person.EmailAddress table, or to create a new database, create the Person.EmailAddress table in it, and insert the deleted records. I’ll select the first one.   The recovery process is completed and 11 records are found and scripted, as expected.   To see the script, click View script. ApexSQL Recover has its own script editor, where you can review, modify, and execute the recovery script. The insert into statements look like: INSERT INTO Person.EmailAddress( BusinessEntityID, EmailAddressID, EmailAddress, rowguid, ModifiedDate) VALUES( 70, 70, N'[email protected]' COLLATE SQL_Latin1_General_CP1_CI_AS, 'd62c5b4e-c91f-403f-b630-7b7e0fda70ce', '20030109 00:00:00.000' ); To execute the script, click Execute in the menu.   If you want to check whether the records are really back, execute SELECT * FROM Person.EmailAddress WHERE BusinessEntityID BETWEEN 70 AND 80 As shown, ApexSQL Recover recovers SQL database data after accidental deletes even without the database backup that contains the deleted data and relevant transaction log backups. ApexSQL Recover reads the deleted data from the database data file, so this method can be used even for databases in the Simple recovery model. Besides recovering SQL database records from a DELETE statement, ApexSQL Recover can help when the records are lost due to a DROP TABLE, or TRUNCATE statement, as well as repair a corrupted MDF file that cannot be attached to as SQL Server instance. You can find more information about how to recover SQL database lost data and repair a SQL Server database on ApexSQL Solution center. There are solutions for various situations when data needs to be recovered. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Backup and Restore, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Introduction – Day 1 of 31

    - by pinaldave
    List of all the Interview Questions and Answers Series blogs Posts covering interview questions and answers always make for interesting reading.  Some people like the subject for their helpful hints and thought provoking subject, and others dislike these posts because they feel it is nothing more than cheating.  I’d like to discuss the pros and cons of a Question and Answer format here. Interview Questions and Answers are Helpful Just like blog posts, books, and articles, interview Question and Answer discussions are learning material.  The popular Dummy’s books or Idiots Guides are not only for “dummies,” but can help everyone relearn the fundamentals.  Question and Answer discussions can serve the same purpose.  You could call this SQL Server Fundamentals or SQL Server 101. I have administrated hundreds of interviews during my career and I have noticed that sometimes an interviewee with several years of experience lacks an understanding of the fundamentals.  These individuals have been in the industry for so long, usually working on a very specific project, that the ABCs of the business have slipped their mind. Or, when a college graduate is looking to get into the industry, he is not expected to have experience since he is just graduated. However, the new grad is expected to have an understanding of fundamentals and theory.  Sometimes after the stress of final exams and graduation, it can be difficult to remember the correct answers to interview questions, though. An interview Question and Answer discussion can be very helpful to both these individuals.  It is simply a way to go back over the building blocks of a topic.  Many times a simple review like this will help “jog” your memory, and all those previously-memorized facts will come flooding back to you.  It is not a way to re-learn a topic, but a way to remind yourself of what you already know. A Question and Answer discussion can also be a way to go over old topics in a more interesting manner.  Especially if you have been working in the industry, or taking lots of classes on the topic, everything you read can sound like a repeat of what you already know.  Going over a topic in a new format can make the material seem fresh and interesting.  And an interested mind will be more engaged and remember more in the end. Interview Questions and Answers are Harmful A common argument against a Question and Answer discussion is that it will give someone a “cheat sheet.” A new guy with relatively little experience can read the interview questions and answers, and then memorize them. When an interviewer asks him the same questions, he will repeat the answers and get the job. Honestly, is he good hire because he memorized the interview questions? Wouldn’t it be better for the interviewer to hire someone with actual experience?  The answer is not as easy as it seems – there are many different factors to be considered. If the interviewer is asking fundamentals-related questions only, he gets the answers he wants to hear, and then hires this first candidate – there is a good chance that he is hiring based on personality rather than experience.  If the interviewer is smart he will ask deeper questions, have more than one person on the interview team, and interview a variety of candidates.  If one interviewee happens to memorize some answers, it usually doesn’t mean he will automatically get the job at the expense of more qualified candidates. Another argument against interview Question and Answers is that it will give candidates a false sense of confidence, and that they will appear more qualified than they are. Well, if that is true, it will not last after the first interview when the candidate is asked difficult questions and he cannot find the answers in the list of interview Questions and Answers.  Besides, confidence is one of the best things to walk into an interview with! In today’s competitive job market, there are often hundreds of candidates applying for the same position.  With so many applicants to choose from, interviewers must make decisions about who to call back and who to hire based on their gut feeling.  One drawback to reading an interview Question and Answer article is that you might sound very boring in your interview – saying the same thing as every single candidate, and parroting answers that sound like someone else wrote them for you – because they did.  However, it is definitely better to go to an interview prepared, just make sure that you give a lot of thought to your answers to make them sound like your own voice.  Remember that you will be hired based on your skills as well as your personality, so don’t think that having all the right answers will make get you hired.  A good interviewee will be prepared, confident, and know how to stand out. My Opinion A list of interview Questions and Answers is really helpful as a refresher or for beginners. To really ace an interview, one needs to have real-world, hands-on experience with SQL Server as well. Interview questions just serve as a starter or easy read for experienced professionals. When I have to learn new technology, I often search online for interview questions and get an idea about the breadth and depth of the technology. Next Action I am going to write about interview Questions and Answers for next 30 days. I have previously written a series of interview questions and answers; now I have re-written them keeping the latest version of SQL Server and current industry progress in mind. If you have faced interesting interview questions or situations, please write to me and I will publish them as a guest post. If you want me to add few more details, leave a comment and I will make sure that I do my best to accommodate. Tomorrow we will start the interview Questions and Answers series, with a few interesting stories, best practices and guest posts. We will have a prize give-away and other awards when the series ends. List of all the Interview Questions and Answers Series blogs Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Interview Questions and Answers, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Big Data – Is Big Data Relevant to me? – Big Data Questionnaires – Guest Post by Vinod Kumar

    - by Pinal Dave
    This guest post is by Vinod Kumar. Vinod Kumar has worked with SQL Server extensively since joining the industry over a decade ago. Working on various versions of SQL Server 7.0, Oracle 7.3 and other database technologies – he now works with the Microsoft Technology Center (MTC) as a Technology Architect. Let us read the blog post in Vinod’s own voice. I think the series from Pinal is a good one for anyone planning to start on Big Data journey from the basics. In my daily customer interactions this buzz of “Big Data” always comes up, I react generally saying – “Sir, do you really have a ‘Big Data’ problem or do you have a big Data problem?” Generally, there is a silence in the air when I ask this question. Data is everywhere in organizations – be it big data, small data, all data and for few it is bad data which is same as no data :). Wow, don’t discount me as someone who opposes “Big Data”, I am a big supporter as much as I am a critic of the abuse of this term by the people. In this post, I wanted to let my mind flow so that you can also think in the direction I want you to see these concepts. In any case, this is not an exhaustive dump of what is in my mind – but you will surely get the drift how I am going to question Big Data terms from customers!!! Is Big Data Relevant to me? Many of my customers talk to me like blank whiteboard with no idea – “why Big Data”. They want to jump into the bandwagon of technology and they want to decipher insights from their unexplored data a.k.a. unstructured data with structured data. So what are these industry scenario’s that come to mind? Here are some of them: Financials Fraud detection: Banks and Credit cards are monitoring your spending habits on real-time basis. Customer Segmentation: applies in every industry from Banking to Retail to Aviation to Utility and others where they deal with end customer who consume their products and services. Customer Sentiment Analysis: Responding to negative brand perception on social or amplify the positive perception. Sales and Marketing Campaign: Understand the impact and get closer to customer delight. Call Center Analysis: attempt to take unstructured voice recordings and analyze them for content and sentiment. Medical Reduce Re-admissions: How to build a proactive follow-up engagements with patients. Patient Monitoring: How to track Inpatient, Out-Patient, Emergency Visits, Intensive Care Units etc. Preventive Care: Disease identification and Risk stratification is a very crucial business function for medical. Claims fraud detection: There is no precise dollars that one can put here, but this is a big thing for the medical field. Retail Customer Sentiment Analysis, Customer Care Centers, Campaign Management. Supply Chain Analysis: Every sensors and RFID data can be tracked for warehouse space optimization. Location based marketing: Based on where a check-in happens retail stores can be optimize their marketing. Telecom Price optimization and Plans, Finding Customer churn, Customer loyalty programs Call Detail Record (CDR) Analysis, Network optimizations, User Location analysis Customer Behavior Analysis Insurance Fraud Detection & Analysis, Pricing based on customer Sentiment Analysis, Loyalty Management Agents Analysis, Customer Value Management This list can go on to other areas like Utility, Manufacturing, Travel, ITES etc. So as you can see, there are obviously interesting use cases for each of these industry verticals. These are just representative list. Where to start? A lot of times I try to quiz customers on a number of dimensions before starting a Big Data conversation. Are you getting the data you need the way you want it and in a timely manner? Can you get in and analyze the data you need? How quickly is IT to respond to your BI Requests? How easily can you get at the data that you need to run your business/department/project? How are you currently measuring your business? Can you get the data you need to react WITHIN THE QUARTER to impact behaviors to meet your numbers or is it always “rear-view mirror?” How are you measuring: The Brand Customer Sentiment Your Competition Your Pricing Your performance Supply Chain Efficiencies Predictive product / service positioning What are your key challenges of driving collaboration across your global business?  What the challenges in innovation? What challenges are you facing in getting more information out of your data? Note: Garbage-in is Garbage-out. Hold good for all reporting / analytics requirements Big Data POCs? A number of customers get into the realm of setting a small team to work on Big Data – well it is a great start from an understanding point of view, but I tend to ask a number of other questions to such customers. Some of these common questions are: To what degree is your advanced analytics (natural language processing, sentiment analysis, predictive analytics and classification) paired with your Big Data’s efforts? Do you have dedicated resources exploring the possibilities of advanced analytics in Big Data for your business line? Do you plan to employ machine learning technology while doing Advanced Analytics? How is Social Media being monitored in your organization? What is your ability to scale in terms of storage and processing power? Do you have a system in place to sort incoming data in near real time by potential value, data quality, and use frequency? Do you use event-driven architecture to manage incoming data? Do you have specialized data services that can accommodate different formats, security, and the management requirements of multiple data sources? Is your organization currently using or considering in-memory analytics? To what degree are you able to correlate data from your Big Data infrastructure with that from your enterprise data warehouse? Have you extended the role of Data Stewards to include ownership of big data components? Do you prioritize data quality based on the source system (that is Facebook/Twitter data has lower quality thresholds than radio frequency identification (RFID) for a tracking system)? Do your retention policies consider the different legal responsibilities for storing Big Data for a specific amount of time? Do Data Scientists work in close collaboration with Data Stewards to ensure data quality? How is access to attributes of Big Data being given out in the organization? Are roles related to Big Data (Advanced Analyst, Data Scientist) clearly defined? How involved is risk management in the Big Data governance process? Is there a set of documented policies regarding Big Data governance? Is there an enforcement mechanism or approach to ensure that policies are followed? Who is the key sponsor for your Big Data governance program? (The CIO is best) Do you have defined policies surrounding the use of social media data for potential employees and customers, as well as the use of customer Geo-location data? How accessible are complex analytic routines to your user base? What is the level of involvement with outside vendors and third parties in regard to the planning and execution of Big Data projects? What programming technologies are utilized by your data warehouse/BI staff when working with Big Data? These are some of the important questions I ask each customer who is actively evaluating Big Data trends for their organizations. These questions give you a sense of direction where to start, what to use, how to secure, how to analyze and more. Sign off Any Big data is analysis is incomplete without a compelling story. The best way to understand this is to watch Hans Rosling – Gapminder (2:17 to 6:06) videos about the third world myths. Don’t get overwhelmed with the Big Data buzz word, the destination to what your data speaks is important. In this blog post, we did not particularly look at any Big Data technologies. This is a set of questionnaire one needs to keep in mind as they embark their journey of Big Data. I did write some of the basics in my blog: Big Data – Big Hype yet Big Opportunity. Do let me know if these questions make sense?  Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • SQL SERVER – 5 Tips for Improving Your Data with expressor Studio

    - by pinaldave
    It’s no secret that bad data leads to bad decisions and poor results.  However, how do you prevent dirty data from taking up residency in your data store?  Some might argue that it’s the responsibility of the person sending you the data.  While that may be true, in practice that will rarely hold up.  It doesn’t matter how many times you ask, you will get the data however they decide to provide it. So now you have bad data.  What constitutes bad data?  There are quite a few valid answers, for example: Invalid date values Inappropriate characters Wrong data Values that exceed a pre-set threshold While it is certainly possible to write your own scripts and custom SQL to identify and deal with these data anomalies, that effort often takes too long and becomes difficult to maintain.  Instead, leveraging an ETL tool like expressor Studio makes the data cleansing process much easier and faster.  Below are some tips for leveraging expressor to get your data into tip-top shape. Tip 1:     Build reusable data objects with embedded cleansing rules One of the new features in expressor Studio 3.2 is the ability to define constraints at the metadata level.  Using expressor’s concept of Semantic Types, you can define reusable data objects that have embedded logic such as constraints for dealing with dirty data.  Once defined, they can be saved as a shared atomic type and then re-applied to other data attributes in other schemas. As you can see in the figure above, I’ve defined a constraint on zip code.  I can then save the constraint rules I defined for zip code as a shared atomic type called zip_type for example.   The next time I get a different data source with a schema that also contains a zip code field, I can simply apply the shared atomic type (shown below) and the previously defined constraints will be automatically applied. Tip 2:     Unlock the power of regular expressions in Semantic Types Another powerful feature introduced in expressor Studio 3.2 is the option to use regular expressions as a constraint.   A regular expression is used to identify patterns within data.   The patterns could be something as simple as a date format or something much more complex such as a street address.  For example, I could define that a valid IP address should be made up of 4 numbers, each 0 to 255, and separated by a period.  So 192.168.23.123 might be a valid IP address whereas 888.777.0.123 would not be.   How can I account for this using regular expressions? A very simple regular expression that would look for any 4 sets of 3 digits separated by a period would be:  ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ Alternatively, the following would be the exact check for truly valid IP addresses as we had defined above:  ^(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])$ .  In expressor, we would enter this regular expression as a constraint like this: Here we select the corrective action to be ‘Escalate’, meaning that the expressor Dataflow operator will decide what to do.  Some of the options include rejecting the offending record, skipping it, or aborting the dataflow. Tip 3:     Email pattern expressions that might come in handy In the example schema that I am using, there’s a field for email.  Email addresses are often entered incorrectly because people are trying to avoid spam.  While there are a lot of different ways to define what constitutes a valid email address, a quick search online yields a couple of really useful regular expressions for validating email addresses: This one is short and sweet:  \b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b (Source: http://www.regular-expressions.info/) This one is more specific about which characters are allowed:  ^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$ (Source: http://regexlib.com/REDetails.aspx?regexp_id=26 ) Tip 4:     Reject “dirty data” for analysis or further processing Yet another feature introduced in expressor Studio 3.2 is the ability to reject records based on constraint violations.  To capture reject records on input, simply specify Reject Record in the Error Handling setting for the Read File operator.  Then attach a Write File operator to the reject port of the Read File operator as such: Next, in the Write File operator, you can configure the expressor operator in a similar way to the Read File.  The key difference would be that the schema needs to be derived from the upstream operator as shown below: Once configured, expressor will output rejected records to the file you specified.  In addition to the rejected records, expressor also captures some diagnostic information that will be helpful towards identifying why the record was rejected.  This makes diagnosing errors much easier! Tip 5:    Use a Filter or Transform after the initial cleansing to finish the job Sometimes you may want to predicate the data cleansing on a more complex set of conditions.  For example, I may only be interested in processing data containing males over the age of 25 in certain zip codes.  Using an expressor Filter operator, you can define the conditional logic which isolates the records of importance away from the others. Alternatively, the expressor Transform operator can be used to alter the input value via a user defined algorithm or transformation.  It also supports the use of conditional logic and data can be rejected based on constraint violations. However, the best tip I can leave you with is to not constrain your solution design approach – expressor operators can be combined in many different ways to achieve the desired results.  For example, in the expressor Dataflow below, I can post-process the reject data from the Filter which did not meet my pre-defined criteria and, if successful, Funnel it back into the flow so that it gets written to the target table. I continue to be impressed that expressor offers all this functionality as part of their FREE expressor Studio desktop ETL tool, which you can download from here.  Their Studio ETL tool is absolutely free and they are very open about saying that if you want to deploy their software on a dedicated Windows Server, you need to purchase their server software, whose pricing is posted on their website. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – Shrinking Database is Bad – Increases Fragmentation – Reduces Performance

    - by pinaldave
    Earlier, I had written two articles related to Shrinking Database. I wrote about why Shrinking Database is not good. SQL SERVER – SHRINKDATABASE For Every Database in the SQL Server SQL SERVER – What the Business Says Is Not What the Business Wants I received many comments on Why Database Shrinking is bad. Today we will go over a very interesting example that I have created for the same. Here are the quick steps of the example. Create a test database Create two tables and populate with data Check the size of both the tables Size of database is very low Check the Fragmentation of one table Fragmentation will be very low Truncate another table Check the size of the table Check the fragmentation of the one table Fragmentation will be very low SHRINK Database Check the size of the table Check the fragmentation of the one table Fragmentation will be very HIGH REBUILD index on one table Check the size of the table Size of database is very HIGH Check the fragmentation of the one table Fragmentation will be very low Here is the script for the same. USE MASTER GO CREATE DATABASE ShrinkIsBed GO USE ShrinkIsBed GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Create FirstTable CREATE TABLE FirstTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_FirstTable_ID] ON FirstTable ( [ID] ASC ) ON [PRIMARY] GO -- Create SecondTable CREATE TABLE SecondTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_SecondTable_ID] ON SecondTable ( [ID] ASC ) ON [PRIMARY] GO -- Insert One Hundred Thousand Records INSERT INTO FirstTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Insert One Hundred Thousand Records INSERT INTO SecondTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO Let us check the table size and fragmentation. Now let us TRUNCATE the table and check the size and Fragmentation. USE MASTER GO CREATE DATABASE ShrinkIsBed GO USE ShrinkIsBed GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Create FirstTable CREATE TABLE FirstTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_FirstTable_ID] ON FirstTable ( [ID] ASC ) ON [PRIMARY] GO -- Create SecondTable CREATE TABLE SecondTable (ID INT, FirstName VARCHAR(100), LastName VARCHAR(100), City VARCHAR(100)) GO -- Create Clustered Index on ID CREATE CLUSTERED INDEX [IX_SecondTable_ID] ON SecondTable ( [ID] ASC ) ON [PRIMARY] GO -- Insert One Hundred Thousand Records INSERT INTO FirstTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Insert One Hundred Thousand Records INSERT INTO SecondTable (ID,FirstName,LastName,City) SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID, 'Bob', CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith' ELSE 'Brown' END, CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino' WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles' ELSE 'Houston' END FROM sys.all_objects a CROSS JOIN sys.all_objects b GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO You can clearly see that after TRUNCATE, the size of the database is not reduced and it is still the same as before TRUNCATE operation. After the Shrinking database operation, we were able to reduce the size of the database. If you notice the fragmentation, it is considerably high. The major problem with the Shrink operation is that it increases fragmentation of the database to very high value. Higher fragmentation reduces the performance of the database as reading from that particular table becomes very expensive. One of the ways to reduce the fragmentation is to rebuild index on the database. Let us rebuild the index and observe fragmentation and database size. -- Rebuild Index on FirstTable ALTER INDEX IX_SecondTable_ID ON SecondTable REBUILD GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO You can notice that after rebuilding, Fragmentation reduces to a very low value (almost same to original value); however the database size increases way higher than the original. Before rebuilding, the size of the database was 5 MB, and after rebuilding, it is around 20 MB. Regular rebuilding the index is rebuild in the same user database where the index is placed. This usually increases the size of the database. Look at irony of the Shrinking database. One person shrinks the database to gain space (thinking it will help performance), which leads to increase in fragmentation (reducing performance). To reduce the fragmentation, one rebuilds index, which leads to size of the database to increase way more than the original size of the database (before shrinking). Well, by Shrinking, one did not gain what he was looking for usually. Rebuild indexing is not the best suggestion as that will create database grow again. I have always remembered the excellent post from Paul Randal regarding Shrinking the database is bad. I suggest every one to read that for accuracy and interesting conversation. Let us run following script where we Shrink the database and REORGANIZE. -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO -- Shrink the Database DBCC SHRINKDATABASE (ShrinkIsBed); GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO -- Rebuild Index on FirstTable ALTER INDEX IX_SecondTable_ID ON SecondTable REORGANIZE GO -- Name of the Database and Size SELECT name, (size*8) Size_KB FROM sys.database_files GO -- Check Fragmentations in the database SELECT avg_fragmentation_in_percent, fragment_count FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('SecondTable'), NULL, NULL, 'LIMITED') GO You can see that REORGANIZE does not increase the size of the database or remove the fragmentation. Again, I no way suggest that REORGANIZE is the solution over here. This is purely observation using demo. Read the blog post of Paul Randal. Following script will clean up the database -- Clean up USE MASTER GO ALTER DATABASE ShrinkIsBed SET SINGLE_USER WITH ROLLBACK IMMEDIATE GO DROP DATABASE ShrinkIsBed GO There are few valid cases of the Shrinking database as well, but that is not covered in this blog post. We will cover that area some other time in future. Additionally, one can rebuild index in the tempdb as well, and we will also talk about the same in future. Brent has written a good summary blog post as well. Are you Shrinking your database? Well, when are you going to stop Shrinking it? Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Index, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology

    Read the article

  • SQL SERVER – Introduction to SQL Server 2014 In-Memory OLTP

    - by Pinal Dave
    In SQL Server 2014 Microsoft has introduced a new database engine component called In-Memory OLTP aka project “Hekaton” which is fully integrated into the SQL Server Database Engine. It is optimized for OLTP workloads accessing memory resident data. In-memory OLTP helps us create memory optimized tables which in turn offer significant performance improvement for our typical OLTP workload. The main objective of memory optimized table is to ensure that highly transactional tables could live in memory and remain in memory forever without even losing out a single record. The most significant part is that it still supports majority of our Transact-SQL statement. Transact-SQL stored procedures can be compiled to machine code for further performance improvements on memory-optimized tables. This engine is designed to ensure higher concurrency and minimal blocking. In-Memory OLTP alleviates the issue of locking, using a new type of multi-version optimistic concurrency control. It also substantially reduces waiting for log writes by generating far less log data and needing fewer log writes. Points to remember Memory-optimized tables refer to tables using the new data structures and key words added as part of In-Memory OLTP. Disk-based tables refer to your normal tables which we used to create in SQL Server since its inception. These tables use a fixed size 8 KB pages that need to be read from and written to disk as a unit. Natively compiled stored procedures refer to an object Type which is new and is supported by in-memory OLTP engine which convert it into machine code, which can further improve the data access performance for memory –optimized tables. Natively compiled stored procedures can only reference memory-optimized tables, they can’t be used to reference any disk –based table. Interpreted Transact-SQL stored procedures, which is what SQL Server has always used. Cross-container transactions refer to transactions that reference both memory-optimized tables and disk-based tables. Interop refers to interpreted Transact-SQL that references memory-optimized tables. Using In-Memory OLTP In-Memory OLTP engine has been available as part of SQL Server 2014 since June 2013 CTPs. Installation of In-Memory OLTP is part of the SQL Server setup application. The In-Memory OLTP components can only be installed with a 64-bit edition of SQL Server 2014 hence they are not available with 32-bit editions. Creating Databases Any database that will store memory-optimized tables must have a MEMORY_OPTIMIZED_DATA filegroup. This filegroup is specifically designed to store the checkpoint files needed by SQL Server to recover the memory-optimized tables, and although the syntax for creating the filegroup is almost the same as for creating a regular filestream filegroup, it must also specify the option CONTAINS MEMORY_OPTIMIZED_DATA. Here is an example of a CREATE DATABASE statement for a database that can support memory-optimized tables: CREATE DATABASE InMemoryDB ON PRIMARY(NAME = [InMemoryDB_data], FILENAME = 'D:\data\InMemoryDB_data.mdf', size=500MB), FILEGROUP [SampleDB_mod_fg] CONTAINS MEMORY_OPTIMIZED_DATA (NAME = [InMemoryDB_mod_dir], FILENAME = 'S:\data\InMemoryDB_mod_dir'), (NAME = [InMemoryDB_mod_dir], FILENAME = 'R:\data\InMemoryDB_mod_dir') LOG ON (name = [SampleDB_log], Filename='L:\log\InMemoryDB_log.ldf', size=500MB) COLLATE Latin1_General_100_BIN2; Above example code creates files on three different drives (D:  S: and R:) for the data files and in memory storage so if you would like to run this code kindly change the drive and folder locations as per your convenience. Also notice that binary collation was specified as Windows (non-SQL). BIN2 collation is the only collation support at this point for any indexes on memory optimized tables. It is also possible to add a MEMORY_OPTIMIZED_DATA file group to an existing database, use the below command to achieve the same. ALTER DATABASE AdventureWorks2012 ADD FILEGROUP hekaton_mod CONTAINS MEMORY_OPTIMIZED_DATA; GO ALTER DATABASE AdventureWorks2012 ADD FILE (NAME='hekaton_mod', FILENAME='S:\data\hekaton_mod') TO FILEGROUP hekaton_mod; GO Creating Tables There is no major syntactical difference between creating a disk based table or a memory –optimized table but yes there are a few restrictions and a few new essential extensions. Essentially any memory-optimized table should use the MEMORY_OPTIMIZED = ON clause as shown in the Create Table query example. DURABILITY clause (SCHEMA_AND_DATA or SCHEMA_ONLY) Memory-optimized table should always be defined with a DURABILITY value which can be either SCHEMA_AND_DATA or  SCHEMA_ONLY the former being the default. A memory-optimized table defined with DURABILITY=SCHEMA_ONLY will not persist the data to disk which means the data durability is compromised whereas DURABILITY= SCHEMA_AND_DATA ensures that data is also persisted along with the schema. Indexing Memory Optimized Table A memory-optimized table must always have an index for all tables created with DURABILITY= SCHEMA_AND_DATA and this can be achieved by declaring a PRIMARY KEY Constraint at the time of creating a table. The following example shows a PRIMARY KEY index created as a HASH index, for which a bucket count must also be specified. CREATE TABLE Mem_Table ( [Name] VARCHAR(32) NOT NULL PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 100000), [City] VARCHAR(32) NULL, [State_Province] VARCHAR(32) NULL, [LastModified] DATETIME NOT NULL, ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA); Now as you can see in the above query example we have used the clause MEMORY_OPTIMIZED = ON to make sure that it is considered as a memory optimized table and not just a normal table and also used the DURABILITY Clause= SCHEMA_AND_DATA which means it will persist data along with metadata and also you can notice this table has a PRIMARY KEY mentioned upfront which is also a mandatory clause for memory-optimized tables. We will talk more about HASH Indexes and BUCKET_COUNT in later articles on this topic which will be focusing more on Row and Index storage on Memory-Optimized tables. So stay tuned for that as well. Now as we covered the basics of Memory Optimized tables and understood the key things to remember while using memory optimized tables, let’s explore more using examples to understand the Performance gains using memory-optimized tables. I will be using the database which i created earlier in this article i.e. InMemoryDB in the below Demo Exercise. USE InMemoryDB GO -- Creating a disk based table CREATE TABLE dbo.Disktable ( Id INT IDENTITY, Name CHAR(40) ) GO CREATE NONCLUSTERED INDEX IX_ID ON dbo.Disktable (Id) GO -- Creating a memory optimized table with similar structure and DURABILITY = SCHEMA_AND_DATA CREATE TABLE dbo.Memorytable_durable ( Id INT NOT NULL PRIMARY KEY NONCLUSTERED Hash WITH (bucket_count =1000000), Name CHAR(40) ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA) GO -- Creating an another memory optimized table with similar structure but DURABILITY = SCHEMA_Only CREATE TABLE dbo.Memorytable_nondurable ( Id INT NOT NULL PRIMARY KEY NONCLUSTERED Hash WITH (bucket_count =1000000), Name CHAR(40) ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_only) GO -- Now insert 100000 records in dbo.Disktable and observe the Time Taken DECLARE @i_t bigint SET @i_t =1 WHILE @i_t<= 100000 BEGIN INSERT INTO dbo.Disktable(Name) VALUES('sachin' + CONVERT(VARCHAR,@i_t)) SET @i_t+=1 END -- Do the same inserts for Memory table dbo.Memorytable_durable and observe the Time Taken DECLARE @i_t bigint SET @i_t =1 WHILE @i_t<= 100000 BEGIN INSERT INTO dbo.Memorytable_durable VALUES(@i_t, 'sachin' + CONVERT(VARCHAR,@i_t)) SET @i_t+=1 END -- Now finally do the same inserts for Memory table dbo.Memorytable_nondurable and observe the Time Taken DECLARE @i_t bigint SET @i_t =1 WHILE @i_t<= 100000 BEGIN INSERT INTO dbo.Memorytable_nondurable VALUES(@i_t, 'sachin' + CONVERT(VARCHAR,@i_t)) SET @i_t+=1 END The above 3 Inserts took 1.20 minutes, 54 secs, and 2 secs respectively to insert 100000 records on my machine with 8 Gb RAM. This proves the point that memory-optimized tables can definitely help businesses achieve better performance for their highly transactional business table and memory- optimized tables with Durability SCHEMA_ONLY is even faster as it does not bother persisting its data to disk which makes it supremely fast. Koenig Solutions is one of the few organizations which offer IT training on SQL Server 2014 and all its updates. Now, I leave the decision on using memory_Optimized tables on you, I hope you like this article and it helped you understand  the fundamentals of IN-Memory OLTP . Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Koenig

    Read the article

  • SQL SERVER – Step by Step Guide to Beginning Data Quality Services in SQL Server 2012 – Introduction to DQS

    - by pinaldave
    Data Quality Services is a very important concept of SQL Server. I have recently started to explore the same and I am really learning some good concepts. Here are two very important blog posts which one should go over before continuing this blog post. Installing Data Quality Services (DQS) on SQL Server 2012 Connecting Error to Data Quality Services (DQS) on SQL Server 2012 This article is introduction to Data Quality Services for beginners. We will be using an Excel file Click on the image to enlarge the it. In the first article we learned to install DQS. In this article we will see how we can learn about building Knowledge Base and using it to help us identify the quality of the data as well help correct the bad quality of the data. Here are the two very important steps we will be learning in this tutorial. Building a New Knowledge Base  Creating a New Data Quality Project Let us start the building the Knowledge Base. Click on New Knowledge Base. In our project we will be using the Excel as a knowledge base. Here is the Excel which we will be using. There are two columns. One is Colors and another is Shade. They are independent columns and not related to each other. The point which I am trying to show is that in Column A there are unique data and in Column B there are duplicate records. Clicking on New Knowledge Base will bring up the following screen. Enter the name of the new knowledge base. Clicking NEXT will bring up following screen where it will allow to select the EXCE file and it will also let users select the source column. I have selected Colors and Shade both as a source column. Creating a domain is very important. Here you can create a unique domain or domain which is compositely build from Colors and Shade. As this is the first example, I will create unique domain – for Colors I will create domain Colors and for Shade I will create domain Shade. Here is the screen which will demonstrate how the screen will look after creating domains. Clicking NEXT it will bring you to following screen where you can do the data discovery. Clicking on the START will start the processing of the source data provided. Pre-processed data will show various information related to the source data. In our case it shows that Colors column have unique data whereas Shade have non-unique data and unique data rows are only two. In the next screen you can actually add more rows as well see the frequency of the data as the values are listed unique. Clicking next will publish the knowledge base which is just created. Now the knowledge base is created. We will try to take any random data and attempt to do DQS implementation over it. I am using another excel sheet here for simplicity purpose. In reality you can easily use SQL Server table for the same. Click on New Data Quality Project to see start DQS Project. In the next screen it will ask which knowledge base to use. We will be using our Colors knowledge base which we have recently created. In the Colors knowledge base we had two columns – 1) Colors and 2) Shade. In our case we will be using both of the mappings here. User can select one or multiple column mapping over here. Now the most important phase of the complete project. Click on Start and it will make the cleaning process and shows various results. In our case there were two columns to be processed and it completed the task with necessary information. It demonstrated that in Colors columns it has not corrected any value by itself but in Shade value there is a suggestion it has. We can train the DQS to correct values but let us keep that subject for future blog posts. Now click next and keep the domain Colors selected left side. It will demonstrate that there are two incorrect columns which it needs to be corrected. Here is the place where once corrected value will be auto-corrected in future. I manually corrected the value here and clicked on Approve radio buttons. As soon as I click on Approve buttons the rows will be disappeared from this tab and will move to Corrected Tab. If I had rejected tab it would have moved the rows to Invalid tab as well. In this screen you can see how the corrected 2 rows are demonstrated. You can click on Correct tab and see previously validated 6 rows which passed the DQS process. Now let us click on the Shade domain on the left side of the screen. This domain shows very interesting details as there DQS system guessed the correct answer as Dark with the confidence level of 77%. It is quite a high confidence level and manual observation also demonstrate that Dark is the correct answer. I clicked on Approve and the row moved to corrected tab. On the next screen DQS shows the summary of all the activities. It also demonstrates how the correction of the quality of the data was performed. The user can explore their data to a SQL Server Table, CSV file or Excel. The user also has an option to either explore data and all the associated cleansing info or data only. I will select Data only for demonstration purpose. Clicking explore will generate the files. Let us open the generated file. It will look as following and it looks pretty complete and corrected. Well, we have successfully completed DQS Process. The process is indeed very easy. I suggest you try this out yourself and you will find it very easy to learn. In future we will go over advanced concepts. Are you using this feature on your production server? If yes, would you please leave a comment with your environment and business need. It will be indeed interesting to see where it is implemented. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Business Intelligence, Data Warehousing, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Data Quality Services, DQS

    Read the article

< Previous Page | 176 177 178 179 180 181 182 183 184 185 186 187  | Next Page >