Search Results

Search found 36816 results on 1473 pages for 'sql pass'.

Page 118/1473 | < Previous Page | 114 115 116 117 118 119 120 121 122 123 124 125  | Next Page >

  • The Data Scientist

    - by BuckWoody
    A new term - well, perhaps not that new - has come up and I’m actually very excited about it. The term is Data Scientist, and since it’s new, it’s fairly undefined. I’ll explain what I think it means, and why I’m excited about it. In general, I’ve found the term deals at its most basic with analyzing data. Of course, we all do that, and the term itself in that definition is redundant. There is no science that I know of that does not work with analyzing lots of data. But the term seems to refer to more than the common practices of looking at data visually, putting it in a spreadsheet or report, or even using simple coding to examine data sets. The term Data Scientist (as far as I can make out this early in it’s use) is someone who has a strong understanding of data sources, relevance (statistical and otherwise) and processing methods as well as front-end displays of large sets of complicated data. Some - but not all - Business Intelligence professionals have these skills. In other cases, senior developers, database architects or others fill these needs, but in my experience, many lack the strong mathematical skills needed to make these choices properly. I’ve divided the knowledge base for someone that would wear this title into three large segments. It remains to be seen if a given Data Scientist would be responsible for knowing all these areas or would specialize. There are pretty high requirements on the math side, specifically in graduate-degree level statistics, but in my experience a company will only have a few of these folks, so they are expected to know quite a bit in each of these areas. Persistence The first area is finding, cleaning and storing the data. In some cases, no cleaning is done prior to storage - it’s just identified and the cleansing is done in a later step. This area is where the professional would be able to tell if a particular data set should be stored in a Relational Database Management System (RDBMS), across a set of key/value pair storage (NoSQL) or in a file system like HDFS (part of the Hadoop landscape) or other methods. Or do you examine the stream of data without storing it in another system at all? This is an important decision - it’s a foundation choice that deals not only with a lot of expense of purchasing systems or even using Cloud Computing (PaaS, SaaS or IaaS) to source it, but also the skillsets and other resources needed to care and feed the system for a long time. The Data Scientist sets something into motion that will probably outlast his or her career at a company or organization. Often these choices are made by senior developers, database administrators or architects in a company. But sometimes each of these has a certain bias towards making a decision one way or another. The Data Scientist would examine these choices in light of the data itself, starting perhaps even before the business requirements are created. The business may not even be aware of all the strategic and tactical data sources that they have access to. Processing Once the decision is made to store the data, the next set of decisions are based around how to process the data. An RDBMS scales well to a certain level, and provides a high degree of ACID compliance as well as offering a well-known set-based language to work with this data. In other cases, scale should be spread among multiple nodes (as in the case of Hadoop landscapes or NoSQL offerings) or even across a Cloud provider like Windows Azure Table Storage. In fact, in many cases - most of the ones I’m dealing with lately - the data should be split among multiple types of processing environments. This is a newer idea. Many data professionals simply pick a methodology (RDBMS with Star Schemas, NoSQL, etc.) and put all data there, regardless of its shape, processing needs and so on. A Data Scientist is familiar not only with the various processing methods, but how they work, so that they can choose the right one for a given need. This is a huge time commitment, hence the need for a dedicated title like this one. Presentation This is where the need for a Data Scientist is most often already being filled, sometimes with more or less success. The latest Business Intelligence systems are quite good at allowing you to create amazing graphics - but it’s the data behind the graphics that are the most important component of truly effective displays. This is where the mathematics requirement of the Data Scientist title is the most unforgiving. In fact, someone without a good foundation in statistics is not a good candidate for creating reports. Even a basic level of statistics can be dangerous. Anyone who works in analyzing data will tell you that there are multiple errors possible when data just seems right - and basic statistics bears out that you’re on the right track - that are only solvable when you understanding why the statistical formula works the way it does. And there are lots of ways of presenting data. Sometimes all you need is a “yes” or “no” answer that can only come after heavy analysis work. In that case, a simple e-mail might be all the reporting you need. In others, complex relationships and multiple components require a deep understanding of the various graphical methods of presenting data. Knowing which kind of chart, color, graphic or shape conveys a particular datum best is essential knowledge for the Data Scientist. Why I’m excited I love this area of study. I like math, stats, and computing technologies, but it goes beyond that. I love what data can do - how it can help an organization. I’ve been fortunate enough in my professional career these past two decades to work with lots of folks who perform this role at companies from aerospace to medical firms, from manufacturing to retail. Interestingly, the size of the company really isn’t germane here. I worked with one very small bio-tech (cryogenics) company that worked deeply with analysis of complex interrelated data. So  watch this space. No, I’m not leaving Azure or distributed computing or Microsoft. In fact, I think I’m perfectly situated to investigate this role further. We have a huge set of tools, from RDBMS to Hadoop to allow me to explore. And I’m happy to share what I learn along the way.

    Read the article

  • SSMS Tools Pack now supports Denali CTP1

    - by AaronBertrand
    Earlier today, Mladen Prajdic ( blog | twitter ) released an updated version of his SSMS Tools Pack (v.1.9.4), a free add-in for Management Studio that provides a ton of helpful functionality that isn't available with the native tools. I'm really glad this happened, because I've installed Denali on all of my VMs and have been using it for most of my work, and I've been missing some of the little things the tool adds. In addition to adding Denali support, Mladen also fixed a handful of minor bugs...(read more)

    Read the article

  • Smart defaults [SSDT]

    - by jamiet
    I’ve just discovered a new, somewhat hidden, feature in SSDT that I didn’t know about and figured it would be worth highlighting here because I’ll bet not many others know it either; the feature is called Smart Defaults. It gets around the problem of adding a NOT NULLable column to an existing table that has got data in it – previous to SSDT you would need to define a DEFAULT constraint however it does feel rather cumbersome to create an object purely for the purpose of pushing through a deployment – that’s the situation that Smart Defaults is meant to alleviate. The Smart Defaults option exists in the advanced section of a Publish Profile file: The description of the setting is “Automatically provides a default value when updating a table that contains data with a column that does not allow null values”, in other words checking that option will cause SSDT to insert an arbitrary default value into your newly created NON NULLable column. In case you’re wondering how it does it, here’s how: SSDT creates a DEFAULT CONSTRAINT at the same time as the column is created and then immediately removes that constraint: ALTER TABLE [dbo].[T1]    ADD [C1] INT NOT NULL,         CONSTRAINT [SD_T1_1df7a5f76cf44bb593506d05ff9a1e2b] DEFAULT 0 FOR [C1];ALTER TABLE [dbo].[T1] DROP CONSTRAINT [SD_T1_1df7a5f76cf44bb593506d05ff9a1e2b]; You can then update the value as appropriate in a Post-Deployment script. Pretty cool! On the downside, you can only specify this option for the whole project, not for an individual table or even an individual column – I’m not sure that I’d want to turn this on for an entire project as it could hide problems that a failed deployment would highlight, in other words smart defaults could be seen to be “papering over the cracks”. If you think that should be improved go and vote (and leave a comment) at [SSDT] Allow us to specify Smart defaults per table or even per column. @Jamiet

    Read the article

  • T-SQL Tuesday #005: On Technical Reporting

    - by Adam Machanic
    Reports. They're supposed to look nice. They're supposed to be a method by which people can get vital information into their heads. And that's obvious, right? So obvious that you're undoubtedly getting ready to close this tab and go find something better to do with your life. "Why is Adam wasting my time with this garbage?" Because apparently, it's not obvious. In the world of reporting we have a number of different types of reports: business reports, status reports, analytical reports, dashboards,...(read more)

    Read the article

  • Always use dtexec.exe to test performance of your dataflows. No exceptions.

    - by jamiet
    Earlier this evening I posted a blog post entitled Investigation: Can different combinations of components effect Dataflow performance? where I compared the performance of three different dataflows all working to the same overall goal. I wanted to make one last point related to the results but I thought it warranted a blog post all of its own. Here is a screenshot of one of the dataflows that I was testing: Pretty complicated I’m sure you’ll agree. Now, when I executed this dataflow in the test it was executing in ~19seconds however in that case I was executing using the command-line tool dtexec. I also tried executing inside the BIDS development environment and in that case it took much longer – 139seconds. That’s more than seven times as long. The point I want to make is very simple. If you are testing your dataflows for performance please use dtexec. Nothing else will suffice. @Jamiet

    Read the article

  • SQL Saturday Birmingham #328 Database Design Precon In One Week

    - by drsql
    On September 22, I will be doing my "How to Design a Relational Database" pre-conference session in Birmingham, Alabama. You can see the abstract here if you are interested, and you can sign up there too, naturally. At just $100, which includes a free ebook copy of my database design book, it is a great bargain and I totally promise it will be a little over 7 hours of talking about and designing databases, which will certainly be better than what you do on a normal work day, even a Friday....(read more)

    Read the article

  • Does multiple files in SQL Server when using RAID help reduce conflicts in growth and file-locking?

    - by Dr Giles M
    I've been reading around and get the impression that if you are using RAID then using multiple SQL Server files within a filegroup won't yeild any more improvements, and the benefits are purely administrative (if you started to run out of space or wanted to partition off data into managable chunks for backups/balancing the data around your big server room). However, being a reasonably savvy software person, it's not unthinkable to hypothesise that, even for smaller databases that SQL Server will perform growth and locking operations (for writes) on a LOGICAL file basis, so even if you are using RAID, it seems to make sense to have multiple files in a file group to balance I/O, or does the time taken to reconstruct the data from distributed filegroups outweigh the benefits of reduced locking? I'm also aware that the behaviour and benefits may be different for tables/indeces/log. Is there a good site that distinguishes the benefits of multiple files when RAID is already in place?

    Read the article

  • Should we have a database independent SQL like query language in Django?

    - by Yugal Jindle
    Note : I know we have Django ORM already that keeps things database independent and converts to the database specific SQL queries. Once things starts getting complicated it is preferred to write raw SQL queries for better efficiency. When you write raw sql queries your code gets trapped with the database you are using. I also understand its important to use the full power of your database that can-not be achieved with the django orm alone. My Question : Until I use any database specific feature, why should one be trapped with the database. For instance : We have a query with multiple joins and we decided to write a raw sql query. Now, that makes my website postgres specific. Even when I have not used any postgres specific feature. I feel there should be some fake sql language which can translate to any database's sql query. Even Django's ORM can be built over it. So, that if you go out of ORM but not database specific - you can still remain database independent. I asked the same question to Jacob Kaplan Moss (In person) : He advised me to stay with the database that I like and endure its whole power, to which I agree. But my point was not that we should be database independent. My point is we should be database independent until we use a database specific feature. Please explain, why should be there a fake sql layer over the actual sql ?

    Read the article

  • Smart defaults [SSDT]

    - by jamiet
    I’ve just discovered a new, somewhat hidden, feature in SSDT that I didn’t know about and figured it would be worth highlighting here because I’ll bet not many others know it either; the feature is called Smart Defaults. It gets around the problem of adding a NOT NULLable column to an existing table that has got data in it – previous to SSDT you would need to define a DEFAULT constraint however it does feel rather cumbersome to create an object purely for the purpose of pushing through a deployment – that’s the situation that Smart Defaults is meant to alleviate. The Smart Defaults option exists in the advanced section of a Publish Profile file: The description of the setting is “Automatically provides a default value when updating a table that contains data with a column that does not allow null values”, in other words checking that option will cause SSDT to insert an arbitrary default value into your newly created NON NULLable column. In case you’re wondering how it does it, here’s how: SSDT creates a DEFAULT CONSTRAINT at the same time as the column is created and then immediately removes that constraint: ALTER TABLE [dbo].[T1]    ADD [C1] INT NOT NULL,         CONSTRAINT [SD_T1_1df7a5f76cf44bb593506d05ff9a1e2b] DEFAULT 0 FOR [C1];ALTER TABLE [dbo].[T1] DROP CONSTRAINT [SD_T1_1df7a5f76cf44bb593506d05ff9a1e2b]; You can then update the value as appropriate in a Post-Deployment script. Pretty cool! On the downside, you can only specify this option for the whole project, not for an individual table or even an individual column – I’m not sure that I’d want to turn this on for an entire project as it could hide problems that a failed deployment would highlight, in other words smart defaults could be seen to be “papering over the cracks”. If you think that should be improved go and vote (and leave a comment) at [SSDT] Allow us to specify Smart defaults per table or even per column. @Jamiet

    Read the article

  • 24 Hours of PASS scheduling

    - by Rob Farley
    I have a new appreciation for Tom LaRock (@sqlrockstar), who is doing a tremendous job leading the organising committee for the 24 Hours of PASS event (Twitter: #24hop). We’ve just been going through the list of speakers and their preferences for time slots, and hopefully we’ve kept everyone fairly happy. All the submitted sessions (59 of them) were put up for a vote, and over a thousand of you picking your favourites. The top 28 sessions as voted were all included (24 sessions plus 4 reserves), and duplicates (when a single presenter had two sessions in the top 28) were swapped out for others. For example, both sessions submitted by Cindy Gross were in the top 28. These swaps were chosen by the committee to get a good balance of topics. Amazingly, some big names missed out, and even the top ten included some surprises. T-SQL, Indexes and Reporting featured well in the top ten, and in the end, the mix between BI, Dev and DBA ended up quite nicely too. The ten most voted-for sessions were (in order): Jennifer McCown - T-SQL Code Sins: The Worst Things We Do to Code and Why Michelle Ufford - Index Internals for Mere Mortals Audrey Hammonds - T-SQL Awesomeness: 3 Ways to Write Cool SQL Cindy Gross - SQL Server Performance Tools Jes Borland - Reporting Services 201: the Next Level Isabel de la Barra - SQL Server Performance Karen Lopez - Five Physical Database Design Blunders and How to Avoid Them Julie Smith - Cool Tricks to Pull From Your SSIS Hat Kim Tessereau - Indexes and Execution Plans Jen Stirrup - Dashboards Design and Practice using SSRS I think you’ll all agree this is shaping up to be an excellent event.

    Read the article

  • Is the SAN dying???

    - by RickHeiges
    Is the SAN dying? The reason that I ask this question is that MSFT has unleashed technologies this year that point in that direction Always ON Availability Groups shuns shared storage Windows 2012 has Storage Replication Technology that does not require a SAN Windows 2012 has Hyper-V Replica Technology that does not require a SAN PDW v2 continues to reinforce the approach to avoid shared storage I'm not saying that SAN technology does not have its place or does not have benefits inherent to the beast....(read more)

    Read the article

  • Defaults for Exporting Data in Oracle SQL Developer

    - by thatjeffsmith
    I was testing a reported bug in SQL Developer today – so the bug I was looking for wasn’t there (YES!) but I found a different one (NO!) – and I was getting frustrated by having to check the same boxes over and over again. What I wanted was INSERT STATEMENTS to the CLIPBOARD. Not what I want! I’m always doing the same thing, over and over again. And I never go to FILE – that’s too permanent for my type of work. I either want stuff to the clipboard or to the worksheet. Surely there’s a way to tell SQL Developer how to behave? Oh yeah, check the preferences So you can set the defaults for this dialog. Go to: Tools – Preferences – Database – Utilities – Export Now I will always start with ‘INSERT’ and ‘Clipboard’ – woohoo! Now, I can also go INTO the preferences for each of the different formats to save me a few more clicks. I prefer pointy hats (^) for my delimiters, don’t you? So, spend a few minutes and set each of these to what you’re normally doing and save yourself a bunch of time going forward.

    Read the article

  • SQL Server 2008 R2 Service Pack 2 CTP is available

    - by AaronBertrand
    You can download the Service Pack 2 CTP from the following URL: http://www.microsoft.com/en-us/download/details.aspx?id=29848 The build # is 10.50.3720. This service pack contains all of the fixes from Service Pack 1 & Cumulative Updates 1 through 5, and a couple of other minor fixes (a couple of SSRS bugs and a bug about an ALTER TABLE batch not being cached correctly). It does not include fixes from Service Pack 1 Cumulative Update #6, which I mentioned recently . You should *NOT* install this...(read more)

    Read the article

  • Oracle Database 12c By Example – SQL Developer and Multitenant

    - by thatjeffsmith
    As you may have heard, Oracle Database 12c is now available. In addition to the binaries and docs going out, we also published a few new Oracle By Example (OBE) chapters. You can find those links here on our product page. Do you know who found these, practically the minute they were published? An enterprising DBA-extraordinaire who was just happening to be presenting at the ODTUG KScope13 conference in New Orleans. He thought it would be a good idea to download the new software over a hotel WIFI, install and create a new multitenant database, watch a few OBEs, and then demo that live for his ‘SQL Developer for DBAs‘ session. Pretty crazy, right? Well, he did it, and I was there to watch. Way cool. You can listen to @leight0nn tell his story in his own words via this ODTUG interview with @oraclenered. In case you’re too giddy to sit through the video, I’ll give you a preview – he succesfully cloned a pluggable database in about a minute with only a couple of clicks using Oracle SQL Developer 3.2.20.09 while connected to a 12c database.

    Read the article

  • SQL Server Add Primary Key

    - by Derek D.
    Adding a primary key can be done either after a table is created, or at the same a table is created. It is important to note, that by default a primary key is clustered. This may or may not be the preferred method of creation. For more information on clustered vs non [...]

    Read the article

  • SQL Server 2000 need to prevent logons whilst performing a backup for a side by side migration

    - by pigeon
    I'm looking for a way to prevent logons from occurring in order to take a full backup of a Database to migrate from its current SQL Server 2000 instance to a new SQL 2005 instance. A friend of mine suggested running a script which would put the DB into a rollback state. Not being a DBA my DDL is very poor and running a script that I don't understand may not be the best idea. One option which might be easier is to simply detach and copy, to the new server. Any suggestions would be greatly appreciated.

    Read the article

  • PASS: The Budget Process

    - by Bill Graziano
    Every fiscal year PASS creates a detailed budget.  This helps us set priorities and communicate to our members what we’re going to do in the upcoming year.  You can review the current budget on the PASS Governance page.  That page currently requires you to login but I’m talking with HQ to see if there are any legal issues with opening that up. The Accounting Team The PASS accounting team is two people.  The Executive Vice-President of Finance (“EVP”) and the PASS Accounting Manager.  Sandy Cherry is the accounting manager and works at PASS HQ.  Sandy has been with PASS since we switched management companies in 2007.  Throughout this document when I talk about any actual work related to the budget that’s all Sandy :)  She’s the glue that gets us through this process.  Last year we went through 32 iterations of the budget before the Board approved so it’s a pretty busy time for her us – well, mostly her. Fiscal Year The PASS fiscal year runs from July 1st through June 30th the following year.  Right now we’re in fiscal year 2011.  Our 2010 Summit actually occurred in FY2011.  We switched to this schedule from a calendar year in 2006.  Our goal was to have the Summit occur early in our fiscal year.  That gives us the rest of the year to handle any significant financial impact from the Summit.  If registrations are down we can reduce spending.  If registrations are up we can decide how much to increase our reserves and how much to spend.  Keep in mind that the Summit is budgeted to generate 82% of our revenue this year.  How it performs has a significant impact on our financials.  The other benefit of this fiscal year is that it matches the Microsoft fiscal year.  We sign an annual sponsorship agreement with Microsoft and it’s very helpful that our fiscal years match. This year our budget process will probably start in earnest in March or April.  I’d like to be done in early June so we can publish before July 1st.  I was late publishing it this year and I’m trying not to repeat that. Our Budget Our actual budget is an Excel spreadsheet with 36 sheets.  We remove some of those when we publish it since they include salary information.  The budget is broken up into various portfolios or departments.  We have 20 portfolios.  They include chapters, marketing, virtual chapters, marketing, etc.  Ideally each portfolio is assigned to a Board member.  Each portfolio also typically has a staff person assigned to it.  Portfolios that aren’t assigned to a Board member are monitored by HQ and the ExecVP-Finance (me).  These are typically smaller portfolios such as deferred membership or Summit futures.  (More on those in a later post.)  All portfolios are reviewed by all Board members during the budget approval process, when interim financials are released internally and at year-end. The Process Our first step is to budget revenues.  The Board determines a target attendee number.  We have formulas based on historical performance that convert that to an overall attendee revenue number.  Other revenue projections (such as vendor sponsorships) come from different parts of the organization.  I hope to have another post with more details on how we project revenues. The next step is to budget expenses.  Board members fill out a sample spreadsheet with their budget for the year.  They can add line items and notes describing what the amounts are for.  Each Board portfolio typically has from 10 to 30 line items.  Any new initiatives they want to pursue needs to be budgeted.  The Summit operations budget is managed by HQ.  It includes the cost for food, electrical, internet, etc.  Most of these come from our estimate of attendees and our contract with the convention center.  During this process the Board can ask for more or less to be spent on various line items.  For example, if we weren’t happy with the Internet at the last Summit we can ask them to look into different options and/or increasing the budget.  HQ will also make adjustments to these numbers based on what they see at the events and the feedback we receive on the surveys. After we have all the initial estimates we start reviewing the entire budget.  It is sent out to the Board and we can see what each portfolio requested and what the overall profit and loss number is.  We usually start with too much in expenses and need to cut.  In years past the Board started haggling over these numbers as a group.  This past year they decided I should take a first cut and present them with a reasonable budget and a list of what I changed.  That worked well and I think we’ll continue to do that in the future. We go through a number of iterations on the budget.  If I remember correctly, we went through 32 iterations before we passed the budget.  At each iteration various revenue and expense numbers can change.  Keep in mind that the PASS budget has 200+ line items spread over 20 portfolios.  Many of these depend on other numbers.  For example, if we decide increase the projected attendees that cascades through our budget.  At each iteration we list what changed and the impact.  Ideally these discussions will take place at a face-to-face Board meeting.  Many of them also take place over the phone.  Board members explain any increase they are asking for while performing due diligence on other budget requests.  Eventually a budget emerges and is passed. Publishing After the budget is passed we create a version without the formulas and salaries for posting on the web site.  Sandy also creates some charts to help our members understand the budget.  The EVP writes a nice little letter describing some of the changes from last year’s budget.  You can see my letter and our budget on the PASS Governance page. And then, eight months later, we start all over again.

    Read the article

  • SQL Server Manageability Series: how to change the default path of .cache files of a data collector? #sql #mdw #dba

    - by ssqa.net
    How to change the default path of .cache files of a data collector after the Management Data Warehouse (MDW has been setup? This was the question asked by one of the DBAs in a client's place, instantly I enquired that were there any folder specified while setting up the MDW and obvious answer was no as there were left default. This means all the .CACHE files are stored under %C\TEMP directory which may post out of disk space problem on the server where the MDW is setup to collect. Going back...(read more)

    Read the article

  • High Performance SQL Views Using WITH(NOLOCK)

    - by gt0084e1
    Every now and then you find a simple way to make everything much faster. We often find customers creating data warehouses or OLAP cubes even though they have a relatively small amount of data (a few gigs) compared to their server memory. If you have more server memory than the size of your database or working set, nearly any aggregate query should run in a second or less. In some situations there may be high traffic on from the transactional application and SQL server may wait for several other queries to run before giving you your results. The purpose of this is make sure you don’t get two versions of the truth. In an ATM system, you want to give the bank balance after the withdrawal, not before or you may get a very unhappy customer. So by default databases are rightly very conservative about this kind of thing. Unfortunately this split-second precision comes at a cost. The performance of the query may not be acceptable by today’s standards because the database has to maintain locks on the server. Fortunately, SQL Server gives you a simple way to ask for the current version of the data without the pending transactions. To better facilitate reporting, you can create a view that includes these directives. CREATE VIEW CategoriesAndProducts AS SELECT * FROM dbo.Categories WITH(NOLOCK) INNER JOIN dbo.Products WITH(NOLOCK) ON dbo.Categories.CategoryID = dbo.Products.CategoryID In some cases quires that are taking minutes end up taking seconds. Much easier than moving the data to a separate database and it’s still pretty much real time give or take a few milliseconds. You’ve been warned not to use this for bank balances though. More from Data Stream

    Read the article

  • A different interface for the Sql Server Reporting Service?

    - by AngryHacker
    I have a SQL Server 2005 SQL Reporting Services implementation. It seems that the only way to actually access the reports is for the users to use Internet Explorer. The web page uses an ActiveX control to do its printing (and probably other functions as well). Does SSRS have a different way to access its functionality via the web browser? Like maybe Java or HTML based? If so, how do I actually turn it on? The reason I am asking is because the security is being tightened and ActiveX controls will be banished, thus the users won't be able to print.

    Read the article

  • MERGE gives better OUTPUT options

    - by Rob Farley
    MERGE is very cool. There are a ton of useful things about it – mostly around the fact that you can implement a ton of change against a table all at once. This is great for data warehousing, handling changes made to relational databases by applications, all kinds of things. One of the more subtle things about MERGE is the power of the OUTPUT clause. Useful for logging.   If you’re not familiar with the OUTPUT clause, you really should be – it basically makes your DML (INSERT/DELETE/UPDATE/MERGE) statement return data back to you. This is a great way of returning identity values from INSERT commands (so much better than SCOPE_IDENTITY() or the older (and worse) @@IDENTITY, because you can get lots of rows back). You can even use it to grab default values that are set using non-deterministic functions like NEWID() – things you couldn’t normally get back without running another query (or with a trigger, I guess, but that’s not pretty). That inserted table I referenced – that’s part of the ‘behind-the-scenes’ work that goes on with all DML changes. When you insert data, this internal table called inserted gets populated with rows, and then used to inflict the appropriate inserts on the various structures that store data (HoBTs – the Heaps or B-Trees used to store data as tables and indexes). When deleting, the deleted table gets populated. Updates get a matching row in both tables (although this doesn’t mean that an update is a delete followed by an inserted, it’s just the way it’s handled with these tables). These tables can be referenced by the OUTPUT clause, which can show you the before and after for any DML statement. Useful stuff. MERGE is slightly different though. With MERGE, you get a mix of entries. Your MERGE statement might be doing some INSERTs, some UPDATEs and some DELETEs. One of the most common examples of MERGE is to perform an UPSERT command, where data is updated if it already exists, or inserted if it’s new. And in a single operation too. Here, you can see the usefulness of the deleted and inserted tables, which clearly reflect the type of operation (but then again, MERGE lets you use an extra column called $action to show this). (Don’t worry about the fact that I turned on IDENTITY_INSERT, that’s just so that I could insert the values) One of the things I love about MERGE is that it feels almost cursor-like – the UPDATE bit feels like “WHERE CURRENT OF …”, and the INSERT bit feels like a single-row insert. And it is – but into the inserted and deleted tables. The operations to maintain the HoBTs are still done using the whole set of changes, which is very cool. And $action – very convenient. But as cool as $action is, that’s not the point of my post. If it were, I hope you’d all be disappointed, as you can’t really go near the MERGE statement without learning about it. The subtle thing that I love about MERGE with OUTPUT is that you can hook into more than just inserted and deleted. Did you notice in my earlier query that my source table had a ‘src’ field, that wasn’t used in the insert? Normally, this would be somewhat pointless to include in my source query. But with MERGE, I can put that in the OUTPUT clause. This is useful stuff, particularly when you’re needing to audit the changes. Suppose your query involved consolidating data from a number of sources, but you didn’t need to insert that into the actual table, just into a table for audit. This is now very doable, either using the INTO clause of OUTPUT, or surrounding the whole MERGE statement in brackets (parentheses if you’re American) and using a regular INSERT statement. This is also doable if you’re using MERGE to just do INSERTs. In case you hadn’t realised, you can use MERGE in place of an INSERT statement. It’s just like the UPSERT-style statement we’ve just seen, except that we want nothing to match. That’s easy to do, we just use ON 1=2. This is obviously more convoluted than a straight INSERT. And it’s slightly more effort for the database engine too. But, if you want the extra audit capabilities, the ability to hook into the other source columns is definitely useful. Oh, and before people ask if you can also hook into the target table’s columns... Yes, of course. That’s what deleted and inserted give you.

    Read the article

  • Q&amp;A: Will my favourite ORM Foo work with SQL Azure?

    - by Eric Nelson
    short answer: Quite probably, as SQL Azure is very similar to SQL Server longer answer: Object Relational Mappers (ORMs) that work with SQL Server are likely but not guaranteed to work with SQL Azure. The differences between the RDBMS versions are small – but may cause problems, for example in tools used to create the mapping between objects and tables or in generated SQL from the ORM which expects “certain things” :-) More specifically: ADO.NET Entity Framework / LINQ to Entities can be used with SQL Azure, but the Visual Studio designer does not currently work. You will need to point the designer at a version of your database running of SQL Server to create the mapping, then change the connection details to run against SQL Azure. LINQ to SQL has similar issues to ADO.NET Entity Framework above NHibernate can be used against SQL Azure DevExpress XPO supports SQL Azure from version 9.3 DataObjects.Net supports SQL Azure Open Access from Telerik works “seamlessly”  - their words not mine :-) The list above is by no means comprehensive – please leave a comment with details of other ORMs that work (or do not work) with SQL Azure. Related Links: General guidelines and limitations of SQL Azure SQL Azure vs SQL Server

    Read the article

< Previous Page | 114 115 116 117 118 119 120 121 122 123 124 125  | Next Page >