Search Results

Search found 38336 results on 1534 pages for 'sql wait types'.

Page 100/1534 | < Previous Page | 96 97 98 99 100 101 102 103 104 105 106 107  | Next Page >

  • System Variables, Stored Procedures or Functions for Meta Data

    - by BuckWoody
    Whenever you want to know something about SQL Server’s configuration, whether that’s the Instance itself or a database, you have a few options. If you want to know “dynamic” data, such as how much memory or CPU is consumed or what a particular query is doing, you should be using the Dynamic Management Views (DMVs) that you can read about here: http://msdn.microsoft.com/en-us/library/ms188754.aspx  But if you’re looking for how much memory is installed on the server, the version of the Instance, the drive letters of the backups and so on, you have other choices. The first of these are system variables. You access these with a SELECT statement, and they are useful when you need a discrete value for use, say in another query or to put into a table. You can read more about those here: http://msdn.microsoft.com/en-us/library/ms173823.aspx You also have a few stored procedures you can use. These often bring back a lot more data, pre-formatted for the screen. You access these with the EXECUTE syntax. It is a bit more difficult to take the data they return and get a single value or place the results in another table, but it is possible. You can read more about those here: http://msdn.microsoft.com/en-us/library/ms187961.aspx Yet another option is to use a system function, which you access with a SELECT statement, which also brings back a discrete value that you can use in a test or to place in another table. You can read about those here: http://msdn.microsoft.com/en-us/library/ms187812.aspx  By the way, many of these constructs simply query from tables in the master or msdb databases for the Instance or the system tables in a user database. You can get much of the information there as well, and there are even system views in each database to show you the meta-data dealing with structure – more on that here: http://msdn.microsoft.com/en-us/library/ms186778.aspx  Some of these choices are the only way to get at a certain piece of data. But others overlap – you can use one or the other, they both come back with the same data. So, like many Microsoft products, you have multiple ways to do the same thing. And that’s OK – just research what each is used for and how it’s intended to be used, and you’ll be able to select (pun intended) the right choice. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • Export all SSIS packages from msdb using Powershell

    - by jamiet
    Have you ever wanted to dump all the SSIS packages stored in msdb out to files? Of course you have, who wouldn’t? Right? Well, at least one person does because this was the subject of a thread (save all ssis packages to file) on the SSIS forum earlier today. Some of you may have already figured out a way of doing this but for those that haven’t here is a nifty little script that will do it for you and it uses our favourite jack-of-all tools … Powershell!! Imagine I have the following package folder structure on my Integration Services server (i.e. in [msdb]): There are two packages in there called “20110111 Chaining Expression components” & “Package”, I want to export those two packages into a folder structure that mirrors that in [msdb]. Here is the Powershell script that will do that:Param($SQLInstance = "localhost") #####Add all the SQL goodies (including Invoke-Sqlcmd)##### add-pssnapin sqlserverprovidersnapin100 -ErrorAction SilentlyContinue add-pssnapin sqlservercmdletsnapin100 -ErrorAction SilentlyContinue cls $Packages = Invoke-Sqlcmd -MaxCharLength 10000000 -ServerInstance $SQLInstance -Query "WITH cte AS ( SELECT cast(foldername as varchar(max)) as folderpath, folderid FROM msdb..sysssispackagefolders WHERE parentfolderid = '00000000-0000-0000-0000-000000000000' UNION ALL SELECT cast(c.folderpath + '\' + f.foldername as varchar(max)), f.folderid FROM msdb..sysssispackagefolders f INNER JOIN cte c ON c.folderid = f.parentfolderid ) SELECT c.folderpath,p.name,CAST(CAST(packagedata AS VARBINARY(MAX)) AS VARCHAR(MAX)) as pkg FROM cte c INNER JOIN msdb..sysssispackages p ON c.folderid = p.folderid WHERE c.folderpath NOT LIKE 'Data Collector%'" Foreach ($pkg in $Packages) { $pkgName = $Pkg.name $folderPath = $Pkg.folderpath $fullfolderPath = "c:\temp\$folderPath\" if(!(test-path -path $fullfolderPath)) { mkdir $fullfolderPath | Out-Null } $pkg.pkg | Out-File -Force -encoding ascii -FilePath "$fullfolderPath\$pkgName.dtsx" } To run it simply change the “localhost” parameter of the server you want to connect to either by editing the script or passing it in when the script is executed. It will create the folder structure in C:\Temp (which you can also easily change if you so wish – just edit the script accordingly). Here’s the folder structure that it created for me: Notice how it is a mirror of the folder structure in [msdb]. Hope this is useful! @Jamiet UPDATE: THis post prompted Chad Miller to write a post describing his Powershell add-in that utilises a SSIS API to do exporting of packages. Go take a read here: http://sev17.com/2011/02/importing-and-exporting-ssis-packages-using-powershell/

    Read the article

  • Export all SSIS packages from msdb using Powershell

    - by jamiet
    Have you ever wanted to dump all the SSIS packages stored in msdb out to files? Of course you have, who wouldn’t? Right? Well, at least one person does because this was the subject of a thread (save all ssis packages to file) on the SSIS forum earlier today. Some of you may have already figured out a way of doing this but for those that haven’t here is a nifty little script that will do it for you and it uses our favourite jack-of-all tools … Powershell!!   Imagine I have the following package folder structure on my Integration Services server (i.e. in [msdb]): There are two packages in there called “20110111 Chaining Expression components” & “Package”, I want to export those two packages into a folder structure that mirrors that in [msdb]. Here is the Powershell script that will do that:   Param($SQLInstance = "localhost") #####Add all the SQL goodies (including Invoke-Sqlcmd)##### add-pssnapin sqlserverprovidersnapin100 -ErrorAction SilentlyContinue add-pssnapin sqlservercmdletsnapin100 -ErrorAction SilentlyContinue cls $Packages = Invoke-Sqlcmd -MaxCharLength 10000000 -ServerInstance $SQLInstance -Query "WITH cte AS ( SELECT cast(foldername as varchar(max)) as folderpath, folderid FROM msdb..sysssispackagefolders WHERE parentfolderid = '00000000-0000-0000-0000-000000000000' UNION ALL SELECT cast(c.folderpath + '\' + f.foldername as varchar(max)), f.folderid FROM msdb..sysssispackagefolders f INNER JOIN cte c ON c.folderid = f.parentfolderid ) SELECT c.folderpath,p.name,CAST(CAST(packagedata AS VARBINARY(MAX)) AS VARCHAR(MAX)) as pkg FROM cte c INNER JOIN msdb..sysssispackages p ON c.folderid = p.folderid WHERE c.folderpath NOT LIKE 'Data Collector%'" Foreach ($pkg in $Packages) { $pkgName = $Pkg.name $folderPath = $Pkg.folderpath $fullfolderPath = "c:\temp\$folderPath\" if(!(test-path -path $fullfolderPath)) { mkdir $fullfolderPath | Out-Null } $pkg.pkg | Out-File -Force -encoding ascii -FilePath "$fullfolderPath\$pkgName.dtsx" }   To run it simply change the “localhost” parameter of the server you want to connect to either by editing the script or passing it in when the script is executed. It will create the folder structure in C:\Temp (which you can also easily change if you so wish – just edit the script accordingly). Here’s the folder structure that it created for me: Notice how it is a mirror of the folder structure in [msdb]. Hope this is useful! @Jamiet

    Read the article

  • Joins in single-table queries

    - by Rob Farley
    Tables are only metadata. They don’t store data. I’ve written something about this before, but I want to take a viewpoint of this idea around the topic of joins, especially since it’s the topic for T-SQL Tuesday this month. Hosted this time by Sebastian Meine (@sqlity), who has a whole series on joins this month. Good for him – it’s a great topic. In that last post I discussed the fact that we write queries against tables, but that the engine turns it into a plan against indexes. My point wasn’t simply that a table is actually just a Clustered Index (or heap, which I consider just a special type of index), but that data access always happens against indexes – never tables – and we should be thinking about the indexes (specifically the non-clustered ones) when we write our queries. I described the scenario of looking up phone numbers, and how it never really occurs to us that there is a master list of phone numbers, because we think in terms of the useful non-clustered indexes that the phone companies provide us, but anyway – that’s not the point of this post. So a table is metadata. It stores information about the names of columns and their data types. Nullability, default values, constraints, triggers – these are all things that define the table, but the data isn’t stored in the table. The data that a table describes is stored in a heap or clustered index, but it goes further than this. All the useful data is going to live in non-clustered indexes. Remember this. It’s important. Stop thinking about tables, and start thinking about indexes. So let’s think about tables as indexes. This applies even in a world created by someone else, who doesn’t have the best indexes in mind for you. I’m sure you don’t need me to explain Covering Index bit – the fact that if you don’t have sufficient columns “included” in your index, your query plan will either have to do a Lookup, or else it’ll give up using your index and use one that does have everything it needs (even if that means scanning it). If you haven’t seen that before, drop me a line and I’ll run through it with you. Or go and read a post I did a long while ago about the maths involved in that decision. So – what I’m going to tell you is that a Lookup is a join. When I run SELECT CustomerID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 285; against the AdventureWorks2012 get the following plan: I’m sure you can see the join. Don’t look in the query, it’s not there. But you should be able to see the join in the plan. It’s an Inner Join, implemented by a Nested Loop. It’s pulling data in from the Index Seek, and joining that to the results of a Key Lookup. It clearly is – the QO wouldn’t call it that if it wasn’t really one. It behaves exactly like any other Nested Loop (Inner Join) operator, pulling rows from one side and putting a request in from the other. You wouldn’t have a problem accepting it as a join if the query were slightly different, such as SELECT sod.OrderQty FROM Sales.SalesOrderHeader AS soh JOIN Sales.SalesOrderDetail as sod on sod.SalesOrderID = soh.SalesOrderID WHERE soh.SalesPersonID = 285; Amazingly similar, of course. This one is an explicit join, the first example was just as much a join, even thought you didn’t actually ask for one. You need to consider this when you’re thinking about your queries. But it gets more interesting. Consider this query: SELECT SalesOrderID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 276 AND CustomerID = 29522; It doesn’t look like there’s a join here either, but look at the plan. That’s not some Lookup in action – that’s a proper Merge Join. The Query Optimizer has worked out that it can get the data it needs by looking in two separate indexes and then doing a Merge Join on the data that it gets. Both indexes used are ordered by the column that’s indexed (one on SalesPersonID, one on CustomerID), and then by the CIX key SalesOrderID. Just like when you seek in the phone book to Farley, the Farleys you have are ordered by FirstName, these seek operations return the data ordered by the next field. This order is SalesOrderID, even though you didn’t explicitly put that column in the index definition. The result is two datasets that are ordered by SalesOrderID, making them very mergeable. Another example is the simple query SELECT CustomerID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 276; This one prefers a Hash Match to a standard lookup even! This isn’t just ordinary index intersection, this is something else again! Just like before, we could imagine it better with two whole tables, but we shouldn’t try to distinguish between joining two tables and joining two indexes. The Query Optimizer can see (using basic maths) that it’s worth doing these particular operations using these two less-than-ideal indexes (because of course, the best indexese would be on both columns – a composite such as (SalesPersonID, CustomerID – and it would have the SalesOrderID column as part of it as the CIX key still). You need to think like this too. Not in terms of excusing single-column indexes like the ones in AdventureWorks2012, but in terms of having a picture about how you’d like your queries to run. If you start to think about what data you need, where it’s coming from, and how it’s going to be used, then you will almost certainly write better queries. …and yes, this would include when you’re dealing with regular joins across multiples, not just against joins within single table queries.

    Read the article

  • Collation errors in business

    - by Rob Farley
    At the PASS Summit last month, I did a set (Lightning Talk) about collation, and in particular, the difference between the “English” spoken by people from the US, Australia and the UK. One of the examples I gave was that in the US drivers might stop for gas, whereas in Australia, they just open the window a little. This is what’s known as a paraprosdokian, where you suddenly realise you misunderstood the first part of the sentence, based on what was said in the second. My current favourite is Emo Phillip’s line “I like to play chess with old men in the park, but it can be hard to find thirty-two of them.” Essentially, this a collation error, one that good comedians can get mileage from. Unfortunately, collation is at its worst when we have a computer comparing two things in different collations. They might look the same, and sound the same, but if one of the things is in SQL English, and the other one is in Windows English, the poor database server (with no sense of humour) will get suspicious of developers (who all have senses of humour, obviously), and declare a collation error, worried that it might not realise some nuance of the language. One example is the common scenario of a case-sensitive collation and a case-insensitive one. One may think that “Rob” and “rob” are the same, but the other might not. Clearly one of them is my name, and the other is a verb which means to steal (people called “Nick” have the same problem, of course), but I have no idea whether “Rob” and “rob” should be considered the same or not – it depends on the collation. I told a lie before – collation isn’t at its worst in the computer world, because the computer has the sense to complain about the collation issue. People don’t. People will say something, with their own understanding of what they mean. Other people will listen, and apply their own collation to it. I remember when someone was asking me about a situation which had annoyed me. They asked if I was ‘pissed’, and I said yes. I meant that I was annoyed, but they were asking if I’d been drinking. It took a moment for us to realise the misunderstanding. In business, the problem is escalated. A business user may explain something in a particular way, using terminology that they understand, but using words that mean something else to a technical person. I remember a situation with a checkbox on a form (back in VB6 days from memory). It was used to indicate that something was approved, and indicated whether a particular database field should store True or False – nothing more. However, the client understood it to mean that an entire workflow system would be implemented, with different users have permission to approve items and more. The project manager I’d just taken over from clearly hadn’t appreciated that, and I faced a situation of explaining the misunderstanding to the client. Lots of fun... Collation errors aren’t just a database setting that you can ignore. You need to remember that Americans speak a different type of English to Aussies and Poms, and techies speak a different language to their clients.

    Read the article

  • Cloud – the forecast is improving

    - by Rob Farley
    There is a lot of discussion about “the cloud”, and how that affects people’s data stories. Today the discussion enters the realm of T-SQL Tuesday, hosted this month by Jorge Segarra. Over the years, companies have invested a lot in making sure that their data is good, and I mean every aspect of it – the quality of it, the security of it, the performance of it, and more. Experts such as those of us at LobsterPot Solutions have helped these companies with this, and continue to work with clients to make sure that data is a strong part of their business, not an oversight. Whether business intelligence systems are being utilised or not, every business needs to be able to rely on its data, and have the confidence in it. Data should be a foundation upon which a business is built. In the past, data had been stored in paper-based systems. Filing cabinets stored vital information. Today, people have server rooms with storage of various kinds, recognising that filing cabinets don’t necessarily scale particularly well. It’s easy to ‘lose’ data in a filing cabinet, when you have people who need to make sure that the sheets of paper are in the right spot, and that you know how things are stored. Databases help solve that problem, but still the idea of a large filing cabinet continues, it just doesn’t involve paper. If something happens to the physical ‘filing cabinet’, then the problems are larger still. Then the data itself is under threat. Many clients have generators in case the power goes out, redundant cables in case the connectivity dies, and spare servers in other buildings just in case they’re required. But still they’re maintaining filing cabinets. You see, people like filing cabinets. There’s something to be said for having your data ‘close’. Even if the data is not in readable form, living as bits on a disk somewhere, the idea that its home is ‘in the building’ is comforting to many people. They simply don’t want to move their data anywhere else. The cloud offers an alternative to this, and the human element is an obstacle. By leveraging the cloud, companies can have someone else look after their filing cabinet. A lot of people really don’t like the idea of this, partly because the administrators of the data, those people who could potentially log in with escalated rights and see more than they should be allowed to, who need to be trusted to respond if there’s a problem, are now a faceless entity in the cloud. But this doesn’t mean that the cloud is bad – this is simply a concern that some people may have. In new functionality that’s on its way, we see other hybrid mechanisms that mean that people can leverage parts of the cloud with less fear. Companies can use cloud storage to hold their backup data, for example, backups that have been encrypted and are therefore not able to be read by anyone (including administrators) who don’t have the right password. Companies can have a database instance that runs locally, but which has its data files in the cloud, complete with Transparent Data Encryption if needed. There can be a higher level of control, making the change easier to accept. Hybrid options allow people who have had fears (potentially very justifiable) to take a new look at the cloud, and to start embracing some of the benefits of the cloud (such as letting someone else take care of storage, high availability, and more) without losing the feeling of the data being close. @rob_farley

    Read the article

  • My server is slower than the average user's computer, should I still offload Access queries to SQL Server? [closed]

    - by andrewb
    Possible Duplicate: How do you do Load Testing and Capacity Planning for Databases I have a database set up with MS Access 2007 front ends and an SQL Server 2005 back end. At the moment, all the queries are saved in the front end as I've only recently moved to an SQL Server backend. I'm wondering how much of those queries I should save as stored procedures/views on SQL Server. About the system The number of concurrent users is only a handful, though it could be as high as 25 at one time (very unlikely). The average computer has an Intel i3-2120 CPU running at 3.3 GHz, which gets a PassMark score of 3,987, whilst the server has an Intel Xeon E5335 running at 2.0 GHz, which gets a PassMark score of 2,637. Always an awkward situation when an i3 outperforms a Xeon... though the i3 is from Q1 2011 and the Xeon is Q2 2009. There is potential for a server upgrade in the future, though it wouldn't come easy. I'm inclined to move the queries to the back end, as they are beginning to take noticeable time and I figure that is a better way of doing things. I like the idea of throwing everything at the server, then pushing for a server upgrade. It makes more sense in my mind to be upgrading one server rather than 30 PCs. Or am I being overzealous? Why my question isn't a duplicate It seems that my question has been misinterpreted and labelled a duplicate of quite a different question, one about testing and capacity planning. I'll try explain how my question is very different from the linked question. The crux of my question is something like "Even though my server is technically slower, is it better to have it doing more of the queries?" There's two ways that people could have answered this: I agree the server is going to be slower, but the extra benefits of such and such (like the less Access the better) means you should move most to the server anyway. (OR no it doesn't outweigh the benefit, keep them in Access) Actually the server will be faster because of such and such. I'm hoping that people out there could provide some answers like this, and the question in the dupe link doesn't really provide either of these answers. Ok sure, I suppose I could do extensive performance testing to compare Access queries running on a local machine to SQL Server queries running on the server, but that sounds like a very hard task (particularly performance testing of access) compared to someone giving some quick general guidance, and again, my question is looking for a lot more than immediate performance benefit.

    Read the article

  • Install Sql Server Developer Edition 32-bit (or Enterprise Edition) on Windows 7 Home Premium 64-bit

    - by ali62b
    Is there any work around to Successfully install SQL server 2008 32-bit on Windows 7 Home premium 64-bit ? If this is the case I first installed VS 2008 SP 1 on my machine and when I click on install.exe file for installing SQL Server 2008 (Developer Edition) I get an error related to .NET Framework version which is installed already on my PC. { I get the same error trying to install Enterprise Edition}

    Read the article

  • How do I get SQL Profiler to show statements with column names like 'password'?

    - by Kev
    I'm profiling a database just now and need to see the UPDATE and INSERT statements being executed on a particular table. However, because the table has a 'Password' column the SQL Profiler is being understandingly cautious and replacing the TextData column with: -- 'password' was found in the text of this event. -- The text has been replaced with this comment for security reasons. How do I prevent it doing this because I need to see the SQL statement being executed?

    Read the article

  • SQL Server Express with Advanced Services (with Reporting Services)???

    - by Fretwizard
    I have tried to download SQL Server 2005 Express edition about 4 times trying to find the correct version that has business intelligence studio and reporting services in it? Every time I try to unhide the advanced configuration during install, it's never there... Can anyone point me to the correct download? Looking for 2005 (not 2008) because my work SQL server that I am trying to learn this for is 2005, and the training material I have is for 2005 and VS 2008 does not want to integrate with SQL2008 express.

    Read the article

  • Alias a linked Server in SQL server management studio?

    - by absentmindeduk
    Hoping someone can help - is there a way in SQL server management studio 2008 R2 that I can alias a linked SQL server? I have a server, added by IP address, to which I do not have the login credentials - however as the connection is already setup I can login ok. Issue is that, this is a dev environment, prior to a live deployment and the IP I have as a linked server needs to be 'accessible' by my stored procs under a different name, eg 'myserver' not 192.168.xxx.xxx... Any help much appreciated.

    Read the article

  • SSIS - XML Source Script

    - by simonsabin
    The XML Source in SSIS is great if you have a 1 to 1 mapping between entity and table. You can do more complex mapping but it becomes very messy and won't perform. What other options do you have? The challenge with XML processing is to not need a huge amount of memory. I remember using the early versions of Biztalk with loaded the whole document into memory to map from one document type to another. This was fine for small documents but was an absolute killer for large documents. You therefore need a streaming approach. For flexibility however you want to be able to generate your rows easily, and if you've ever used the XmlReader you will know its ugly code to write. That brings me on to LINQ. The is an implementation of LINQ over XML which is really nice. You can write nice LINQ queries instead of the XMLReader stuff. The downside is that by default LINQ to XML requires a whole XML document to work with. No streaming. Your code would look like this. We create an XDocument and then enumerate over a set of annoymous types we generate from our LINQ statement XDocument x = XDocument.Load("C:\\TEMP\\CustomerOrders-Attribute.xml");   foreach (var xdata in (from customer in x.Elements("OrderInterface").Elements("Customer")                        from order in customer.Elements("Orders").Elements("Order")                        select new { Account = customer.Attribute("AccountNumber").Value                                   , OrderDate = order.Attribute("OrderDate").Value }                        )) {     Output0Buffer.AddRow();     Output0Buffer.AccountNumber = xdata.Account;     Output0Buffer.OrderDate = Convert.ToDateTime(xdata.OrderDate); } As I said the downside to this is that you are loading the whole document into memory. I did some googling and came across some helpful videos from a nice UK DPE Mike Taulty http://www.microsoft.com/uk/msdn/screencasts/screencast/289/LINQ-to-XML-Streaming-In-Large-Documents.aspx. Which show you how you can combine LINQ and the XmlReader to get a semi streaming approach. I took what he did and implemented it in SSIS. What I found odd was that when I ran it I got different numbers between using the streamed and non streamed versions. I found the cause was a little bug in Mikes code that causes the pointer in the XmlReader to progress past the start of the element and thus foreach (var xdata in (from customer in StreamReader("C:\\TEMP\\CustomerOrders-Attribute.xml","Customer")                                from order in customer.Elements("Orders").Elements("Order")                                select new { Account = customer.Attribute("AccountNumber").Value                                           , OrderDate = order.Attribute("OrderDate").Value }                                ))         {             Output0Buffer.AddRow();             Output0Buffer.AccountNumber = xdata.Account;             Output0Buffer.OrderDate = Convert.ToDateTime(xdata.OrderDate);         } These look very similiar and they are the key element is the method we are calling, StreamReader. This method is what gives us streaming, what it does is return a enumerable list of elements, because of the way that LINQ works this results in the data being streamed in. static IEnumerable<XElement> StreamReader(String filename, string elementName) {     using (XmlReader xr = XmlReader.Create(filename))     {         xr.MoveToContent();         while (xr.Read()) //Reads the first element         {             while (xr.NodeType == XmlNodeType.Element && xr.Name == elementName)             {                 XElement node = (XElement)XElement.ReadFrom(xr);                   yield return node;             }         }         xr.Close();     } } This code is specifically designed to return a list of the elements with a specific name. The first Read reads the root element and then the inner while loop checks to see if the current element is the type we want. If not we do the xr.Read() again until we find the element type we want. We then use the neat function XElement.ReadFrom to read an element and all its sub elements into an XElement. This is what is returned and can be consumed by the LINQ statement. Essentially once one element has been read we need to check if we are still on the same element type and name (the inner loop) This was Mikes mistake, if we called .Read again we would advance the XmlReader beyond the start of the Element and so the ReadFrom method wouldn't work. So with the code above you can use what ever LINQ statement you like to flatten your XML into the rowsets you want. You could even have multiple outputs and generate your own surrogate keys.        

    Read the article

  • The Data Scientist

    - by BuckWoody
    A new term - well, perhaps not that new - has come up and I’m actually very excited about it. The term is Data Scientist, and since it’s new, it’s fairly undefined. I’ll explain what I think it means, and why I’m excited about it. In general, I’ve found the term deals at its most basic with analyzing data. Of course, we all do that, and the term itself in that definition is redundant. There is no science that I know of that does not work with analyzing lots of data. But the term seems to refer to more than the common practices of looking at data visually, putting it in a spreadsheet or report, or even using simple coding to examine data sets. The term Data Scientist (as far as I can make out this early in it’s use) is someone who has a strong understanding of data sources, relevance (statistical and otherwise) and processing methods as well as front-end displays of large sets of complicated data. Some - but not all - Business Intelligence professionals have these skills. In other cases, senior developers, database architects or others fill these needs, but in my experience, many lack the strong mathematical skills needed to make these choices properly. I’ve divided the knowledge base for someone that would wear this title into three large segments. It remains to be seen if a given Data Scientist would be responsible for knowing all these areas or would specialize. There are pretty high requirements on the math side, specifically in graduate-degree level statistics, but in my experience a company will only have a few of these folks, so they are expected to know quite a bit in each of these areas. Persistence The first area is finding, cleaning and storing the data. In some cases, no cleaning is done prior to storage - it’s just identified and the cleansing is done in a later step. This area is where the professional would be able to tell if a particular data set should be stored in a Relational Database Management System (RDBMS), across a set of key/value pair storage (NoSQL) or in a file system like HDFS (part of the Hadoop landscape) or other methods. Or do you examine the stream of data without storing it in another system at all? This is an important decision - it’s a foundation choice that deals not only with a lot of expense of purchasing systems or even using Cloud Computing (PaaS, SaaS or IaaS) to source it, but also the skillsets and other resources needed to care and feed the system for a long time. The Data Scientist sets something into motion that will probably outlast his or her career at a company or organization. Often these choices are made by senior developers, database administrators or architects in a company. But sometimes each of these has a certain bias towards making a decision one way or another. The Data Scientist would examine these choices in light of the data itself, starting perhaps even before the business requirements are created. The business may not even be aware of all the strategic and tactical data sources that they have access to. Processing Once the decision is made to store the data, the next set of decisions are based around how to process the data. An RDBMS scales well to a certain level, and provides a high degree of ACID compliance as well as offering a well-known set-based language to work with this data. In other cases, scale should be spread among multiple nodes (as in the case of Hadoop landscapes or NoSQL offerings) or even across a Cloud provider like Windows Azure Table Storage. In fact, in many cases - most of the ones I’m dealing with lately - the data should be split among multiple types of processing environments. This is a newer idea. Many data professionals simply pick a methodology (RDBMS with Star Schemas, NoSQL, etc.) and put all data there, regardless of its shape, processing needs and so on. A Data Scientist is familiar not only with the various processing methods, but how they work, so that they can choose the right one for a given need. This is a huge time commitment, hence the need for a dedicated title like this one. Presentation This is where the need for a Data Scientist is most often already being filled, sometimes with more or less success. The latest Business Intelligence systems are quite good at allowing you to create amazing graphics - but it’s the data behind the graphics that are the most important component of truly effective displays. This is where the mathematics requirement of the Data Scientist title is the most unforgiving. In fact, someone without a good foundation in statistics is not a good candidate for creating reports. Even a basic level of statistics can be dangerous. Anyone who works in analyzing data will tell you that there are multiple errors possible when data just seems right - and basic statistics bears out that you’re on the right track - that are only solvable when you understanding why the statistical formula works the way it does. And there are lots of ways of presenting data. Sometimes all you need is a “yes” or “no” answer that can only come after heavy analysis work. In that case, a simple e-mail might be all the reporting you need. In others, complex relationships and multiple components require a deep understanding of the various graphical methods of presenting data. Knowing which kind of chart, color, graphic or shape conveys a particular datum best is essential knowledge for the Data Scientist. Why I’m excited I love this area of study. I like math, stats, and computing technologies, but it goes beyond that. I love what data can do - how it can help an organization. I’ve been fortunate enough in my professional career these past two decades to work with lots of folks who perform this role at companies from aerospace to medical firms, from manufacturing to retail. Interestingly, the size of the company really isn’t germane here. I worked with one very small bio-tech (cryogenics) company that worked deeply with analysis of complex interrelated data. So  watch this space. No, I’m not leaving Azure or distributed computing or Microsoft. In fact, I think I’m perfectly situated to investigate this role further. We have a huge set of tools, from RDBMS to Hadoop to allow me to explore. And I’m happy to share what I learn along the way.

    Read the article

  • .Net Windows Service Throws EventType clr20r3 system.data.sqlclient.sql error

    - by William Edmondson
    I have a .Net/c# 2.0 windows service. The entry point is wrapped in a try catch block yet when I look at the server's application event log I seem a number of "EventType clr20r3" errors that are causing the service to die unexpectedly. The catch block has a "catch (Exception ex)". Each sql commands is of the type "CommandType.StoredProcedure" and are executed with SqlDataReader's. These sproc calls function correctly 99% of time and have all been thoroughly unit tested, profiled, and QA'd. I additionally wrapped these calls in try catch blocks just to be sure and am still experiencing these unhandled exceptions. This only in our production environment and cannot be duplicated in our dev or staging environments (even under heavy load). Why would my error handling not catch this particular error? Is there anyway to capture more detail as to the root cause of the problem? Here is an example of the event log: EventType clr20r3, P1 RDC.OrderProcessorService, P2 1.0.0.0, P3 4ae6a0d0, P4 system.data, P5 2.0.0.0, P6 4889deaf, P7 2490, P8 2c, P9 system.data.sqlclient.sql, P10 NIL. Additionally The Order Processor service terminated unexpectedly. It has done this 1 time(s). The following corrective action will be taken in 60000 milliseconds: Restart the service.

    Read the article

  • LINQ To SQL ignore unique constraint exception and continue

    - by Martin
    I have a single table in a database called Users Users ------ ID (PK, Identity) Username (Unique Index) I have setup a unique index on the Username table to prevent duplicates. I am then enumerating through a collection and creating a new user in the database for each item. What I want to do is just insert a new user and ignore the exception if the unique key constraint is violated (as it's clearly a duplicate record in that case). This is to avoid having to craft where not exists kind of queries. First off, is this going to be any more efficient or should my insert code be checking for duplicates instead? I'm drawn more to the database having that logic as this prevents any other type of client from inserting duplicate data. My other issue is related to LINQ To SQL. I have the following code: public class TestRepo { DatabaseDataContext database = new DatabaseDataContext(); public void Add(string username) { database.Users.InsertOnSubmit(new User() { Username = username }); } public void Save() { database.SubmitChanges(); } } And then I iterate over a collection and insert new users, ignoring any exceptions: TestRepo repo = new TestRepo(); foreach (var name in new string[] { "Tim", "Bob", "John" }) { try { repo.Add(name); repo.Save(); } catch { } } The first time this is run, great I have three users in the table. If I remove the second one and run this code again, nothing is inserted. I expected the first insert to fail with the exception, the second to succeed (as I just removed that item from the DB) and the third to then fail. What seems to be happening is that once the SqlException is thrown (even though the loop continues to iterate) all of the next inserts fail - even when there isn't a row in the table that would cause a unique violation. Can anyone explain this? P.S. The only workaround I could find was to instantiate the repo each time before the insert, then it worked exactly as excepted - indicating that it's something to do with the LINQ To SQL DataContext. Thanks.

    Read the article

  • Error Handling in T-SQL Scalar Function

    - by hydroparadise
    Ok.. this question could easily take multiple paths, so I will hit the more specific path first. While working with SQL Server 2005, I'm trying to create a scalar funtion that acts as a 'TryCast' from varchar to int. Where I encounter a problem is when I add a TRY block in the function; CREATE FUNCTION u_TryCastInt ( @Value as VARCHAR(MAX) ) RETURNS Int AS BEGIN DECLARE @Output AS Int BEGIN TRY SET @Output = CONVERT(Int, @Value) END TRY BEGIN CATCH SET @Output = 0 END CATCH RETURN @Output END Turns out theres all sorts of things wrong with this statement including "Invalid use of side-effecting or time-dependent operator in 'BEGIN TRY' within a function" and "Invalid use of side-effecting or time-dependent operator in 'END TRY' within a function". I can't seem to find any examples of using try statements within a scalar function, which got me thinking, is error handling in a function is possible? The goal here is to make a robust version of the Convert or Cast functions to allow a SELECT statement carry through depsite conversion errors. For example, take the following; CREATE TABLE tblTest ( f1 VARCHAR(50) ) GO INSERT INTO tblTest(f1) VALUES('1') INSERT INTO tblTest(f1) VALUES('2') INSERT INTO tblTest(f1) VALUES('3') INSERT INTO tblTest(f1) VALUES('f') INSERT INTO tblTest(f1) VALUES('5') INSERT INTO tblTest(f1) VALUES('1.1') SELECT CONVERT(int,f1) AS f1_num FROM tblTest DROP TABLE tblTest It never reaches point of dropping the table because the execution gets hung on trying to convert 'f' to an integer. I want to be able to do something like this; SELECT u_TryCastInt(f1) AS f1_num FROM tblTest fi_num __________ 1 2 3 0 5 0 Any thoughts on this? Is there anything that exists that handles this? Also, I would like to try and expand the conversation to support SQL Server 2000 since Try blocks are not an option in that scenario. Thanks in advance.

    Read the article

  • Complex SQL Query similar to a z order problem

    - by AaronLS
    I have a complex SQL problem in MS SQL Server, and in drawing on a piece of paper I realized that I could think of it as a single bar filled with rectangles, each rectangle having segments with different Z orders. In reality it has nothing to do with z order or graphics at all, but more to do with some complex business rules that would be difficult to explain. Howoever, if anyone has ideas on how to solve the below that will give me my solution. I have the following data: ObjectID, PercentOfBar, ZOrder (where smaller is closer) A, 100, 6 B, 50, 5 B, 50, 4 C, 30, 3 C, 70, 6 The result of my query that I want is this, in any order: PercentOfBar, ZOrder 50, 5 20, 4 30, 3 Think of it like this, if I drew rectangle A, it would fill 100% of the bar and have a z order of 6. 66666666666 AAAAAAAAAAA If I then layed out rectangle B, consisting of two segments, both segments would cover up rectangle A resulting in the following rendering: 4444455555 BBBBBBBBBB As a rule of thumb, for a given rectangle, it's segments should be layed out such that the highest z order is to the right of the lower Z orders. Finally rectangle C would cover up only portions of Rectangle B with it's 30% segment that is z order 3, which would be on the left. You can hopefully see how the is represented in the output dataset I listed above: 3334455555 CCCBBBBBBB Now to make things more complicated I actually have a 4th column such that this grouping occurs for each key: Input: SomeKey, ObjectID, PercentOfBar, ZOrder (where smaller is closer) X, A, 100, 6 X, B, 50, 5 X, B, 50, 4 X, C, 30, 3 X, C, 70, 6 Y, A, 100, 6 Z, B, 50, 2 Z, B, 50, 6 Z, C, 100, 5 Output: SomeKey, PercentOfBar, ZOrder X, 50, 5 X, 20, 4 X, 30, 3 Y, 100, 6 Z, 50, 2 Z, 50, 5 Notice in the output, the PercentOfBar for each SomeKey would add up to 100%. This is one I know I'm going to be thinking about when I go to bed tonight. Just to be explicit and have a question: What would be a query that would produce the results described above?

    Read the article

  • Atomic UPSERT in SQL Server 2005

    - by rabidpebble
    What is the correct pattern for doing an atomic "UPSERT" (UPDATE where exists, INSERT otherwise) in SQL Server 2005? I see a lot of code on SO (e.g. see http://stackoverflow.com/questions/639854/tsql-check-if-a-row-exists-otherwise-insert) with the following two-part pattern: UPDATE ... FROM ... WHERE <condition> -- race condition risk here IF @@ROWCOUNT = 0 INSERT ... or IF (SELECT COUNT(*) FROM ... WHERE <condition>) = 0 -- race condition risk here INSERT ... ELSE UPDATE ... where will be an evaluation of natural keys. None of the above approaches seem to deal well with concurrency. If I cannot have two rows with the same natural key, it seems like all of the above risk inserting rows with the same natural keys in race condition scenarios. I have been using the following approach but I'm surprised not to see it anywhere in people's responses so I'm wondering what is wrong with it: INSERT INTO <table> SELECT <natural keys>, <other stuff...> FROM <table> WHERE NOT EXISTS -- race condition risk here? ( SELECT 1 FROM <table> WHERE <natural keys> ) UPDATE ... WHERE <natural keys> (Note: I'm assuming that rows will not be deleted from this table. Although it would be nice to discuss how to handle the case where they can be deleted -- are transactions the only option? Which level of isolation?) Is this atomic? I can't locate where this would be documented in SQL Server documentation.

    Read the article

  • Fastest way to remove non-numeric characters from a VARCHAR in SQL Server

    - by Dan Herbert
    I'm writing an import utility that is using phone numbers as a unique key within the import. I need to check that the phone number does not already exist in my DB. The problem is that phone numbers in the DB could have things like dashes and parenthesis and possibly other things. I wrote a function to remove these things, the problem is that it is slow and with thousands of records in my DB and thousands of records to import at once, this process can be unacceptably slow. I've already made the phone number column an index. I tried using the script from this post: http://stackoverflow.com/questions/52315/t-sql-trim-nbsp-and-other-non-alphanumeric-characters But that didn't speed it up any. Is there a faster way to remove non-numeric characters? Something that can perform well when 10,000 to 100,000 records have to be compared. Whatever is done needs to perform fast. Update Given what people responded with, I think I'm going to have to clean the fields before I run the import utility. To answer the question of what I'm writing the import utility in, it is a C# app. I'm comparing BIGINT to BIGINT now, with no need to alter DB data and I'm still taking a performance hit with a very small set of data (about 2000 records). Could comparing BIGINT to BIGINT be slowing things down? I've optimized the code side of my app as much as I can (removed regexes, removed unneccessary DB calls). Although I can't isolate SQL as the source of the problem anymore, I still feel like it is.

    Read the article

  • Can I select 0 columns in SQL Server?

    - by Woody Zenfell III
    I am hoping this question fares a little better than the similar Create a table without columns. Yes, I am asking about something that will strike most as pointlessly academic. It is easy to produce a SELECT result with 0 rows (but with columns), e.g. SELECT a = 1 WHERE 1 = 0. Is it possible to produce a SELECT result with 0 columns (but with rows)? e.g. something like SELECT NO COLUMNS FROM Foo. (This is not valid T-SQL.) I came across this because I wanted to insert several rows without specifying any column data for any of them. e.g. (SQL Server 2005) CREATE TABLE Bar (id INT NOT NULL IDENTITY PRIMARY KEY) INSERT INTO Bar SELECT NO COLUMNS FROM Foo -- Invalid column name 'NO'. -- An explicit value for the identity column in table 'Bar' can only be specified when a column list is used and IDENTITY_INSERT is ON. One can insert a single row without specifying any column data, e.g. INSERT INTO Foo DEFAULT VALUES. One can query for a count of rows (without retrieving actual column data from the table), e.g. SELECT COUNT(*) FROM Foo. (But that result set, of course, has a column.) I tried things like INSERT INTO Bar () SELECT * FROM Foo -- Parameters supplied for object 'Bar' which is not a function. -- If the parameters are intended as a table hint, a WITH keyword is required. and INSERT INTO Bar DEFAULT VALUES SELECT * FROM Foo -- which is a standalone INSERT statement followed by a standalone SELECT statement. I can do what I need to do a different way, but the apparent lack of consistency in support for degenerate cases surprises me. I read through the relevant sections of BOL and didn't see anything. I was surprised to come up with nothing via Google either.

    Read the article

  • How to Convert using of SqlLit to Simple SQL command in C#

    - by Nasser Hajloo
    I want to get start with DayPilot control I do not use SQLLite and this control documented based on SQLLite. I want to use SQL instead of SQL Lite so if you can, please do this for me. main site with samples http://www.daypilot.org/calendar-tutorial.html The database contains a single table with the following structure CREATE TABLE event ( id VARCHAR(50), name VARCHAR(50), eventstart DATETIME, eventend DATETIME); Loading Events private DataTable dbGetEvents(DateTime start, int days) { SQLiteDataAdapter da = new SQLiteDataAdapter("SELECT [id], [name], [eventstart], [eventend] FROM [event] WHERE NOT (([eventend] <= @start) OR ([eventstart] >= @end))", ConfigurationManager.ConnectionStrings["db"].ConnectionString); da.SelectCommand.Parameters.AddWithValue("start", start); da.SelectCommand.Parameters.AddWithValue("end", start.AddDays(days)); DataTable dt = new DataTable(); da.Fill(dt); return dt; } Update private void dbUpdateEvent(string id, DateTime start, DateTime end) { using (SQLiteConnection con = new SQLiteConnection(ConfigurationManager.ConnectionStrings["db"].ConnectionString)) { con.Open(); SQLiteCommand cmd = new SQLiteCommand("UPDATE [event] SET [eventstart] = @start, [eventend] = @end WHERE [id] = @id", con); cmd.Parameters.AddWithValue("id", id); cmd.Parameters.AddWithValue("start", start); cmd.Parameters.AddWithValue("end", end); cmd.ExecuteNonQuery(); } }

    Read the article

  • Linq to SQL not inserting data onto the DB

    - by Jesus Rodriguez
    Hello! I have a little / weird behaviour here and Im looking over internet and SO and I didn't find a response. I have to admit that this is my first time using databases, I know how to use them with SQL but never used it actually. Anyway, I have a problem with my app inserting data, I just created a very simple project for testing that and no solution yet. I have an example database with Sql Server Id - int (identity primary key) Name - nchar(10) (not null) The table is called "Person", simple as pie. I have this: static void Main(string[] args) { var db = new ExampleDBDataContext {Log = Console.Out}; var jesus = new Person {Name = "Jesus"}; db.Persons.InsertOnSubmit(jesus); db.SubmitChanges(); var query = from person in db.Persons select person; foreach (var p in query) { Console.WriteLine(p.Name); } } As you can see, nothing extrange. It show Jesus in the console. But if you see the table data, there is no data, just empty. I comment the object creation and insertion and the foreach doesn't print a thing (normal, there is no data in the database) The weird thing is that I created a row in the database manually and the Id was 2 and no 1 (Was the linq really playing with the database but it didn't create the row?) There is the log: INSERT INTO [dbo].Person VALUES (@p0) SELECT CONVERT(Int,SCOPE_IDENTITY()) AS [value] -- @p0: Input NChar (Size = 10; Prec = 0; Scale = 0) [Jesus] -- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.30729.4926 SELECT [t0].[Id], [t0].[Name] FROM [dbo].[Person] AS [t0] -- Context: SqlProvider(Sql2005) Model: AttributedMetaModel Build: 3.5.30729.4926 I am really confused, All the blogs / books use this kind of snippet to insert an element to a database. Thank you for helping.

    Read the article

  • integer Max value constants in SQL Server T-SQL?

    - by AaronLS
    Are there any constants in T-SQL like there are in some other languages that provide the max and min values ranges of data types such as int? I have a code table where each row has an upper and lower range column, and I need an entry that represents a range where the upper range is the maximum value an int can hold(sort of like a hackish infinity). I would prefer not to hard code it and instead use something like SET UpperRange = int.Max

    Read the article

  • SQL Server 2005 Reporting Services and the Report Viewer

    - by Kendra
    I am having an issue embedding my report into an aspx page. Here's my setup: 1 Server running SQL Server 2005 and SQL Server 2005 Reporting Services 1 Workstation running XP and VS 2005 The server is not on a domain. Reporting Services is a default installation. I have one report called TestMe in a folder called TestReports using a shared datasource. If I view the report in Report Manager, it renders fine. If I view the report using the http ://myserver/reportserver url it renders fine. If I view the report using the http ://myserver/reportserver?/TestReports/TestMe it renders fine. If I try to view the report using http ://myserver/reportserver/TestReports/TestMe, it just goes to the folder navigation page of the home directory. My web application is impersonating somebody specific to get around the server not being on a domain. When I call the report from the report viewer using http ://myserver/reportserver as the server and /TestReports/TestMe as the path I get this error: For security reasons DTD is prohibited in this XML document. To enable DTD processing set the ProhibitDtd property on XmlReaderSettings to false and pass the settings into XmlReader.Create method. When I change the server to http ://myserver/reportserver? I get this error when I run the report: Client found response content type of '', but expected 'text/xml'. The request failed with an empty response. I have been searching for a while and haven't found anything that fixes my issue. Please let me know if there is more information needed. Thanks in advance, Kendra

    Read the article

< Previous Page | 96 97 98 99 100 101 102 103 104 105 106 107  | Next Page >