Search Results

Search found 794 results on 32 pages for 'ssis'.

Page 3/32 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

SSIS Reporting Pack v0.4 – Execution Report updated

- by jamiet

SSIS Reporting Pack is a suite of reports that I maintain at http://ssisreportingpack.codeplex.com/ that provide visualisation over the SSIS Catalog in SQL Server 2012 and attempt to add value over the reports that ship in the box. Work on the reports has stalled (my last SSIS Reporting Pack blog post was on 4th September 2011) as I’ve had rather more important things going on my life of late however I have recently checked-in a fix that couldn’t really be delayed. I discovered a problem with the Execution report that was causing the report to effectively hang, it was caused by this bit of SQL hidden away in the report definition: [generated_executables] AS ( SELECT [new_executable].[execution_path],[new_executable].[parent_execution_path] FROM ( SELECT [execution_path] = SUBSTRING([loop_iteration].[execution_path] ,1, [loop_iteration].length_exec_path - [loop_iteration].[char_index_close_square] + 1) , [parent_execution_path] = SUBSTRING([loop_iteration].[execution_path] ,1, [loop_iteration].length_exec_path - [loop_iteration].[char_index_open_square]) FROM ( SELECT [execution_path] , [char_index_open_square] = CHARINDEX('[',REVERSE([execution_path]),1) , [char_index_close_square] = CHARINDEX(']',REVERSE([execution_path]),1) , [length_exec_path] = LEN([execution_path]) FROM [exec_stats] es WHERE execution_path LIKE '%\[%]%' ESCAPE '\' )AS [loop_iteration] ) AS [new_executable] GROUP BY [new_executable].[execution_path],[new_executable].[parent_execution_path]) It was there because SSIS does not currently treat a loop iteration as an executable yet I figured there was still value in being able to view it as such – this SQL essentially “invents” new executables for those loop iterations; its what enabled the following visualisation: where each of the three iterations of a For Each Loop called “FEL Loop over top performing regions” appear in the report. Unfortunately, as I alluded, this could under certain circumstances (most likely when there were many loop iterations) cause the report to hang as it waited for the results to be constructed and returned. The change that I have made eradicates this generation of “fake” executables and thus produces this visualisation instead: Notice that the three “children” of the For Each Loop are no longer the three iterations but actually the task (“EPT Call Data Export Package”) contained within that For Each Loop. The problem here is of course that there is no longer a visual distinction between those three iterations; I have instead made the full execution path viewable via a tooltip: If you preferred the “old” way of presenting this information and are happy to put up with the performance degradation then I have kept the old version of the report hanging around in the reporting pack as “execution loop with iterations” however none of the other reports link to it so you will have to browse to it manually if you want to use it. Please let me know if you ARE using it – I would be very interested to hear about your experiences. The last change to make you aware of in the execution report is that by default I no longer show OnPreValidate or OnPostValidate messages as I consider them to be superfluous and only serve to clutter up the results. If you want to put them back, well, its open source so go right ahead! The latest release of SSIS Reporting Pack that contains all of these changes is v0.4 and can be downloaded from http://ssisreportingpack.codeplex.com/releases/view/88178 Feedback on all of the above changes would be very much appreciated. @Jamiet

Read the article
Working with Decimal fields in SSIS

- by CoffeeAddict

I'm using SQL Server 2008 w/SP2. I've got an incoming decimal(9,2) field incoming through my OLE DB transformation to my recordset destination transformation. It's like it's reading it as something other than a decimal? I don't know..I'm not an SSIS guru. So continuing on...the problem I have starts here with me trying to stuff the value into a variable for this decimal field. In a foreach loop, I have a variable to represent this decimal field so I can work with it. The first problem that I believe is pretty well known is SSIS variables do not have a decimal type. And from my own testing and what I've read out there, people are using type object for the variable to make SSIS "happy" with decimal values? It makes mine happy. But, then in my foreach loop, I have a for loop. And inside that I'm using an E*xecute SQL Task transformation*. In it, I need to create a parameter mapping to my variable so I can work with that decimal field in my T-SQL call in here. So now I see a type decimal for the parameter and use it and set that to point to my variable. When I run SSIS and it hits my SQL call, I get this in my output window.: The type is not supported.DBTYPE_DECIMAL So I am hitting a wall here. All I wanna do is work with a decimal!!!

Read the article
ODBC in SSIS 2012

- by jamiet

In August 2011 the SQL Server client team published a blog post entitled Microsoft is Aligning with ODBC for Native Relational Data Access in which they basically said "OLE DB is the past, ODBC is the future. Deal with it.". From that blog post:We encourage you to adopt ODBC in the development of your new and future versions of your application. You don’t need to change your existing applications using OLE DB, as they will continue to be supported on Denali throughout its lifecycle. While this gives you a large window of opportunity for changing your applications before the deprecation goes into effect, you may want to consider migrating those applications to ODBC as a part of your future roadmap.I recently undertook a project using SSIS2012 and heeded that advice by opting to use ODBC Connection Managers rather than OLE DB Connection Managers. Unfortunately my finding was that the ODBC Connection Manager is not yet ready for primetime use in SSIS 2012. The main issue I found was that you can't populate an Object variable with a recordset when using an Execute SQL Task connecting to an ODBC data source; any attempt to do so will result in an error:"Disconnected recordsets are not available from ODBC connections." I have filed a bug on Connect at ODBC Connection Manager does not have same funcitonality as OLE DB. For this reason I strongly recommend that you don't make the move to ODBC Connection Managers in SSIS just yet - best to wait for the next version of SSIS before doing that.I found another couple of issues with the ODBC Connection Manager that are worth keeping in mind:It doesn't recognise System Data Source Names (DSNs), only User DSNs (bug filed at ODBC System DSNs are not available in the ODBC Connection Manager) UPDATE: According to a comment on that Connect item this may only be a problem on 64bit.In the OLE DB Connection Manager parameter ordinals are 0-based, in the ODBC Connection Manager they are 1-based (oh I just can't wait for the upgrade mess that ensues from this one!!!)You have been warned!@jamiet

Read the article
SSIS Dashboard v0.4

- by Davide Mauri

Following the post on SSISDB script on Gist, I’ve been working on a HTML5 SSIS Dashboard, in order to have a nice looking, user friendly and, most of all, useful, SSIS Dashboard. Since this is a “spare-time” project, I’ve decided to develop it using Python since it’s THE data language (R aside), it’s a beautiful & powerful, well established and well documented and with a rich ecosystem around. Plus it has full support in Visual Studio, through the amazing Python Tools For Visual Studio plugin, I decided also to use Flask, a very good micro-framework to create websites, and use the SB Admin 2.0 Bootstrap admin template, since I’m all but a Web Designer. The result is here: https://github.com/yorek/ssis-dashboard and I can say I’m pretty satisfied with the work done so far (I’ve worked on it for probably less than 24 hours). Though there’s some features I’d like to add in t future (historical execution time, some charts, connection with AzureML to do prediction on expected execution times) it’s already usable. Of course I’ve tested it only on my development machine, so check twice before putting it in production but, give the fact that, virtually, there is not installation needed (you only need to install Python), and that all queries are based on standard SSISDB objects, I expect no big problems. If you want to test, contribute and/or give feedback please fell free to do it…I would really love to see this little project become a community project! Enjoy!

Read the article
The one feature that would make me invest in SSIS 2012

- by Peter Larsson

This week I was invited my Microsoft to give two presentations in Slovenia. My presentations went well and I had good energy and the audience was interacting with me. When I had some time over from networking and partying, I attended a few other presentations. At least the ones who where held in English. One of these was "SQL Server Integration Services 2012 - All the News, and More", given by Davide Mauri, a fellow co-worker from SolidQ. We started to talk and soon came into the details of the new things in SSIS 2012. All of the official things Davide talked about are good stuff, but for me, the best thing is one he didn't cover in his presentation. In earlier versions of SSIS than 2012, it is possible to have a stored procedure to act as a data source, as long as it doesn't have a temp table in it. In that case, you will get an error message from SSIS that "Metadata could not be found". This is still true with SSIS 2012, so the thing I am talking about is not really a SSIS feature, it's a SQL Server 2012 feature. And this is the EXECUTE WITH RESULTSETS feature! With this, you can have a stored procedure with a temp table to deliver the resultset to SSIS, if you execute the stored procedure from SSIS and add the "WITH RESULTSETS" option. If you do this, SSIS is able to take the metadata from the code you write in SSIS and not from the stored procedure! And it's very fast too. Let's say you have a stored procedure in earlier versions and when referencing that stored procedure in SSIS forced SSIS to call the stored procedure (which can take hours), to retrieve the metadata. Now, with RESULTSETS, SSIS 2012 can continue in milliseconds! This is because you provide the metadata in the RESULTSETS clause, and if the data from the stored procedure doesn't match this RESULTSETS, you will get an error anyway, so it makes sense Microsoft has provided this optimization for us.

Read the article
ODBC in SSIS 2012

- by jamiet

In August 2011 the SQL Server client team published a blog post entitled Microsoft is Aligning with ODBC for Native Relational Data Access in which they basically said "OLE DB is the past, ODBC is the future. Deal with it.". From that blog post:We encourage you to adopt ODBC in the development of your new and future versions of your application. You don’t need to change your existing applications using OLE DB, as they will continue to be supported on Denali throughout its lifecycle. While this gives you a large window of opportunity for changing your applications before the deprecation goes into effect, you may want to consider migrating those applications to ODBC as a part of your future roadmap.I recently undertook a project using SSIS2012 and heeded that advice by opting to use ODBC Connection Managers rather than OLE DB Connection Managers. Unfortunately my finding was that the ODBC Connection Manager is not yet ready for primetime use in SSIS 2012. The main issue I found was that you can't populate an Object variable with a recordset when using an Execute SQL Task connecting to an ODBC data source; any attempt to do so will result in an error:"Disconnected recordsets are not available from ODBC connections." I have filed a bug on Connect at ODBC Connection Manager does not have same funcitonality as OLE DB. For this reason I strongly recommend that you don't make the move to ODBC Connection Managers in SSIS just yet - best to wait for the next version of SSIS before doing that.I found another couple of issues with the ODBC Connection Manager that are worth keeping in mind:It doesn't recognise System Data Source Names (DSNs), only User DSNs (bug filed at ODBC System DSNs are not available in the ODBC Connection Manager) UPDATE: According to a comment on that Connect item this may only be a problem on 64bit.In the OLE DB Connection Manager parameter ordinals are 0-based, in the ODBC Connection Manager they are 1-based (oh I just can't wait for the upgrade mess that ensues from this one!!!)You have been warned!@jamiet

Read the article
Merge Join component sorted outputs [SSIS]

- by jamiet

One question that I have been asked a few times of late in regard to performance tuning SSIS data flows is this: Why isn’t the Merge Join output sorted (i.e.IsSorted=True)? This is a fair question. After all both of the Merge Join inputs are sorted, hence why wouldn’t the output be sorted as well? Well here’s a little secret, the Merge Join output IS sorted! There’s a caveat though – it is only under certain circumstances and SSIS itself doesn’t do a good job of informing you of it. Let’s take a look at an example. Here we have a dataflow that consumes data from the [AdventureWorks2008].[Sales].[SalesOrderHeader] & [AdventureWorks2008].[Sales].[SalesOrderDetail] tables then joins them using a Merge Join component: Let’s take a look inside the editor of the Merge Join: We are joining on the [SalesOrderId] field (which is what the two inputs just happen to be sorted upon). We are also putting [SalesOrderHeader].[SalesOrderId] into the output. Believe it or not the output from this Merge Join component is sorted (i.e. has IsSorted=True) but unfortunately the Merge Join component does not have an Advanced Editor hence it is hidden away from us. There are a couple of ways to prove to you that is the case; I could open up the package XML inside the .dtsx file and show you the metadata but there is an easier way than that – I can attach a Sort component to the output. Take a look: Notice that the Sort component is attempting to sort on the [SalesOrderId] column. This gives us the following warning: Validation warning. DFT Get raw data: {992B7C9A-35AD-47B9-A0B0-637F7DDF93EB}: The data is already sorted as specified so the transform can be removed. The warning proves that the output from the Merge Join is sorted! It must be noted that the Merge Join output will only have IsSorted=True if at least one of the join columns is included in the output. So there you go, the Merge Join component can indeed produce a sorted output and that’s very useful in order to avoid unnecessary expensive Sort operations downstream. Hope this is useful to someone out there! @Jamiet P.S. Thank you to Bob Bojanic on the SSIS product team who pointed this out to me!

Read the article
REPLACENULL in SSIS 2012

- by Davide Mauri

While preparing my slides e demos for the forthcoming SQL Server Conference 2012 in Italy, I’ve come across a nice addition to DTS Expression language which I never noticed before and that seems unknown also to the blogosphere: REPLACENULL. REPLACENULL is the same of ISNULL in T-SQL. It’s *very* useful especially when loading a fact table of your BI solution when you need to replace unexisting reference to dimension with dummy values. Here’s an example of how it can be used (please notice that in this example I’m NOT loading a fact table): I’ve noticed that the feature was requested by fellow MVP John Welch http://connect.microsoft.com/SQLServer/feedback/details/636057/ssis-add-a-replacenull-function-to-the-expression-language So: Thanks John and Thanks SSIS Team ! Ah, btw, the Help online is here http://msdn.microsoft.com/en-us/library/hh479601(v=sql.110).aspx Enjoy!

Read the article
Merge Join component sorted outputs [SSIS]

- by jamiet

One question that I have been asked a few times of late in regard to performance tuning SSIS data flows is this: Why isn’t the Merge Join output sorted (i.e.IsSorted=True)? This is a fair question. After all both of the Merge Join inputs are sorted, hence why wouldn’t the output be sorted as well? Well here’s a little secret, the Merge Join output IS sorted! There’s a caveat though – it is only under certain circumstances and SSIS itself doesn’t do a good job of informing you of it. Let’s take a look at an example. Here we have a dataflow that consumes data from the [AdventureWorks2008].[Sales].[SalesOrderHeader] & [AdventureWorks2008].[Sales].[SalesOrderDetail] tables then joins them using a Merge Join component: Let’s take a look inside the editor of the Merge Join: We are joining on the [SalesOrderId] field (which is what the two inputs just happen to be sorted upon). We are also putting [SalesOrderHeader].[SalesOrderId] into the output. Believe it or not the output from this Merge Join component is sorted (i.e. has IsSorted=True) but unfortunately the Merge Join component does not have an Advanced Editor hence it is hidden away from us. There are a couple of ways to prove to you that is the case; I could open up the package XML inside the .dtsx file and show you the metadata but there is an easier way than that – I can attach a Sort component to the output. Take a look: Notice that the Sort component is attempting to sort on the [SalesOrderId] column. This gives us the following warning: Validation warning. DFT Get raw data: {992B7C9A-35AD-47B9-A0B0-637F7DDF93EB}: The data is already sorted as specified so the transform can be removed. The warning proves that the output from the Merge Join is sorted! It must be noted that the Merge Join output will only have IsSorted=True if at least one of the join columns is included in the output. So there you go, the Merge Join component can indeed produce a sorted output and that’s very useful in order to avoid unnecessary expensive Sort operations downstream. Hope this is useful to someone out there! @Jamiet P.S. Thank you to Bob Bojanic on the SSIS product team who pointed this out to me!

Read the article
SSIS: Building SQL databases on-the-fly using concatenated SQL scripts

- by DrJohn

Over the years I have developed many techniques which help automate the whole SQL Server build process. In my current process, where I need to build entire OLAP data marts on-the-fly, I make regular use of a simple but very effective mechanism to concatenate all the SQL Scripts together from my SSMS (SQL Server Management Studio) projects. This proves invaluable because in two clicks I can redeploy an entire SQL Server database with all tables, views, stored procedures etc. Indeed, I can also use the concatenated SQL scripts with SSIS to build SQL Server databases on-the-fly. You may be surprised to learn that I often redeploy the database several times per day, or even several times per hour, during the development process. This is because the deployment errors are logged and you can quickly see where SQL Scripts have object dependency errors. For example, after changing a table structure you may have forgotten to change any related views. The deployment log immediately points out all the objects which failed to build so you can fix and redeploy the database very quickly. The alternative approach (i.e. doing changes in the database directly using the SSMS UI) would require you to check all dependent objects before making changes. The chances are that you will miss something and wonder why your app returns the wrong data – a common problem caused by changing a table without re-creating dependent views. Using SQL Projects in SSMS A great many developers fail to make use of SQL Projects in SSMS (SQL Server Management Studio). To me they are invaluable way of organizing your SQL Scripts. The screenshot below shows a typical SSMS solution made up of several projects – one project for tables, another for views etc. The key point is that the projects naturally fall into the right order in file system because of the project name. The number in the folder or file name ensures that the projects the SQL scripts are concatenated together in the order that they need to be executed. Hence the script filenames start with 100, 110 etc. Concatenating SQL Scripts To concatenate the SQL Scripts together into one file, I use notepad.exe to create a simple batch file (see example screenshot) which uses the TYPE command to write the content of the SQL Script files into a combined file. As the SQL Scripts are in several folders, I simply use several TYPE command multiple times and append the output together. If you are unfamiliar with batch files, you may not know that the angled bracket (>) means write output of the program into a file. Two angled brackets (>>) means append output of this program into a file. So the command-line DIR > filelist.txt would write the content of the DIR command into a file called filelist.txt. In the example shown above, the concatenated file is called SB_DDS.sql If, like me you place the concatenated file under source code control, then the source code control system will change the file's attribute to "read-only" which in turn would cause the TYPE command to fail. The ATTRIB command can be used to remove the read-only flag. Using SQLCmd to execute the concatenated file Now that the SQL Scripts are all in one big file, we can execute the script against a database using SQLCmd using another batch file as shown below: SQLCmd has numerous options, but the script shown above simply executes the SS_DDS.sql file against the SB_DDS_DB database on the local machine and logs the errors to a file called SB_DDS.log. So after executing the batch file you can simply check the error log to see if your database built without a hitch. If you have errors, then simply fix the source files, re-create the concatenated file and re-run the SQLCmd to rebuild the database. This two click operation allows you to quickly identify and fix errors in your entire database definition.Using SSIS to execute the concatenated file To execute the concatenated SQL script using SSIS, you simply drop an Execute SQL task into your package and set the database connection as normal and then select File Connection as the SQLSourceType (as shown below). Create a file connection to your concatenated SQL script and you are ready to go. Tips and TricksAdd a new-line at end of every fileThe most common problem encountered with this approach is that the GO statement on the last line of one file is placed on the same line as the comment at the top of the next file by the TYPE command. The easy fix to this is to ensure all your files have a new-line at the end.Remove all USE database statementsThe SQLCmd identifies which database the script should be run against. So you should remove all USE database commands from your scripts - otherwise you may get unintentional side effects!!Do the Create Database separatelyIf you are using SSIS to create the database as well as create the objects and populate the database, then invoke the CREATE DATABASE command against the master database using a separate package before calling the package that executes the concatenated SQL script.

Read the article
Export all SSIS packages from msdb using Powershell

- by jamiet

Have you ever wanted to dump all the SSIS packages stored in msdb out to files? Of course you have, who wouldn’t? Right? Well, at least one person does because this was the subject of a thread (save all ssis packages to file) on the SSIS forum earlier today. Some of you may have already figured out a way of doing this but for those that haven’t here is a nifty little script that will do it for you and it uses our favourite jack-of-all tools … Powershell!! Imagine I have the following package folder structure on my Integration Services server (i.e. in [msdb]): There are two packages in there called “20110111 Chaining Expression components” & “Package”, I want to export those two packages into a folder structure that mirrors that in [msdb]. Here is the Powershell script that will do that: Param($SQLInstance = "localhost") #####Add all the SQL goodies (including Invoke-Sqlcmd)##### add-pssnapin sqlserverprovidersnapin100 -ErrorAction SilentlyContinue add-pssnapin sqlservercmdletsnapin100 -ErrorAction SilentlyContinue cls $Packages = Invoke-Sqlcmd -MaxCharLength 10000000 -ServerInstance $SQLInstance -Query "WITH cte AS ( SELECT cast(foldername as varchar(max)) as folderpath, folderid FROM msdb..sysssispackagefolders WHERE parentfolderid = '00000000-0000-0000-0000-000000000000' UNION ALL SELECT cast(c.folderpath + '\' + f.foldername as varchar(max)), f.folderid FROM msdb..sysssispackagefolders f INNER JOIN cte c ON c.folderid = f.parentfolderid ) SELECT c.folderpath,p.name,CAST(CAST(packagedata AS VARBINARY(MAX)) AS VARCHAR(MAX)) as pkg FROM cte c INNER JOIN msdb..sysssispackages p ON c.folderid = p.folderid WHERE c.folderpath NOT LIKE 'Data Collector%'" Foreach ($pkg in $Packages) { $pkgName = $Pkg.name $folderPath = $Pkg.folderpath $fullfolderPath = "c:\temp\$folderPath\" if(!(test-path -path $fullfolderPath)) { mkdir $fullfolderPath | Out-Null } $pkg.pkg | Out-File -Force -encoding ascii -FilePath "$fullfolderPath\$pkgName.dtsx" } To run it simply change the “localhost” parameter of the server you want to connect to either by editing the script or passing it in when the script is executed. It will create the folder structure in C:\Temp (which you can also easily change if you so wish – just edit the script accordingly). Here’s the folder structure that it created for me: Notice how it is a mirror of the folder structure in [msdb]. Hope this is useful! @Jamiet

Read the article
SSIS Denali CTP1 Source Assistant

- by andyleonard

I like the new Data Flow Source Assistant in SSIS Denali. The default view is shown above, with the "Show installed only" checkbox checked. When not checked, the list of Source types changes: In previous versions of SSIS, I rarely created connections in the Connection Managers pane - I usually hit a New button in either a Source or Destination Adapter, or in a task. It was just easier letting the task or adapter pick the proper Connection Manager editor. This is handy and a time-saver. :{>...(read more)

Read the article
The SSIS tuning tip that everyone misses

- by Rob Farley

I know that everyone misses this, because I’m yet to find someone who doesn’t have a bit of an epiphany when I describe this. When tuning Data Flows in SQL Server Integration Services, people see the Data Flow as moving from the Source to the Destination, passing through a number of transformations. What people don’t consider is the Source, getting the data out of a database. Remember, the source of data for your Data Flow is not your Source Component. It’s wherever the data is, within your database, probably on a disk somewhere. You need to tune your query to optimise it for SSIS, and this is what most people fail to do. I’m not suggesting that people don’t tune their queries – there’s plenty of information out there about making sure that your queries run as fast as possible. But for SSIS, it’s not about how fast your query runs. Let me say that again, but in bolder text: The speed of an SSIS Source is not about how fast your query runs. If your query is used in a Source component for SSIS, the thing that matters is how fast it starts returning data. In particular, those first 10,000 rows to populate that first buffer, ready to pass down the rest of the transformations on its way to the Destination. Let’s look at a very simple query as an example, using the AdventureWorks database: We’re picking the different Weight values out of the Product table, and it’s doing this by scanning the table and doing a Sort. It’s a Distinct Sort, which means that the duplicates are discarded. It'll be no surprise to see that the data produced is sorted. Obvious, I know, but I'm making a comparison to what I'll do later. Before I explain the problem here, let me jump back into the SSIS world... If you’ve investigated how to tune an SSIS flow, then you’ll know that some SSIS Data Flow Transformations are known to be Blocking, some are Partially Blocking, and some are simply Row transformations. Take the SSIS Sort transformation, for example. I’m using a larger data set for this, because my small list of Weights won’t demonstrate it well enough. Seven buffers of data came out of the source, but none of them could be pushed past the Sort operator, just in case the last buffer contained the data that would be sorted into the first buffer. This is a blocking operation. Back in the land of T-SQL, we consider our Distinct Sort operator. It’s also blocking. It won’t let data through until it’s seen all of it. If you weren’t okay with blocking operations in SSIS, why would you be happy with them in an execution plan? The source of your data is not your OLE DB Source. Remember this. The source of your data is the NCIX/CIX/Heap from which it’s being pulled. Picture it like this... the data flowing from the Clustered Index, through the Distinct Sort operator, into the SELECT operator, where a series of SSIS Buffers are populated, flowing (as they get full) down through the SSIS transformations. Alright, I know that I’m taking some liberties here, because the two queries aren’t the same, but consider the visual. The data is flowing from your disk and through your execution plan before it reaches SSIS, so you could easily find that a blocking operation in your plan is just as painful as a blocking operation in your SSIS Data Flow. Luckily, T-SQL gives us a brilliant query hint to help avoid this. OPTION (FAST 10000) This hint means that it will choose a query which will optimise for the first 10,000 rows – the default SSIS buffer size. And the effect can be quite significant. First let’s consider a simple example, then we’ll look at a larger one. Consider our weights. We don’t have 10,000, so I’m going to use OPTION (FAST 1) instead. You’ll notice that the query is more expensive, using a Flow Distinct operator instead of the Distinct Sort. This operator is consuming 84% of the query, instead of the 59% we saw from the Distinct Sort. But the first row could be returned quicker – a Flow Distinct operator is non-blocking. The data here isn’t sorted, of course. It’s in the same order that it came out of the index, just with duplicates removed. As soon as a Flow Distinct sees a value that it hasn’t come across before, it pushes it out to the operator on its left. It still has to maintain the list of what it’s seen so far, but by handling it one row at a time, it can push rows through quicker. Overall, it’s a lot more work than the Distinct Sort, but if the priority is the first few rows, then perhaps that’s exactly what we want. The Query Optimizer seems to do this by optimising the query as if there were only one row coming through: This 1 row estimation is caused by the Query Optimizer imagining the SELECT operation saying “Give me one row” first, and this message being passed all the way along. The request might not make it all the way back to the source, but in my simple example, it does. I hope this simple example has helped you understand the significance of the blocking operator. Now I’m going to show you an example on a much larger data set. This data was fetching about 780,000 rows, and these are the Estimated Plans. The data needed to be Sorted, to support further SSIS operations that needed that. First, without the hint. ...and now with OPTION (FAST 10000): A very different plan, I’m sure you’ll agree. In case you’re curious, those arrows in the top one are 780,000 rows in size. In the second, they’re estimated to be 10,000, although the Actual figures end up being 780,000. The top one definitely runs faster. It finished several times faster than the second one. With the amount of data being considered, these numbers were in minutes. Look at the second one – it’s doing Nested Loops, across 780,000 rows! That’s not generally recommended at all. That’s “Go and make yourself a coffee” time. In this case, it was about six or seven minutes. The faster one finished in about a minute. But in SSIS-land, things are different. The particular data flow that was consuming this data was significant. It was being pumped into a Script Component to process each row based on previous rows, creating about a dozen different flows. The data flow would take roughly ten minutes to run – ten minutes from when the data first appeared. The query that completes faster – chosen by the Query Optimizer with no hints, based on accurate statistics (rather than pretending the numbers are smaller) – would take a minute to start getting the data into SSIS, at which point the ten-minute flow would start, taking eleven minutes to complete. The query that took longer – chosen by the Query Optimizer pretending it only wanted the first 10,000 rows – would take only ten seconds to fill the first buffer. Despite the fact that it might have taken the database another six or seven minutes to get the data out, SSIS didn’t care. Every time it wanted the next buffer of data, it was already available, and the whole process finished in about ten minutes and ten seconds. When debugging SSIS, you run the package, and sit there waiting to see the Debug information start appearing. You look for the numbers on the data flow, and seeing operators going Yellow and Green. Without the hint, I’d sit there for a minute. With the hint, just ten seconds. You can imagine which one I preferred. By adding this hint, it felt like a magic wand had been waved across the query, to make it run several times faster. It wasn’t the case at all – but it felt like it to SSIS.

Read the article
SSIS Script Component Testing Strategy

- by Paul Kohler

This question is in respect to the script component specifically. I am aware of ssisUnit etc… With simple SSIS Scripts Components, it’s sufficient to let basic testing flesh out issues, however I am working with a script that has grown in complexity over time. To better test the functionality I am considering abstracting the script logic into a DLL that gets deployed with the package, and then use the custom component in the script. The advantage is that the function will be more testable etc but it’s one more deployment artefact that needs to be managed. My question is, does anyone know of a better way to test such an SSIS script in a more isolated manner than to run the whole package and examine the output?

Read the article
FileNameColumnName property, Flat File Source Adapter : SSIS Nugget

- by jamiet

I saw a question on MSDN’s SSIS forum the other day that went something like this: I’m loading data into a table from a flat file but I want to be able to store the name of that file as well. Is there a way of doing that? I don’t want to come across as disrespecting those who took the time to reply but there was a few answers along the lines of “loop over the files using a For Each, store the file name in a variable yadda yadda yadda” when in fact there is a much much simpler way of accomplishing this; it just happens to be a little hidden away as I shall now explain! The Flat File Source Adapter has a property called FileNameColumnName which for some reason it isn’t exposed through the Flat File Source editor, it is however exposed via the Advanced Properties: You’ll see in the screenshot above that I have set FileNameColumnName=“Filename” (it doesn’t matter what name you use, anything except a non-zero string will work). What this will do is create a new column in our dataflow called “Filename” that contains, unsurprisingly, the name of the file from which the row was sourced. All very simple. This is particularly useful if you are extracting data from multiple files using the MultiFlatFile Connection Manager as it allows you to differentiate between data from each of the files as you can see in the following screenshot: So there you have it, the FileNameColumnName property; a little known secret of SSIS. I hope it proves to be useful to someone out there. @Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
FileNameColumnName property, Flat File Source Adapter : SSIS Nugget

- by jamiet

I saw a question on MSDN’s SSIS forum the other day that went something like this: I’m loading data into a table from a flat file but I want to be able to store the name of that file as well. Is there a way of doing that? I don’t want to come across as disrespecting those who took the time to reply but there was a few answers along the lines of “loop over the files using a For Each, store the file name in a variable yadda yadda yadda” when in fact there is a much much simpler way of accomplishing this; it just happens to be a little hidden away as I shall now explain! The Flat File Source Adapter has a property called FileNameColumnName which for some reason it isn’t exposed through the Flat File Source editor, it is however exposed via the Advanced Properties: You’ll see in the screenshot above that I have set FileNameColumnName=“Filename” (it doesn’t matter what name you use, anything except a non-zero string will work). What this will do is create a new column in our dataflow called “Filename” that contains, unsurprisingly, the name of the file from which the row was sourced. All very simple. This is particularly useful if you are extracting data from multiple files using the MultiFlatFile Connection Manager as it allows you to differentiate between data from each of the files as you can see in the following screenshot: So there you have it, the FileNameColumnName property; a little known secret of SSIS. I hope it proves to be useful to someone out there. @Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
SSIS – Delete all files except for the most recent one

- by jorg

Quite often one or more sources for a data warehouse consist of flat files. Most of the times these files are delivered as a zip file with a date in the file name, for example FinanceDataExport_20100528.zip Currently I work at a project that does a full load into the data warehouse every night. A zip file with some flat files in it is dropped in a directory on a daily basis. Sometimes there are multiple zip files in the directory, this can happen because the ETL failed or somebody puts a new zip file in the directory manually. Because the ETL isn’t incremental only the most recent file needs to be loaded. To implement this I used the simple code below; it checks which file is the most recent and deletes all other files. Note: In a previous blog post I wrote about unzipping zip files within SSIS, you might also find this useful: SSIS – Unpack a ZIP file with the Script Task Public Sub Main() 'Use this piece of code to loop through a set of files in a directory 'and delete all files except for the most recent one based on a date in the filename. 'File name example: 'DataExport_20100413.zip Dim rootDirectory As New DirectoryInfo(Dts.Variables("DirectoryFromSsisVariable").Value.ToString) Dim mostRecentFile As String = "" Dim currentFileDate As Integer Dim mostRecentFileDate As Integer = 0 'Check which file is the most recent For Each fi As FileInfo In rootDirectory.GetFiles("*.zip") currentFileDate = CInt(Left(Right(fi.Name, 12), 8)) 'Get date from current filename (based on a file that ends with: YYYYMMDD.zip) If currentFileDate > mostRecentFileDate Then mostRecentFileDate = currentFileDate mostRecentFile = fi.Name End If Next 'Delete all files except the most recent one For Each fi As FileInfo In rootDirectory.GetFiles("*.zip") If fi.Name <> mostRecentFile Then File.Delete(rootDirectory.ToString + "\" + fi.Name) End If Next Dts.TaskResult = ScriptResults.Success End Sub Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
SSIS Prehistory video

- by jamiet

I’m currently wasting spending my Easter bank holiday putting together my presentation SSIS Dataflow Performance Tuning for the upcoming SQL Bits conference in London and in doing so I’m researching some old material about how the dataflow actually works. Boring as it is I’ve gotten easily sidelined and have chanced upon an old video on Channel 9 entitled Euan Garden - Tour of SQL Server Team (part I). Euan is a former member of the SQL Server team and in this series of videos he walks the halls of the SQL Server building on Microsoft’s Redmond campus talking to some of the various protagonists and in this one he happens upon the SQL Server Integration Services team. The video was shot in 2004 so this is a fascinating (to me anyway) glimpse into the development of SSIS from before it was ever shipped and if you’re a geek like me you’ll really enjoy this behind-the-scenes look into how and why the product was architected. The video is also notable for the presence of the cameraman – none other than the now-rather-more-famous-than-he-was-then Robert Scoble. See it at http://channel9.msdn.com/posts/TheChannel9Team/Euan-Garden-Tour-of-SQL-Server-Team-part-I/ Enjoy! @Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
SSIS Prehistory video

- by jamiet

I’m currently wasting spending my Easter bank holiday putting together my presentation SSIS Dataflow Performance Tuning for the upcoming SQL Bits conference in London and in doing so I’m researching some old material about how the dataflow actually works. Boring as it is I’ve gotten easily sidelined and have chanced upon an old video on Channel 9 entitled Euan Garden - Tour of SQL Server Team (part I). Euan is a former member of the SQL Server team and in this series of videos he walks the halls of the SQL Server building on Microsoft’s Redmond campus talking to some of the various protagonists and in this one he happens upon the SQL Server Integration Services team. The video was shot in 2004 so this is a fascinating (to me anyway) glimpse into the development of SSIS from before it was ever shipped and if you’re a geek like me you’ll really enjoy this behind-the-scenes look into how and why the product was architected. The video is also notable for the presence of the cameraman – none other than the now-rather-more-famous-than-he-was-then Robert Scoble. See it at http://channel9.msdn.com/posts/TheChannel9Team/Euan-Garden-Tour-of-SQL-Server-Team-part-I/ Enjoy! @Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
Tale of an Encrypted SSIS Package in msdb and a Lost Password

- by Argenis

Yesterday a Developer at work asked for a copy of an SSIS package in Production so he could work on it (please, dear Reader – withhold judgment on Source Control – I know!). I logged on to the SSIS instance, and when I went to export the package… Oops. I didn’t have that password. The DBA who uploaded the package to Production is long gone; my fellow DBA had no idea either - and the Devs returned a cricket sound when queried. So I posed the obligatory question on #SQLHelp and a bunch of folks jumped in – some to help and some to make fun of me (thanks, @SQLSoldier @crummel4 @maryarcia and @sqljoe). I tried their suggestions to no avail…even ran some queries to see if I could figure out how to extract the package XML from the system tables in msdb: SELECT CAST(CAST(p.packagedata AS varbinary(max)) AS varchar(max)) FROM msdb.dbo.sysssispackages p WHERE p.name = 'LePackage' This just returned a bunch of XML with encrypted data on it: I knew there was a job in SQL Agent scheduled to execute the package, and when I tried to look at details on the job step I got the following: Not very helpful. The password had to be saved somewhere, but where?? All of a sudden I remembered that there was a system table I hadn’t queried yet: SELECT sjs.command FROM msdb.dbo.sysjobs sj JOIN msdb.dbo.sysjobsteps sjs ON sj.job_id = sjs.job_id WHERE sj.name = 'Run LePackage' The result: “Well, that’s really secure”, I thought to myself. Cheers, -Argenis

Read the article
Running SSIS packages from C#

- by Piotr Rodak

Most of the developers and DBAs know about two ways of deploying packages: You can deploy them to database server and run them using SQL Server Agent job or you can deploy the packages to file system and run them using dtexec.exe utility. Both approaches have their pros and cons. However I would like to show you that there is a third way (sort of) that is often overlooked, and it can give you capabilities the ‘traditional’ approaches can’t. I have been working for a few years with applications that run packages from host applications that are implemented in .NET. As you know, SSIS provides programming model that you can use to implement more flexible solutions. SSIS applications are usually thought to be batch oriented, with fairly rigid architecture and processing model, with fixed timeframes when the packages are executed to process data. It doesn’t to be the case, you don’t have to limit yourself to batch oriented architecture. I have very good experiences with service oriented architectures processing large amounts of data. These applications are more complex than what I would like to show here, but the principle stays the same: you can execute packages as a service, on ad-hoc basis. You can also implement and schedule various signals, HTTP calls, file drops, time schedules, Tibco messages and other to run the packages. You can implement event handler that will trigger execution of SSIS when a certain event occurs in StreamInsight stream. This post is just a small example of how you can use the API and other features to create a service that can run SSIS packages on demand. I thought it might be a good idea to implement a restful service that would listen to requests and execute appropriate actions. As it turns out, it is trivial in C#. The application is implemented as console application for the ease of debugging and running. In reality, you might want to implement the application as Windows service. To begin, you have to reference namespace System.ServiceModel.Web and then add a few lines of code: Uri baseAddress = new Uri("http://localhost:8011/"); WebServiceHost svcHost = new WebServiceHost(typeof(PackRunner), baseAddress); try { svcHost.Open(); Console.WriteLine("Service is running"); Console.WriteLine("Press enter to stop the service."); Console.ReadLine(); svcHost.Close(); } catch (CommunicationException cex) { Console.WriteLine("An exception occurred: {0}", cex.Message); svcHost.Abort(); } The interesting lines are 3, 7 and 13. In line 3 you create a WebServiceHost object. In line 7 you start listening on the defined URL and then in line 13 you shut down the service. As you have noticed, the WebServiceHost constructor is accepting type of an object (here: PackRunner) that will be instantiated as singleton and subsequently used to process the requests. This is the class where you put your logic, but to tell WebServiceHost how to use it, the class must implement an interface which declares methods to be used by the host. The interface itself must be ornamented with attribute ServiceContract. [ServiceContract] public interface IPackRunner { [OperationContract] [WebGet(UriTemplate = "runpack?package={name}")] string RunPackage1(string name); [OperationContract] [WebGet(UriTemplate = "runpackwithparams?package={name}&rows={rows}")] string RunPackage2(string name, int rows); } Each method that is going to be used by WebServiceHost has to have attribute OperationContract, as well as WebGet or WebInvoke attribute. The detailed discussion of the available options is outside of scope of this post. I also recommend using more descriptive names to methods . Then, you have to provide the implementation of the interface: public class PackRunner : IPackRunner { ... There are two methods defined in this class. I think that since the full code is attached to the post, I will show only the more interesting method, the RunPackage2. /// <summary> /// Runs package and sets some of its variables. /// </summary> /// <param name="name">Name of the package</param> /// <param name="rows">Number of rows to export</param> /// <returns></returns> public string RunPackage2(string name, int rows) { try { string pkgLocation = ConfigurationManager.AppSettings["PackagePath"]; pkgLocation = Path.Combine(pkgLocation, name.Replace("\"", "")); Console.WriteLine(); Console.WriteLine("Calling package {0} with parameter {1}.", name, rows); Application app = new Application(); Package pkg = app.LoadPackage(pkgLocation, null); pkg.Variables["User::ExportRows"].Value = rows; DTSExecResult pkgResults = pkg.Execute(); Console.WriteLine(); Console.WriteLine(pkgResults.ToString()); if (pkgResults == DTSExecResult.Failure) { Console.WriteLine(); Console.WriteLine("Errors occured during execution of the package:"); foreach (DtsError er in pkg.Errors) Console.WriteLine("{0}: {1}", er.ErrorCode, er.Description); Console.WriteLine(); return "Errors occured during execution. Contact your support."; } Console.WriteLine(); Console.WriteLine(); return "OK"; } catch (Exception ex) { Console.WriteLine(ex); return ex.ToString(); } } The method accepts package name and number of rows to export. The packages are deployed to the file system. The path to the packages is configured in the application configuration file. This way, you can implement multiple services on the same machine, provided you also configure the URL for each instance appropriately. To run a package, you have to reference Microsoft.SqlServer.Dts.Runtime namespace. This namespace is implemented in Microsoft.SQLServer.ManagedDTS.dll which in my case was installed in the folder “C:\Program Files (x86)\Microsoft SQL Server\100\SDK\Assemblies”. Once you have done it, you can create an instance of Microsoft.SqlServer.Dts.Runtime.Application as in line 18 in the above snippet. It may be a good idea to create the Application object in the constructor of the PackRunner class, to avoid necessity of recreating it each time the service is invoked. Then, in line 19 you see that an instance of Microsoft.SqlServer.Dts.Runtime.Package is created. The method LoadPackage in its simplest form just takes package file name as the first parameter. Before you run the package, you can set its variables to certain values. This is a great way of configuring your packages without all the hassle with dtsConfig files. In the above code sample, variable “User:ExportRows” is set to value of the parameter “rows” of the method. Eventually, you execute the package. The method doesn’t throw exceptions, you have to test the result of execution yourself. If the execution wasn’t successful, you can examine collection of errors exposed by the package. These are the familiar errors you often see during development and debugging of the package. I you run the package from the code, you have opportunity to persist them or log them using your favourite logging framework. The package itself is very simple; it connects to my AdventureWorks database and saves number of rows specified in variable “User::ExportRows” to a file. You should know that before you run the package, you can change its connection strings, logging, events and many more. I attach solution with the test service, as well as a project with two test packages. To test the service, you have to run it and wait for the message saying that the host is started. Then, just type (or copy and paste) the below command to your browser. http://localhost:8011/runpackwithparams?package=%22ExportEmployees.dtsx%22&rows=12 When everything works fine, and you modified the package to point to your AdventureWorks database, you should see "OK” wrapped in xml: I stopped the database service to simulate invalid connection string situation. The output of the request is different now: And the service console window shows more information: As you see, implementing service oriented ETL framework is not a very difficult task. You have ability to configure the packages before you run them, you can implement logging that is consistent with the rest of your system. In application I have worked with we also have resource monitoring and execution control. We don’t allow to run more than certain number of packages to run simultaneously. This ensures we don’t strain the server and we use memory and CPUs efficiently. The attached zip file contains two projects. One is the package runner. It has to be executed with administrative privileges as it registers HTTP namespace. The other project contains two simple packages. This is really a cool thing, you should check it out!

Read the article
What is SSIS order of data transformation component method calls

- by Ron Ruble

I am working on a custom data transformation component. I'm using NUnit and NMock2 to test as I code. Testing and getting the custom UI and other features right is a huge pain, in part because I can't find any documentation about the order in which SSIS invokes methods on the component at design time as well as runtime. I can correct the issues readily enough, but it's tedious and time consuming to unregister the old version, register the new version, fire up the test ssis package, try to display the UI, get an obscure error message, backtrace it, modify the component and continue. One of the big issues involves the UI component needing access to the componentmetadata and buffermanager properties of the component at design time, and what I need to provide for to support properties that won't be initialized until after the user enters them in the UI. I can work through it; but if someone knows of some docs or tips that would speed me up, I'd greatly appreciate it. The samples I've found havn't been much use; they seem to be directed to showing off cool stuff (Twitter, weather.com) rather than actual work. Thanks in advance.

Read the article
SSIS Design Pattern: Loading Variable-Length Rows

- by andyleonard

Introduction I encounter flat file sources with variable-length rows on occassion. Here, I supply one SSIS Design Pattern for loading them. What's a Variable-Length Row Flat File? Great question - let's start with a definition. A variable-length row flat file is a text source of some flavor - comma-separated values (CSV), tab-delimited file (TDF), or even fixed-length, positional-, or ordinal-based (where the location of the data on the row defines its field). The major difference between a "normal"...(read more)

Read the article
Getting Dynamic in SSIS Queries

- by ejohnson2010

When you start working with SQL Server and SSIS, it isn’t long before you find yourself wishing you could change bits of SQL queries dynamically. Most commonly, I see people that want to change the date portion of a query so that you can limit your query to the last 30 days, for example. This can be done using a combination of expressions and variables. I will do this in two parts, first I will build a variable that will always contain the 1 st day of the previous month and then I will dynamically...(read more)

Read the article
SQLAuthority News – Interesting Whitepaper – We Loaded 1TB in 30 Minutes with SSIS, and So Can You

- by pinaldave

In February 2008, Microsoft announced a record-breaking data load using Microsoft SQL Server Integration Services (SSIS): 1 TB of data in less than 30 minutes. That data load, using SQL Server Integration Services, was 30% faster than the previous best time using a commercial ETL tool. This paper outlines what it took: the software, hardware, [...]

Read the article

Search Results

Search found 794 results on 32 pages for 'ssis'.

Page 3/32 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by jamiet

- by CoffeeAddict

- by jamiet

- by Davide Mauri

- by Peter Larsson

- by jamiet

- by jamiet

- by Davide Mauri

- by jamiet

- by DrJohn

- by jamiet

- by andyleonard

- by Rob Farley

- by Paul Kohler

- by jamiet

- by jamiet

- by jorg

- by jamiet

- by jamiet

- by Argenis

- by Piotr Rodak

- by Ron Ruble

- by andyleonard

- by ejohnson2010

- by pinaldave

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >