ssis - Page 15 - Developer IT

Using SQL Server Intergration Services (SSIS) can you read a file from a FileStream column in SQL Se

- by tbrovold

I am trying to create a tool that I can upload different types of files "csv", Excel, XML and load those files into a FileStream column in the database as "Source" untouched over the web. Then using SSIS on the server I want to create a package that will process that file to be loaded into other tables to be used by the web application. Is it possible to have SSIS read a file from FileStream column? if so how?

Read the article

How do I configure SSIS logging to overwrite the log file?

- by theog

My SSIS package has logging configured with a SSIS log provider for text files, which works fine, but each time the package is run the log appends to the end of the log file. I want it to truncate the file and only keep the log from the most recent execution of the package, but I don't see an option anywhere to do that. I've tried both file usage types (Existing file and New file) in the File Connection manager with the same results.

Read the article

SSIS: Way to handle hot folder items in parallel?

- by Dr. Zim

We have eight Xeon (i7) cores and 16 gig of RAM on our SSIS box. We have about 200 image files we want to convert using a command line utility every day. Currently the process is using Adobe Photoshop and droplets (very manual, taking upwards of two hours a day) Using SSIS hot folders, is there a way to execute up to eight conversions at once? Is there any way to tell a process completed or execute code upon it's completion?

Read the article

SSIS Debugging Tip: Using Data Viewers

- by Jim Giercyk

When you have an SSIS package error, it is often very helpful to see the data records that are causing the problem. After all, if your input has 50,000 records and 1 of them has corrupt data, it can be a chore. Your execution results will tell you which column contains the bad data, but not which record…..enter the Data Viewer. In this scenario I have created a truncation error. The input length of [lastname] is 50, but the output table has a length of 15. When it runs, at least one of the records causes the package to fail. Now what? We can tell from our execution results that there is a problem with [lastname], but we have no idea WHICH record? Let’s identify the row that is actually causing the problem. First, we grab the oft’ forgotten Row Count shape from our toolbar and connect it to the error output from our input query. Remember that in order to intercept errors with the error output, you must redirect them. The Row Count shape requires 1 integer variable. For our purposes, we will not reference the variable, but it is still required in order for the package to run. Typically we would use the variable to hold the number of rows in the table and refer back to it later in our process. We are simply using the Row Count as a “Dead End” for errors. I called my variable RowCounter. To create a variable, with no shapes selected, right-click on the background and choose Variable. Once we have setup the Row Count shape, we can right-click on the red line (error output) from the query, and select Data Viewers. In the popup, we click the add button and we will see this: There are other fancier options we can play with, but for now we just want to view the output in a grid. WE select Grid, then click OK on all of the popup windows to shut them down. We should now see a grid with a pair of glasses on the error output line. So, we are ready to catch the error output in a grid and see that is causing the problem! This time when we run the package, it does not fail because we directed the error to the Row Count. We also get a popup window showing the error record in a grid. If there were multiple errors we would see them all. Indeed, the [lastname] column is longer than 15 characters. Notice the last column in the grid, [Error Code – Description]. We knew this was a truncation error before we added the grid, but if you have worked with SSIS for any length of time, you know that some errors are much more obscure. The description column can be very useful under those circumstances! Data viewers can be used any time we want to see the data that is actually in the pipeline; they stop the package temporarily until we shut them. Also remember that the Row Count shape can be used as a “Dead End”. It is useful during development when we want to see the output from a dataflow, but don’t want to update a table or file with the data. Data viewers are an invaluable tool for both development and debugging. Just remember to REMOVE THEM before putting your package into production

Read the article

Perm SSIS Developer Urgently Required

- by blakmk

Job Role To provide dedicated data services support to the company, by designing, creating, maintaining and enhancing database objects, ensuring data quality, consistency and integrity. Migrating data from various sources to central SQL 2008 data warehouse will be the primary function. Migration of data from bespoke legacy database’s to SQL 2008 data warehouse. Understand key business requirements, Liaising with various aspects of the company. Create advanced transformations of data, with focus on data cleansing, redundant data and duplication. Creating complex business rules regarding data services, migration, Integrity and support (Best Practices). Experience · Minimum 3 year SSIS experience, in a project or BI Development role and involvement in at least 3 full ETL project life cycles, using the following methodologies and tools o Excellent knowledge of ETL concepts including data migration & integrity, focusing on SSIS. o Extensive experience with SQL 2005 products, SQL 2008 desirable. o Working knowledge of SSRS and its integration with other BI products. o Extensive knowledge of T-SQL, stored procedures, triggers (Table/Database), views, functions in particular coding and querying. o Data cleansing and harmonisation. o Understanding and knowledge of indexes, statistics and table structure. o SQL Agent – Scheduling jobs, optimisation, multiple jobs, DTS. o Troubleshoot, diagnose and tune database and physical server performance. o Knowledge and understanding of locking, blocks, table and index design and SQL configuration. · Demonstrable ability to understand and analyse business processes. · Experience in creating business rules on best practices for data services. · Experience in working with, supporting and troubleshooting MS SQL servers running enterprise applications · Proven ability to work well within a team and liaise with other technical support staff such as networking administrators, system administrators and support engineers. · Ability to create formal documentation, work procedures, and service level agreements. · Ability to communicate technical issues at all levels including to a non technical audience. · Good working knowledge of MS Word, Excel, PowerPoint, Visio and Project. Location Based in Crawley with possibility of some remote working Contact me for more info: http://sqlblogcasts.com/blogs/blakmk/contact.aspx

Read the article

PASS Summit 2010 Presentation Feedback

- by andyleonard

Introduction It's always an honor to present anywhere. Presenting at the PASS Summit is a special honor. I delivered three presentations last month: Database Design for Developers SSIS Design Patterns, Part 2 A Lightning Talk on SSIS Database Design for Developers First, a bit of explanation (defense): I submitted this abstract to the PASS Abstracts folks by mistake . I kid you not. Inspired by Adam Machanic ( Blog | @AdamMachanic ) I maintain a document of current presentations. I've recently published...(read more)

Read the article

Running 32-bit SSIS in a 64-bit Environment

- by John Paul Cook

After my recent post on where to find the 32-bit ODBC Administrator on a 64-bit SQL Server, a new question was asked about how to get SSIS to run with the 32-bit ODBC instead of the 64-bit ODBC. You need to make a simple configuration change to the properties of your BIDS solution. Here I have a solution called 32bitODBC and it needs to run in 32-bit mode, not 64-bit mode. Since I have a 64-bit SQL Server, BIDS defaults to using the 64-bit runtime. To override this setting, go to the property pages...(read more)

Read the article

SSIS Tips & Tricks (Presentation)

This has been a rather well used presentation title but it does allow a certain degree of flexibility, and we covered a good range of topics in my session at the UK SQL Server User Group in Cambridge last night. Thanks to all who attended. Here is the rather limited slide deck and the all important demo packages for download as promised. For reference, high level topics covered were BIDS Helper Inserts and Updates Transactions Script Debugging Data Flow Checkpoints I’ll update the post with a link to the Live Meeting recording when I get it. Presentation & Demo Packages (194KB) SSIS Tips & Tricks - Darren Green.zip

Read the article

SSIS Tips & Tricks (Presentation)

This has been a rather well used presentation title but it does allow a certain degree of flexibility, and we covered a good range of topics in my session at the UK SQL Server User Group in Cambridge last night. Thanks to all who attended. Here is the rather limited slide deck and the all important demo packages for download as promised. For reference, high level topics covered were BIDS Helper Inserts and Updates Transactions Script Debugging Data Flow Checkpoints I’ll update the post with a link to the Live Meeting recording when I get it. Presentation & Demo Packages (194KB) SSIS Tips & Tricks - Darren Green.zip

Read the article

Simple step by step process to import MS Access data into SQL Server using SSIS

Sometimes we need to import information from MS Access. We could use the Microsoft SQL Server Migration Assistant, but sometimes we need to add custom transformations and it is necessary to use more sophisticated tools. In this tip, we are going to walk through step by step how to migrate a MS Access table to SQL Server using SQL Server Integration Services (SSIS). What are your servers really trying to tell you? Find out with new SQL Monitor 3.0, an easy-to-use tool built for no-nonsense database professionals.For effortless insights into SQL Server, download a free trial today.

Read the article

Is there a reason why SSIS significantly slows down after a few minutes?

- by Mark

I'm running a fairly substantial SSIS package against SQL 2008 - and I'm getting the same results both in my dev environment (Win7-x64 + SQL-x64-Developer) and the production environment (Server 2008 x64 + SQL Std x64). The symptom is that initial data loading screams at between 50K - 500K records per second, but after a few minutes the speed drops off dramatically and eventually crawls embarrasingly slowly. The database is in Simple recovery model, the target tables are empty, and all of the prerequisites for minimally logged bulk inserts are being met. The data flow is a simple load from a RAW input file to a schema-matched table (i.e. no complex transforms of data, no sorting, no lookups, no SCDs, etc.) The problem has the following qualities and resiliences: Problem persists no matter what the target table is. RAM usage is lowish (45%) - there's plenty of spare RAM available for SSIS buffers or SQL Server to use. Perfmon shows buffers are not spooling, disk response times are normal, disk availability is high. CPU usage is low (hovers around 25% shared between sqlserver.exe and DtsDebugHost.exe) Disk activity primarily on TempDB.mdf, but I/O is very low (< 600 Kb/s) OLE DB destination and SQL Server Destination both exhibit this problem. To sum it up, I expect either disk, CPU or RAM to be exhausted before the package slows down, but instead its as if the SSIS package is taking an afternoon nap. SQL server remains responsive to other queries, and I can't find any performance counters or logged events that betray the cause of the problem. I'll gratefully reward any reasonable answers / suggestions.

Read the article

Downloading a file over HTTP the SSIS way

This post shows you how to download files from a web site whilst really making the most of the SSIS objects that are available. There is no task to do this, so we have to use the Script Task and some simple VB.NET or C# (if you have SQL Server 2008) code. Very often I see suggestions about how to use the .NET class System.Net.WebClient and of course this works, you can code pretty much anything you like in .NET. Here I’d just like to raise the profile of an alternative. This approach uses the HTTP Connection Manager, one of the stock connection managers, so you can use configurations and property expressions in the same way you would for all other connections. Settings like the security details that you would want to make configurable already are, but if you take the .NET route you have to write quite a lot of code to manage those values via package variables. Using the connection manager we get all of that flexibility for free. The screenshot below illustrate some of the options we have. Using the HttpClientConnection class makes for much simpler code as well. I have demonstrated two methods, DownloadFile which just downloads a file to disk, and DownloadData which downloads the file and retains it in memory. In each case we show a message box to note the completion of the download. You can download a sample package below, but first the code: Imports System Imports System.IO Imports System.Text Imports System.Windows.Forms Imports Microsoft.SqlServer.Dts.Runtime Public Class ScriptMain Public Sub Main() ' Get the unmanaged connection object, from the connection manager called "HTTP Connection Manager" Dim nativeObject As Object = Dts.Connections("HTTP Connection Manager").AcquireConnection(Nothing) ' Create a new HTTP client connection Dim connection As New HttpClientConnection(nativeObject) ' Download the file #1 ' Save the file from the connection manager to the local path specified Dim filename As String = "C:\Temp\Sample.txt" connection.DownloadFile(filename, True) ' Confirm file is there If File.Exists(filename) Then MessageBox.Show(String.Format("File {0} has been downloaded.", filename)) End If ' Download the file #2 ' Read the text file straight into memory Dim buffer As Byte() = connection.DownloadData() Dim data As String = Encoding.ASCII.GetString(buffer) ' Display the file contents MessageBox.Show(data) Dts.TaskResult = Dts.Results.Success End Sub End Class Sample Package HTTPDownload.dtsx (74KB)

Read the article

Downloading a file over HTTP the SSIS way

This post shows you how to download files from a web site whilst really making the most of the SSIS objects that are available. There is no task to do this, so we have to use the Script Task and some simple VB.NET or C# (if you have SQL Server 2008) code. Very often I see suggestions about how to use the .NET class System.Net.WebClient and of course this works, you can code pretty much anything you like in .NET. Here I’d just like to raise the profile of an alternative. This approach uses the HTTP Connection Manager, one of the stock connection managers, so you can use configurations and property expressions in the same way you would for all other connections. Settings like the security details that you would want to make configurable already are, but if you take the .NET route you have to write quite a lot of code to manage those values via package variables. Using the connection manager we get all of that flexibility for free. The screenshot below illustrate some of the options we have. Using the HttpClientConnection class makes for much simpler code as well. I have demonstrated two methods, DownloadFile which just downloads a file to disk, and DownloadData which downloads the file and retains it in memory. In each case we show a message box to note the completion of the download. You can download a sample package below, but first the code: Imports System Imports System.IO Imports System.Text Imports System.Windows.Forms Imports Microsoft.SqlServer.Dts.Runtime Public Class ScriptMain Public Sub Main() ' Get the unmanaged connection object, from the connection manager called "HTTP Connection Manager" Dim nativeObject As Object = Dts.Connections("HTTP Connection Manager").AcquireConnection(Nothing) ' Create a new HTTP client connection Dim connection As New HttpClientConnection(nativeObject) ' Download the file #1 ' Save the file from the connection manager to the local path specified Dim filename As String = "C:\Temp\Sample.txt" connection.DownloadFile(filename, True) ' Confirm file is there If File.Exists(filename) Then MessageBox.Show(String.Format("File {0} has been downloaded.", filename)) End If ' Download the file #2 ' Read the text file straight into memory Dim buffer As Byte() = connection.DownloadData() Dim data As String = Encoding.ASCII.GetString(buffer) ' Display the file contents MessageBox.Show(data) Dts.TaskResult = Dts.Results.Success End Sub End Class Sample Package HTTPDownload.dtsx (74KB)

Read the article

Search SSIS packages for table/column references

- by Nigel Rivett

A lot of companies now use TFS or some other system and keep all their packages in a single project. This means that a copy of all the packages will end up on your local disk. There is major failing with SSIS that it is sometimes quite difficult to find what a package is actually doing, what it accesses and what it affects. This is a simple dos script which will search through all packages in a folder for a string and write the names of found packages to an output file. Just copy the text to a .bat file (I use aaSearch.bat) in the folder with all the package scripts Change the output filename (twice), change the find string value and run it in a dos window. It works on any text file type so you can also search store procedure scripts – but there are easier ways of doing that. echo. > aaSearch_factSales.txt for /f “delims=” %%a in (‘dir /B /s *.dtsx’) do call :subr “%%a” goto:EOF :subr findstr “factSales” %1 if %ERRORLEVEL% NEQ 1 echo %1 >> aaSearch_factSales.txt goto:EOF

Read the article

SSIS - Upgrade from 2005 to 2008 - How to set a project property when I don't have a project

- by Greg

I have about 160 SSIS packages that I'm trying to upgrade from 2005 to 2008. When I run SSISUpgrade.exe on them, I get the following error messages on many of the packages: Error 0xc0209303: ...: SSIS Error Code DTS_E_OLEDB_NOPROVIDER_64BIT_ERROR. The requested OLE DB provider MICROSOFT.JET.OLEDB.4.0 is not registered -- perhaps no 64-bit provider is available. enter code here`Error code: 0x00000000. An OLE DB record is available. Source: "Microsoft OLE DB Service Components" Hresult: 0x80040154 Description: "Class not registered". This fellow says that to fix this I need to set the run64bitruntime debugging property to False. However each of these packages exists outside of a project file. How can I set this property without having a project file?

Read the article

SSIS - How do I see/set the field types in a Recordset?

- by thursdaysgeek

I'm looking at an inherited SSIS package, and a stored procedure is sending records to a recordset called USER:NEW_RECORDS. It's of type Object, and the value is System.Object. It is then used for inputting that data to a SQL table. We're getting an error, because it seems that the numeric results of the stored procedure are being put in a DT_WSTR field, and then failing when it is then put into a decimal field in the database. Most of the records are working, but one, which happens to have a longer number of decimal digits, is failing. I want to see exactly what my SSIS recordset field types are, and probably change them, so I can force the data to be truncated properly and copied. Or, perhaps, I'm not even looking at this correctly. The data is put into the recordset using a SQL Task that executes the stored procedure.

Read the article

Is there a size limitation for an Access DB destination in SSIS?

- by Adam V

I'm creating an SSIS package, which will read through a user's SQL database and populate the tables in an Access database. However, for the largest user databases, I start getting errors around the time the Access file reaches approx. 2 GB. Has anyone run into this problem? Is this a size limitation for this operation? More information: I'm getting the error code 0xC020907B, but no additional information that I can see. Error: 0xC0209029 at , [733]: SSIS Error Code DTS_E_INDUCEDTRANSFORMFAILUREONERROR. The "input "OLE DB Destination Input" (746)" failed because error code 0xC020907B occurred, and the error row disposition on "input "OLE DB Destination Input" (746)" specifies failure on error. An error occurred on the specified object of the specified component. There may be error messages posted before this with more information about the failure.

Read the article

Using SSIS, how do you read a datetime field into a variable that is of Data Type string?

- by Mark Kadlec

This one has bugged me for the longest time and a great question to ask the Stackoverflow users I think. I have a rather large SSIS flow that uses a string variable to store the datetime. I would now like to dynamically read the datetime value from the database, but how would you construct the SSIS to do this? My first obvious thought would be to simply execute a SQL task to get the datetime and store it in the variable, but got the "differs from the current variable type" error. Is there a simple way to convert the database datetime into a String variable? Any help from the community would be appreciated,

Read the article

How to delete rows based on comparison from Data Flow Task in an SSIS?

- by vikasde

I have a DataFlow task with two OLE DB Source objects. This is the SQL I want to achieve using SSIS: Insert into server2.db.dbo.[table2] (...) Select col1, col2, col3 ... from Server1.db.dbo.[table1] where [table1.col1] not in (Select col5 from server2.db.dbo.[table2] Where ...) I am pretty new to SSIS and not sure how to achieve this. I thought I could do this using the Data Flow task and populating the first source with the data from server1.db.dbo.table1 and the second source with server2.db.dbo.[table2] and then do the conditional check before inserting it into server2.db.dbo.[table2]. I am not sure how to do the conditional check though. Any help is appreciated.

Read the article

How to handle added input data colums without having to maintain multiple versions of SSIS packages?

- by GLFelger

I’m writing to solicit ideas for a solution to an upcoming problem. The product that provides data to our ETL process currently has multiple versions. Our clients are all using some version of the product, but not all use the same version and they will not all be upgraded at the same time. As new versions of the product are rolled out, the most common change is to add new data columns. Columns being dropped or renamed may happen occasionally, but our main focus right now is how to handle new columns being added. The problem that we want to address is how to handle the data for clients who use an older version of the product. If we don’t account for the new columns in our SSIS packages, then the data in those columns for clients using an older product version will not be processed. What we want to avoid is having to maintain a separate version of the SSIS packages for each version of the product. Has anyone successfully implemented a solution for this situation?

Read the article

Within SSIS - Is it possible to deploy one package multiple times in the same instance and set diffe

- by Matt

In my environment my Dev and QA Database Instances are on the same server. I would like to deploy the same package (or different versions of the package) into SSIS and set the filter to select different rows in the Config table. Is this possible? This is SQL 2005. For the sake of this question lets say I have one variable, which is a directory path. I would like to have these variables in the table twice (with different Filters applied (Dev and QA) as below (simplified) . . . Filter / Variable Value / Variable Name Dev / c:\data\dev / FilePath QA / c:\data\qa / FilePath Do I need to apply a change within the settings of the package in SSIS or is it changed on the job step within Agent? Any help would be appreciated.

Read the article

CreationName for SSIS 2008 and adding components programmatically

If you are building SSIS 2008 packages programmatically and adding data flow components, you will probably need to know the creation name of the component to add. I can never find a handy reference when I need one, hence this rather mundane post. See also CreationName for SSS 2005. We start with a very simple snippet for adding a component: // Add the Data Flow Task package.Executables.Add("STOCK:PipelineTask"); // Get the task host wrapper, and the Data Flow task TaskHost taskHost = package.Executables[0] as TaskHost; MainPipe dataFlowTask = (MainPipe)taskHost.InnerObject; // Add OLE-DB source component - ** This is where we need the creation name ** IDTSComponentMetaData90 componentSource = dataFlowTask.ComponentMetaDataCollection.New(); componentSource.Name = "OLEDBSource"; componentSource.ComponentClassID = "DTSAdapter.OLEDBSource.2"; So as you can see the creation name for a OLE-DB Source is DTSAdapter.OLEDBSource.2. CreationName Reference ADO NET Destination Microsoft.SqlServer.Dts.Pipeline.ADONETDestination, Microsoft.SqlServer.ADONETDest, Version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91 ADO NET Source Microsoft.SqlServer.Dts.Pipeline.DataReaderSourceAdapter, Microsoft.SqlServer.ADONETSrc, Version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91 Aggregate DTSTransform.Aggregate.2 Audit DTSTransform.Lineage.2 Cache Transform DTSTransform.Cache.1 Character Map DTSTransform.CharacterMap.2 Checksum Konesans.Dts.Pipeline.ChecksumTransform.ChecksumTransform, Konesans.Dts.Pipeline.ChecksumTransform, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b2ab4a111192992b Conditional Split DTSTransform.ConditionalSplit.2 Copy Column DTSTransform.CopyMap.2 Data Conversion DTSTransform.DataConvert.2 Data Mining Model Training MSMDPP.PXPipelineProcessDM.2 Data Mining Query MSMDPP.PXPipelineDMQuery.2 DataReader Destination Microsoft.SqlServer.Dts.Pipeline.DataReaderDestinationAdapter, Microsoft.SqlServer.DataReaderDest, Version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91 Derived Column DTSTransform.DerivedColumn.2 Dimension Processing MSMDPP.PXPipelineProcessDimension.2 Excel Destination DTSAdapter.ExcelDestination.2 Excel Source DTSAdapter.ExcelSource.2 Export Column TxFileExtractor.Extractor.2 Flat File Destination DTSAdapter.FlatFileDestination.2 Flat File Source DTSAdapter.FlatFileSource.2 Fuzzy Grouping DTSTransform.GroupDups.2 Fuzzy Lookup DTSTransform.BestMatch.2 Import Column TxFileInserter.Inserter.2 Lookup DTSTransform.Lookup.2 Merge DTSTransform.Merge.2 Merge Join DTSTransform.MergeJoin.2 Multicast DTSTransform.Multicast.2 OLE DB Command DTSTransform.OLEDBCommand.2 OLE DB Destination DTSAdapter.OLEDBDestination.2 OLE DB Source DTSAdapter.OLEDBSource.2 Partition Processing MSMDPP.PXPipelineProcessPartition.2 Percentage Sampling DTSTransform.PctSampling.2 Performance Counters Source DataCollectorTransform.TxPerfCounters.1 Pivot DTSTransform.Pivot.2 Raw File Destination DTSAdapter.RawDestination.2 Raw File Source DTSAdapter.RawSource.2 Recordset Destination DTSAdapter.RecordsetDestination.2 RegexClean Konesans.Dts.Pipeline.RegexClean.RegexClean, Konesans.Dts.Pipeline.RegexClean, Version=2.0.0.0, Culture=neutral, PublicKeyToken=d1abe77e8a21353e Row Count DTSTransform.RowCount.2 Row Count Plus Konesans.Dts.Pipeline.RowCountPlusTransform.RowCountPlusTransform, Konesans.Dts.Pipeline.RowCountPlusTransform, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b2ab4a111192992b Row Number Konesans.Dts.Pipeline.RowNumberTransform.RowNumberTransform, Konesans.Dts.Pipeline.RowNumberTransform, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b2ab4a111192992b Row Sampling DTSTransform.RowSampling.2 Script Component Microsoft.SqlServer.Dts.Pipeline.ScriptComponentHost, Microsoft.SqlServer.TxScript, Version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91 Slowly Changing Dimension DTSTransform.SCD.2 Sort DTSTransform.Sort.2 SQL Server Compact Destination Microsoft.SqlServer.Dts.Pipeline.SqlCEDestinationAdapter, Microsoft.SqlServer.SqlCEDest, Version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91 SQL Server Destination DTSAdapter.SQLServerDestination.2 Term Extraction DTSTransform.TermExtraction.2 Term Lookup DTSTransform.TermLookup.2 Trash Destination Konesans.Dts.Pipeline.TrashDestination.Trash, Konesans.Dts.Pipeline.TrashDestination, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b8351fe7752642cc TxTopQueries DataCollectorTransform.TxTopQueries.1 Union All DTSTransform.UnionAll.2 Unpivot DTSTransform.UnPivot.2 XML Source Microsoft.SqlServer.Dts.Pipeline.XmlSourceAdapter, Microsoft.SqlServer.XmlSrc, Version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91 Here is a simple console program that can be used to enumerate the pipeline components installed on your machine, and dumps out a list of all components like that above. You will need to add a reference to the Microsoft.SQLServer.ManagedDTS assembly. using System; using System.Diagnostics; using Microsoft.SqlServer.Dts.Runtime; public class Program { static void Main(string[] args) { Application application = new Application(); PipelineComponentInfos componentInfos = application.PipelineComponentInfos; foreach (PipelineComponentInfo componentInfo in componentInfos) { Debug.WriteLine(componentInfo.Name + "\t" + componentInfo.CreationName); } Console.Read(); } }

Read the article

How do I tweak columns in a Flat File Destination in SSIS?

- by theog

I have an OLE DB Data source and a Flat File Destination in the Data Flow of my SSIS Project. The goal is simply to pump data into a text file, and it does that. Where I'm having problems is with the formatting. I need to be able to rtrim() a couple of columns to remove trailing spaces, and I have a couple more that need their leading zeros preserved. The current process is losing all the leading zeros. The rtrim() can be done by simple truncation and ignoring the truncation errors, but that's very inelegant and error prone. I'd like to find a better way, like actually doing the rtrim() function where needed. Exploring similar SSIS questions & answers on SO, the thing to do seems to be "Use a Script Task", but that's ususally just thrown out there with no details, and it's not at all an intuitive thing to set up. I don't see how to use scripting to do what I need. Do I use a Script Task on the Control Flow, or a Script Component in the Data Flow? Can I do rtrim() and pad strings where needed in a script? Anybody got an example of doing this or similar things? Many thanks in advance.

Read the article

Investigation: Can different combinations of components effect Dataflow performance?

- by jamiet

Introduction The Dataflow task is one of the core components (if not the core component) of SQL Server Integration Services (SSIS) and often the most misunderstood. This is not surprising, its an incredibly complicated beast and we’re abstracted away from that complexity via some boxes that go yellow red or green and that have some lines drawn between them. Example dataflow In this blog post I intend to look under that facade and get into some of the nuts and bolts of the Dataflow Task by investigating how the decisions we make when building our packages can affect performance. I will do this by comparing the performance of three dataflows that all have the same input, all produce the same output, but which all operate slightly differently by way of having different transformation components. I also want to use this blog post to challenge a common held opinion that I see perpetuated over and over again on the SSIS forum. That is, that people assume adding components to a dataflow will be detrimental to overall performance. Its not surprising that people think this –it is intuitive to think that more components means more work- however this is not a view that I share. I have always been of the opinion that there are many factors affecting dataflow duration and the number of components is actually one of the less important ones; having said that I have never proven that assertion and that is one reason for this investigation. I have actually seen evidence that some people think dataflow duration is simply a function of number of rows and number of components. I’ll happily call that one out as a myth even without any investigation! The Setup I have a 2GB datafile which is a list of 4731904 (~4.7million) customer records with various attributes against them and it contains 2 columns that I am going to use for categorisation: [YearlyIncome] [BirthDate] The data file is a SSIS raw format file which I chose to use because it is the quickest way of getting data into a dataflow and given that I am testing the transformations, not the source or destination adapters, I want to minimise external influences as much as possible. In the test I will split the customers according to month of birth (12 of those) and whether or not their yearly income is above or below 50000 (2 of those); in other words I will be splitting them into 24 discrete categories and in order to do it I shall be using different combinations of SSIS’ Conditional Split and Derived Column transformation components. The 24 datapaths that occur will each input to a rowcount component, again because this is the least resource intensive means of terminating a datapath. The test is being carried out on a Dell XPS Studio laptop with a quad core (8 logical Procs) Intel Core i7 at 1.73GHz and Samsung SSD hard drive. Its running SQL Server 2008 R2 on Windows 7. The Variables Here are the three combinations of components that I am going to test: One Conditional Split - A single Conditional Split component CSPL Split by Month of Birth and income category that will use expressions on [YearlyIncome] & [BirthDate] to send each row to one of 24 outputs. This next screenshot displays the expression logic in use: Derived Column & Conditional Split - A Derived Column component DER Income Category that adds a new column [IncomeCategory] which will contain one of two possible text values {“LessThan50000”,”GreaterThan50000”} and uses [YearlyIncome] to determine which value each row should get. A Conditional Split component CSPL Split by Month of Birth and Income Category then uses that new column in conjunction with [BirthDate] to determine which of the same 24 outputs to send each row to. Put more simply, I am separating the Conditional Split of #1 into a Derived Column and a Conditional Split. The next screenshots display the expression logic in use: DER Income Category CSPL Split by Month of Birth and Income Category Three Conditional Splits - A Conditional Split component that produces two outputs based on [YearlyIncome], one for each Income Category. Each of those outputs will go to a further Conditional Split that splits the input into 12 outputs, one for each month of birth (identical logic in each). In this case then I am separating the single Conditional Split of #1 into three Conditional Split components. The next screenshots display the expression logic in use: CSPL Split by Income Category CSPL Split by Month of Birth 1& 2 Each of these combinations will provide an input to one of the 24 rowcount components, just the same as before. For illustration here is a screenshot of the dataflow containing three Conditional Split components: As you can these dataflows have a fair bit of work to do and remember that they’re doing that work for 4.7million rows. I will execute each dataflow 10 times and use the average for comparison. I foresee three possible outcomes: The dataflow containing just one Conditional Split (i.e. #1) will be quicker There is no significant difference between any of them One of the two dataflows containing multiple transformation components will be quicker Regardless of which of those outcomes come to pass we will have learnt something and that makes this an interesting test to carry out. Note that I will be executing the dataflows using dtexec.exe rather than hitting F5 within BIDS. The Results and Analysis The table below shows all of the executions, 10 for each dataflow. It also shows the average for each along with a standard deviation. All durations are in seconds. I’m pasting a screenshot because I frankly can’t be bothered with the faffing about needed to make a presentable HTML table. It is plain to see from the average that the dataflow containing three conditional splits is significantly faster, the other two taking 43% and 52% longer respectively. This seems strange though, right? Why does the dataflow containing the most components outperform the other two by such a big margin? The answer is actually quite logical when you put some thought into it and I’ll explain that below. Before progressing, a side note. The standard deviation for the “Three Conditional Splits” dataflow is orders of magnitude smaller – indicating that performance for this dataflow can be predicted with much greater confidence too. The Explanation I refer you to the screenshot above that shows how CSPL Split by Month of Birth and salary category in the first dataflow is setup. Observe that there is a case for each combination of Month Of Date and Income Category – 24 in total. These expressions get evaluated in the order that they appear and hence if we assume that Month of Date and Income Category are uniformly distributed in the dataset we can deduce that the expected number of expression evaluations for each row is 12.5 i.e. 1 (the minimum) + 24 (the maximum) divided by 2 = 12.5. Now take a look at the screenshots for the second dataflow. We are doing one expression evaluation in DER Income Category and we have the same 24 cases in CSPL Split by Month of Birth and Income Category as we had before, only the expression differs slightly. In this case then we have 1 + 12.5 = 13.5 expected evaluations for each row – that would account for the slightly longer average execution time for this dataflow. Now onto the third dataflow, the quick one. CSPL Split by Income Category does a maximum of 2 expression evaluations thus the expected number of evaluations per row is 1.5. CSPL Split by Month of Birth 1 & CSPL Split by Month of Birth 2 both have less work to do than the previous Conditional Split components because they only have 12 cases to test for thus the expected number of expression evaluations is 6.5 There are two of them so total expected number of expression evaluations for this dataflow is 6.5 + 6.5 + 1.5 = 14.5. 14.5 is still more than 12.5 & 13.5 though so why is the third dataflow so much quicker? Simple, the conditional expressions in the first two dataflows have two boolean predicates to evaluate – one for Income Category and one for Month of Birth; the expressions in the Conditional Split in the third dataflow however only have one predicate thus they are doing a lot less work. To sum up, the difference in execution times can be attributed to the difference between: MONTH(BirthDate) == 1 && YearlyIncome <= 50000 and MONTH(BirthDate) == 1 In the first two dataflows YearlyIncome <= 50000 gets evaluated an average of 12.5 times for every row whereas in the third dataflow it is evaluated once and once only. Multiply those 11.5 extra operations by 4.7million rows and you get a significant amount of extra CPU cycles – that’s where our duration difference comes from. The Wrap-up The obvious point here is that adding new components to a dataflow isn’t necessarily going to make it go any slower, moreover you may be able to achieve significant improvements by splitting logic over multiple components rather than one. Performance tuning is all about reducing the amount of work that needs to be done and that doesn’t necessarily mean use less components, indeed sometimes you may be able to reduce workload in ways that aren’t immediately obvious as I think I have proven here. Of course there are many variables in play here and your mileage will most definitely vary. I encourage you to download the package and see if you get similar results – let me know in the comments. The package contains all three dataflows plus a fourth dataflow that will create the 2GB raw file for you (you will also need the [AdventureWorksDW2008] sample database from which to source the data); simply disable all dataflows except the one you want to test before executing the package and remember, execute using dtexec, not within BIDS. If you want to explore dataflow performance tuning in more detail then here are some links you might want to check out: Inequality joins, Asynchronous transformations and Lookups Destination Adapter Comparison Don’t turn the dataflow into a cursor SSIS Dataflow – Designing for performance (webinar) Any comments? Let me know! @Jamiet

Read the article

Presenting to the New England SQL Server Users Group 10 Jun 2010!

- by andyleonard

I am honored to present Applied SSIS Design Patterns to the New England SQL Server Users Group on 10 Jun 2010! This is a reprise of the spotlight session presented at the PASS Summit 2009. Abstract "Design Patterns" is more than a trendy buzz phrase; design patterns are a way of breaking down complex development projects into manageable tasks. They lend themselves to several development methodologies and apply to SSIS development. Chances are you're using your own design patterns now! In this spotlight...(read more)

Search Results

Search found 794 results on 32 pages for 'ssis'.

Page 15/32 | < Previous Page | 11 12 13 14 15 16 17 18 19 20 21 22 | Next Page >

- by tbrovold

- by theog

- by Dr. Zim

- by Jim Giercyk

- by blakmk

- by andyleonard

- by John Paul Cook

- by Mark

- by Nigel Rivett

- by Greg

- by thursdaysgeek

- by Adam V

- by Mark Kadlec

- by vikasde

- by GLFelger

- by Matt

- by theog

- by jamiet

- by andyleonard

< Previous Page | 11 12 13 14 15 16 17 18 19 20 21 22 | Next Page >