Search Results

Search found 1221 results on 49 pages for 'ssis pivot'.

Page 4/49 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

SSIS Prehistory video

- by jamiet

I’m currently wasting spending my Easter bank holiday putting together my presentation SSIS Dataflow Performance Tuning for the upcoming SQL Bits conference in London and in doing so I’m researching some old material about how the dataflow actually works. Boring as it is I’ve gotten easily sidelined and have chanced upon an old video on Channel 9 entitled Euan Garden - Tour of SQL Server Team (part I). Euan is a former member of the SQL Server team and in this series of videos he walks the halls of the SQL Server building on Microsoft’s Redmond campus talking to some of the various protagonists and in this one he happens upon the SQL Server Integration Services team. The video was shot in 2004 so this is a fascinating (to me anyway) glimpse into the development of SSIS from before it was ever shipped and if you’re a geek like me you’ll really enjoy this behind-the-scenes look into how and why the product was architected. The video is also notable for the presence of the cameraman – none other than the now-rather-more-famous-than-he-was-then Robert Scoble. See it at http://channel9.msdn.com/posts/TheChannel9Team/Euan-Garden-Tour-of-SQL-Server-Team-part-I/ Enjoy! @Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
Tale of an Encrypted SSIS Package in msdb and a Lost Password

- by Argenis

Yesterday a Developer at work asked for a copy of an SSIS package in Production so he could work on it (please, dear Reader – withhold judgment on Source Control – I know!). I logged on to the SSIS instance, and when I went to export the package… Oops. I didn’t have that password. The DBA who uploaded the package to Production is long gone; my fellow DBA had no idea either - and the Devs returned a cricket sound when queried. So I posed the obligatory question on #SQLHelp and a bunch of folks jumped in – some to help and some to make fun of me (thanks, @SQLSoldier @crummel4 @maryarcia and @sqljoe). I tried their suggestions to no avail…even ran some queries to see if I could figure out how to extract the package XML from the system tables in msdb: SELECT CAST(CAST(p.packagedata AS varbinary(max)) AS varchar(max)) FROM msdb.dbo.sysssispackages p WHERE p.name = 'LePackage' This just returned a bunch of XML with encrypted data on it: I knew there was a job in SQL Agent scheduled to execute the package, and when I tried to look at details on the job step I got the following: Not very helpful. The password had to be saved somewhere, but where?? All of a sudden I remembered that there was a system table I hadn’t queried yet: SELECT sjs.command FROM msdb.dbo.sysjobs sj JOIN msdb.dbo.sysjobsteps sjs ON sj.job_id = sjs.job_id WHERE sj.name = 'Run LePackage' The result: “Well, that’s really secure”, I thought to myself. Cheers, -Argenis

Read the article
Running SSIS packages from C#

- by Piotr Rodak

Most of the developers and DBAs know about two ways of deploying packages: You can deploy them to database server and run them using SQL Server Agent job or you can deploy the packages to file system and run them using dtexec.exe utility. Both approaches have their pros and cons. However I would like to show you that there is a third way (sort of) that is often overlooked, and it can give you capabilities the ‘traditional’ approaches can’t. I have been working for a few years with applications that run packages from host applications that are implemented in .NET. As you know, SSIS provides programming model that you can use to implement more flexible solutions. SSIS applications are usually thought to be batch oriented, with fairly rigid architecture and processing model, with fixed timeframes when the packages are executed to process data. It doesn’t to be the case, you don’t have to limit yourself to batch oriented architecture. I have very good experiences with service oriented architectures processing large amounts of data. These applications are more complex than what I would like to show here, but the principle stays the same: you can execute packages as a service, on ad-hoc basis. You can also implement and schedule various signals, HTTP calls, file drops, time schedules, Tibco messages and other to run the packages. You can implement event handler that will trigger execution of SSIS when a certain event occurs in StreamInsight stream. This post is just a small example of how you can use the API and other features to create a service that can run SSIS packages on demand. I thought it might be a good idea to implement a restful service that would listen to requests and execute appropriate actions. As it turns out, it is trivial in C#. The application is implemented as console application for the ease of debugging and running. In reality, you might want to implement the application as Windows service. To begin, you have to reference namespace System.ServiceModel.Web and then add a few lines of code: Uri baseAddress = new Uri("http://localhost:8011/"); WebServiceHost svcHost = new WebServiceHost(typeof(PackRunner), baseAddress); try { svcHost.Open(); Console.WriteLine("Service is running"); Console.WriteLine("Press enter to stop the service."); Console.ReadLine(); svcHost.Close(); } catch (CommunicationException cex) { Console.WriteLine("An exception occurred: {0}", cex.Message); svcHost.Abort(); } The interesting lines are 3, 7 and 13. In line 3 you create a WebServiceHost object. In line 7 you start listening on the defined URL and then in line 13 you shut down the service. As you have noticed, the WebServiceHost constructor is accepting type of an object (here: PackRunner) that will be instantiated as singleton and subsequently used to process the requests. This is the class where you put your logic, but to tell WebServiceHost how to use it, the class must implement an interface which declares methods to be used by the host. The interface itself must be ornamented with attribute ServiceContract. [ServiceContract] public interface IPackRunner { [OperationContract] [WebGet(UriTemplate = "runpack?package={name}")] string RunPackage1(string name); [OperationContract] [WebGet(UriTemplate = "runpackwithparams?package={name}&rows={rows}")] string RunPackage2(string name, int rows); } Each method that is going to be used by WebServiceHost has to have attribute OperationContract, as well as WebGet or WebInvoke attribute. The detailed discussion of the available options is outside of scope of this post. I also recommend using more descriptive names to methods . Then, you have to provide the implementation of the interface: public class PackRunner : IPackRunner { ... There are two methods defined in this class. I think that since the full code is attached to the post, I will show only the more interesting method, the RunPackage2. /// <summary> /// Runs package and sets some of its variables. /// </summary> /// <param name="name">Name of the package</param> /// <param name="rows">Number of rows to export</param> /// <returns></returns> public string RunPackage2(string name, int rows) { try { string pkgLocation = ConfigurationManager.AppSettings["PackagePath"]; pkgLocation = Path.Combine(pkgLocation, name.Replace("\"", "")); Console.WriteLine(); Console.WriteLine("Calling package {0} with parameter {1}.", name, rows); Application app = new Application(); Package pkg = app.LoadPackage(pkgLocation, null); pkg.Variables["User::ExportRows"].Value = rows; DTSExecResult pkgResults = pkg.Execute(); Console.WriteLine(); Console.WriteLine(pkgResults.ToString()); if (pkgResults == DTSExecResult.Failure) { Console.WriteLine(); Console.WriteLine("Errors occured during execution of the package:"); foreach (DtsError er in pkg.Errors) Console.WriteLine("{0}: {1}", er.ErrorCode, er.Description); Console.WriteLine(); return "Errors occured during execution. Contact your support."; } Console.WriteLine(); Console.WriteLine(); return "OK"; } catch (Exception ex) { Console.WriteLine(ex); return ex.ToString(); } } The method accepts package name and number of rows to export. The packages are deployed to the file system. The path to the packages is configured in the application configuration file. This way, you can implement multiple services on the same machine, provided you also configure the URL for each instance appropriately. To run a package, you have to reference Microsoft.SqlServer.Dts.Runtime namespace. This namespace is implemented in Microsoft.SQLServer.ManagedDTS.dll which in my case was installed in the folder “C:\Program Files (x86)\Microsoft SQL Server\100\SDK\Assemblies”. Once you have done it, you can create an instance of Microsoft.SqlServer.Dts.Runtime.Application as in line 18 in the above snippet. It may be a good idea to create the Application object in the constructor of the PackRunner class, to avoid necessity of recreating it each time the service is invoked. Then, in line 19 you see that an instance of Microsoft.SqlServer.Dts.Runtime.Package is created. The method LoadPackage in its simplest form just takes package file name as the first parameter. Before you run the package, you can set its variables to certain values. This is a great way of configuring your packages without all the hassle with dtsConfig files. In the above code sample, variable “User:ExportRows” is set to value of the parameter “rows” of the method. Eventually, you execute the package. The method doesn’t throw exceptions, you have to test the result of execution yourself. If the execution wasn’t successful, you can examine collection of errors exposed by the package. These are the familiar errors you often see during development and debugging of the package. I you run the package from the code, you have opportunity to persist them or log them using your favourite logging framework. The package itself is very simple; it connects to my AdventureWorks database and saves number of rows specified in variable “User::ExportRows” to a file. You should know that before you run the package, you can change its connection strings, logging, events and many more. I attach solution with the test service, as well as a project with two test packages. To test the service, you have to run it and wait for the message saying that the host is started. Then, just type (or copy and paste) the below command to your browser. http://localhost:8011/runpackwithparams?package=%22ExportEmployees.dtsx%22&rows=12 When everything works fine, and you modified the package to point to your AdventureWorks database, you should see "OK” wrapped in xml: I stopped the database service to simulate invalid connection string situation. The output of the request is different now: And the service console window shows more information: As you see, implementing service oriented ETL framework is not a very difficult task. You have ability to configure the packages before you run them, you can implement logging that is consistent with the rest of your system. In application I have worked with we also have resource monitoring and execution control. We don’t allow to run more than certain number of packages to run simultaneously. This ensures we don’t strain the server and we use memory and CPUs efficiently. The attached zip file contains two projects. One is the package runner. It has to be executed with administrative privileges as it registers HTTP namespace. The other project contains two simple packages. This is really a cool thing, you should check it out!

Read the article
What is SSIS order of data transformation component method calls

- by Ron Ruble

I am working on a custom data transformation component. I'm using NUnit and NMock2 to test as I code. Testing and getting the custom UI and other features right is a huge pain, in part because I can't find any documentation about the order in which SSIS invokes methods on the component at design time as well as runtime. I can correct the issues readily enough, but it's tedious and time consuming to unregister the old version, register the new version, fire up the test ssis package, try to display the UI, get an obscure error message, backtrace it, modify the component and continue. One of the big issues involves the UI component needing access to the componentmetadata and buffermanager properties of the component at design time, and what I need to provide for to support properties that won't be initialized until after the user enters them in the UI. I can work through it; but if someone knows of some docs or tips that would speed me up, I'd greatly appreciate it. The samples I've found havn't been much use; they seem to be directed to showing off cool stuff (Twitter, weather.com) rather than actual work. Thanks in advance.

Read the article
SSIS Design Pattern: Loading Variable-Length Rows

- by andyleonard

Introduction I encounter flat file sources with variable-length rows on occassion. Here, I supply one SSIS Design Pattern for loading them. What's a Variable-Length Row Flat File? Great question - let's start with a definition. A variable-length row flat file is a text source of some flavor - comma-separated values (CSV), tab-delimited file (TDF), or even fixed-length, positional-, or ordinal-based (where the location of the data on the row defines its field). The major difference between a "normal"...(read more)

Read the article
Getting Dynamic in SSIS Queries

- by ejohnson2010

When you start working with SQL Server and SSIS, it isn’t long before you find yourself wishing you could change bits of SQL queries dynamically. Most commonly, I see people that want to change the date portion of a query so that you can limit your query to the last 30 days, for example. This can be done using a combination of expressions and variables. I will do this in two parts, first I will build a variable that will always contain the 1 st day of the previous month and then I will dynamically...(read more)

Read the article
SQLAuthority News – Interesting Whitepaper – We Loaded 1TB in 30 Minutes with SSIS, and So Can You

- by pinaldave

In February 2008, Microsoft announced a record-breaking data load using Microsoft SQL Server Integration Services (SSIS): 1 TB of data in less than 30 minutes. That data load, using SQL Server Integration Services, was 30% faster than the previous best time using a commercial ETL tool. This paper outlines what it took: the software, hardware, [...]

Read the article
SSIS and Parallelism: The Unseen Minions

Sometimes, a procedural database process cannot easily be reduced to a set-based algorithm in order to reduce the time it takes. Then, you have to find other ways to parallelise it. Other ways? Josef shows how to use SSIS to drastically reduce the time that such a process takes.

Read the article
March 2012 - SSIS Training in London!

- by andyleonard

I am honored to announce I will be delivering From Zero To SSIS! in London, England 5-9 Mar 2012. This course is delivered in cooperation with my friends at TechniTrain who provide awesome training by talented technologists like Chris Webb ( Blog ), Gavin Payne ( Blog ), and Christian Bolton ( Blog ). This opportunity grew out of conversations at SQLBits 9 in Liverpool in September 2011. I had an awesome time at SQLBits and encourage everyone to attend the conference if you have the opportunity to...(read more)

Read the article
SSIS Dashboard 0.5.2 and Live Demo Website

- by Davide Mauri

In the last days I’ve worked again on the SQL Server Integration Service Dashboard and I did some updates: Beta Added support for "*" wildcard in project names. Now you can filter a specific project name using an url like: http://<yourserver>/project/MyPro* Added initial support for Package Execution History. Just click on a package name and you'll see its latest 15 executions and I’ve also created a live demo website for all those who want to give it a try before downloading and using it: http://ssis-dashboard.azurewebsites.net/

Read the article
SSIS Basics: Using the Merge Join Transformation

SSIS is able to take sorted data from more than one OLE DB data source and merge them into one table which can then be sent to an OLE DB destination. This 'Merge Join' transformation works in a similar way to a SQL join by specifying a 'join key' relationship. this transformation can save a great deal of processing on the destination. Annette Allen, as usual, gives clear guidance on how to do it.

Read the article
SSIS Lookup component only returns matching rows when using partial mode

- by Myat Htut

I am attempting to use a lookup transformation in my data transformation package and all of the other lookup transformations went well but one component returns the matching rows only when I enable the partial cache mode. If I use the full cache mode, all data is routed to the error path. I am using SQL 2005 SSIS. Any help appreciated..

Read the article
Can Sql Server 2005 Pivot table have nText passed into it?

- by manemawanna

Right bit of a simple question can I input nText into a pivot table? (SQL Server 2005) What I have is a table which records the answers to a questionnaire consisting of the following elements for example: UserID QuestionNumber Answer Mic 1 Yes Mic 2 No Mic 3 Yes Ste 1 Yes Ste 2 No Ste 3 Yes Bob 1 Yes Bob 2 No Bob 3 Yes With the answers being held in nText. Anyway what id like a Pivot table to do is: UserID 1 2 3 Mic Yes No Yes Ste Yes No Yes Bob Yes No Yes I have some test code, that creates a pivot table but at the moment it just shows the number of answers in each column (code can be found below). So I just want to know is it possible to add nText to a pivot table? As when I've tried it brings up errors and someone stated on another site that it wasn't possible, so I would like to check if this is the case or not. Just for further reference I don't have the opportunity to change the database as it's linked to other systems that I haven't created or have access too. Heres the SQL code I have at present below: DECLARE @query NVARCHAR(4000) DECLARE @count INT DECLARE @concatcolumns NVARCHAR(4000) SET @count = 1 SET @concatcolumns = '' WHILE (@count <=52) BEGIN IF @COUNT > 1 AND @COUNT <=52 SET @concatcolumns = (@concatcolumns + ' + ') SET @concatcolumns = (@concatcolumns + 'CAST ([' + CAST(@count AS NVARCHAR) + '] AS NVARCHAR)') SET @count = (@count+1) END DECLARE @columns NVARCHAR(4000) SET @count = 1 SET @columns = '' WHILE (@count <=52) BEGIN IF @COUNT > 1 AND @COUNT <=52 SET @columns = (@columns + ',') SET @columns = (@columns + '[' + CAST(@count AS NVARCHAR) + '] ') SET @count = (@count+1) END SET @query = ' SELECT UserID, ' + @concatcolumns + ' FROM( SELECT UserID, QuestionNumber AS qNum from QuestionnaireAnswers where QuestionnaireID = 7 ) AS t PIVOT ( COUNT (qNum) FOR qNum IN (' + @columns + ') ) AS PivotTable' select @query exec(@query)

Read the article
Enforce SSIS naming conventions using BI-xPress

- by jamiet

A long long long time ago (in 2006 in fact) I published a blog post entitled Suggested Best Practises and naming conventions in which I suggested a bunch of acronyms that folks could use to prefix object names in their SSIS packages, thus allowing easier identification of those objects in log records, here is a sample of some of those suggestions: If you have adopted these naming conventions (and I am led to believe that a bunch of people have) then you might like to know that you can now check for adherence to these conventions using a tool called BI-xPress from Pragmatic Works. BI-xPress includes a feature called the Best Practices Analyzer that scans your packages and assess them according to some rules that you specify. In addition Pragmatic Works have made available a collection of these rules that adhere to the naming conventions I specified in 2006 You can download this collection however I recommend you first read the accompanying article that demonstrates the capabilities of the Best Practices Analyzer. Pretty cool stuff. @Jamiet

Read the article
SSIS - The expression cannot be parsed because it contains invalid elements at the location specifie

- by simonsabin

If you get the following error when trying to write an expression there is an easy solution Attempt to parse the expression "@[User::FilePath] + "\" + @[User::FileName] + ".raw"" failed. The token "." at line number "1", character number "<some position>" was not recognized. The expression cannot be parsed because it contains invalid elements at the location specified. The SSIS expression language is a C based language and the \ is a token, this means you have to escape it with another one. i.e "\" becomes "\\", unlike C# you can't prefix the string with a @, you have to use the escaping route. In summary when ever you want to use \ you need to use two \\

Read the article
How hard it will be for the programmer to learn MS SSRS adn SSIS [closed]

- by user75380

I have a programming background in php/python/java for 5 years and I know MySQL and PostgreSQL. Currently in our company the MSQL Business Intelligence person is leaving his job in 4 months. I am thinking of trying to go to his place, at least try as I want to move in Business Intelligence field in SSRS and SSIS. I just want to know that is it possible for me to get my head around those things in 4 months because I have no idea how they work and how hard it will be for me to pick up those things. Can I do that? I just want to know from experienced people if I can move towards that field? At least how should I start? In my area there are shortages of person, so once I know the stuff I can get into junior jobs easily but I want to know from experienced people.

Read the article
SSIS is Case-Sensitive

- by andyleonard

Introduction SSIS is case-sensitive even if the database is case-insensitive. Imagine... ... you work in an ETL shop where someone who believes in natural keys won the Battle of the Joins. Imagine one of your natural keys is a string. (I know it's a stretch... play along!). Let's build some tables to sketch it out. If you do not have a TestDB database, why not? Build one! You'll use it often. Use TestDB go Create Table SSIS1 ( StrID char ( 5 ) , Name varchar ( 15 ) , Value int ) Insert Into SSIS1...(read more)

Read the article
SQL Server 2008 - Script Data as Insert Statements from SSIS Package

- by Brandon King

SQL Server 2008 provides the ability to script data as Insert statements using the Generate Scripts option in Management Studio. Is it possible to access the same functionality from within a SSIS package? Here's what I'm trying to accomplish... I have a scheduled job that nightly scripts out all the schema and data for a SQL Server 2008 database and then uses the script to create a "mirror copy" SQLCE 3.5 database. I've been using Narayana Vyas Kondreddi's sp_generate_inserts stored procedure to accomplish this, but it has problems with some datatypes and greater-than-4K columns (holdovers from SQL Server 2000 days). The Script Data function looks like it could solve my problems, if only I could automate it. Any suggestions?

Read the article
SSIS - XML Source Script

- by simonsabin

The XML Source in SSIS is great if you have a 1 to 1 mapping between entity and table. You can do more complex mapping but it becomes very messy and won't perform. What other options do you have? The challenge with XML processing is to not need a huge amount of memory. I remember using the early versions of Biztalk with loaded the whole document into memory to map from one document type to another. This was fine for small documents but was an absolute killer for large documents. You therefore need a streaming approach. For flexibility however you want to be able to generate your rows easily, and if you've ever used the XmlReader you will know its ugly code to write. That brings me on to LINQ. The is an implementation of LINQ over XML which is really nice. You can write nice LINQ queries instead of the XMLReader stuff. The downside is that by default LINQ to XML requires a whole XML document to work with. No streaming. Your code would look like this. We create an XDocument and then enumerate over a set of annoymous types we generate from our LINQ statement XDocument x = XDocument.Load("C:\\TEMP\\CustomerOrders-Attribute.xml"); foreach (var xdata in (from customer in x.Elements("OrderInterface").Elements("Customer") from order in customer.Elements("Orders").Elements("Order") select new { Account = customer.Attribute("AccountNumber").Value , OrderDate = order.Attribute("OrderDate").Value } )) { Output0Buffer.AddRow(); Output0Buffer.AccountNumber = xdata.Account; Output0Buffer.OrderDate = Convert.ToDateTime(xdata.OrderDate); } As I said the downside to this is that you are loading the whole document into memory. I did some googling and came across some helpful videos from a nice UK DPE Mike Taulty http://www.microsoft.com/uk/msdn/screencasts/screencast/289/LINQ-to-XML-Streaming-In-Large-Documents.aspx. Which show you how you can combine LINQ and the XmlReader to get a semi streaming approach. I took what he did and implemented it in SSIS. What I found odd was that when I ran it I got different numbers between using the streamed and non streamed versions. I found the cause was a little bug in Mikes code that causes the pointer in the XmlReader to progress past the start of the element and thus foreach (var xdata in (from customer in StreamReader("C:\\TEMP\\CustomerOrders-Attribute.xml","Customer") from order in customer.Elements("Orders").Elements("Order") select new { Account = customer.Attribute("AccountNumber").Value , OrderDate = order.Attribute("OrderDate").Value } )) { Output0Buffer.AddRow(); Output0Buffer.AccountNumber = xdata.Account; Output0Buffer.OrderDate = Convert.ToDateTime(xdata.OrderDate); } These look very similiar and they are the key element is the method we are calling, StreamReader. This method is what gives us streaming, what it does is return a enumerable list of elements, because of the way that LINQ works this results in the data being streamed in. static IEnumerable<XElement> StreamReader(String filename, string elementName) { using (XmlReader xr = XmlReader.Create(filename)) { xr.MoveToContent(); while (xr.Read()) //Reads the first element { while (xr.NodeType == XmlNodeType.Element && xr.Name == elementName) { XElement node = (XElement)XElement.ReadFrom(xr); yield return node; } } xr.Close(); } } This code is specifically designed to return a list of the elements with a specific name. The first Read reads the root element and then the inner while loop checks to see if the current element is the type we want. If not we do the xr.Read() again until we find the element type we want. We then use the neat function XElement.ReadFrom to read an element and all its sub elements into an XElement. This is what is returned and can be consumed by the LINQ statement. Essentially once one element has been read we need to check if we are still on the same element type and name (the inner loop) This was Mikes mistake, if we called .Read again we would advance the XmlReader beyond the start of the Element and so the ReadFrom method wouldn't work. So with the code above you can use what ever LINQ statement you like to flatten your XML into the rowsets you want. You could even have multiple outputs and generate your own surrogate keys.

Read the article
SSIS - Range lookups

- by Repieter

When developing an ETL solution in SSIS we sometimes need to do range lookups in SSIS. Several solutions for this can be found on the internet, but now we have built another solution which I would like to share, since it's pretty easy to implement and the performance is fast. You can download the sample package to see how it works. Make sure you have the AdventureWorks2008R2 and AdventureWorksDW2008R2 databases installed. (Apologies for the layout of this blog, I don't do this too often :)) To give a little bit more information about the example, this is basically what is does: we load a facttable and do an SCD type 2 lookup operation of the Product dimension. This is done with a script component. First we query the Data warehouse to create the lookup dataset. The query that is used for that is: SELECT [ProductKey] ,[ProductAlternateKey] ,[StartDate] ,ISNULL([EndDate], '9999-01-01') AS EndDate FROM [DimProduct] The output of this query is stored in a DataTable: string lookupQuery = @" SELECT [ProductKey] ,[ProductAlternateKey] ,[StartDate] ,ISNULL([EndDate], '9999-01-01') AS EndDate FROM [DimProduct]"; OleDbCommand oleDbCommand = new OleDbCommand(lookupQuery, _oleDbConnection); OleDbDataAdapter adapter = new OleDbDataAdapter(oleDbCommand); _dataTable = new DataTable(); adapter.Fill(_dataTable); Now that the dimension data is stored in the DataTable we use the following method to do the actual lookup: public int RangeLookup(string businessKey, DateTime lookupDate) { // set default return value (Unknown) int result = -1; DataRow[] filteredRows; filteredRows = _dataTable.Select(string.Format("ProductAlternateKey = '{0}'", businessKey)); for (int i = 0; i < filteredRows.Length; i++) { // check if the lookupdate is found between the startdate and enddate of any of the records if (lookupDate >= (DateTime)filteredRows[i][2] && lookupDate < (DateTime)filteredRows[i][3]) { result = (filteredRows[i][0] == null) ? -1 : (int)filteredRows[i][0]; break; } } filteredRows = null; return result; } This method is executed for every row that passes the script component. This is implemented in the ProcessInputRow method public override void Input0_ProcessInputRow(Input0Buffer Row) { // Perform the lookup operation on the current row and put the value in the Surrogate Key Attribute Row.ProductKey = RangeLookup(Row.ProductNumber, Row.OrderDate); } Now what actually happens?! 1. Every record passes the business key and the orderdate to the RangeLookup method. 2. The DataTable is then filtered on the business key of the current record. The output is stored in a DataRow [] object. 3. We loop over the DataRow[] object to see where the orderdate meets the following expression: (lookupDate >= (DateTime)filteredRows[i][2] && lookupDate < (DateTime)filteredRows[i][3]) 4. When the expression returns true (so where the data is between the Startdate and the EndDate), the surrogate key of the dimension record is returned We have done some testing with this solution and it works great for us. Hope others can use this example to do their range lookups.

Read the article
Why I don't use SSIS checkpoint files

- by jamiet

In a recent discussion in regard to general ETL best practises the subject of checkpoint files as a means for package restartability came up and I stated that I was dead against using them. For anyone that may care, here is why: Configuring them is distinctly unintuitive (that's a matter of opinion but if you follow the link I'll wager that you will agree) they don't make any allowance for loop iterations they cannot store variables of type Object they are limited in ability. There are many scenarios where you may want to execute certain containers regardless of whether the package is started from a checkpoint file but the current usage model does not allow for this. they are ignored by eventhandlers, which wouldn't be so bad if there were a way to toggle this behaviour in certain scenarios they dont work properly I'll expand on the last bullet point. I have encountered situations where the behaviour for tasks executing concurrently is unpredictable. That is, sometimes the completion of a task that executes concurrently with a failed/failing task will make it into the checkpoint file and sometimes it won't. This is near-impossible to reproduce but it does happen as my good friend John Welch will hopefully concur (if he is reading). Is anyone out there making successful use of checkpoint files within SSIS? I would be interested in knowing about that if so. @Jamiet

Read the article
Execute a SSIS package in Sync or Async mode from SQL Server 2012

- by Davide Mauri

Today I had to schedule a package stored in the shiny new SSIS Catalog store that can be enabled with SQL Server 2012. (http://msdn.microsoft.com/en-us/library/hh479588(v=SQL.110).aspx) Once your packages are stored here, they will be executed using the new stored procedures created for this purpose. This is the script that will get executed if you try to execute your packages right from management studio or through a SQL Server Agent job, will be similar to the following: Declare @execution_id bigint EXEC [SSISDB].[catalog].[create_execution] @package_name='my_package.dtsx', @execution_id=@execution_id OUTPUT, @folder_name=N'BI', @project_name=N'DWH', @use32bitruntime=False, @reference_id=Null Select @execution_id DECLARE @var0 smallint = 1 EXEC [SSISDB].[catalog].[set_execution_parameter_value] @execution_id, @object_type=50, @parameter_name=N'LOGGING_LEVEL', @parameter_value=@var0 DECLARE @var1 bit = 0 EXEC [SSISDB].[catalog].[set_execution_parameter_value] @execution_id, @object_type=50, @parameter_name=N'DUMP_ON_ERROR', @parameter_value=@var1 EXEC [SSISDB].[catalog].[start_execution] @execution_id GO The problem here is that the procedure will simply start the execution of the package and will return as soon as the package as been started…thus giving you the opportunity to execute packages asynchrously from your T-SQL code. This is just *great*, but what happens if I what to execute a package and WAIT for it to finish (and thus having a synchronous execution of it)? You have to be sure that you add the “SYNCHRONIZED” parameter to the package execution. Before the start_execution procedure: exec [SSISDB].[catalog].[set_execution_parameter_value] @execution_id, @object_type=50, @parameter_name=N'SYNCHRONIZED', @parameter_value=1 And that’s it . PS From the RC0, the SYNCHRONIZED parameter is automatically added each time you schedule a package execution through the SQL Server Agent. If you’re using an external scheduler, just keep this post in mind .

Read the article
Inequality joins, Asynchronous transformations and Lookups : SSIS

- by jamiet

It is pretty much accepted by SQL Server Integration Services (SSIS) developers that synchronous transformations are generally quicker than asynchronous transformations (for a description of synchronous and asynchronous transformations go read Asynchronous and synchronous data flow components). Notice I said “generally” and not “always”; there are circumstances where using asynchronous transformations can be beneficial and in this blog post I’ll demonstrate such a scenario, one that is pretty common when building data warehouses. Imagine I have a [Customer] dimension table that manages information about all of my customers as a slowly-changing dimension. If that is a type 2 slowly changing dimension then you will likely have multiple rows per customer in that table. Furthermore you might also have datetime fields that indicate the effective time period of each member record. Here is such a table that contains data for four dimension members {Terry, Max, Henry, Horace}: Notice that we have multiple records per customer and that the [SCDStartDate] of a record is equivalent to the [SCDEndDate] of the record that preceded it (if there was one). (Note that I am on record as saying I am not a fan of this technique of storing an [SCDEndDate] but for the purposes of clarity I have included it here.) Anyway, the idea here is that we will have some incoming data containing [CustomerName] & [EffectiveDate] and we need to use those values to lookup [Customer].[CustomerId]. The logic will be: Lookup a [CustomerId] WHERE [CustomerName]=[CustomerName] AND [SCDStartDate] <= [EffectiveDate] AND [EffectiveDate] <= [SCDEndDate] The conventional approach to this would be to use a full cached lookup but that isn’t an option here because we are using inequality conditions. The obvious next step then is to use a non-cached lookup which enables us to change the SQL statement to use inequality operators: Let’s take a look at the dataflow: Notice these are all synchronous components. This approach works just fine however it does have the limitation that it has to issue a SQL statement against your lookup set for every row thus we can expect the execution time of our dataflow to increase linearly in line with the number of rows in our dataflow; that’s not good. OK, that’s the obvious method. Let’s now look at a different way of achieving this using an asynchronous Merge Join transform coupled with a Conditional Split. I’ve shown it post-execution so that I can include the row counts which help to illustrate what is going on here: Notice that there are more rows output from our Merge Join component than on the input. That is because we are joining on [CustomerName] and, as we know, we have multiple records per [CustomerName] in our lookup set. Notice also that there are two asynchronous components in here (the Sort and the Merge Join). I have embedded a video below that compares the execution times for each of these two methods. The video is just over 8minutes long. View on Vimeo For those that can’t be bothered watching the video I’ll tell you the results here. The dataflow that used the Lookup transform took 36 seconds whereas the dataflow that used the Merge Join took less than two seconds. An illustration in case it is needed: Pretty conclusive proof that in some scenarios it may be quicker to use an asynchronous component than a synchronous one. Your mileage may of course vary. The scenario outlined here is analogous to performance tuning procedural SQL that uses cursors. It is common to eliminate cursors by converting them to set-based operations and that is effectively what we have done here. Our non-cached lookup is performing a discrete operation for every single row of data, exactly like a cursor does. By eliminating this cursor-in-disguise we have dramatically sped up our dataflow. I hope all of that proves useful. You can download the package that I demonstrated in the video from my SkyDrive at http://cid-550f681dad532637.skydrive.live.com/self.aspx/Public/BlogShare/20100514/20100514%20Lookups%20and%20Merge%20Joins.zip Comments are welcome as always. @Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
Tool to automatically combine many large pivot table in many large Excel sheet ?

- by Sim

Hi all, We have several Excel files that contains large pivot data table with the same structure For example File A Pivot table (Field A, B, C) File B Pivot table (Field A, B, C) We want to combine them into 1 pivot table (A, B, C). Just want to know which ways we can do it ? Manual way : open a new empty sheet, copy and paste the pivot there and create the pivot again Automatic way : is there some tool that did this ? Thanks

Read the article
Pivot table not refreshing sort order

- by William Anthony

I have Pivot Table that get its data source from another sheet, same workbook. I want the sort order of data is same as the data source order, I choose "Sort in data source order" in Pivot Table option. The problem is, when I change the data order on data source worksheet, then I refresh the Pivot Table, the sort order didn't change. I googled that the Pivot Table should be unlink first then re-link again in order to work properly, so I tried the following: The original data source has named range: origdata. The fake data source has names range: dummydata I changed manually data source to dummydata then changed back to origdata. The sort order did change as expected. Now I want to make the operation automated, so I'm using this code in Worksheet.activate event. Note that, PT is PivotTable instance. ... PT.SourceData = "dummydata" PT.RefreshTable PT.SourceData = "origdata" PT.RefreshTable ... Change data source from VBA didn't change the sort order just like I did with manual method. Why is that? Am I missing something? Maybe there are some routine called when I changed the data source manually via toolbar button? Thanks in advance for your help.

Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >