Working on a Data Warehouse project, the guy that gave us the tutorial advised that we stick to using SQL queries over defining a lot of data flow transformations, citing points like it'll consume a lot of memory on the ETL box so we'd rather leave the processing to the DB box. Is this really advisable? Where's the balance between relying on GUI tools over executing a bunch of SQL scripts on your Integration package?
And honestly, I'd like to avoid writing SQL queries as much as I can.
In SSMS I've gotten my for xml path query written and it's beautiful.
I put it in an "Execute SQL Task" in the Control Flow, set the resultset to XML.
Now how to I get the results into an actual xml file that I can turn around and FTP to a third party?
This should have been so easy! I would put the XML into a variable but we are looking at a HUGE file, possibly 100mb+
Do I need to use a Script Task? (I'd like to avoid that if there is another option.)
I've created two simple, yet very useful, script to extract some useful data to quickly monitor SSIS packages execution in SQL Server 2012 and after.get-ssis-execution-status get-ssis-data-pumped-rows I've started to use gist since it comes very handy, for this "quick'n'dirty" scripts and snippets, and you can find the above scripts and others (hopefully the number will increase over time...I plan to use gist to store all the code snippet I used to store in a dedicated folder on my machine) there.Now, back to the aforementioned scripts. The first one ("get-ssis-execution-status") returns a list of all executed and executing packages along with latest successful and running executions (so that on can have an idea of the expected run time)error messageswarning messages related to duplicate rows found in lookupsthe second one ("get-ssis-data-pumped-rows") returns information on DataFlows status. Here there's something interesting, IMHO. Nothing exceptional, let it be clear, but nonetheless useful: the script extract information on destinations and row sent to destinations right from the messages produced by the DataFlow component. This helps to quickly understand how many rows as been sent and where...without having to increase the logging level.Enjoy! PSI haven't tested it with SQL Server 2014, but AFAIK they should work without problems. Of course any feedback on this is welcome.
I right clicked on a Database in the object explorer of SQL Server 2008 Management Studio. I went to Tasks Import Data, and imported some data from a flat text file, opting to save the package on the server.
Now how the heck do I get to the package to edit or run it again? Where in SQL Server Management Studio do I go? I've expanded everything and I can't find it. It's driving me nuts.
I have simple package which reads data from csv file and loads into SQL table. File is located on another server and it is shared. I use UNC path in package. package is scheduled using sql agent job. Job worked fine for 1 week and suddenly started giving error "The file name "\\124.0.48.173\basel2\Commercial\Input\ACBS_GSU.csv" specified in the connection was not valid. End Error Error: 2010-04-20 16:15:07.19 Code: 0xC0202070 Source: ACBS_GSU Connection manager "CSV file conection" Description: Connection "CSV file conection" failed validation."
Any help will be appreciated.
Is there any way in OnError event handler to identify error code and then send email? like on failure of particular control in my case it is Flat File Source. If flat file source fails because of any issue then send email.
What I am trying to accomplish is using this script task to continually insert into a generated RecordSet I know how to access it in the script however I do not know how to update it after my changes to the DataTable have been made.
Code is Below:
Dim EmailsToSend As New OleDb.OleDbDataAdapter
Dim EmailsToSendDt As New DataTable("EmailsToSend")
Dim CurrentEmailsToSend As New DataTable
Dim EmailsToSendRow As DataRow
EmailsToSendDt.Columns.Add("SiteMgrUserId", System.Type.GetType("System.Integer"))
EmailsToSendDt.Columns.Add("EmailAddress", System.Type.GetType("System.String"))
EmailsToSendDt.Columns.Add("EmailMessage", System.Type.GetType("System.String"))
EmailsToSendRow = EmailsToSendDt.NewRow()
EmailsToSendRow.Item("SiteMgrUserId") = siteMgrUserId
EmailsToSendRow.Item("EmailAddress") = siteMgrEmail
EmailsToSendRow.Item("EmailMessage") = EmailMessage.ToString
EmailsToSend.Fill(CurrentEmailsToSend, Dts.Variables("EmailsToSend").Value)
EmailsToSendDt.Merge(CurrentEmailsToSend, True)
Basically my goal is to create a single row in a new data table. Get the current record set, merge the results so I have my result DataTable. Now I just need to update the ReadWriteVariable for my script. Do not know if I have to do anything special or if I can just assign it directly to the DataTable I.E. Dts.Variables("EmailsToSend").Value = EmailsToSendDt
Thanks for the help in advanced.
Hey all,
Im looking to Extract a SharePoint List (WSS 2.0) to a SQL(2005) Table using SQL Server Integrated Services.
First off I am aware of the "adapter" that does this from http://msdn.microsoft.com/en-us/library/dd365137.aspx however I'm just wondering for compatibility purposes if it can't just be done "out of the box".
There are only a limited number of "Data Flow Sources" to select as alternatives and I am unsure if any of these would be able to work in a similar way either directly to SharePoint or via SharePoints web services (e.g. http://server_name/_vti_bin/Lists.asmx) From the list of these sources it looks like the best option would be the OLE DB connector, but not sure how it would do this.
Any help you have would be great,
Mark
I'm trying to debug some legacy Integration Services code, and really want some confirmation on what I think the problem is:
We have a very large data task inside a control flow container. This control flow container is set up with TransactionOption = supported - i.e. it will 'inherit' transactions from parent containers, but none are set up here.
Inside the data flow there is a call to a stored proc that writes to a table with pseudo code something like:
"If a record doesn't exist that matches these parameters then write it"
Now, the issue is that there are three records being passed into this proc all with the same parameters, so logically the first record doesn't find a match and a record is created. The second record (with the same parameters) also doesn't find a match and another record is created.
My understanding is that the first 'record' passed to the proc in the dataflow is uncommitted and therefore can't be 'read' by the second call. The upshot being that all three records create a row, when logically only the first should.
In this scenario am I right in thinking that it is the uncommitted transaction that stops the second call from seeing the first? Even setting the isolation level on the container doesn't help because it's not being wrapped in a transaction anyway....
Hope that makes sense, and any advice gratefully received. Work-arounds confer god-like status on you.
My send mail task works fine for email ids like [email protected] but it throws error for email ids like [email protected].
is there any way i can make it work for such ids also?
Thanks.
I have an ID in a package variable that I need to add as a column (with each row having that package variablevalue) in a Dataflow.
Is there a way to do this with only the Derived Column? I know i can using the Derived Column to make a new column and then set the value using a Script Component, but that seems inefficient.
I know configuring the logging for individual packages thru BIDS. But the drawback I see here is I have to add connectionstring for each tasks and when I have to deplloy these packages on server I have to change log file connectionstring for all packages. Currently I have 32 pacakes and this seems to be time consuming.
Is there any way where I can set up logging for all packages in one place?
3 years ago I created a catalog using InDesign's data merge feature and a 3rd party software called EasyCatalog. It wasn't a great success. Has anyone used data merge in InDesign or a catalog building software to produce a print catalog?
I want to use the foreach container to iterate through a folder matching something like: "Filename_MMYYYY.xls". That's easy enough to do; but I can't seem to find a way to parse the MMYYYY from the filename and add it to a variable (or something) that i can use as a lookup field for my DimDate table. It seems possible with a flat file data source, but not an excel connection. I'm using Visual Studio 2005. Please help!
In one SQL Task can I create a table variable
DELCARE @TableVar TABLE (...)
Then in another SQL Task or DataSource destination and select or insert into the table variable?
The other option I have considered is using a Temp Table.
CREATE TABLE #TempTable (...)
I would prefer to use Table Variable so that it remains in memory. But can use temp table if it is not possible to use table variable. Also I cannot use the record set destination as I need to preform straight SQL tasks on it later on.
In a Data Flow, I have an Derived Column task. In the expression for one of the columns, I have the following expression:
[siteid] == "100" ? "1101" : [siteid] == "110" ? "1001" : [siteid] == "120" ? "2101" : [siteid] == "140" ? "1102" : [siteid] == "210" ? "2001" : [siteid] == "310" ? "3001" : [siteid]
This works just fine. However, I intend to reuse this in at least a dozen other places so I want to store this to a variable and use the variable in the Derived Column instead of the hard-coded expression. When I attempt to create a variable, using the expression above, I get a syntax error saying 'siteid' is not defined. I guess this makes sense because it isn't. But how can I get this the expression to work by using a variable? It seems like I need some sort of way to tell it that 'siteid' will be the column containing the data I want to apply the expression to.
I am using SQL Server 2005 Business Intelligence Studio and struggling with returning an integer value from a very simple execute SQL Task.
For a very simple test, I wrote the SQL Statement as:
Select 35 As 'TotalRecords'
Then, I specified ResultSet as
ResultName = TotalRecords and
VariableName = User::TotalRecords
When I execute this, the statement is executed but the variable doesn't have the updated value. However, it has the default value that I specified while variable definition.
The return of a date variable works, but integer variable isn't working. The type of User::TotalRecords specified is Int32 in a package scope.
Thanks for any hints
Hi
I am trying to generate a CSV file from DB Query as source one of my column is having datatype nvarchar(50) with values as "01050007029604301001"
After the export when the csv file is viewed using Excel the value appears as
"1.0500E18" .
How can i stop this .
Please suggest
I got a script component which does Transformation / DataType conversions / Creating some calculated columns. All the transform validations / datatype conversion methods and for new column generation is put into custom .dll.
As this script component would be same for all other tables, only thing is to define input / ouput columns and apply validation methods on required columns.
This all works fine. On production server where do I need to deploy my .dll.
Would just putting it into GAC will be enough or need to do something else.
Regards
I have a flat file source from Excel that has a structure like this:
**People** Day1 Day2 Day3 Day4
Person1 someValue ...
Person2
Person3
And i would like the package to put this information in a database with standard columns 'Person', 'Day', 'Value'. Does anybody know how to do this - at the moment because the days are going along the top, the package is assuming these are seperate data columns when they are not really and the mapping is not working.
I am using ADSI Edit to look at LDAP properties of a single user account in AD. I see properties such as userPrincipalName, but I do not see one for the fully qualified domain name (FQDN) or the netbios domain name.
We will be setting up the Global Catalog (GC) to give us LDAP access to multiple domains and through configuration in an application we map LDAP properties to user profile properties within the application. With typical AD the FQDN and netbios domain name are the same for all users, but with the GC involved we need this additional information. We really only need the netbios domain name (the FQDN is not good enough).
Maybe there is a LDAP query that can be done to request this information from a more top-level object in AD?