SSIS: Deploying OLAP cubes using C# script tasks and AMO
- by DrJohn
As part of the continuing series on Building dynamic OLAP data marts on-the-fly, this blog entry will focus on how to automate the deployment of OLAP cubes using SQL Server Integration Services (SSIS) and Analysis Services Management Objects (AMO).
OLAP cube deployment is usually done using the Analysis Services Deployment Wizard. However, this option was dismissed for a variety of reasons. Firstly, invoking external processes from SSIS is fraught with problems as (a) it is not always possible to ensure SSIS waits for the external program to terminate; (b) we cannot log the outcome properly and (c) it is not always possible to control the server's configuration to ensure the executable works correctly. Another reason for rejecting the Deployment Wizard is that it requires the 'answers' to be written into four XML files. These XML files record the three things we need to change: the name of the server, the name of the OLAP database and the connection string to the data mart. Although it would be reasonably straight forward to change the content of the XML files programmatically, this adds another set of complication and level of obscurity to the overall process.
When I first investigated the possibility of using C# to deploy a cube, I was surprised to find that there are no other blog entries about the topic. I can only assume everyone else is happy with the Deployment Wizard!
SSIS "forgets" assembly references
If you build your script task from scratch, you will have to remember how to overcome one of the major annoyances of working with SSIS script tasks: the forgetful nature of SSIS when it comes to assembly references. Basically, you can go through the process of adding an assembly reference using the Add Reference dialog, but when you close the script window, SSIS "forgets" the assembly reference so the script will not compile. After repeating the operation several times, you will find that SSIS only remembers the assembly reference when you specifically press the Save All icon in the script window. This problem is not unique to the AMO assembly and has certainly been a "feature" since SQL Server 2005, so I am not amazed it is still present in SQL Server 2008 R2!
Sample Package
So let's take a look at the sample SSIS package I have provided which can be downloaded from here: DeployOlapCubeExample.zip Below is a screenshot after a successful run.
Connection Managers
The package has three connection managers:
AsDatabaseDefinitionFile is a file connection manager pointing to the .asdatabase file you wish to deploy. Note that this can be found in the bin directory of you OLAP database project once you have clicked the "Build" button in Visual Studio
TargetOlapServerCS is an Analysis Services connection manager which identifies both the deployment server and the target database name.
SourceDataMart is an OLEDB connection manager pointing to the data mart which is to act as the source of data for your cube. This will be used to replace the connection string found in your .asdatabase file
Once you have configured the connection managers, the sample should run and deploy your OLAP database in a few seconds. Of course, in a production environment, these connection managers would be associated with package configurations or set at runtime.
When you run the sample, you should see that the script logs its activity to the output screen (see screenshot above). If you configure logging for the package, then these messages will also appear in your SSIS logging.
Sample Code Walkthrough
Next let's walk through the code. The first step is to parse the connection string provided by the TargetOlapServerCS connection manager and obtain the name of both the target OLAP server and also the name of the OLAP database. Note that the target database does not have to exist to be referenced in an AS connection manager, so I am using this as a convenient way to define both properties.
We now connect to the server and check for the existence of the OLAP database. If it exists, we drop the database so we can re-deploy.
svr.Connect(olapServerName);
if (svr.Connected)
{
// Drop the OLAP database if it already exists
Database db = svr.Databases.FindByName(olapDatabaseName);
if (db != null)
{
db.Drop();
}
// rest of script
}
Next we start building the XMLA command that will actually perform the deployment. Basically this is a small chuck of XML which we need to wrap around the large .asdatabase file generated by the Visual Studio build process.
// Start generating the main part of the XMLA command
XmlDocument xmlaCommand = new
XmlDocument();
xmlaCommand.LoadXml(string.Format("<Batch Transaction='false' xmlns='http://schemas.microsoft.com/analysisservices/2003/engine'><Alter AllowCreate='true' ObjectExpansion='ExpandFull'><Object><DatabaseID>{0}</DatabaseID></Object><ObjectDefinition/></Alter></Batch>", olapDatabaseName));
Next we need to merge two XML files which we can do by simply using setting the InnerXml property of the ObjectDefinition node as follows:
// load OLAP Database definition from .asdatabase file identified by connection manager
XmlDocument olapCubeDef = new
XmlDocument();
olapCubeDef.Load(Dts.Connections["AsDatabaseDefinitionFile"].ConnectionString);
// merge the two XML files by obtain a reference to the ObjectDefinition node
oaRootNode.InnerXml = olapCubeDef.InnerXml;
One hurdle I had to overcome was removing detritus from the .asdabase file left by the Visual Studio build. Through an iterative process, I found I needed to remove several nodes as they caused the deployment to fail. The XMLA error message read "Cannot set read-only node: CreatedTimestamp" or similar. In comparing the XMLA generated with by the Deployment Wizard with that generated by my code, these read-only nodes were missing, so clearly I just needed to strip them out. This was easily achieved using XPath to find the relevant XML nodes, of which I show one example below:
foreach (XmlNode node in rootNode.SelectNodes("//ns1:CreatedTimestamp", nsManager))
{
node.ParentNode.RemoveChild(node);
}
Now we need to change the database name in both the ID and Name nodes using code such as:
XmlNode databaseID = xmlaCommand.SelectSingleNode("//ns1:Database/ns1:ID", nsManager);
if (databaseID != null)
databaseID.InnerText = olapDatabaseName;
Finally we need to change the connection string to point at the relevant data mart. Again this is easily achieved using XPath to search for the relevant nodes and then replace the content of the node with the new name or connection string.
XmlNode connectionStringNode = xmlaCommand.SelectSingleNode("//ns1:DataSources/ns1:DataSource/ns1:ConnectionString", nsManager);
if (connectionStringNode != null)
{
connectionStringNode.InnerText = Dts.Connections["SourceDataMart"].ConnectionString;
}
Finally we need to perform the deployment using the Execute XMLA command and check the returned XmlaResultCollection for errors before setting the Dts.TaskResult.
XmlaResultCollection oResults = svr.Execute(xmlaCommand.InnerXml);
// check for errors during deployment
foreach (Microsoft.AnalysisServices.XmlaResult oResult in oResults)
{
foreach (Microsoft.AnalysisServices.XmlaMessage oMessage in oResult.Messages)
{
if ((oMessage.GetType().Name == "XmlaError"))
{
FireError(oMessage.Description);
HadError = true;
}
}
}
If you are not familiar with XML programming, all this may all seem a bit daunting, but perceiver as the sample code is pretty short.
If you would like the script to process the OLAP database, simply uncomment the lines in the vicinity of Process method. Of course, you can extend the script to perform your own custom processing and to even synchronize the database to a front-end server. Personally, I like to keep the deployment and processing separate as the code can become overly complex for support staff.If you want to know more, come see my session at the forthcoming SQLBits conference.