Posts tagged: XML

Creating an XML output file from SSIS with XSLT

By , March 29, 2012

A frequent question on a number of forums is how to export data to an XML file with SSIS.  There is even a Connect request for an SSIS: XML Destination Adapter, which has been marked as Closed – won’t fix. For awhile I’ve been using a script task to generate XML output based on the example that  Denis Gobo (blog | twitter) blogged about.

Last fall/winter I took the free on-line Introduction to Databases course offered by Stanford University, where part of the course touched on XML, including a section on XSLT.  The course provided a good overview and pointed me in the direction of trying to utilize XSLT within SSIS.  According to the website the course may be formally offered later in 2012 – I would highly recommend taking it.  It was an awesome experience.

For this example, I create an XML output file using SSIS.  The formatted XML data is retrieved from a stored procedure. There are a few modifications that need to be made to the retrieved XML to create the desired output result:

  • Add the XML Version <?xml version=”1.0″ ?>
  • Rename  the <ROOT> tag to <CustomerData>
  • Specify the name space for the XSD used for validation.

Getting all the pieces in place

First, I created the XML Schema that would be used to validate the XML.  It’s located online and is also show here:

<?xml version="1.0" encoding="utf-8" ?>
<!--Created with Liquid XML Studio 2012 Designer Edition 10.0.2.3955 (http://www.liquid-technologies.com)-->
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="CustomerData">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="Customer" minOccurs="0" maxOccurs="unbounded">
                    <xs:complexType>
                        <xs:sequence>
                            <xs:element name="CustomerID" type="xs:unsignedShort" />
                            <xs:element name="FirstName" type="xs:string" />
                            <xs:element name="MiddleName" type="xs:string" minOccurs="0" maxOccurs="1" />
                            <xs:element name="LastName" type="xs:string" />
                            <xs:element name="Phone" type="xs:string" minOccurs="0" />
                            <xs:element name="EmailAddress" type="xs:string" minOccurs="0" />
                            <xs:element name="CustomerAddresses" minOccurs="0" maxOccurs="unbounded">
                                <xs:complexType>
                                    <xs:sequence>
                                        <xs:element name="AddressType" type="xs:string" />
                                        <xs:element name="AddressLine1" type="xs:string" />
                                        <xs:element name="AddressLine2" minOccurs="0" maxOccurs="1" />
                                        <xs:element name="City" type="xs:string" />
                                        <xs:element name="StateProvinceName" type="xs:string" />
                                        <xs:element name="PostalCode" type="xs:unsignedInt" />
                                        <xs:element name="CountryRegionName" type="xs:string" />
                                    </xs:sequence>
                                </xs:complexType>
                            </xs:element>
                        </xs:sequence>
                    </xs:complexType>
                </xs:element>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

Next, I created a simple stored procedure that is used to query the AdventureWorks database and produce a result set in XML with name and address information for a small subset of customers. The customer IDs for four customers are hardcoded in the query to reduce the size of the sample XML.

USE AdventureWorks
GO
/*
** Stored Procedure to create sample XML data from
** AdventureWorks
*/
CREATE PROCEDURE Sales.getCustomerAddresses
AS

BEGIN
       SELECT   i. CustomerID
              , c. Title
              , c. FirstName
              , c. MiddleName
              , c. LastName
              , c. Suffix
              , c. Phone
              , c. EmailAddress
              , (
                 SELECT at. Name AS AddressType
                      , a. AddressLine1
                      , a. AddressLine2
                      , a. City
                      , StateProvinceName = sp.Name
                      , a. PostalCode
                      , CountryRegionName = cr.Name
                 FROM   Sales. CustomerAddress ca
                        INNER JOIN Person.Address a
                            ON a. AddressID = ca .AddressID
                        INNER JOIN Person.StateProvince sp
                            ON sp. StateProvinceID = a.StateProvinceID
                        INNER JOIN Person.CountryRegion cr
                            ON cr. CountryRegionCode = sp.CountryRegionCode
                        INNER JOIN Person.AddressType at
                            ON ca. AddressTypeID = at.AddressTypeID
                 WHERE  ca. CustomerID = i .CustomerID
                FOR
                 XML PATH ('CustomerAddresses')
                   , TYPE
                )
       FROM     Sales. Individual i
                INNER JOIN Person.Contact c
                    ON c. ContactID = i .ContactID
       WHERE    i. CustomerID IN (11532, 11279, 13145 , 13311)
      FOR
       XML PATH ('Customer')
         , TYPE

END
GO

Executing the stored procedure in SSMS:

EXEC Sales.getCustomerAddresses

Returns this result set:

<Customer>
  <CustomerID>11279</CustomerID>
  <FirstName>Amanda</FirstName>
  <MiddleName>S</MiddleName>
  <LastName>Cook</LastName>
  <Phone>252-555-0177</Phone>
  <EmailAddress>amanda3@adventure-works.com</EmailAddress>
  <CustomerAddresses>
    <AddressType>Shipping</AddressType>
    <AddressLine1>4098 Woodcrest Dr.</AddressLine1>
    <City>Everett</City>
    <StateProvinceName>Washington</StateProvinceName>
    <PostalCode>98201</PostalCode>
    <CountryRegionName>United States</CountryRegionName>
  </CustomerAddresses>
  <CustomerAddresses>
    <AddressType>Home</AddressType>
    <AddressLine1>9187 Vista Del Sol</AddressLine1>
    <City>Everett</City>
    <StateProvinceName>Washington</StateProvinceName>
    <PostalCode>98201</PostalCode>
    <CountryRegionName>United States</CountryRegionName>
  </CustomerAddresses>
</Customer>
<Customer>
  <CustomerID>11532</CustomerID>
  <FirstName>Lauren</FirstName>
  <LastName>Miller</LastName>
  <Phone>827-555-0183</Phone>
  <EmailAddress>lauren24@adventure-works.com</EmailAddress>
  <CustomerAddresses>
    <AddressType>Shipping</AddressType>
    <AddressLine1>6202 Seeno St.</AddressLine1>
    <City>Sammamish</City>
    <StateProvinceName>Washington</StateProvinceName>
    <PostalCode>98074</PostalCode>
    <CountryRegionName>United States</CountryRegionName>
  </CustomerAddresses>
  <CustomerAddresses>
    <AddressType>Home</AddressType>
    <AddressLine1>6437 Brookview Dr.</AddressLine1>
    <City>Redmond</City>
    <StateProvinceName>Washington</StateProvinceName>
    <PostalCode>98052</PostalCode>
    <CountryRegionName>United States</CountryRegionName>
  </CustomerAddresses>
</Customer>
<Customer>
  <CustomerID>13145</CustomerID>
  <FirstName>Gilbert</FirstName>
  <LastName>Xu</LastName>
  <Phone>798-555-0118</Phone>
  <EmailAddress>gilbert9@adventure-works.com</EmailAddress>
  <CustomerAddresses>
    <AddressType>Shipping</AddressType>
    <AddressLine1>1010 Maple</AddressLine1>
    <City>Baltimore</City>
    <StateProvinceName>Maryland</StateProvinceName>
    <PostalCode>21201</PostalCode>
    <CountryRegionName>United States</CountryRegionName>
  </CustomerAddresses>
  <CustomerAddresses>
    <AddressType>Home</AddressType>
    <AddressLine1>7779 Merry Drive</AddressLine1>
    <City>Cheektowaga</City>
    <StateProvinceName>New York</StateProvinceName>
    <PostalCode>14227</PostalCode>
    <CountryRegionName>United States</CountryRegionName>
  </CustomerAddresses>
</Customer>
<Customer>
  <CustomerID>13311</CustomerID>
  <FirstName>Isabella</FirstName>
  <LastName>Reed</LastName>
  <Phone>754-555-0118</Phone>
  <EmailAddress>isabella85@adventure-works.com</EmailAddress>
  <CustomerAddresses>
    <AddressType>Shipping</AddressType>
    <AddressLine1>5297 Algiers Drive</AddressLine1>
    <City>Renton</City>
    <StateProvinceName>Washington</StateProvinceName>
    <PostalCode>98055</PostalCode>
    <CountryRegionName>United States</CountryRegionName>
  </CustomerAddresses>
  <CustomerAddresses>
    <AddressType>Home</AddressType>
    <AddressLine1>6351 22nd Ave.</AddressLine1>
    <City>Renton</City>
    <StateProvinceName>Washington</StateProvinceName>
    <PostalCode>98055</PostalCode>
    <CountryRegionName>United States</CountryRegionName>
  </CustomerAddresses>
</Customer>

Now, the we’ve got the XSD, and the stored procedure completed we can build the SSIS Package to execute the procedure and create the XML output file.

Building the SSIS Package:

First, create an ADO.NET Connection to the AdventureWorks database and name the connection AdventureWorks.

Declare a variable with a data type of sting.  This will be used to hold the returned result set from the stored procedures:

DeclaredVariable

Next we add an Execute SQL Task to call our stored procedure

ExecuteSQL

Setting up the Execute SQL Task.

On the General settings tab, specify the connection information for our ADO.NET connection, the stored procedure to be executed, and that the ResultSet is XML:

ExecuteSQLGeneralTab

On the Result Set tab, we are going to assign the result set returned from the stored procedure to the User::XMLResult variable defined earlier. It’s also important that you change the Result Name from the default of “NewResultName” to Zero (0), otherwise when executing the package you’ll receive an error

ExecuteSQLResultSetTab

Setting up the XML Task

Next add an XML Task to the Project

XMLTask

On the General settings tab this is how the default properties initially display:

XMLTaskGeneralDefault

What we need to do first is change the Input Operation Type from Diff to XSLT.  This will expose the properties needed for using an XSLT.

XMLTaskGeneralXSLT1

Now make the following changes.

  • The SourceType, which is the XML we want to transform is a variable.  The variable, User::XMLResult, holds the output from the stored procedure executed in the previous Execute SQL Task
  • The Output Specifies where we want to save the results of the transformation. In our case, we want to save the results to a file so we select a File Connection.
  • When presented with the <New File connection…> option in the dropdown, specify a new file for the output:

NewFileConnection1

And fill out the dialog to create a new file

NewFileConnectionDialog

The Second Operand, is where we specify the XSLT to be applied against the Input Source.  For this example, we’re using direct input.  If this package was to be deployed to a production system, I would store the XSLT in a file rather than using direct input.

When presented with the XML Source Editor, paste the XSLT shown below into the source editor:

XMLSourceEditor

The completed XML Task – General settings tab.

XMLTaskGeneralXSLT2

XSLT

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml"
            indent="yes"
            omit-xml-declaration="no" />

<xsl:template match="*|@*|text()">
   <xsl:copy>
      <xsl:apply-templates select="*|@*|text()" />
   </xsl:copy>
</xsl:template>

<xsl:template match="ROOT">
	<CustomerData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.jaybonk.com/schemas/CustomerData.xsd">
     <xsl:apply-templates/>
  </CustomerData>
</xsl:template>

</xsl:stylesheet>

What the XSLT does…

  • The <xsl:stylesheet> tag specifies that our output should be rendered as XML
  • The first <xsl:template> matches everything in the source document; all elements (*), attributes (@*) and text (text()).  The remainder of the tag specifies that the document is to be copied and any additionally defined templates are recursively applied
  • The second <xsl:template>, looks for any elements matching “ROOT”, and rewrites the opening and closing tags as “CustomerData”, with the opening tag additionally containing the required namespaces.
  • Within the second template, the <xsl:apply-templates/> signifies that all children of the current node are processed.  In our case copying them to the output document.

Full disclosure, I’m a novice when it comes to working with XSLT and I was able to get this example working with just some rudimentary knowledge.  I do see XML, XPATH, XQuery, and XSLT as very powerful, flexible, and extremely useful.  I am looking forward to learning and working with them much more.

The resulting output file, after the XSLT has been applied:

<?xml version="1.0" encoding="utf-8"?>
<CustomerData xsi:noNamespaceSchemaLocation="http://www.jaybonk.com/schemas/CustomerData.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <Customer>
    <CustomerID>11279</CustomerID>
    <FirstName>Amanda</FirstName>
    <MiddleName>S</MiddleName>
    <LastName>Cook</LastName>
    <Phone>252-555-0177</Phone>
    <EmailAddress>amanda3@adventure-works.com</EmailAddress>
    <CustomerAddresses>
      <AddressType>Shipping</AddressType>
      <AddressLine1>4098 Woodcrest Dr.</AddressLine1>
      <City>Everett</City>
      <StateProvinceName>Washington</StateProvinceName>
      <PostalCode>98201</PostalCode>
      <CountryRegionName>United States</CountryRegionName>
    </CustomerAddresses>
    <CustomerAddresses>
      <AddressType>Home</AddressType>
      <AddressLine1>9187 Vista Del Sol</AddressLine1>
      <City>Everett</City>
      <StateProvinceName>Washington</StateProvinceName>
      <PostalCode>98201</PostalCode>
      <CountryRegionName>United States</CountryRegionName>
    </CustomerAddresses>
  </Customer>
  <Customer>
    <CustomerID>11532</CustomerID>
    <FirstName>Lauren</FirstName>
    <LastName>Miller</LastName>
    <Phone>827-555-0183</Phone>
    <EmailAddress>lauren24@adventure-works.com</EmailAddress>
    <CustomerAddresses>
      <AddressType>Shipping</AddressType>
      <AddressLine1>6202 Seeno St.</AddressLine1>
      <City>Sammamish</City>
      <StateProvinceName>Washington</StateProvinceName>
      <PostalCode>98074</PostalCode>
      <CountryRegionName>United States</CountryRegionName>
    </CustomerAddresses>
    <CustomerAddresses>
      <AddressType>Home</AddressType>
      <AddressLine1>6437 Brookview Dr.</AddressLine1>
      <City>Redmond</City>
      <StateProvinceName>Washington</StateProvinceName>
      <PostalCode>98052</PostalCode>
      <CountryRegionName>United States</CountryRegionName>
    </CustomerAddresses>
  </Customer>
  <Customer>
    <CustomerID>13145</CustomerID>
    <FirstName>Gilbert</FirstName>
    <LastName>Xu</LastName>
    <Phone>798-555-0118</Phone>
    <EmailAddress>gilbert9@adventure-works.com</EmailAddress>
    <CustomerAddresses>
      <AddressType>Shipping</AddressType>
      <AddressLine1>1010 Maple</AddressLine1>
      <City>Baltimore</City>
      <StateProvinceName>Maryland</StateProvinceName>
      <PostalCode>21201</PostalCode>
      <CountryRegionName>United States</CountryRegionName>
    </CustomerAddresses>
    <CustomerAddresses>
      <AddressType>Home</AddressType>
      <AddressLine1>7779 Merry Drive</AddressLine1>
      <City>Cheektowaga</City>
      <StateProvinceName>New York</StateProvinceName>
      <PostalCode>14227</PostalCode>
      <CountryRegionName>United States</CountryRegionName>
    </CustomerAddresses>
  </Customer>
  <Customer>
    <CustomerID>13311</CustomerID>
    <FirstName>Isabella</FirstName>
    <LastName>Reed</LastName>
    <Phone>754-555-0118</Phone>
    <EmailAddress>isabella85@adventure-works.com</EmailAddress>
    <CustomerAddresses>
      <AddressType>Shipping</AddressType>
      <AddressLine1>5297 Algiers Drive</AddressLine1>
      <City>Renton</City>
      <StateProvinceName>Washington</StateProvinceName>
      <PostalCode>98055</PostalCode>
      <CountryRegionName>United States</CountryRegionName>
    </CustomerAddresses>
    <CustomerAddresses>
      <AddressType>Home</AddressType>
      <AddressLine1>6351 22nd Ave.</AddressLine1>
      <City>Renton</City>
      <StateProvinceName>Washington</StateProvinceName>
      <PostalCode>98055</PostalCode>
      <CountryRegionName>United States</CountryRegionName>
    </CustomerAddresses>
  </Customer>
</CustomerData>

So where did the <ROOT> tag come from?

The XSLT, applies changes to the “ROOT” element.  But if you look at the XML as returned from the stored procedure there is no <ROOT> element, the first element returned is the first <Customer> element.  The Execute SQL task wraps the result set in a <ROOT></ROOT> node as shown below.  I added a script task to display the User::XMLResult variable in a message box.

TheRootTag

OfficeFolders theme by Themocracy