Scripting an initial load into PDW

gravatar

Greg Galloway

When migrating from to SQL Server Parallel Data Warehouse (also called Analytics Platform System), using Biml to generate SSIS packages that do an initial historical load is very convenient. This script shows how easy Mist 4.0 makes the Biml to accomplish this.

published 02.15.15

last updated 02.21.15


Share

Tags

  • APS
  • PDW
                            


<#@ template language="C#" hostspecific="true"#>
<#@ import namespace="System.Data" #>
	<Biml xmlns="http://schemas.varigence.com/biml.xsd">
	<Connections>
		<OleDbConnection Name="Source" ConnectionString="Provider=SQLNCLI11.1;Server=localhost;Initial Catalog=YourDB;Integrated Security=SSPI;" CreateInProject="false"/>
		<OleDbConnection Name="PdwOledb" ConnectionString="Provider=SQLNCLI11.1;Server=PDW1-CTL01,17001;Initial Catalog=YourDestinationDB;Integrated Security=SSPI;" CreateInProject="false"/>
		<SqlServerPdwConnection Name="SQLPDWConnection" CreateInProject="false" ConnectionString="Data Source=PDW1-CTL01,17001;Initial Catalog=YourDestinationDB;Integrated Security=True" StagingDatabase="" LoaderPort="8002" EncryptData="false" DelayValidation="false"/>
	</Connections>


	<Packages>
            <# 
                string metadataConnectionString = "Provider=SQLNCLI11.1;Server=PDW1-CTL01,17001;Initial Catalog=YourDestinationDB;Integrated Security=SSPI;";
				string query = @"SELECT '[' + s.name + '].[' + t.name + ']', s.name + ' ' + t.name
,'select d.* '
 +'from [' + s.name + '].[' + t.name + '] d (nolock)'
 FROM sys.tables t 
 INNER JOIN sys.schemas s on t.schema_id = s.schema_id
 ";
                DataTable tables = ExternalDataAccess.GetDataTable(metadataConnectionString, query);
                foreach (DataRow row in tables.Rows)
                { #>
		<Package Name="InitialLoad <#=row[1]#>" ConstraintMode="Linear" AutoCreateConfigurationsType="Xml" PackageSubpath="InitialLoad" Type="InitialLoadType">
			<Tasks>
				<ExecuteSQL ConnectionName="PdwOledb" Name="Truncate destination">
					<DirectInput>truncate table <#=row[0]#></DirectInput>
                </ExecuteSQL>
				<Dataflow Name="Copy Data">
					<Transformations>
						<OleDbSource Name="Retrieve Data" ConnectionName="Source">
							<DirectInput><#=row[2]#></DirectInput>
						</OleDbSource>
						<SqlServerPdwDestination Name="PDW dest" ConnectionName="SQLPDWConnection" LoadingMode="FastAppend" RollbackOnFailure="false" LocaleId="None" ValidateExternalMetadata="true">
							<ExternalTableOutput Table="<#=row[0]#>">
							</ExternalTableOutput>
						</SqlServerPdwDestination>
					</Transformations>
				</Dataflow>
			</Tasks>
		</Package>
                <# } #>
		<Package Name="InitialLoad Master" ConstraintMode="Parallel" AutoCreateConfigurationsType="None" PackageSubpath="InitialLoad" Type="InitialLoadType" MaxConcurrentExecutables="10">
			<Tasks>
				<#
				foreach (DataRow row in tables.Rows)
                { #>
				<ExecutePackage Name="InitialLoad <#=row[1]#>">
					<Package PackageName="InitialLoad <#=row[1]#>"></Package>
				</ExecutePackage>
                <# } #>
            </Tasks>
		</Package>
	</Packages>
</Biml>
                        

Prior to Mist 4.0, using Biml to automate building SSIS packages that target PDW was very complex. For example, see http://www.varigence.com/Forums?threadID=2406.

With Mist 4.0 and the convenient SqlServerPdwDestination tag, Biml for PDW is very straightforward.

The above Biml script has been tested on Mist 4.0 targeting PDW v2 Appliance Update 3 (circa December 2014). It requires you have the PDW SSIS destination adapter that matches your version of SSIS and your appliance version (appliance update or AU). The AU3 installers can be found here.

Note that this script isn't supported in BIDS Helper. This script requires Mist 4.0 or higher. Only sources and destinations that are installed with the SSIS installer itself are supported by Biml in BIDS Helper. For example, the PDW destination must be installed after installing SSIS.

Note that sys.tables in this example is a straightforward way to get a list of SQL tables, but many times ImportDB or GetSchema may be helpful in that it retrieves much more rich information including columns and data types.

You are not authorized to comment. A verification email has been sent to your email address. Please verify your account.

Comments

gravatar

Simon6

11:10am 06.11.15

Will support for PDW come in BIDS helper?

gravatar

Bertrand Renotte

12:25pm 09.16.15

Hi Greg,

Do you know how to handle upsert loading mode and setting key columns ?

Thx.

Bertrand