Professional Microsoft SQL Server 2014 Integration Services (Wrox Programmer to Programmer), 1st Edition (2014)
Appendix A. SSIS Crib Notes
In this appendix you find a list of the most commonly used expressions, tasks, and transforms in SSIS with a description of when to use them. Reference these tables when you have a package to build in SSIS, and you are not sure which SSIS component to use to perform the needed actions.
WHEN TO USE CONTROL FLOW TASKS
TASKS |
WHEN TO USE |
Data Flow Task |
Use this task when you need to pass data from a source to a destination. The source and destination can be a flat file, an OLE DB Connection, or any other connections supported in the connection manager. |
Execute Package Task |
Use this task when you need to call another package from within a package. The package performing the call is the parent package. The called package is the child package. Information can be passed from the parent package to the child package with configurations. |
Execute Process Task |
Use this task to call an executable. The executable can be a batch file or an application. This task can call applications to perform functions on the files in SSIS, such as compressing a file. This task is commonly used to call third-party programs like compression or FTP tools. |
Execute SQL Task |
Use this task to perform any T-SQL operation. The SQL can be saved directly in the task, in a file, or in a variable. This task is commonly used to call stored procedures. |
File System Task |
Use this task to manipulate files. This task can move, rename, copy, and delete files and directories. You can also change the attributes of a file. A common use is archiving files after loading them. |
FTP Task |
Use this task to send or receive a file via the FTP protocol. You must have a valid FTP connection to perform this task. This task is commonly used to receive files from an FTP host for loading in a database. |
Message Queue Task |
Use this task to send or receive messages to a message queue. You must have a valid MSMQ connection to perform this task. |
Script Task |
Use this task to perform complex tasks that are not available in SSIS. This task allows you to leverage the .NET Framework to perform just about any task. Checking for the existence of a file is common use of this task. |
Send Mail Task |
Use this task to send e-mail via SMTP. You must have a valid SMTP server connection to use this task. You can use this task to send notification of the package information to recipients. You can also send files via the attachments on the e-mail. |
Web Service Task |
Use this task to call a web service. You need a valid web service URL to perform this task. |
XML Task |
Use this task to perform XML functions. This task can perform common XML tasks such as Diff, used to compare two XML files and find the differences. |
WHEN TO USE DATA FLOW TRANSFORMS
TRANSFORMS |
WHEN TO USE |
Aggregate |
Use this transform to perform grouping and summing of data. This is similar to the Group By function in T-SQL. |
Audit |
Use this transform to add a column to a Data Flow with package information. You can add items like the package name and user name as a new column in the Data Flow. |
Conditional Split |
Use this transform to divide data into different paths based on a Boolean expression. You can use all the paths from the split or ignore some outputs. This transform equates to a CASE statement in T-SQL. |
Copy Column |
Use this transform to create a new column in the Data Flow that is an exact copy of another column. |
Data Conversion |
Use this transform to convert data from one data type to another. For example, you can change Unicode to non-Unicode or change a string to an integer. |
Derived Column |
This is the most important transform. Use this transform to create or replace a column in the Data Flow with a column created by an expression. You can combine columns or use functions like getdate to create new data. |
Export Column |
Use this transform to send a column in a Data Flow to a file. The data types can be DT_TEXT, DT_NTEXT, and DT_IMAGE. |
Fuzzy Grouping |
Use this transform to group data together based on a percentage match. In this transform the data does not have to be an exact match to be grouped together. You can control the percentage of matching needed to group the data. |
Fuzzy Lookup |
Use this transform to find matching data in a table. The data does not have to match exactly. You can control the percentage of matching needed to group the data. |
Import Column |
Use this transform to import data from files into rows in a data set. |
Lookup |
Use this transform to compare data in a Data Flow to a table. This will find exact matches in the date and give you a match and no-match output from the transform. This transform equates to an INNER JOIN statement in T-SQL. |
Merge |
Use this transform to combine two sets of data in a way similar to a Union All. This transform requires both inputs to be sorted. |
Merge Join |
Use this transform to combine two sets of data in a way similar to a left outer join. This transform requires both inputs to be sorted. |
Multicast |
Use this transform to clone the data set and send it to different locations. This transform does not alter the data. |
OLE DB Command |
Use this transform to send T-SQL commands to a database. This can be used to insert data into a table using the T-SQL Insert command. |
Percentage Sampling |
Use this transform to select a percentage of the rows in a Data Flow. The rows are randomly selected. You can set a seed to select the same rows on every execution of the transform. The unselected rows will follow a different path in the Data Flow. |
Pivot |
Use this transform to convert normalized data to denormalized data. This transform changes the rows into columns. |
Row Count |
Use this transform to write the row count in a Data Flow to a variable. |
Row Sampling |
Use this transform to select a number of rows in the Data Flow. The number of rows is set in the transform. The unselected rows will follow a different path in the Data Flow. |
Script Component |
Use this transform to perform complex transforms that are not available in SSIS. This transform allows you to leverage the .NET Framework to perform just about any transform. |
Slowly Changing Dimension |
Use this transform to create a dimension load for a data warehouse. This is a wizard that will walk you through all the decision making in setting up a dimensional load. |
Sort |
Use this transform to order the data by a column or more than one column. This is similar to an “order by” command in T-SQL. |
Term Extraction |
Use this transform to find words in a Data Flow and create an output with the words listed and a score. |
Term Lookup |
Use this transform to compare to data in a Data Flow and determine if a word exists in the data. |
Union All |
Use this transform to combine two sets of data on top of each other. This is similar to the Union command in T-SQL. |
Unpivot |
Use this transform to convert denormalized data to normalized data. This transform changes the columns into rows. |
COMMON EXPRESSIONS AND SCRIPTS
PROBLEM |
QUICK SOLUTION |
Loop over a list of files and load each one. |
Tasks Required: Foreach Loop, Data Flow Task |
Conditionally executing tasks |
Solution: Double-click the precedence constraint and set the Evaluation property to Expression and Constraint. Type the condition that you want to evaluate in the Expression box. |
Move and rename the file at the same time. |
Tasks Required: File System Task |
Loop over an array of data in a table and perform a set of tasks for each row. |
Tasks Required: Execute SQL Task, Foreach Loop |
Perform an incremental load of data. |
Tasks Required: Two Execute SQL Tasks, Data Flow Task |
Perform a conditional update and insert. |
Components Required: Data Flow Task, Conditional Split, Lookup Transform or Merge Join, OLE DB Command Transform |
Create a file name with today’s date. |
Expression on the Flat File or File Connection Manager: |
Use a two-digit date. For example, retrieve a month in two-digit form (03 for March instead of 3). |
RIGHT("0"+(DT_WSTR,4)MONTH(Getdate()),2) |
Multiple condition if statement. In this example, the statement determines that if the ColumnName column is blank or null, it will be set to unknown. To make a Logical AND condition, use && instead of the || operator. |
ISNULL(ColumnName) || TRIM(ColumnName)== "" ? "Unknown" : ColumnName |
Return the first five characters from a zip code. |
Derived Column Transform in the Data Flow: |
Remove a given character from a string (example shows how to remove dashes from a Social Security number). |
Derived Column Transform in the Data Flow: |
Uppercase data |
Derived Column Transform in the Data Flow: |
Replace NULL with another value. |
Derived Column Transform in the Data Flow: |
Replace blanks with NULL values. |
Derived Column Transform in the Data Flow: |
Remove any non-numeric data from a column. |
Script Transform in the Data Flow Task with the code as follows: |
Convert text to proper case (first letter in each word uppercase). |
Script Transform with the line of partial code as follows (note that this code should go on one line): |