Teradata Parallel Data Pump
Teradata Parallel Data Pump Reference - Teradata Developer ...
Teradata Parallel Data Pump Reference - Teradata Developer ...
- No tags were found...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Chapter 1: Overview<br />
The <strong>Teradata</strong> T<strong>Pump</strong> Task<br />
Task Limits<br />
<strong>Teradata</strong> T<strong>Pump</strong> supports only single-row, primary index operations. Up to 2430 of these<br />
operations can be packed into a single request for network efficiency. The 2430 - statement<br />
upper limit is arbitrary and may actually be lower for statements associated with large data<br />
parcels that may exceed the overall limit of 64 KB for a request, or where a statement itself is<br />
very long.<br />
DML Commands<br />
Upsert Feature<br />
<strong>Teradata</strong> T<strong>Pump</strong> Macros<br />
DML commands appear with their associated INSERT, UPDATE, or DELETE DML<br />
statements, together with the IMPORT commands that identify data to be read from the<br />
client.<br />
<strong>Teradata</strong> T<strong>Pump</strong> DML statements support a conditional apply logic similar to MultiLoad, in<br />
which DML statements are applied based on record field contents.<br />
Specified DML statements following a DML command apply data from one or more separate<br />
data sources. The data sources contain a record for each table row to which one or more<br />
statements apply. Each IMPORT command identifies a separate data source, and references<br />
LAYOUT and DML commands. The IMPORT command matches records of the data source<br />
to the applicable DML statement or statements by means of its APPLY clauses.<br />
The LAYOUT command defines the layout of the records of a data source, using the<br />
parameters and a sequence of FIELD, FILLER, and TABLE commands. The DML command<br />
identifies an immediately following set of one or more DML statements.<br />
Each DML statement is converted into a macro and used for the duration of the import.<br />
As <strong>Teradata</strong> T<strong>Pump</strong> reaches the end of one data source, as identified by the IMPORT<br />
command, it continues with the next IMPORT command.<br />
<strong>Teradata</strong> T<strong>Pump</strong>’s upsert feature is a composite of UPDATE and INSERT functionality applied<br />
to a single row. <strong>Teradata</strong> T<strong>Pump</strong> upsert logic is similar to that used in MultiLoad, the only<br />
other load utility with this feature. The DML statements required to execute each iteration of<br />
upsert are a single UPDATE statement, followed by a single INSERT statement.<br />
With upsert, if the UPDATE fails because the target row does not exist, <strong>Teradata</strong> T<strong>Pump</strong><br />
automatically executes the INSERT statement. This capability can save considerable loading<br />
time by completing this operation in a single pass instead of two.<br />
Before beginning a load, <strong>Teradata</strong> T<strong>Pump</strong> creates equivalent macros on the database, based on<br />
the actual DML statements. That is, for every INSERT, UPDATE, DELETE, and UPSERT<br />
statement in the DML statement, <strong>Teradata</strong> T<strong>Pump</strong> creates an equivalent macro for it. These<br />
macros are then executed iteratively, in place of the actual DML statement, when an import<br />
task begins, and are removed when all import tasks are complete. The use of macros in place<br />
of lengthy requests helps to minimize network and parsing overhead.<br />
<strong>Teradata</strong> <strong>Parallel</strong> <strong>Data</strong> <strong>Pump</strong> Reference 33