25.09.2015 Views

Teradata Parallel Data Pump

Teradata Parallel Data Pump Reference - Teradata Developer ...

Teradata Parallel Data Pump Reference - Teradata Developer ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 3: <strong>Teradata</strong> T<strong>Pump</strong> Commands<br />

DML<br />

The Basic Upsert Feature<br />

Example Upsert<br />

When using the basic upsert feature:<br />

• There must be exactly two DML statements in this DML group.<br />

• The first DML statement must be an UPDATE statement that follows all of the <strong>Teradata</strong><br />

T<strong>Pump</strong> task rules.<br />

• The second DML statement must be an INSERT statement.<br />

• Both DML statements must refer to the same table.<br />

• The INSERT statement, when built, must reflect the same primary index specified in the<br />

WHERE clause of the UPDATE statement. This is true for both a single column primary<br />

index and a compound primary index.<br />

By following these rules, a number of uses for the DO INSERT ROWS option can be found. In<br />

the past, data could be presorted into INSERTs and UPDATEs, or UPDATEs attempted with<br />

all the data, and then do an INSERT on any UPDATEs that failed. With upsert, <strong>Teradata</strong><br />

T<strong>Pump</strong> needs only one pass of the data to UPDATE rows that need to be updated and INSERT<br />

rows that need to be inserted.<br />

Note: To ensure data integrity, the SERIALIZE parameter defaults to ON in the absence of an<br />

explicit value if there are upserts in the <strong>Teradata</strong> T<strong>Pump</strong> job.<br />

When MARK MISSING UPDATE ROWS specified, while using DO INSERT ROWS, <strong>Teradata</strong><br />

T<strong>Pump</strong> records any UPDATE that fails. This record appears in the Application Error Table,<br />

together with an error code that shows that the INSERT of the DO INSERT ROWS was then<br />

executed. If the INSERT fails, the INSERT row is also recorded in the Application Error table.<br />

The default for an upsert function, however, is not to mark missing update rows. This is<br />

because when the upsert function is performed, the INSERT is expected to occur when the<br />

UPDATE fails. The failure of the UPDATE portion of an upsert does not, in itself, constitute<br />

an error and should not be treated as one.<br />

The MARK MISSING DELETE ROW option has no meaning when used with the DO<br />

INSERT ROWS option.<br />

The option of MARK (IGNORE) EXTRA DELETE (UPDATE) ROWS provides <strong>Teradata</strong><br />

T<strong>Pump</strong> with a way to protect against an update or delete affecting multiple rows, which can<br />

happen in <strong>Teradata</strong> T<strong>Pump</strong> because the primary index can be non-unique.<br />

MARK is the default for all DML options, except for an upsert.<br />

Each record in the following example contains the value of the primary index column<br />

(EmpNo) of a row of the Employee table whose PhoneNo column is to be assigned a new<br />

phone number from field Fone.<br />

When the UPDATE fails, the INSERT statement is activated and <strong>Teradata</strong> T<strong>Pump</strong> enters the<br />

upsert mode. In this case, each record contains the primary index value (EmpNum) of a row<br />

that is to be inserted successively into the Employee table whose columns are EmpNo and<br />

PhoneNo.<br />

<strong>Teradata</strong> <strong>Parallel</strong> <strong>Data</strong> <strong>Pump</strong> Reference 123

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!