Teradata Parallel Data Pump

More documents

Recommendations

Info

Chapter 1: Overview Teradata TPump Utility How it Works • Optionally, Teradata TPump supports data serialization for a given row, which guarantees that if a row insert is immediately followed by a row update, the insert is processed first. This is done by hashing records to a given session. • Teradata TPump supports bulletproof restartability using time-based checkpoints. Using frequent checkpoints provides a greater ease in restarting, but at the expense of the checkpointing overhead. • Teradata TPump supports upsert logic similar to MultiLoad. • Teradata TPump supports insert/update/delete statements in multiple-record requests. • Teradata TPump uses macros to minimize network overhead. Before Teradata TPump begins a load, it sends the statements to Teradata Database to create equivalent macros for every insert/update/delete statement used in the job script. The execute macro requests, rather than lengthy text requests, are then executed iteratively during a job run. • Teradata TPump supports interpretive, record manipulating and restarting features similar to MultiLoad. • Teradata TPump supports conditional apply logic, similar to MultiLoad. • Teradata TPump supports error treatment options, similar to MultiLoad. • Teradata TPump runs as a single process. • Teradata TPump supports Teradata Database internationalization features such as kanji character sets. • Up to 2430 operations can be packed into a single request for network efficiency. The limit of 2430 may vary as the overall limit for a request is one megabyte. Teradata TPump assumes that every statement is a one- or two- (for fallback) step request. Teradata TPump is a Teradata utility with functions similar to the MultiLoad utility. MultiLoad edits Teradata tables by processing insert, updates, and deletes, and so does Teradata TPump. This section provides insight into the important differences between MultiLoad and Teradata TPump. All of the information in this section is discussed in further detail later in this document, either explicitly or by implication. Methods of Operation MultiLoad performs Teradata Database updates in phases. During the first phase of operation, MultiLoad uses a special database and CLIv2 protocol for efficiently sending large (64 KB) data messages to the database. The data is stored in a temporary table. During the second phase of operation, the temporary table is sorted, then changes from it are applied to various target tables. In this phase, processing is entirely in the database while MultiLoad on the client waits to see if the job completes successfully. Teradata TPump performs Teradata Database updates asynchronously. Changes are sent in conventional CLIv2 parcels and applied immediately to target tables. To improve its efficiency, Teradata TPump builds multiple statement requests and provides the serialize option to help reduce locking overhead. 18 Teradata Parallel Data Pump Reference
Chapter 1: Overview Teradata TPump Utility Economy of Scale and Performance MultiLoad performance improves as change volume increases because, in phase two of MultiLoad, changes are applied to target tables in a single pass. All changes for any physical data block are effected using one read and one write of the block. Furthermore, the temporary table and the sorting process used by MultiLoad are additional overheads that must be “amortized” through the volume of changes. Teradata TPump, on the other hand, performs better for relatively low change volume because there is no temporary table overhead. Teradata TPump becomes expensive for large volumes of data because multiple updates to a physical data block will most likely result in multiple reads and writes of the block. Loading No Primary Index (NoPI) Tables A NoPI table has no primary index. These tables can be used as staging tables where data is always appended to the table, making population of the table generally faster than that of a traditional table containing a primary index. NoPI tables could increase performance for Teradata TPump Array INSERT. Multiple Statement Requests The most important technique used by Teradata TPump to improve performance over MultiLoad is the multiple statement request. Placing more statements in a single request is beneficial for two reasons. First, it reduces network overhead because large messages are more efficient than small ones. Secondly, (in ROBUST mode) it reduces Teradata TPump recovery overhead, which amounts to one extra database row written for each request. Teradata TPump automatically packs multiple statements into a request based upon the PACK specification in the BEGIN LOAD command. Macro Creation Teradata TPump uses macros to efficiently modify tables rather than actual DML commands. The technique of changing statements into equivalent macros before beginning the job greatly improves performance. Specifically, the benefits of using macros are: • The size of network (and channel) messages sent to the database by Teradata TPump are reduced. • Teradata Database parsing engine overhead is reduced because the execution plans (or steps) for macros are cached and re-used. This eliminates normal parser handling, where each request sent by Teradata TPump is planned and optimized. Because the space required by macros is negligible, the only issue regarding macros is where they are placed in the database. Macros are put into the database that contains the restart log table or the database specified using the MACRODB keyword in the BEGIN LOAD command. Locking and Transactional Logic In contrast to MultiLoad, Teradata TPump uses conventional row hash locking, which allows for some amount of concurrent read and write access to its target tables. At any point, Teradata Parallel Data Pump Reference 19
Page 1 and 2: Teradata Parallel Data Pump Referen
Page 3 and 4: Preface Purpose This book provides
Page 5 and 6: Preface Additional Information Date
Page 7 and 8: Table of Contents Preface. . . . .
Page 9 and 10: Table of Contents Teradata TPump St
Page 11 and 12: Table of Contents Teradata TPump/No
Page 13 and 14: List of Tables Table 1: Teradata TP
Page 15 and 16: CHAPTER 1 Overview This chapter pro
Page 17: Chapter 1: Overview Teradata TPump
Page 21 and 22: Chapter 1: Overview Teradata TPump
Page 23 and 24: Chapter 1: Overview Operating Featu
Page 25 and 26: Chapter 1: Overview Operating Featu
Page 33 and 34: Chapter 1: Overview The Teradata TP
Page 35 and 36: Chapter 1: Overview The Teradata TP
Page 37 and 38: CHAPTER 2 Using Teradata TPump This
Page 39 and 40: Chapter 2: Using Teradata TPump Inv
Page 51 and 52: Chapter 2: Using Teradata TPump Ter
Page 53 and 54: Chapter 2: Using Teradata TPump Res
Page 55 and 56: Chapter 2: Using Teradata TPump Res
Page 57 and 58: Chapter 2: Using Teradata TPump Pro
Page 69 and 70:
Chapter 2: Using Teradata TPump Wri
Page 71 and 72:
Page 73 and 74:
Page 75 and 76:
Chapter 2: Using Teradata TPump Vie
Page 77 and 78:
Page 79 and 80:
Page 81 and 82:
Chapter 2: Using Teradata TPump Mon
Page 83 and 84:
Chapter 2: Using Teradata TPump Mon
Page 85 and 86:
Chapter 2: Using Teradata TPump Est
Page 87 and 88:
Chapter 2: Using Teradata TPump Est
Page 89 and 90:
CHAPTER 3 Teradata TPump Commands T
Page 91 and 92:
Chapter 3: Teradata TPump Commands
Page 93 and 94:
Page 95 and 96:
Page 97 and 98:
Page 99 and 100:
Page 101 and 102:
Page 103 and 104:
Page 105 and 106:
Page 107 and 108:
Page 109 and 110:
Page 111 and 112:
Page 113 and 114:
Page 115 and 116:
Page 117 and 118:
Page 119 and 120:
Page 121 and 122:
Page 123 and 124:
Page 125 and 126:
Page 127 and 128:
Page 129 and 130:
Page 131 and 132:
Page 133 and 134:
Page 135 and 136:
Page 137 and 138:
Page 139 and 140:
Page 141 and 142:
Page 143 and 144:
Page 145 and 146:
Page 147 and 148:
Page 149 and 150:
Page 151 and 152:
Page 153 and 154:
Page 155 and 156:
Page 157 and 158:
Page 159 and 160:
Page 161 and 162:
Page 163 and 164:
Page 165 and 166:
Page 167 and 168:
Page 169 and 170:
Page 171 and 172:
Page 173 and 174:
Page 175 and 176:
Page 177 and 178:
Page 179 and 180:
Page 181 and 182:
Page 183 and 184:
Page 185 and 186:
Page 187 and 188:
Page 189 and 190:
Page 191 and 192:
Page 193 and 194:
Page 195 and 196:
CHAPTER 4 Troubleshooting This chap
Page 197 and 198:
Chapter 4: Troubleshooting Error Me
Page 199 and 200:
Chapter 4: Troubleshooting Reading
Page 201 and 202:
Chapter 4: Troubleshooting Reading
Page 203 and 204:
CHAPTER 5 Using INMOD and Notify Ex
Page 205 and 206:
Chapter 5: Using INMOD and Notify E
Page 207 and 208:
Page 209 and 210:
Page 211 and 212:
Page 213 and 214:
Page 215 and 216:
Page 217 and 218:
Page 219 and 220:
Page 221 and 222:
Page 223 and 224:
Page 225 and 226:
Page 227 and 228:
Page 229 and 230:
APPENDIX A How to Read Syntax Diagr
Page 231 and 232:
Appendix A: How to Read Syntax Diag
Page 233 and 234:
Appendix A: How to Read Syntax Diag
Page 235 and 236:
APPENDIX B Teradata TPump Examples
Page 237 and 238:
Appendix B: Teradata TPump Examples
Page 239 and 240:
Page 241 and 242:
Page 243 and 244:
Page 245 and 246:
Page 247 and 248:
APPENDIX C INMOD and Notify Exit Ro
Page 249 and 250:
Appendix C: INMOD and Notify Exit R
Page 251 and 252:
Page 253 and 254:
Page 255 and 256:
Page 257 and 258:
APPENDIX D User-Defined-Types and U
Page 259 and 260:
Appendix D: User-Defined-Types and
Page 261 and 262:
Glossary A abend: Abnormal END of a
Page 263 and 264:
Glossary data loading: The process
Page 265 and 266:
Glossary join: result. A SELECT ope
Page 267 and 268:
Glossary script: or job. A file tha
Page 269 and 270:
Glossary z/OS (MVS (Multiple Virtua
Page 271 and 272:
Index Symbols - 46 &SYSAPLYCNT syst
Page 273 and 274:
Index function 30 syntax 186 usage
Page 275 and 276:
Index FastLoad 215 IBM interface 21
Page 277 and 278:
Index reduced print output runtime
Page 279 and 280:
Index THRU keyword IMPORT command 1
show all

Teradata Parallel Data Pump

Create successful ePaper yourself

Delete template?

Save as template?