Sponsored by PRISE Ltd.
www.prisetools.com
www.prisetools.com
Shall I run DELETE and INSERT separately?
We often have permanent or temporary tables in our data warehouse which must be purged before they are populated again. We can do the operation within a transaction or separately. It is not the same...Options - (example)
The examples assume Teradata (BTET) mode, which is Auto-commit by default- Together: both statements run in the same transaction
- Explicit transaction
begin transaction;
delete from TABLE1;
insert into TABLE1 SELECT .....;
end transaction;
- Multi-statement
delete from TABLE1
;insert into TABLE1 SELECT .....;
- Explicit transaction
- Separately: the statements run in separate transaction
delete from TABLE1;
insert into TABLE1 SELECT .....;
(Please note if you run it in Teradata Administrator aka. WinDDI, this will be run in one explicit transaction, since the application will place a BT; before the script and an ET; after.)
What is the difference?
The result
If everything is OK, we will not experience any difference at the end. The same data will be found in TABLE1 after we finished each ways. The difference shows if the INSERT step fails eg. for running out of perm space:- In A. cases the all transaction will be rolled back, therefore we will see all the data in the table we had before the delete.
- In B. case we will delete the data from the tab, independently the success of the INSERT phase. After a failed INSERT the table will be empty
We have another additional transient difference:
- In A. case we will not "see" an empty table from another session in any point of time: until a point the table seems as a "before" version, and when the transaction finished it will switch to an "after" image for other sessions in no time.
- In B. case when the DELETE finished, the table seems to be empty up to the moment of COMMITing the INSERT statement. So if the INSERT lasts 2 hours, others will face and empty TABLE1 for 2 hours.
So I do not recommend to use option B. if the "empty table" state is not valid.
The performance
The A. and B. options will be executed significantly different manner. The DELETE operation will know that all the data will be dropped, so internally it just administers the data blocks to be free instead of clearing their content record-by-record, BUT only if the statement is the only statement in the transaction! In this case only the data block administration must be logged (to be able to roll back for some reason) instead of every block changes. This results far less Transient Journal activity besides less block I/Os.Let's see some measurements.
I've put 10M records into a table, and then executed the same DELETE + INSERT statement on it. The results are the following:
- CPU: 9 sec I/O: 39K
- CPU >0.1sec I/O: 4K
Sponsored by PRISE Ltd.
www.prisetools.com
www.prisetools.com
This comment has been removed by the author.
ReplyDeleteHi Akos -
ReplyDeleteTransient Journal not used DELETE ALL command so that would make big performance change here, please add your insights.
e.g DELETE tablename ALL;