Sponsored by PRISE Ltd.
www.prisetools.com
www.prisetools.com
Delete or Insert? Which one is faster?
The question
If I got to delete rows from a table, practically I have two options:- Issue a DELETE command with the desired WHERE / JOIN condition
- Create an empty table with the same structure and INSERT the remaining records using the negated WHERE condition and the rename tables
The first one is simple and straightforward, the second is complicated and will require more perm space. So it is a question?
Yes, it is. Because of the backgrounds of the execution:
- DELETE
It will modify each blocks that contain record(s) to be deleted. It has to write Transient Journal (transaction log) to enable transaction related functionality. In addition those data blocks which fill-up-rate falls below a threshold (eg. 50%) they will be merged. All these may cause really lot of I/O overhead. In some cases the deletion of the rows will happen in "random-like" order therefore the I/O access will be random, which is painfully slow. - INSERT
In this case it is quite simple: we insert into an empty table, which causes almost zero Transient Journal data, and the system has to write only that amount of data blocks, what can store the records that will remain, once each block. An additional advantage: the I/O will always be "linear" instead of "random-like" therefore it can be multi-block operation, which is pretty fast.
The answer
No, absolutely not. DELETE is good enough if the amount of the records that must be deleted is not that much.We have two factors that will help us in the decision:
- How many records to be deleted?
- How many records will remain after deletion?
I do not have well defined universal thresholds where it is worth to choose this or that, it depends.
I used to use the CREATE+INSERT+RENAME method if
- The table is big enough (> > 0,1-0,5Gbytes or 10-50k rows / AMP)
and
I got to delete more than 20..40% of the records
OR
- The table has high hash collision (HC) rate on the PI
and
The table is MULTISET*
and
The delete is a JOIN-ed one
* If the HC rate is high and the table is SET then the INSERT will be very slow because of the duplication check method
Sponsored by PRISE Ltd.
www.prisetools.com
www.prisetools.com
No comments:
Post a Comment