Sponsored by PRISE Ltd.
www.prisetools.com
www.prisetools.com
Hiding enemy: the hidden "product" join
Introduction
When we check query performance one of the first things we search for is the PRODUCT JOIN. Usually this is the component we hate and pursue, because we belive that it is the root problem. It is not always true, but it is a different topic.Sometimes we find merge join, but it is still very slow, in spite there is no skewness, PIs seem OK.
So the question is: how can a simple merge join be so pathetically slow sometimes?
What can be the problem?
The real evil in this case is - my favourite - the hash collision.Let's imagine the tables below (transactions' columns splitted into two tables):
CREATE TABLE TRX_A (
User_id INTEGER,
User_trx_seq INTEGER
ColA...
ColB...
...
) WITH DATA
PRIMARY INDEX (User_id)
;
CREATE TABLE TRX_B (
User_id INTEGER,
User_trx_seq INTEGER
ColC...
ColD...
...
...
) WITH DATA
PRIMARY INDEX (User_id)
;
The individual users produce generally equal number of transaction in our cited case, so table skewness is not experienced. The INSERTs into these tables are not that fast, but the daily increment is not that big, so it makes no outstanding problem - in our imagined case.We have all required statistics either.
Let's take this query:
SELECT
...
...
FROM TRX_1 a
JOIN TRX_2 b on a.User_id =b.User_id and a.User_trx_seq = b.User_trx_seq
How does the optimizer think?
Distribution of the tables are excellent, both have equal PI and in addition the column is part of an "AND"-ed join condition. The matchable records reside on the same AMP - the only thing to do is evaluate the "residual" condition: a.User_trx_id = b.User_trx_id.
However it is not that simple.
In the beginning each user has few transactions, eg. 10.
What does the system do? Each records of User1 in TRX_1 table must be compared with 10 records from TRX_2. That is 100 comparisons.
What if each user has 1000 transactions? 1million comparisons for each user!!
The growth is in the order of X2, where X is the number of records per PI (exactly PI hash) value.
In this type of cases the number of records per PI value grows linearly or quicker as time goes by, what means longer and longer execution times.
Please note thet the total time required for the join step is determined by the maximum number of records per PI value, since all these records will fall to the same AMP, therefore that one will perform the slowest.
We can experience this problem in two cases:...
...
FROM TRX_1 a
JOIN TRX_2 b on a.User_id =b.User_id and a.User_trx_seq = b.User_trx_seq
How does the optimizer think?
Distribution of the tables are excellent, both have equal PI and in addition the column is part of an "AND"-ed join condition. The matchable records reside on the same AMP - the only thing to do is evaluate the "residual" condition: a.User_trx_id = b.User_trx_id.
However it is not that simple.
In the beginning each user has few transactions, eg. 10.
What does the system do? Each records of User1 in TRX_1 table must be compared with 10 records from TRX_2. That is 100 comparisons.
What if each user has 1000 transactions? 1million comparisons for each user!!
The growth is in the order of X2, where X is the number of records per PI (exactly PI hash) value.
In this type of cases the number of records per PI value grows linearly or quicker as time goes by, what means longer and longer execution times.
Please note thet the total time required for the join step is determined by the maximum number of records per PI value, since all these records will fall to the same AMP, therefore that one will perform the slowest.
What is the solution?
- Sources are regular tables of our datamodel
- Sources (at least one of them) is a Spool during a query execution
Another option is to do a pre-fetch from the TRX_1 and TRX_2 tables into volatile tables having PI: (User_id,User_trx_seq). But note that it may be expensive, if we cannot apply a pretty selective filtering. This means redistribution of the tables.
In the second case we may apply the method of storing the partial result. Put the Spool into a volatile table with the appropriate PI (User_id,User_trx_seq), and this will hopefully enforce the other table/spool to be redistributed to this hash-collision free distribution.
How can we catch those guys?
Unfortunately the "explain plan" will not have a blinking LED to highlight this situation, we need to have an experience to suspect that someting's wrong. The main indicator is the cost/input ratio. Check how much records are in the bigger source table and compare with the step's impactCPU, while your other eye is on the sources' and target's skew. If the skews are low, but the CPU (or I/O) is unreasonably high, you probably got it!Does it seem complicated? Maybe yes, but quite simple if using PRISE Tuning Assistant:
Sponsored by PRISE Ltd.
www.prisetools.com
www.prisetools.com
As always, awesome Akos; Glad that you are back -- Cheeli
ReplyDeleteThanks Cheeli, I have some new ideas, hopefully will have enough time to share
ReplyDelete