Sponsored by PRISE Ltd.
www.prisetools.com

How to optimize slow OUTER JOINs

Case description

We have a (LEFT or RIGHT) OUTER JOIN, and it runs a long time while causing skewed CPU / Spool usage. In practice most of the time during the query execution only 1 AMP will work, while others have nothing to do, causing poor parallel efficiency.

How to identify

The query typically runs long time, contains a "MERGE JOIN" step in the Explain description, and that step consumes skewed CPU consumption (MaxAMPCPUTime * Number-of-AMPS >> AMPCPUTime) and lasts long.

In the DBQL you should find skewed, high CPU usage queries (
dbc.DBQLogtbl.MaxAMPCPUTime * (hashamp()+1) / nullifzero(dbc.DBQLogtbl.AmpCPUTime) > 1.2 and dbc.DBQLogtbl.AMPCPUTime > 1000 , depends on system size) which also has "left outer join" expression in the execution plan text (dbc.DBQLExplaintbl.ExplainText like '%left outer joined using a merge join%')
This is only an approximation since the skewness causing step may be a different one.

PRISE Tuning Assistant supplies easy-to-use GUI based search function.

Explanation

Let's assume that we outer join Table1 and Table2 on a condition that causes no product join (merge join instead), eg.:


select Table1.x,Table2.y

from Table1

LEFT JOIN Table2 on Table1.x = Table2.x

...

If Table2 is not a "small" table, Teradata optimizer will choose to "equi-distribute" (place matchable records on the same AMP) the two tables on the join field(s), in our case: Table1.x and Table2.x respectively.
If Table1.x contains significant percentage of NULLs, then the distribution will be skewed, since all "x is NULL" records will get to the same AMP.
We know that the NULL value never results in a join match, so those records are useless to examine, but they have to appear in the resultset, since it is an OUTER JOIN.

Solution

Let's handle the Table1 into two separate subsets: NULL(x) and NotNULL(x), and modify the select this way:


select Table1.x,Table2.y

from Table1

LEFT INNER JOIN Table2 on Table1.x = Table2.x

where Table1.x is not null -- This will eliminate skewed spool

UNION ALL

select Table1.x,NULL

from Table1

where Table1.x IS NULL;

Practical example:
Some of our transactions are contributed by an operator, in this case OpID is filled, else null. We would like to query the number of transactions by operators including the non-contributed ones. Most of the transactions are non contributed ones (OpID is null).


select

a.Transaction_id, b.OperatorName as ContribOpName

from TransactionT a

LEFT JOIN OperatorT b on a.OpID = b.OpID

Optimized form:


select

  a.Transaction_id, b.OperatorName as ContribOpName

from TransactionT a

LEFT JOIN OperatorT b on a.OpID = b.OpID

where a.OpID is not null


UNION ALL

select

  a.Transaction_id, NULL as ContribOpName

from TransactionT

where OpID is null;

The execution will not cause a skewed CPU / Spool, because the those records of Table1 that caused peak ( x is NULL ) are excluded from processing of the join.
The second part will supply the "x is NULL" records to the result set without join processing.

The tradeoff is two full scans and a UNION ALL operation, which are comparably much less cost than a strongly skewed redistribution and a JOIN processing.

What's next

Next post will discuss unexpectedly slow INSERTs (hash collision).

Sponsored by PRISE Ltd.
www.prisetools.com

PRISE Teradata Database Optimization

Wednesday, September 25, 2013

Boost slow (LEFT/RIGHT) OUTER JOINs