Showing posts with label DBQL. Show all posts
Showing posts with label DBQL. Show all posts

Thursday, April 17, 2014

Using Queryband

Sponsored by PRISE Ltd.
www.prisetools.com

How to use querybanding in Teradata?

What is queryband?

Teradata is a diligent RDBMS that runs sometimes millions of SQLs a day. You will see them in the DBQL (DataBase Query Logging area) - if it is switched on - but it's a hard job to know around in that mess of queries. How can I find a specific query? What did that query run by? If I want to analyze or modify something I need to find the source of the execution as exactly as can be.
Queryband is a labelling possibility to flag the queries to let their source job/report/etc. be easily found.

Who defines the queryband?

Setting the queryband is usually the responsibility of the query runner:

  • ETL software or solution that executes it
  • OLAP tool that issues it
  • Person, who runs it ad-hoc

How to set the queryband?

Technically it is a quite simple stuff: Teradata provides a command to set it:

SET QUERY_BAND = {'<variable1>=<value1>;<variable2>=<value2>;...' / NONE} [UPDATE] for SESSION/TRANSACTION;

, where:
<variable1>=<value1>;
Queryband can consist of arbitrary number of "variable"-"value" pairs. Both are string values. Do not forget to put the semicolon after each variable-value pair!

NONE: clears the queryband 

UPDATE: is specified, then those variables that has been previously defined are updated by the new value, others are added with the given value. Empty value string is a valid content and will not remove the variable. Please note that deleting a value is only possible by redefining the queryband without that specific variable.


SESSION/TRANSACTION: what it says...

Where can I check queryband?

The values are reflected in the dbc.SessionfoX.QueryBand and the dbc.DBQLogtbl.QueryBand. The following example shows its content:

SET QUERY_BAND='PROJECT=TeraTuningBlog;TASK=QB_example;' for session;

(For the logged in session)
SELECT queryband FROM dbc.sessioninfoX WHERE sessionNo=session;
----------------------------------------------------
PROJECT=TeraTuningBlog;TASK=QB_example;

(For the formerly ran queries)
SELECT queryband FROM dbc.dbqlogtbl WHERE Queryid=...;
----------------------------------------------------
=S> PROJECT=TeraTuningBlog;TASK=QB_example;

(For a specific variable, eg. "PROJECT")
SELECT QB_PROJECT FROM
(
   SELECT CAST((case when index(queryband,'PROJECT=') >0 then substr(queryband,index(queryband,'PROJECT=') ) else '' end) AS VARCHAR(2050)) tmp_PROJECT
     ,CAST( (substr(tmp_PROJECT,characters('PROJECT=')+1, nullifzero(index(tmp_PROJECT,';'))-characters('PROJECT=')-1)) AS VARCHAR(2050)) QB_PROJECT
   FROM dbc.sessioninfoX 
WHERE sessionNo=session
) x ;

----------------------------------------------------
TeraTuningBlog

(Which queries has been run by the "LoadCustomers" project?)
   SELECT a.*, CAST((case when index(queryband,'PROJECT=') >0 then substr(queryband,index(queryband,'PROJECT=') ) else '' end) AS VARCHAR(2050)) tmp_PROJECT
     ,CAST( (substr(tmp_PROJECT,characters('PROJECT=')+1, nullifzero(index(tmp_PROJECT,';'))-characters('PROJECT=')-1)) AS VARCHAR(2050)) QB_PROJECT
   FROM dbc.dbqlogtbl a 
WHERE QB_PROJECT="LoadCustomers"
;

Designing querybanding

We know how to set the queryband, it's quite easy to build in / configure in the ETL tool, OLAP software and other query running applications. But what variables should we define, and how should we populate them? I give a best practice, but it is just a recommendation, can be modified due according to your taste.

First of all, some things to mind:
  • Use short variable names and values, since they will be logged in each DBQL records
  • Define consistent structure in each source systems to easily analyze data
  • Record as detailed information as you need, not more, not less. Define unique values for those items you later want to differentiate. Using a lookup/hierarchy table you can easily merge what you need, but never can drill down what is aggregated.
I recommend these variables to be defined:
  • SYS: Maximum of 3 characters ID of the system that ran the Query, like INF (Informatica), MST (Microstrategy), SLJ (SLJM), BO (Business Objects), AH (ad-hoc query tool)
  • ENV: P (Production) / Tx (Test x) / Dx (Development x), the identifier of environment. x may be neglected, if it does not matter
  • JOB: Which job or report contains that specific query (the name of it)
  • STP: (Step) Which SQL script, or other sub-structure does the query belong to (name of it)
  • VER: Version of the JOB. This will determine the version of the script (if available)

Sponsored by PRISE Ltd.
www.prisetools.com

Thursday, February 20, 2014

DBQL analysis IV - Monitor index usage

Sponsored by PRISE Ltd.
www.prisetools.com

Analyze "Index usage" in DBQL

Please note that the solutions found in the article works on the DBQL logs, which covers only that users' activity, for whom the logging is switched on. "Object" option in DBQL is required "on" to use the scripts provided.

About Indices

Teradata provides possibility of creating INDEX objects for allowing alternative access path to the data records. They are quite different structures then the good old B*Tree or Bitmap indices (common in non-MPP RDBMSes)
The main goal of the index objects is to improve data access performance in exchange for storage and maintenance processing capacity.
Having indices is not free (storage and maintenance resources), those ones that bring not enough gain is better being dropped.

Index usage footprint in DBQL

If a query uses an index for accessing the data it is declared in the "explain text", and also registered in the DBQL: dbqlobjtbl. The appearance type of indices depend on the type of index. In case of primary/secondary index only IndexId and the columns of the index are registered, while join/hash indices appear like a regular table: at database, object and column levels all.

If the join/hash index is covering, the base table may not be listed in the DBQL objects, therefore be careful if analyze table usage from DBQLobjtbl.
I recommend to use PRISE Tuning Assistant to easily find all type of access to a table data.

Examples

Primary index access (2 columns):
Plan:
  1) First, we do a single-AMP RETRIEVE step from d01.z by way of the
     primary index "d01.z.i = 1, d01.z.j = 1" with no residual

DBQLobjtbl:
...ObjectDatabaseNameObjectTableNameObjectColumnNameObjectNumObjectType...
...D01Zi1Idx...
...D01Zj1Idx...

Secondary index access (1 column):
Plan:
  3) We do an all-AMPs RETRIEVE step from d01.x by way of index # 4
     without accessing the base table "d01.x.j = 1" with no residual
DBQLobjtbl:
...ObjectDatabaseNameObjectTableNameObjectColumnNameObjectNumObjectType...
...D01X
4Idx...

Join index:
create join index d01.ji as sel b from d01.q primary index (b);
select b from d01.q where a=1;
Plan:
  1) First, we do a single-AMP RETRIEVE step from D01.JI by way of the
     primary index "D01.JI.a = 1" with no residual conditions into
DBQLobjtbl:
...ObjectDatabaseNameObjectTableNameObjectColumnNameObjectNumObjectType...
...D01JI
0Jix...
...D01JIa1Idx...
...D01JIb1026Col...


Please note that
  • ObjectNum identifies the index (refers to dbc.indices.Indexnumber)
  • As many rows appeas as many columns the index has
  • Eg. in V13.10 Teradata Express the single column secondary index lacks the column name in the logs


Analyzing DBQL data 

Prepare data

CREATE VOLATILE TABLE DBQLIdx_tmp1 AS (
SELECT databasename,tablename,indexnumber,indextype,uniqueflag,Columnposition,Columnname
, SUM (1) OVER (partition BY databasename,tablename,indexnumber ORDER BY Columnname ROWS UNBOUNDED PRECEDING) ABCOrder
FROM dbc.indices WHERE indextype IN ('K','P','Q','S','V','H','O','I')
) WITH DATA
PRIMARY INDEX (databasename,tablename,indexnumber)
ON COMMIT PRESERVE ROWS


CREATE VOLATILE TABLE DBQLIdx_tmp2 AS (
WITH RECURSIVE idxs (Databasename,Tablename,Indexnumber,Indextype,Uniqueflag,Indexcolumns,DEPTH)
AS (
SELECT
databasename,tablename,indexnumber,indextype,uniqueflag,TRIM (Columnname) (VARCHAR (1000)),ABCorder
FROM DBQLIdx_tmp1 WHERE ABCorder = 1
UNION ALL
SELECT
b.databasename,b.tablename,b.indexnumber,b.indextype,b.uniqueflag,b.Indexcolumns||','||TRIM (a.Columnname),a.ABCOrder
FROM DBQLIdx_tmp1 a
JOIN idxs b ON a.databasename = b.databasename AND a.tablename = b.tablename AND a.indexnumber = b.indexnumber AND a.ABCOrder = b.Depth + 1
)
SELECT databasename db_name,tablename table_name,indextype,uniqueflag,indexcolumns
,indexnumber
,CASE WHEN uniqueflag = 'Y' AND indextype IN ('P','Q','K') THEN 'UPI'
WHEN uniqueflag = 'N' AND indextype IN ('P','Q') THEN 'NUPI'
WHEN uniqueflag = 'Y' AND indextype IN ('S','V','H','O') THEN 'USI'
WHEN uniqueflag = 'N' AND indextype IN ('S','V','H','O') THEN 'NUSI'
WHEN indextype = 'I' THEN 'O-SI'
ELSE NULL
END Index_code
FROM idxs
QUALIFY SUM (1) OVER (partition BY db_name,table_name,indexnumber ORDER BY DEPTH DESC ROWS UNBOUNDED PRECEDING) = 1
) WITH DATA
PRIMARY INDEX (db_name,table_name)
ON COMMIT PRESERVE ROWS
;

UPDATE a
FROM DBQLIdx_tmp2 a,DBQLIdx_tmp2 b
SET Index_code = 'PK'
WHERE a.db_name = b.db_name
AND a.table_name = b.table_name
AND a.Index_code = 'UPI'
AND a.indextype = 'K'
AND b.Index_code = 'NUPI'
AND b.indextype <> 'K'
;

Report: How many times have the indices been used?

You may need to modify the script:
  • Date filtering (use between for interval)
  • Online/archived DBQL: use commented section for archived
SELECT
  COALESCE(usg.objectdatabasename,idx.db_name) db
, COALESCE(usg.objecttablename,idx.table_name) tbl
, COALESCE(usg.ObjectNum,idx.IndexNumber) idxNo
, idx.Index_code
, idx.Indexcolumns Index_columnss
, coalesce(usg.drb,0) Nbr_of_usg
FROM
(SELECT objectdatabasename,objecttablename,objecttype,ObjectNum,COUNT (*) drb

--  Archived DBQL
--  FROM dbql_arch.dbqlobjtbl_hst WHERE logdate = '2014-02-20' (date)

--  Online DBQL
  FROM dbc.dbqlobjtbl WHERE  cast(collecttimestamp as char(10)) = '2014-02-20'
AND objecttablename IS NOT NULL
AND ((objecttype IN ('JIx','Hix') AND objectcolumnname IS NULL)
OR
(objecttype IN ('Idx'))
)
AND objectnum <> 1
GROUP BY 1,2,3,4
) usg
FULL OUTER JOIN
( SELECT db_name,table_name,Indextype,Uniqueflag,indexcolumns,Indexnumber,Index_code 

  FROM DBQLIdx_tmp2 a WHERE indextype NOT IN ('P','Q','K')
union all
SELECT databasename,tablename,tablekind,cast(null as char(1))

              ,cast(null as varchar(1000)),cast(null as smallint)
,case when tablekind='I' then 'JIX' else 'HIX' end from dbc.tables where tablekind in ('I','N')
) idx ON usg.objectdatabasename = idx.db_name
AND usg.objecttablename = idx.table_name
AND ((usg.ObjectNum = idx.IndexNumber) or usg.objecttype IN ('JIx','Hix'))
ORDER BY 6 DESC
;

Sponsored by PRISE Ltd.
www.prisetools.com

Monday, January 13, 2014

DBQL analysis III - Monitor "collect statistics"

Sponsored by PRISE Ltd.
www.prisetools.com

Analyze "collect statistics" in DBQL

Please note that the solutions found in the article works on the DBQL logs, which covers only that users' activity, for whom the logging is switched on. "Object" and "SQL" option in DBQL is required "on" to use the scripts provided.
This article is applicable up to V13.10 w/o modifications, statistics handling changed from V14. 

About Statistics

"Statistics" is a descriptive object in the Teradata database that are used by the optimizer for transforming SQLs to effective execution plans.
Statistics reflect the key data demographic information of one or more table column(s).
These objects should be created and maintained, the RDBMS will not do it by itself.
Statistics internally contain value histogram, which needs the table data (or sample) to be analyzed, which is an expensive task.

Summarized: appropriate statistics are required for getting good and effective executon plans for SQLs, but statistics consume resources to be collected or refreshed.

"Statistics" footprint in DBQL

When a "statistics" is created or refreshed it is executed by an SQL command: collect statistics....
This command will create a log entry into the DBQL if the logging is switched on.

One can track when, which "statistics" was collected, consuming how much CPU and I/O.
Those statements are very easy to identify in the central table:

select * from dbc.DBQLogTbl where StatementType='Collect statistics'

Analyzing DBQL data 

Prepare data

You may need to modify the script:
  • Date (interval)
  • Online/archived: use commented section
  • QueryBand: "JOB" variable is used, modify according to your ETL settings
create volatile table DBQLStat_tmp1
as
(
sel a.procId,a.QueryId,a.StartTime,a.AMPCpuTime,a.TotalIOCount

,case when a.querytext like '% sample %' then 'S' else 'F' end Full_Sample
,UserName,(FirstRespTime - StartTime) DAY(4) TO SECOND(4) AS RUNINTERVAL      
,(EXTRACT(DAY FROM RUNINTERVAL) * 86400 + EXTRACT(HOUR FROM RUNINTERVAL)  * 3600 + EXTRACT(MINUTE FROM RUNINTERVAL)  * 60 + EXTRACT(SECOND FROM RUNINTERVAL) ) (decimal(10,1)) Duration
,b.ObjectDatabaseName DatabaseName,b.ObjectTableName TableName,c.ObjectColumnName ColumnName
,case when d.SQLTextInfo like any ('%"PARTITION"%', '%,PARTITION %', '%,PARTITION,%', '% PARTITION,%', '% PARTITION %', '%(PARTITION,%', '%(PARTITION %', '%,PARTITION)%', '% PARTITION)%', '%(PARTITION)%') then 'Y' else 'N' end inclPartition
,CAST((case when index(queryband,'JOB=') >0 then  substr(queryband,index(queryband,'JOB=') ) else '' end) AS VARCHAR(500)) tmp_Q
,case when queryband = '' then 'N/A'
         when tmp_q = '' then '-Other'
else CAST( (substr(tmp_Q,characters('JOB=')+1, nullifzero(index(tmp_Q,';'))-characters('JOB=')-1)) AS VARCHAR(500)) end QB_info
,sum(1) over (partition by a.procid,a.Queryid order by c.ObjectColumnName, a.QueryID rows unbounded preceding) Rnk
from
/* For achived tables
     dbql_arch.DBQLogTbl_hst       a
join dbql_arch.DBQLObjTbl_hst      b on b.ObjectType='Tab' and a.procid=b.procid and a.QueryID=b.QueryID and a.logDate=b.logDate
left join dbql_arch.DBQLObjTbl_hst c on c.ObjectType='Col' and a.procid=c.procid and a.QueryID=c.QueryID and a.logDate=c.logDate
join dbql_arch.DBQLSQLTbl_hst      d on d.SQLRowNo=1       and a.procid=d.procid and a.QueryID=d.QueryID and a.logDate=d.logDate
where a.logDate=1140113
*/
/*end*/
/* For online tables */
     dbc.DBQLogTbl       a
join dbc.DBQLObjTbl      b on b.ObjectType='Tab' and a.procid=b.procid and a.QueryID=b.QueryID
left join dbc.DBQLObjTbl c on c.ObjectType='Col' and a.procid=c.procid and a.QueryID=c.QueryID
join dbc.DBQLSQLTbl      d on d.SQLRowNo=1       and a.procid=d.procid and a.QueryID=d.QueryID
where cast(cast(a.starttime as char(10)) as date) = '2014-01-13' (date)
/*end*/
and a.StatementType='Collect statistics'
) with data
primary index (procId,QueryId)
on commit preserve rows
;

create volatile table DBQLStat
as
(
WITH RECURSIVE rec_tbl
(
 procId,QueryId,StartTime,AMPCpuTime,TotalIOCount,Duration,Full_Sample,UserName,DatabaseName,TableName,QB_info,inclPartition,ColumnName,Rnk,SColumns
)
AS
(
select
 procId,QueryId,StartTime,AMPCpuTime,TotalIOCount,Duration,Full_Sample,UserName,DatabaseName,TableName,QB_info,inclPartition,ColumnName,Rnk,cast(case when ColumnName is null and inclPartition='Y' then '' else '('||ColumnName end as varchar(10000)) SColumns
from DBQLStat_tmp1 where Rnk=1
UNION ALL
select
  a.procId,a.QueryId,a.StartTime,a.AMPCpuTime,a.TotalIOCount,a.Duration,a.Full_Sample,a.UserName,a.DatabaseName,a.TableName,a.QB_info,a.inclPartition,a.ColumnName,a.Rnk,b.SColumns ||','||a.ColumnName
from DBQLStat_tmp1     a
join rec_tbl b on a.procId=b.ProcId and a.QueryId=b.QueryID and a.Rnk=b.Rnk+1
)
select   procId,QueryId,StartTime,AMPCpuTime,TotalIOCount,Duration,Full_Sample,UserName,DatabaseName,TableName,QB_info,Rnk NumOfColumns
        ,case when SColumns = '' then '(PARTITION)' else SColumns || case when inclPartition='Y' then ',PARTITION)' else ')' end end StatColumns
from rec_tbl qualify sum(1) over (partition by procid,queryid order by Rnk desc, QueryID rows unbounded preceding) = 1
) with data
primary index (procid,queryid)
on commit preserve rows
;

Reports

  • How many statistics has been collected for how much resources?

select
  UserName /*Or: DatabaseName*//*Or: Full_sample*/
, count(*) Nbr
, sum(AMPCpuTIme) CPU
, sum(TotalIOCount) IO
from DBQLStat
group by 1
order by 1
;
  • Which statistics has been collected multiple times?
    (If more days are in preapred data, frequency can be determined, erase "qualify")
select a.*,
sum(1) over (partition by databasename,tablename,statcolumns)  Repl
from DBQLStat a

/* Comment for frequency report*/
qualify sum(1) over (partition by databasename,tablename,statcolumns) > 1

/*end*/
order by repl desc, databasename,tablename,statcolumns
;


Sponsored by PRISE Ltd.
www.prisetools.com

Thursday, December 19, 2013

DBQL analysis II. - Monitor "Top skewed/impact queries"

Sponsored by PRISE Ltd.
www.prisetools.com

Find Skewed CPU and I/O in DBQL

Please note that the solutions found in the article works on the DBQL logs, which covers only that users' activity, for whom the logging is switched on.

About Skew

If you are not comfortably familiar with Skewness or DBQL, please read the corresponding posts before:

TeradataTuning - Skewness
TeradataTuning - DBQL I
As you can see the difference between Impact[Resource] and Total[Resource] is a net technological loss, which should be minimized.

Skewness is a good indicator for highlight those workload that wastes lots of valuable CPU and I/O resources because of inefficient parallelism.

Find bad queries in DBQL

With this SQL you can filter top impacting queries  (replace the date value or maybe you have to adjust the date filtering according to local settings):

select  top 50
  ProcID
, QueryID
, AMPCPUTime
, MaxAMPCPUTime * (hashamp () + 1) CPUImpact
, CAST (100 - ((AmpCPUTime / (hashamp () + 1)) * 100 / NULLIFZERO (MaxAMPCPUTime)) AS INTEGER) "CPUSkew%"
, TotalIOCount
, MaxAMPIO * (hashamp () + 1) IOImpact
, CAST (100 - ((TotalIOCount / (hashamp () + 1)) * 100 / NULLIFZERO (MaxAMPIO) ) AS INTEGER) "IOSkew%"
, AMPCPUTime * 1000 / nullifzero (TotalIOCount) LHR
, TotalIOCount / nullifzero (AMPCPUTime * 1000) RHL
, ParserCPUTime
, Queryband
, Substr(QueryText,1,2000) QueryText
from
/* For archived DBQL
    dbql_arch.dbqlogtbl_hst where logdate=1131201
and ampcputime>0
*/
/* For online DBQL*/
    dbc.dbqlogtbl where
    cast(cast(starttime as char(10)) as date) = '2013-12-18' (date)
    and ampcputime>0
order by CPUImpact desc


Explanation of extra fields:
ParserCPUTime: Time parser spent on producing the execution plan. This can be high if SQL is too complex or too many random AMP sampling has to be done.
LHR/RHL: Larry Higa ( inverse Larry Higa) index. Empirical index that shows the CPU vs I/O rate. By experience it should be usually around one (can be different depending on your system configuration, but is a constant). If it is far from 1, that indicates CPU or I/O dominance, which means unbalanced resource consumption, but it is a different dimension that skew.
QueryBand: Labels that sessions use to identify themselves within the DBQL logs
QueryText: First 200 characters of the query (depending on DBQL log settings)

OK, we've listed the terrible top consumers, but what's next?
Have to identify those queries. If your ETL and Analytics software is configured to user QueryBand properly (this area deserves a separate post...) , you can find which job or report issued that SQL, anyway, you can see the QueryText field.

If you want to get the full SQL text, select it from the DBQLSQLTbl (SQL logging needs to be switched on), replace the appropriate <procid> and <queryid> values:

select SQLTextInfo from dbc.dbqlsqltbl where procid=<procid> and queryid=<queryid>
order by SQLRowNo asc


You will get the SQL in several records, broken up to 30K blocks, simply concatenate them. Unfortunately the SQL will have very ugly make up, you can use PRISE Tuning Assistant to beautify and highlight it for easy reading.

System level Skewness

Totals

We've found those bad queries, nice. But what can we say about the whole system? What is the total parallel efficiency? Can we report how much resources were wasted due to bad parallel efficiency?
The answer is: yes, we can estimate quite closely. The exact value we cannot calculate because DBQL does not log AMP information for the query execution, but the most important metrics.
We can not calculate that situation when more skewed queries run the same time, but have peaks on different AMPs. This reduces the system level resource wasting, but is hard to calculate with, however its probability and effect is negligible now.

select
  sum(AMPCPUTime) AMPCPUTimeSum
, sum(MaxAMPCPUTime * (hashamp () + 1)) CPUImpactSum
, sum(TotalIOCount) TotalIOCountSum
, sum(MaxAMPIO * (hashamp () + 1)) IOImpactSum
, cast(100 - (AMPCPUTimeSum / CPUImpactSum) * 100 as integer) "CPUSkew%"
, cast(100 - (TotalIOCountSum / IOImpactSum) * 100 as integer) "IOSkew%"
from
/* For archived DBQL
    dbql_arch.dbqlogtbl_hst where logdate = '2013-12-18' (date)     

    and (ampcputime>0 or TotalIOCount > 0)
*/
/* For online DBQL*/
    dbc.dbqlogtbl where
    cast(cast(starttime as char(10)) as date) = '2013-12-18' (date)
    and (ampcputime>0 or TotalIOCount > 0)


Look at the last two columns. That percent of your CPU and I/O goes to the sink...

Top bad guys

OK, let's check how many queries accumulate 5%,10%,25%,50%,75%,90% of this loss?
Here you are (CPU version, transform for I/O implicitly):

select 'How many queries?' as "_",min(limit5) "TOP5%Loss",min(limit10) "TOP10%Loss",min(limit25) "TOP25%Loss",min(limit50) "TOP50%Loss",min(limit75) "TOP75%Loss",min(limit90) "TOP90%Loss", max(rnk) TotalQueries, sum(ResourceTotal) "TotalResource", sum(ResourceImpact) "ImpactResource"
from
(
select
 case when ResRatio < 5.00 then null else rnk end limit5
,case when ResRatio < 10.00 then null else rnk end limit10
,case when ResRatio < 25.00 then null else rnk end limit25
,case when ResRatio < 50.00 then null else rnk end limit50
,case when ResRatio < 75.00 then null else rnk end limit75
,case when ResRatio < 90.00 then null else rnk end limit90
,rnk
, ResourceTotal
, ResourceImpact
from
(
select
  sum(ResourceLoss) over (order by ResourceLoss desc ) totalRes
, sum(ResourceLoss) over (order by ResourceLoss desc  rows unbounded preceding) subtotalRes
, subtotalRes *100.00 / totalRes Resratio
, sum(1) over (order by ResourceLoss desc  rows unbounded preceding) rnk
, ResourceTotal
, ResourceImpact
from
(
select
  AMPCPUTime ResourceTotal
, (MaxAMPCPUTime * (hashamp () + 1)) ResourceImpact
,  ResourceImpact - ResourceTotal ResourceLoss
/* For archived DBQL
from dbql_arch.dbqlogtbl_hst where logdate=1131207
and ampcputime>0
*/
/* For online DBQL*/
from dbc.dbqlogtbl where
cast(cast(starttime as char(10)) as date) = '2013-12-18' (date)
and ampcputime>0
) x
) y
) z
group by 1



I expect you are a bit shocked now, how few queries waste how much golden resources.
I think we will agree that it is worth to tune those dozen of queries, and you save in orders of 100K..MUSD for your company annually, am I right?

PRISE Tuning Assistant helps you to find those queries and to get the hang of how to accelerate them.

Typical reasons of skewness - in a nutshell

  • Skewed tables: Bad choice of PI, Skewed data 
  • Bad execution plans (typically skewed redistributions)
    • Bad data model (normalization,data types,PI, etc.)
    • Missing or stale statistics
    • Too many joins (break up the query!)
  • Hash collision (load time problem)

Sponsored by PRISE Ltd.
www.prisetools.com

Tuesday, December 10, 2013

DBQL analysis I. - Monitor "Top CPU consumers"

Sponsored by PRISE Ltd.
www.prisetools.com

CPU usage distribution

About DBQL

What is it?


DataBase Query Logging.
It is a nice feature of Teradata RDBMS, which comprehensively logs the issued queries execution - if it is switched on.

Configuration can be checked/administered eg. in the Teradata tools or from DBC.DBQLRuleTbl.
Logging can be set on global/user level, and in respect of details (see DBQL tables)

For detailed information please refer Teradata documentation of your version.

DBQL tables


Table Content
DBQLogTbl Central table, 1 record for each query.
DBQLSQLTbl Whole SQL command, broken up to 30k blocks
DBQLStepTbl Execution steps of the query, one row for each step.
DBQLObjTbl Objects participated in the query. Logged on different levels (db,table, column, index, etc.)
DBQLExplainTbl English explain text, broken up to 30k blocks
DBQLXMLTbl Explain in XML format, broken up to 30k blocks
DBQLSummaryTbl PEs' aggregated table, which accounts on the desired level.

DBQL tables logically organized into 1:N structure, where DBQLogTbl is the master entity and others (except DBQLSummaryTbl) are the children.
Join fields are the ProcID and QueryId together, eg:
...
from DBQLogTbl a
join   DBQLStepTbl b on a.ProcID=b.ProcID and a.QueryID = b.QueryID
...
Unfortunately PI of DBQL tables are not in sync with logical PK-FK relation in (also in latest V14.10), therefore JOIN-ed selects against online DBQL tables are not optimal.

Cost of using DBQL

DBQL basically consumes negligible amount of processing resources, since it has cached&batch write and generates data proportional to issued queries (flush rate is DBScontrol parameter).
It is important to regularly purge/archive them from the DBC tables, Teradata has a recommendation for it. This ensures that PERM space consumption of the DBQL remains low.
In an environment where ~1M SQLs are issued a day, comprehensive logging generates  ~8..10G of DBQL data daily w/o XML and Summary. Less SQLs generate proportionally less data.

It is worth to switch on all option except XML and Summary, since the first generates huge data volume (~makes it double), and the second is similar to Acctg info. If you want to utilize them, they should be switched on, of course.

What is it good for?

It contains:
  • Query run time, duration
  • Consumed resources
  • Environment info (user, default db, etc)
  • SQL text
  • Explain
  • Step resource info
  • Objects involved
  • Etc.
One can get a lot of useful aggregated and query specific tuning information, some of them I will share in the blog.

CPU usage distribution info

(Everything applies to I/O also, just replace CPU with I/O, AMPCPUTime with TotalIOCount...)

Do you think Query optimization is rewarding?


Yes, I know it is hard work to find out why is ONE query run sub-optimally, and what to do with it.

But guess how many queries consume how many percent of the processing resources (CPU) within a whole day's workload.
Tip it and write down for CPU%: 5%, 10%, 25% and 50%

And now run the query below, which will result it to you. (replace the date value or maybe you have to adjust the date filtering according to local settings)

select 'How many queries?' as "_",min(limit5) "TOP5%CPU",min(limit10) "TOP10%CPU",min(limit25) "TOP25%CPU",min(limit50) "TOP50%CPU", max(rnk) TotalQueries
from
(
select
case when CPURatio < 5.00 then null else rnk end limit5
,case when CPURatio < 10.00 then null else rnk end limit10
,case when CPURatio < 25.00 then null else rnk end limit25
,case when CPURatio < 50.00 then null else rnk end limit50
,rnk
from
(
select
  sum(ampcputime) over (order by ampcputime desc ) totalCPU
, sum(ampcputime) over (order by ampcputime desc  rows unbounded preceding) subtotalCPU
, subtotalCPU *100.00 / totalCPU CPUratio
, sum(1) over (order by ampcputime desc  rows unbounded preceding) rnk
from
(
select *

/* For archived DBQL
from dbql_arch.dbqlogtbl_hst where logdate=1131201 

and ampcputime>0
*/
/* For online DBQL*/
from dbc.dbqlogtbl where
cast(cast(starttime as char(10)) as date) = '2013-12-10' (date) 

and ampcputime>0
) x
) y
) z
group by 1



Are you surprised?
I bet:
  • Less than 10 queries will consume 5% of the CPU
  • Less than  1% of the queries will consume 50% of the CPU
Let's calculate.
How much does your Teradata system cost a year? It is all for storage and processing capacity.
If you can save eg. X% of CPU&I/O and X% storage using MVC optimization, you saved X% of the price of the Teradata system, by:
  • Improved user experience (earlier load, faster responses)
  • Resources for additional reports and applications
  • Enable postponing a very expensive Teradata hardware upgrade

PRISE Tuning Assistant helps you to find those queries and to get the hang of how to accelerate them.

Sponsored by PRISE Ltd.
www.prisetools.com