Thursday, April 17, 2014

Using Queryband

Sponsored by PRISE Ltd.
www.prisetools.com

How to use querybanding in Teradata?

What is queryband?

Teradata is a diligent RDBMS that runs sometimes millions of SQLs a day. You will see them in the DBQL (DataBase Query Logging area) - if it is switched on - but it's a hard job to know around in that mess of queries. How can I find a specific query? What did that query run by? If I want to analyze or modify something I need to find the source of the execution as exactly as can be.
Queryband is a labelling possibility to flag the queries to let their source job/report/etc. be easily found.

Who defines the queryband?

Setting the queryband is usually the responsibility of the query runner:

  • ETL software or solution that executes it
  • OLAP tool that issues it
  • Person, who runs it ad-hoc

How to set the queryband?

Technically it is a quite simple stuff: Teradata provides a command to set it:

SET QUERY_BAND = {'<variable1>=<value1>;<variable2>=<value2>;...' / NONE} [UPDATE] for SESSION/TRANSACTION;

, where:
<variable1>=<value1>;
Queryband can consist of arbitrary number of "variable"-"value" pairs. Both are string values. Do not forget to put the semicolon after each variable-value pair!

NONE: clears the queryband 

UPDATE: is specified, then those variables that has been previously defined are updated by the new value, others are added with the given value. Empty value string is a valid content and will not remove the variable. Please note that deleting a value is only possible by redefining the queryband without that specific variable.


SESSION/TRANSACTION: what it says...

Where can I check queryband?

The values are reflected in the dbc.SessionfoX.QueryBand and the dbc.DBQLogtbl.QueryBand. The following example shows its content:

SET QUERY_BAND='PROJECT=TeraTuningBlog;TASK=QB_example;' for session;

(For the logged in session)
SELECT queryband FROM dbc.sessioninfoX WHERE sessionNo=session;
----------------------------------------------------
PROJECT=TeraTuningBlog;TASK=QB_example;

(For the formerly ran queries)
SELECT queryband FROM dbc.dbqlogtbl WHERE Queryid=...;
----------------------------------------------------
=S> PROJECT=TeraTuningBlog;TASK=QB_example;

(For a specific variable, eg. "PROJECT")
SELECT QB_PROJECT FROM
(
   SELECT CAST((case when index(queryband,'PROJECT=') >0 then substr(queryband,index(queryband,'PROJECT=') ) else '' end) AS VARCHAR(2050)) tmp_PROJECT
     ,CAST( (substr(tmp_PROJECT,characters('PROJECT=')+1, nullifzero(index(tmp_PROJECT,';'))-characters('PROJECT=')-1)) AS VARCHAR(2050)) QB_PROJECT
   FROM dbc.sessioninfoX 
WHERE sessionNo=session
) x ;

----------------------------------------------------
TeraTuningBlog

(Which queries has been run by the "LoadCustomers" project?)
   SELECT a.*, CAST((case when index(queryband,'PROJECT=') >0 then substr(queryband,index(queryband,'PROJECT=') ) else '' end) AS VARCHAR(2050)) tmp_PROJECT
     ,CAST( (substr(tmp_PROJECT,characters('PROJECT=')+1, nullifzero(index(tmp_PROJECT,';'))-characters('PROJECT=')-1)) AS VARCHAR(2050)) QB_PROJECT
   FROM dbc.dbqlogtbl a 
WHERE QB_PROJECT="LoadCustomers"
;

Designing querybanding

We know how to set the queryband, it's quite easy to build in / configure in the ETL tool, OLAP software and other query running applications. But what variables should we define, and how should we populate them? I give a best practice, but it is just a recommendation, can be modified due according to your taste.

First of all, some things to mind:
  • Use short variable names and values, since they will be logged in each DBQL records
  • Define consistent structure in each source systems to easily analyze data
  • Record as detailed information as you need, not more, not less. Define unique values for those items you later want to differentiate. Using a lookup/hierarchy table you can easily merge what you need, but never can drill down what is aggregated.
I recommend these variables to be defined:
  • SYS: Maximum of 3 characters ID of the system that ran the Query, like INF (Informatica), MST (Microstrategy), SLJ (SLJM), BO (Business Objects), AH (ad-hoc query tool)
  • ENV: P (Production) / Tx (Test x) / Dx (Development x), the identifier of environment. x may be neglected, if it does not matter
  • JOB: Which job or report contains that specific query (the name of it)
  • STP: (Step) Which SQL script, or other sub-structure does the query belong to (name of it)
  • VER: Version of the JOB. This will determine the version of the script (if available)

Sponsored by PRISE Ltd.
www.prisetools.com

Thursday, April 03, 2014

Using Partitioned Primary Index II

Sponsored by PRISE Ltd.
www.prisetools.com

How to choose partitioned primary index (PPI)

This post is an expand to my PPI basic post.

What is the difference between NPPI and PPI?

  • NPPI: Non partitioned primary index
    The good old regular PI. The rows are distributed by HASHAMP(HASHBUCKET(HASHROW(PI))), and ordered by HASHROW(PI), nothing special
  • PPI: Partitioned primary index
    Distribution is the same, but ordering different: <PartitionID><HASHROW(PI)>. The <PartitionID> is a stored value in each rows, allocating 2 or 8 bytes (see below).
The only difference is the storing order of the records (and the 2/8 bytes overhead).

What is PPI good for?

The partitioning feature - like in many other databases - usually solves some performance issues, say enables to eliminate some needless work in specific situations.
  • SELECT
    Eligible "where conditions" result serious partition-elimination, which means that usually only a small fraction of the table should be scanned instead of the whole one.
  • INSERT
    Check the storing order of the NPPI tables: the records are in "hash" order, that is if I want to insert a series of records into a Teradata table, they will reside in spreadly distributed data blocks. If the table is big enough, my new eg. 1000 records will get into ~1000 different data blocks, what means 1000 pieces of expensive "random writes". However if my 1000 records got to a PPI table and they will have the same PartitionID, they will get into far less than 1000 data blocks with high probability. In real life situations we often will write to continuous data blocks with much cheaper "sequential write" 
  • DELETE
    Same as INSERT
  • BACKUP
    Teradata allows archiving only one or more partitions, saving lot of time and tape. Older data in transaction tables usually does not change therefore it is unnecessary to backup them every time

"Good-to-know"s

Costs of partitioning

Like all good things in the world, partitioning has trade-offs also:
  • Extra 2/8 bytes per record allocated storage space
    Depending on maximal number of partitions. See "Number of partitions" chapter
  • Slower "SAMPLE" scans
    Proper random sampling is more complex, since the physical storing order is in correlation with partitioning value
  • Extra sort operations / Sliding window joins
    If joined to a table which has NPPI or PPI with not exactly same definition will result a preparation "sort" step, or leads to a "sliding window merge join", which is technically N x M merge joins between the partitions of TableA and TableB.

Number of partitions

How many partitions should I have?
How many partitions do I have?
How is an empty partition looks like?
They are all interesting questions, let's analyze the implementation of Teradata implementation.

Partition is not an object, it is just a calculated (and stored) value in the record, which will determine the physical storing order of the record. A partition will not allocate space, an "empty partition" technically means that no record exists with that partition's partitionID, nothing else.
How many partitions I have in the table? As many different PartitionID in the existing records occure, which depends on the occurring values of the partitioning column.
How many partitions can I have in the table? It depends on the table definition. One must use the RANGE_N or the CASE_N function to define the PartitionID calculation. Its definition unambiguously determines how many different PartitionID values may occur. In versions up to V13.10 65535 is allowed, from V14.00 we can have as many as 9.2 Quintillion (8 bytes PartitionID). The table definition cannot be altered to switch between 2 and 8 bytes layout.

What is the drawback of having many partition? The sliding-window merge join. Mind including partitioning column into the PI if possible (otherwise PI based filtering will cause as many accesses as many partitions exist).

What happens with the out-of-range records?

We have the clauses NO RANGE and NO CASE in the PPI definition. They mean an ID value for that partition that is out of the defined range or case, those records got into this partition. It can be a hidden trap, if you forget to maintain your date partition definition on a transaction table, and all records got to get into this partition from a moment. And the partition keeps fattening, queries keep go slowing somehow...

Multi level partitioning

This is a good trick. One can define partitioning "hierarchically", which is simply a "Cartesian product" of the partitions at each levels, the result is a single PartitionID. In case of 2 bytes partitioning, the "Cartesian product" should fall below 65535.

What is sensational in the Teradata implementation of multi level PPI? You can filter only lower level partitioning key(s) also, partition elimination will happen. How? It calculates all possible combinations, and produces the PartitionID list to be scanned, excellent.

Partitioning granularity

The next good question is: how fine should I define partitioning?
It depends. Basically I'd branch to two main cases:
  • "Temporal" (date) partitioning
    The best partition size is the day. Most of the filtering is on day level, and we have ~365 days a year, not too much partitions for your lifetime. If we partition on monthly units, then the partition elimination ranges are more rough, and we have 12 partitions a year, which is also too much in case of a PI-NPPI join.
  • All others
    It really depends. Depends on the goal, and the value demographics. It's good to correlate with the filtering pattern (what is the frequent relevant 'where' condition parcel).
Hope it helped, please ask, if something is missing or confusing.

Sponsored by PRISE Ltd.
www.prisetools.com

Tuesday, March 18, 2014

Storing date&time columns

Sponsored by PRISE Ltd.
www.prisetools.com

How to store date and time info effectively

Introduction

Data Warehouse databases usually contain significant amount of date/time information. Physical modeling technique can seriously influence their storage space and usability.

Aspects

Date/time information can be stored in different ways/data types, each of them will have its own specialities.
Basic options:
  • Joint storage: Timestamp
    • Timestamp(n) , when n means the fractional digits of seconds
  • Separate storage: Date & Time
    • Date column + Time column

Storage space

The data types require the following space (if uncompressed)

Type Space
Date 4 bytes
Integer time (integer format '99:99:99') 4 bytes
Time(n) 6 bytes, independent of n*, where n:[0..6]
Time(n) with time zone 8 bytes, independent of n*, where n:[0..6]
Timestamp(n) 10 bytes, independent of n*, where n:[0..6]
Timestamp(n) with time zone 12 bytes, independent of n*, where n:[0..6]
* n means the precision digits of second

Usage complexity

Teradata is not the most ergonomic for handling date-time data. Operations with these data types are typically tricky and sometimes hides traps (try add_months('2014-01-31',1) ). Conversion of a date and a timestamp is different, decisions must be made by considering storage and usage aspects.
  • Conversions
    • Date: implicit conversions work, easy and comfortable
    • Integer time: works fine, but insert-select will loose the formatting, only the integer value will remain
    • Time(n): implicit conversion to string is not working. This fails: select cast('2014-01-31' as date) || ' ' ||cast('12:00:00' as time(0))
    • Timestamp(n): brrr. Different precisions will not convert automatically either. I don't like it.
  • Filtering: comparing date/datetime values with < / <= /between operators
    • Joint storage (timestamps)
      Straightforward, just use the values - if they are equivalent data types
    • Separate storage
      You have to convert to a "joint" format, either a string or a timestamp before
  • Arithmetic
    • Date: ok, adding a constant, subtracting dates work fine
    • Integer time: do not use arithmetic, results are bad!
    • Time(n): interval types accepted. Not really comfortable, eg max 99 second long interval is accepted (V13.10)
    • Timestamp(n): same as Time(n)
    Regarding arithmetic I suggest building your own UDF library, that will ease your life.

Recommendations

Choosing data type

I recommend to choose data types depending on the table type and usage purposes.
I differentiate "transaction" and "all other" table types, because transaction tables are usually allocate most of the PERM space, while others are many in number, but allocate "negligible" space.
  • Transaction
    • Separate storage
    • Integer time
  • All others
    • Joint type (timestamp)

Saving space - store "delta"

The biggest tables in the data warehouses are the "transaction tables" (call/purchase/transfer/etc. transactions depending on industry), and most of them contain several date fields, most of them w/strong correlation. I explain what I mean. Let's assume a call record (telco), that will have the following date(&time) columns:
  • Channel_seizure
  • Call_start
  • Call_end
  • Bill_cycle_start
  • Bill_cycle_end
The date component of the first three columns are the same in 99% of the records, and the last ones differ from the first ones with max. of 30 days.

My recommendation is the following:
  • Choose a "primary date"
    Must be not null, and typically used as partitioning key also, since it is the most often date filtering condition.In our case this will be the Call_start
  • Choose separate date-time storing
    Eg. Date and Integer time , as this combination requires the least space
  • Store the non-primary dates as delta, multi value comressed
    Compute it in the load process, like this:
    Call_end_delta := Call_end-Call_start
  • Compress the "delta" columns
    They will reflect low deviation, highly compressible, use PRISE Compress Wizard
  • Convert to absolute dates back in the view layer
    Call_start + Call_end_delta as "Call_end"
Example:

CREATE TABLE T2000_CALL_TRX
(
...
Call_start_date Date NOT NULL
Call_end_date_delta Integer COMPRESS (0)
...
) PRIMARY INDEX (...,Call_start_date)
PARTITION BY RANGE_N ( Call_start_date BETWEEN date '2010-01-01' AND date '2020-12-31' EACH interval '1' day, NO RANGE, UNKNOWN);
;

CREATE VIEW V2000_CALL_TRX
as
SELECT
...
, Call_end_date_delta +Call_start_date as "Call_end_date"
...
FROM
T2000_CALL_TRX

;

Sponsored by PRISE Ltd.
www.prisetools.com

Friday, March 07, 2014

How to optimize a Teradata query?

Sponsored by PRISE Ltd.
www.prisetools.com

Teradata SQL optimization techniques

Introduction

The typical goal of an SQL optimization is to get the result (data set) with less computing resources consumed and/or with shorter response time. We can follow several methodologies depending on our experience and studies, but at the end we have to get the answers for the following questions:
  • Is the task really heavy, or just the execution of the query is non-optimal?
  • What is/are the weak point(s) of the query execution?
  • What can I do to make the execution optimal?

Methodologies

The common part of the methodologies that we have to understand - more or less - what is happening during the execution. The more we understand the things behind the scenes the more we can feel the appropriate point of intervention. One can start with the trivial stuff: collect some statistics, make indices, and continue with query rewrite, or even modifying the base table structures.

What is our goal?

First of all we should branch on what do we have to do:
  1. Optimize a specific query that has been running before and we have the execution detail info
    Step details clearly show where were the big resources burnt
  2. In general, optimize the non optimal queries: find them, solve them
    Like a.,but first find those queries, and then solve them one-by-one
  3. Optimize a query, that has no detailed execution info, just the SQL (and "explain")
    Deeper knowledge of the base data and "Teradata way-of-thinking" is required, since no easy and trustworthy resource peak-detecting is available. You have to imagine what will happen, and what can be done better

Optimization in practice

This section describes the case b., and expects available detailed DBQL data.
In this post I will not attach example SQL-s, because I also switched to use PRISE Tuning Assistant for getting all the requested information for performance tuning, instead of writing complex SQL queries and making heaps of paper notes.

Prerequisites

My opinion is that DBQL (DataBase Query Logging) is the fundamental basis of a Teradata system performance management - from SQL optimization point of view. I strongly recommend to switch DBQL comprehensively ON (SQL, Step, Explain, Object are important, excluding XML, that is huge, but actually has not too much extra), and use daily archiving from the online tables - just follow Teradata recommendation.

Finding good candidate queries

DBQL is an excellent source for selecting "low hanging fruits" for performance tuning. The basic rule: we can gain big save on expensive items only, let's focus on the top resource consuming queries first. But what is high resource consumption? I usually check top queries by one or more of these properties:
  • Absolute CPU (CPU totals used by AMPs)
  • Impact CPU (CPU usage corrected by skewness)
  • Absolute I/O (I/O totals used by AMPs)
  • Impact I/O   (Disk I/O usage corrected by skewness)
  • Spool usage
  • Run duration
PRISE Tuning Assistant supplies an easy to use and quick search function for that:



Finding weak point of a query

Examining a query begins with the following steps:
  • Does it have few or many "peak steps", that consume much resources? 
    • Which one(s)?
    • What type of operations are they?
  • Does it have high skewness?
    Bad parallel efficiency, very harmful
  • Does it consume extreme huge spool?
    Compared to other queries...
PRISE Tuning Assistant again.
Check the yellow highlights in the middle, those are the top consuming steps:

    Most of the queries will have one "peak step", that consumes most of the total resources. Typical cases:
    • "Retrieve step" with redistribution
      Large number of rows and/or skewed target spool
    • "Retrieve step" with "duplication-to-all-AMPs"
      Large number of rows duplicated to all AMPs
    • Product join
      Huge number of comparisons: N * M
    • Merge or Hash join
      Skewed base or prepared (spool) data
    • OLAP function
      Large data set or skewed operation
    • Merge step
      Skewness and/or many hash collisions
    • Any kind of step
      Non small, but strongly skewed result

    What can we do?

    Teradata optimizer tries its best when produces the execution plan for a query, however it sometimes lacks proper information or its algorithms are not perfect. We - as humans - may have additional knowledge either of the data or the execution, and we can spoil the optimizer to make better decisions. Let's see our possibilities.
    • Supplement missing / refresh stale statistics
    • Drop disturbing statistics (sometimes occurs...)
    • Restructure the query
    • Break up the query, place part result into volatile table w/ good PI and put statistics on
    • Correct primary index of target / source tables
    • Build secondary/join index/indices
    • Add extra components to the query.
      You may know some additional "easy" filter that lightens the work. Eg. if you know that the join will match for only the last 3 days data of a year-covering table, you can add a date filter, which cost pennies compared to the join.
    • Restrict the result requirements to the real information demand.
      Do the end-user really need that huge amount of data, or just a record of it?

    What should we do?

    First of all, we have to find the root cause(s). Why does that specific top step consume that huge amount or resources or executes so skewed? If we find the cause and eliminate, the problem is usually solved.
    My method is the following:
    1. Find the top consuming step, and determine why it it high consumer
      • Its result is huge
      • Its result is skewed
      • Its work is huge
      • Its input(s) is/are huge
    2. Track the spool flow backwards from the top step, and find
      • Low fidelity results (row count falls far from estimated row count)
      • NO CONFIDENCE steps, specifically w/low fidelity
      • Skewed spool, specifically non small ones
      • Big duplications, specifically w/NO CONFIDENCE
    3. Find the solution
      • Supplement missing statistics, typically on PI, join fields or filter condition
        NO CONFIDENCE, low fidelity, big duplications
      • Break up the query
        Store that part result into a volatile table, where fidelity is very bad, or spool is skewed. Choose a better PI for that
      • Modify PI of the target table
        Slow MERGE step, typical hash-collision problem.
      • Eliminate product joins
      • Decompose large product join-s
      • E.T.C.
    Have a good optimization! :)

    Sponsored by PRISE Ltd.
    www.prisetools.com

    Monday, March 03, 2014

    No more spool space

    Sponsored by PRISE Ltd.
    www.prisetools.com

    Why do I get "No more spool space" error?

    This is the most familiar error message in Teradata world:
    "Failure 2646 No more spool space"

    What does it really mean, what is it caused by?
    Let's get back to the basics.

    What is spool space?

    Spool space is a temporary area, that can be used to store the part-results during query processing, as well as volatile tables. All free space in the database that are not allocated by PERM data, technically can be used for spool area, since a PERM data does not want to allocate that area.

    Each database users may have a "spool limit" that restricts the user to allocate more spool area at a time, than its limit. Keep in mind that all active sessions of a username must share the spool limit together.

    Teradata is a massive parallel system, therefore the spool limit must be interpreted on AMP level:
    Eg: 100AMP system, a user with 10G spool limit means: 100M spool/AMP

    What is spool space limit good for?

    This limitation method is a quite simple way to cut those queries from the system that would suck too much resources. There is no exact relationship between high spool usage and ineffective query, but statistically the correlation is high.
    Practically: a bad query is being kicked off before is consumes too much resources unnecessarily.

    No more spool space scenarios

    System ran out of spool space

    This is the most rare situation, forget about. There are too few free space on the system, but this situation used to be avoided by defining a "SpoolReserve" database, where no objects are created, this way that area is always available for spool.
    If many "big spool limit" users run high spool queries parallel, then this rare situation can yet occure.

    Multiple session of the user are active together

    This is a quite rare situation also. Check the active users from dbc.sessioninfo.

    Volatile tables

    All existing volatile tables reside in your available spool space, reducing the available. If you create many, and even with skewed distribution, you can stuff your spool up. Choose "primary index" carefully, when defining volatile tables also.

    Improper execution plan

    These are the >90% of cases that cause the "No more spool space" errors. Let' see how:
    • "Duplication to all AMPs" of a non-small set of records
      The root cause is typically missing or stale statistics. Either system thinks that much less records will be duplicated than the real (sometimes billions of records get in this kind of spools), or knows this exactly, but on the other branch of the query there are more low quality estimations, and this execution seems to be cheaper. 
    • Redistribution of records by a hash that causes skewed distribution
      Check the corresponding blog post: Accelerate skewed joins
    • Retrieve huge amount of records into spool (locally or redistributed onto the AMPs)
      Specific query structures imply this execution, like: join to a view that "union all"-s big tables.
    I suggest to use PRISE Tuning Assistant to identify what is the problem. It spectacularly displays which execution step falls in the problems above.
    Increasing the spool limit will not solve the problems in the most cases. 

    Too big task

    Sometimes a given SQL query requires big spool area to be performed, even with the best execution plan.
    This is the only case when raising spool limit is the solution. But first you have to understand that the task is really big. PRISE Tuning Assistant is a good tool for identify this in a minute.

    Sponsored by PRISE Ltd.
    www.prisetools.com

    Friday, February 28, 2014

    Using Join Index

    Sponsored by PRISE Ltd.
    www.prisetools.com

    How to use Join Index (JI) in Teradata

    About JIs

    Join index is most like a "materialized view", say it is a stored result of an SQL SELECT , like a table: you can define the primary index (PI) of the stored result.

    What are the main differences between a JI and a Secondary Index?
    • Different internal structure and purposes
    • Appears as separate object (not a sub-table) in the DBC tables/views
    • Can reside in different database than the base table(s)
    • Maintenance is logged in separate DBQL STEP, therefore cost can be easily measured

    What is the difference between JI and a regular table in which I can store the same query result? 
    • You can't access it directly from an SQL query
    • Optimizer automatically decides to be used or not:
      • By logic: if the JI can support the query with its data content
      • By cost: if usage of JI will result a "cheaper" execution (always collect statistics!)
    • Maintained automatically if the content of the base table(s) change
    • "Locks" the base table(s) against DROP, RENAME, MLOAD, FASTLOAD, RESTORE and ALTER the indexed column(s).

    JI types

    The Join Index types I will list below are not differentiated by SQL phrase, but the structure of the SQL SELECT used in the JI definition.
    They can be combined also in reasonable ways, eg. <single table - aggregate - sparse> or <multi table - aggregate>, etc.

    Let's take these base tables for our examples:
    CREATE TABLE TBL_A
    (
      Col1 integer
    , Col2 integer
    , Col3 integer
    PRIMARY INDEX (Col1)
    ;
    CREATE TABLE TBL_B
    (
      Col1 integer
    , Col2 integer
    PRIMARY INDEX (Col1)
    ;

    Single table JI

    This is the most simple case of a join index. We include only one table, and typically choose different PI than the base table has. There are two significantly different kinds of usage:
    • Non-covering
      We select only the filtering column(s) (those will be the PI of the JI also) and the "Rowid" pseudo column in the JI definition. In this case the filter is strongly selective, and the rowids will be put to a spool to be joined to the base table's appropriate records. The JI can be very small this way, but note that we have an additional "join phase".
    • Covering
      The JI is selecting all columns (or all columns required by the SQL to be supported) . That means that the base table is not required to satisfy the query at all. This will result very fast operation. This case is typically used for eliminating frequent table redistributions of some "central" tables.
    This example shows a non-covering index for the query below:
    create join index JI_1
    as
    SELECT Col1,Col2
    FROM TBL_A
    PRIMARY INDEX(Col2)

    ;

    select Col3 from TBL_A where Col2=1;

     Multi table JI

    This kind of JI is for accelerating frequent join statements. Technically it stores the result of a join. It can cover all the columns of just store the key-pairs, or somewhere between.

    create join index JI_2
    as
    SELECT a.Col3,b.Col1
    FROM TBL_A a

    join     TBL_B b on a.Col2=b.Col2
    PRIMARY INDEX(Col3)
    ;

    Aggregate JI

    The JI's SELECT contains GROUP BY clause. This case is for caching frequent aggregations. Typically can be very useful for supporting those OLAP applications, that do not have internal aggregation-caching methods Teradata's optimizer is quite clever, because it can recognize "intermediate" aggregate JIs for further aggregation instead using the base table. Example:

    create join index JI_3
    as
    SELECT Col1,Col2,sum(Col3) X
    FROM TBL_A

    GROUP BY 1,2
    PRIMARY INDEX(Col2)
    ;

    All three SELECTs can be served from the JI_3:
    SQL1: select Col1        , sum(X) from TBL_A group by 1;
    SQL1: select Col2        , sum(X) from TBL_A group by 1;
    SQL1: select Col1,Col2, sum(X) from TBL_A group by 1,2;

    Sparse JI

    Thees JIs contain where clause, say the indexing is not for the whole table. If the where condition of the JI is a logical subset of the supported SQL's where condition than the JI can support the query. 
    Typical usage is on the transactional tables: we have frequent accessed on a transaction table by Customer_id, but have PI of Trx_id. We can observe that 90% of the SELECTs fall onto the last 30 days.
    We can put a sparse non covering single table JI on the table with PI:Customer_id
    90% of the selects will finish in no time, and 10% of the queries will result in a full table scan, meanwhile our JI size remains pretty small: we index only 30 days instead of eg. 365 days.

    Please note that where condition of a sparse JI can not contain "changing" values like current_date. That means JI must be regularly redefined if a moving window sparse JI is to be maintained.

    create join index JI_4
    as
    SELECT Col1,Col2,Col3 X
    FROM TBL_A

    WHERE Col3 between 1 and 10
    PRIMARY INDEX(Col2)
    ;

    Serveable query:
    select Col1 from TBL_A where Col2=20 and Col3=4;
    Non serveable query:
    select Col1 from TBL_A where Col2=20 and Col3=11;

    When will join index help?

    Join indices typically serve two purposes (or the combination of them):

    • Store the result of frequently produced (part-)results
    • Provide alternative access path to the data
    Practical examples:
    • Frequent aggregation or join operation
    • Frequent redistribution of a table
    • Non primary index based filtering on a table (causing full table scan)
    • Non primary index based join of a table (causing redistribution)

    Important things about JIs

    Statistics

    Always collect statistics on the PI of the JIs, it is essential for pushing optimizer to use them

    Choosing PI

    Te PI of the JI will be the most effective access path to the JI's records, like in case of tables, use the regular PI choosing methodology. Mind frequent access (bí which column(s) are filtered or joined), distribution(skewness) and hash collision.


    Sponsored by PRISE Ltd.
    www.prisetools.com

    Thursday, February 20, 2014

    DBQL analysis IV - Monitor index usage

    Sponsored by PRISE Ltd.
    www.prisetools.com

    Analyze "Index usage" in DBQL

    Please note that the solutions found in the article works on the DBQL logs, which covers only that users' activity, for whom the logging is switched on. "Object" option in DBQL is required "on" to use the scripts provided.

    About Indices

    Teradata provides possibility of creating INDEX objects for allowing alternative access path to the data records. They are quite different structures then the good old B*Tree or Bitmap indices (common in non-MPP RDBMSes)
    The main goal of the index objects is to improve data access performance in exchange for storage and maintenance processing capacity.
    Having indices is not free (storage and maintenance resources), those ones that bring not enough gain is better being dropped.

    Index usage footprint in DBQL

    If a query uses an index for accessing the data it is declared in the "explain text", and also registered in the DBQL: dbqlobjtbl. The appearance type of indices depend on the type of index. In case of primary/secondary index only IndexId and the columns of the index are registered, while join/hash indices appear like a regular table: at database, object and column levels all.

    If the join/hash index is covering, the base table may not be listed in the DBQL objects, therefore be careful if analyze table usage from DBQLobjtbl.
    I recommend to use PRISE Tuning Assistant to easily find all type of access to a table data.

    Examples

    Primary index access (2 columns):
    Plan:
      1) First, we do a single-AMP RETRIEVE step from d01.z by way of the
         primary index "d01.z.i = 1, d01.z.j = 1" with no residual

    DBQLobjtbl:
    ...ObjectDatabaseNameObjectTableNameObjectColumnNameObjectNumObjectType...
    ...D01Zi1Idx...
    ...D01Zj1Idx...

    Secondary index access (1 column):
    Plan:
      3) We do an all-AMPs RETRIEVE step from d01.x by way of index # 4
         without accessing the base table "d01.x.j = 1" with no residual
    DBQLobjtbl:
    ...ObjectDatabaseNameObjectTableNameObjectColumnNameObjectNumObjectType...
    ...D01X
    4Idx...

    Join index:
    create join index d01.ji as sel b from d01.q primary index (b);
    select b from d01.q where a=1;
    Plan:
      1) First, we do a single-AMP RETRIEVE step from D01.JI by way of the
         primary index "D01.JI.a = 1" with no residual conditions into
    DBQLobjtbl:
    ...ObjectDatabaseNameObjectTableNameObjectColumnNameObjectNumObjectType...
    ...D01JI
    0Jix...
    ...D01JIa1Idx...
    ...D01JIb1026Col...


    Please note that
    • ObjectNum identifies the index (refers to dbc.indices.Indexnumber)
    • As many rows appeas as many columns the index has
    • Eg. in V13.10 Teradata Express the single column secondary index lacks the column name in the logs


    Analyzing DBQL data 

    Prepare data

    CREATE VOLATILE TABLE DBQLIdx_tmp1 AS (
    SELECT databasename,tablename,indexnumber,indextype,uniqueflag,Columnposition,Columnname
    , SUM (1) OVER (partition BY databasename,tablename,indexnumber ORDER BY Columnname ROWS UNBOUNDED PRECEDING) ABCOrder
    FROM dbc.indices WHERE indextype IN ('K','P','Q','S','V','H','O','I')
    ) WITH DATA
    PRIMARY INDEX (databasename,tablename,indexnumber)
    ON COMMIT PRESERVE ROWS


    CREATE VOLATILE TABLE DBQLIdx_tmp2 AS (
    WITH RECURSIVE idxs (Databasename,Tablename,Indexnumber,Indextype,Uniqueflag,Indexcolumns,DEPTH)
    AS (
    SELECT
    databasename,tablename,indexnumber,indextype,uniqueflag,TRIM (Columnname) (VARCHAR (1000)),ABCorder
    FROM DBQLIdx_tmp1 WHERE ABCorder = 1
    UNION ALL
    SELECT
    b.databasename,b.tablename,b.indexnumber,b.indextype,b.uniqueflag,b.Indexcolumns||','||TRIM (a.Columnname),a.ABCOrder
    FROM DBQLIdx_tmp1 a
    JOIN idxs b ON a.databasename = b.databasename AND a.tablename = b.tablename AND a.indexnumber = b.indexnumber AND a.ABCOrder = b.Depth + 1
    )
    SELECT databasename db_name,tablename table_name,indextype,uniqueflag,indexcolumns
    ,indexnumber
    ,CASE WHEN uniqueflag = 'Y' AND indextype IN ('P','Q','K') THEN 'UPI'
    WHEN uniqueflag = 'N' AND indextype IN ('P','Q') THEN 'NUPI'
    WHEN uniqueflag = 'Y' AND indextype IN ('S','V','H','O') THEN 'USI'
    WHEN uniqueflag = 'N' AND indextype IN ('S','V','H','O') THEN 'NUSI'
    WHEN indextype = 'I' THEN 'O-SI'
    ELSE NULL
    END Index_code
    FROM idxs
    QUALIFY SUM (1) OVER (partition BY db_name,table_name,indexnumber ORDER BY DEPTH DESC ROWS UNBOUNDED PRECEDING) = 1
    ) WITH DATA
    PRIMARY INDEX (db_name,table_name)
    ON COMMIT PRESERVE ROWS
    ;

    UPDATE a
    FROM DBQLIdx_tmp2 a,DBQLIdx_tmp2 b
    SET Index_code = 'PK'
    WHERE a.db_name = b.db_name
    AND a.table_name = b.table_name
    AND a.Index_code = 'UPI'
    AND a.indextype = 'K'
    AND b.Index_code = 'NUPI'
    AND b.indextype <> 'K'
    ;

    Report: How many times have the indices been used?

    You may need to modify the script:
    • Date filtering (use between for interval)
    • Online/archived DBQL: use commented section for archived
    SELECT
      COALESCE(usg.objectdatabasename,idx.db_name) db
    , COALESCE(usg.objecttablename,idx.table_name) tbl
    , COALESCE(usg.ObjectNum,idx.IndexNumber) idxNo
    , idx.Index_code
    , idx.Indexcolumns Index_columnss
    , coalesce(usg.drb,0) Nbr_of_usg
    FROM
    (SELECT objectdatabasename,objecttablename,objecttype,ObjectNum,COUNT (*) drb

    --  Archived DBQL
    --  FROM dbql_arch.dbqlobjtbl_hst WHERE logdate = '2014-02-20' (date)

    --  Online DBQL
      FROM dbc.dbqlobjtbl WHERE  cast(collecttimestamp as char(10)) = '2014-02-20'
    AND objecttablename IS NOT NULL
    AND ((objecttype IN ('JIx','Hix') AND objectcolumnname IS NULL)
    OR
    (objecttype IN ('Idx'))
    )
    AND objectnum <> 1
    GROUP BY 1,2,3,4
    ) usg
    FULL OUTER JOIN
    ( SELECT db_name,table_name,Indextype,Uniqueflag,indexcolumns,Indexnumber,Index_code 

      FROM DBQLIdx_tmp2 a WHERE indextype NOT IN ('P','Q','K')
    union all
    SELECT databasename,tablename,tablekind,cast(null as char(1))

                  ,cast(null as varchar(1000)),cast(null as smallint)
    ,case when tablekind='I' then 'JIX' else 'HIX' end from dbc.tables where tablekind in ('I','N')
    ) idx ON usg.objectdatabasename = idx.db_name
    AND usg.objecttablename = idx.table_name
    AND ((usg.ObjectNum = idx.IndexNumber) or usg.objecttype IN ('JIx','Hix'))
    ORDER BY 6 DESC
    ;

    Sponsored by PRISE Ltd.
    www.prisetools.com

    Monday, January 13, 2014

    DBQL analysis III - Monitor "collect statistics"

    Sponsored by PRISE Ltd.
    www.prisetools.com

    Analyze "collect statistics" in DBQL

    Please note that the solutions found in the article works on the DBQL logs, which covers only that users' activity, for whom the logging is switched on. "Object" and "SQL" option in DBQL is required "on" to use the scripts provided.
    This article is applicable up to V13.10 w/o modifications, statistics handling changed from V14. 

    About Statistics

    "Statistics" is a descriptive object in the Teradata database that are used by the optimizer for transforming SQLs to effective execution plans.
    Statistics reflect the key data demographic information of one or more table column(s).
    These objects should be created and maintained, the RDBMS will not do it by itself.
    Statistics internally contain value histogram, which needs the table data (or sample) to be analyzed, which is an expensive task.

    Summarized: appropriate statistics are required for getting good and effective executon plans for SQLs, but statistics consume resources to be collected or refreshed.

    "Statistics" footprint in DBQL

    When a "statistics" is created or refreshed it is executed by an SQL command: collect statistics....
    This command will create a log entry into the DBQL if the logging is switched on.

    One can track when, which "statistics" was collected, consuming how much CPU and I/O.
    Those statements are very easy to identify in the central table:

    select * from dbc.DBQLogTbl where StatementType='Collect statistics'

    Analyzing DBQL data 

    Prepare data

    You may need to modify the script:
    • Date (interval)
    • Online/archived: use commented section
    • QueryBand: "JOB" variable is used, modify according to your ETL settings
    create volatile table DBQLStat_tmp1
    as
    (
    sel a.procId,a.QueryId,a.StartTime,a.AMPCpuTime,a.TotalIOCount

    ,case when a.querytext like '% sample %' then 'S' else 'F' end Full_Sample
    ,UserName,(FirstRespTime - StartTime) DAY(4) TO SECOND(4) AS RUNINTERVAL      
    ,(EXTRACT(DAY FROM RUNINTERVAL) * 86400 + EXTRACT(HOUR FROM RUNINTERVAL)  * 3600 + EXTRACT(MINUTE FROM RUNINTERVAL)  * 60 + EXTRACT(SECOND FROM RUNINTERVAL) ) (decimal(10,1)) Duration
    ,b.ObjectDatabaseName DatabaseName,b.ObjectTableName TableName,c.ObjectColumnName ColumnName
    ,case when d.SQLTextInfo like any ('%"PARTITION"%', '%,PARTITION %', '%,PARTITION,%', '% PARTITION,%', '% PARTITION %', '%(PARTITION,%', '%(PARTITION %', '%,PARTITION)%', '% PARTITION)%', '%(PARTITION)%') then 'Y' else 'N' end inclPartition
    ,CAST((case when index(queryband,'JOB=') >0 then  substr(queryband,index(queryband,'JOB=') ) else '' end) AS VARCHAR(500)) tmp_Q
    ,case when queryband = '' then 'N/A'
             when tmp_q = '' then '-Other'
    else CAST( (substr(tmp_Q,characters('JOB=')+1, nullifzero(index(tmp_Q,';'))-characters('JOB=')-1)) AS VARCHAR(500)) end QB_info
    ,sum(1) over (partition by a.procid,a.Queryid order by c.ObjectColumnName, a.QueryID rows unbounded preceding) Rnk
    from
    /* For achived tables
         dbql_arch.DBQLogTbl_hst       a
    join dbql_arch.DBQLObjTbl_hst      b on b.ObjectType='Tab' and a.procid=b.procid and a.QueryID=b.QueryID and a.logDate=b.logDate
    left join dbql_arch.DBQLObjTbl_hst c on c.ObjectType='Col' and a.procid=c.procid and a.QueryID=c.QueryID and a.logDate=c.logDate
    join dbql_arch.DBQLSQLTbl_hst      d on d.SQLRowNo=1       and a.procid=d.procid and a.QueryID=d.QueryID and a.logDate=d.logDate
    where a.logDate=1140113
    */
    /*end*/
    /* For online tables */
         dbc.DBQLogTbl       a
    join dbc.DBQLObjTbl      b on b.ObjectType='Tab' and a.procid=b.procid and a.QueryID=b.QueryID
    left join dbc.DBQLObjTbl c on c.ObjectType='Col' and a.procid=c.procid and a.QueryID=c.QueryID
    join dbc.DBQLSQLTbl      d on d.SQLRowNo=1       and a.procid=d.procid and a.QueryID=d.QueryID
    where cast(cast(a.starttime as char(10)) as date) = '2014-01-13' (date)
    /*end*/
    and a.StatementType='Collect statistics'
    ) with data
    primary index (procId,QueryId)
    on commit preserve rows
    ;

    create volatile table DBQLStat
    as
    (
    WITH RECURSIVE rec_tbl
    (
     procId,QueryId,StartTime,AMPCpuTime,TotalIOCount,Duration,Full_Sample,UserName,DatabaseName,TableName,QB_info,inclPartition,ColumnName,Rnk,SColumns
    )
    AS
    (
    select
     procId,QueryId,StartTime,AMPCpuTime,TotalIOCount,Duration,Full_Sample,UserName,DatabaseName,TableName,QB_info,inclPartition,ColumnName,Rnk,cast(case when ColumnName is null and inclPartition='Y' then '' else '('||ColumnName end as varchar(10000)) SColumns
    from DBQLStat_tmp1 where Rnk=1
    UNION ALL
    select
      a.procId,a.QueryId,a.StartTime,a.AMPCpuTime,a.TotalIOCount,a.Duration,a.Full_Sample,a.UserName,a.DatabaseName,a.TableName,a.QB_info,a.inclPartition,a.ColumnName,a.Rnk,b.SColumns ||','||a.ColumnName
    from DBQLStat_tmp1     a
    join rec_tbl b on a.procId=b.ProcId and a.QueryId=b.QueryID and a.Rnk=b.Rnk+1
    )
    select   procId,QueryId,StartTime,AMPCpuTime,TotalIOCount,Duration,Full_Sample,UserName,DatabaseName,TableName,QB_info,Rnk NumOfColumns
            ,case when SColumns = '' then '(PARTITION)' else SColumns || case when inclPartition='Y' then ',PARTITION)' else ')' end end StatColumns
    from rec_tbl qualify sum(1) over (partition by procid,queryid order by Rnk desc, QueryID rows unbounded preceding) = 1
    ) with data
    primary index (procid,queryid)
    on commit preserve rows
    ;

    Reports

    • How many statistics has been collected for how much resources?

    select
      UserName /*Or: DatabaseName*//*Or: Full_sample*/
    , count(*) Nbr
    , sum(AMPCpuTIme) CPU
    , sum(TotalIOCount) IO
    from DBQLStat
    group by 1
    order by 1
    ;
    • Which statistics has been collected multiple times?
      (If more days are in preapred data, frequency can be determined, erase "qualify")
    select a.*,
    sum(1) over (partition by databasename,tablename,statcolumns)  Repl
    from DBQLStat a

    /* Comment for frequency report*/
    qualify sum(1) over (partition by databasename,tablename,statcolumns) > 1

    /*end*/
    order by repl desc, databasename,tablename,statcolumns
    ;


    Sponsored by PRISE Ltd.
    www.prisetools.com

    Thursday, January 02, 2014

    PRISE Tuning Assistant use cases

    Sponsored by PRISE Ltd.
    www.prisetools.com

    What is PRISE Tuning Assistant (PTA) good for?

    PRISE developed this tool to help DBAs and developers solve performance issues quickly and easily. In this post I describe a couple of typical situations when it can assist you.
    A full functioning trial version of PRISE Tuning Assistant is available for download.

    In general, PTA application consist of two main functions:
    • Query finder
      Enables applying lots of search criteria like: thresholds, skewness, queryband, SQL text, Explain text, used objects, time window, etc.
    • Query analyzer
      Unfolds the execution process of one selected query, helps understanding the query and its performance bottlenecks. Collects all required supplementary information like table sizes, DDLs, statistics, etc. in one comprehensive dashboard, and provides a "walking tour" feature, you can roam in the execution plan with.
    All this enables even an average skilled Teradata expert to find which queries to be optimized and figure out the weak points of even a complex query in minutes.

    PTA's key architectural properties:
    • Supports active DBQL (dbc.dbqlogtbl and friends...) and also archived DBQL tables.
    • Optimized DBQL querying (fast response times and low resource usage)
    • Requires read only access to database
    • Portable, multi platform Java application, requires no installation
    Let's see typical use cases of the application.

    Find the TOP consuming queries

    The optimization's "low hanging fruits" are the top consuming queries, I mean CPU or I/O  stuffing SQL statements. They are often small in number, however consume significant percent of the total system resources, as I mentioned in this post.

    How to find them?
    Simply set your focused time interval and select the "CPU", "I/O" or "Duration" (for long time running ones) as TOP criteria, and click the "Get DBQL data" button. You will be given the list of the qualifying queries in descending order of your TOP criteria. After you can sort it by the other KPIs, like:
    • Skewness
    • Spool usage
    • LHR/RHL (Larry Higa ratio)
    • QueryBand (and specific variables of it)
    • Etc.

    Skewed CPU or I/O usage

    Teradata is a massive parallel system. It can utilize its full power if the work is spread evenly across the AMPs, otherwise (skewed queries) the response time increases dramatically while virtual resource consumption can remain low.

    How to find the bad guys?
    Start same as "TOP queries" case, but choose "CPU Impact" or "I/O Impact" as TOP criteria. This will result those queries that have the highest "Impact" (skew-normalized resource consumption) on the system.

    "No more spool space"

    Teradata uses Spool limiting in fact for cancelling those queries that have bad plans. Therefore "Failure 2646: No more spool space error" means a bad query execution plan, and the solution is usually not spool barrier raising. But often those spool limits are raised, until query is able to finish, and bad queries throttle the system... Why does a query usually run out of spool?
    • Duplicates a large spool/table to all AMPS
      Thinks it is small
    • Redistributes spool/table by a skewed value
      Thinks it is non-skewed
    • Processes (copies to spool) unreasonably large number of records
    First two cases are usually caused by missing or stale statistics that can be easily found and supplied.

    How to select?
    Start same as "TOP queries" case, but choose "Spool" as TOP criteria. This will result those queries that have the highest spool consumption, and SQL dashboard will help you to figure out which statistics should be collected. In addition you can filter for "Error 2646" with Error code option, which will result those queries that has reached the spool limit.

    Find a specific query

    The DBA gets a question: "My query runs slow, please help me!" - occasionally an SQL is attached in the email. PTA lets figuring out which query was it in the DBQL. You can filter on:
    • Username
    • Session ID
    • Full query text (with LIKE)
    • QueryID (internal DBQL ID)
    • QueryBand (with LIKE)
    • Error code

    Search disliked plan elements

    PTA enables text search in Explain text of the queries. Eg. find quickly:
    • Product joins
    • Sliding-window joins
    • "Redistributed by"
    If it is combined with CPU and/or I/O limitation (CPU >=; I/O >= criteria) then we can get those queries where they really cause problem.

    Object usage

    We have a table or index in the database, and we look for all the queries that utilize that object.
    It is very useful for:
    • Preparing for a PI change
      PI change usually triggered by skewed data distribution, but the more important factor is the access path it provides. Before you change a PI, you need to examine the workload, accessing that table. What is the typical filter/join criteria, does the current/new PI support good access path? PTA enables to do it quickly and easily.
    • Determine usage of a table
      How many times a table is accessed a day? Is it used extensively, or just kept loaded for nothing?
    • Determine usage of a hash/join index
      A hash/join index requires processing and storage resources. Is it pretty utilized, or not?
    How?
    Simply set the "Object used" filter citeria. In addition you can use the "Thru join/hash index" option for including the join/hash index based data accesses.

    Recurring queries

    Which query to be optimized? Maybe there are a medium consuming queries, but run lot of times a day, consuming much resource in total. Use the "Recurring query finder" function of the PTA to see aggregated information: total and average Duration, CPU and I/O.

    What was a query run by?

    If your ETL and OLAP systems use QueryBand properly, you can set specific variables (max of 6) to be parsed into separate columns in the result query list. Sort the result by it, and find which queries of a load job should be optimized.


    Sponsored by PRISE Ltd.
    www.prisetools.com