Sponsored by PRISE Ltd.
www.prisetools.com

Why do I get "No more spool space" error?

This is the most familiar error message in Teradata world:

"Failure 2646 No more spool space"

What does it really mean, what is it caused by?

Let's get back to the basics.

What is spool space?

Spool space is a temporary area, that can be used to store the part-results during query processing, as well as volatile tables. All free space in the database that are not allocated by PERM data, technically can be used for spool area, since a PERM data does not want to allocate that area.

Each database users may have a "spool limit" that restricts the user to allocate more spool area at a time, than its limit. Keep in mind that all active sessions of a username must share the spool limit together.

Teradata is a massive parallel system, therefore the spool limit must be interpreted on AMP level:
Eg: 100AMP system, a user with 10G spool limit means: 100M spool/AMP

What is spool space limit good for?

This limitation method is a quite simple way to cut those queries from the system that would suck too much resources. There is no exact relationship between high spool usage and ineffective query, but statistically the correlation is high.
Practically: a bad query is being kicked off before is consumes too much resources unnecessarily.

No more spool space scenarios

System ran out of spool space

This is the most rare situation, forget about. There are too few free space on the system, but this situation used to be avoided by defining a "SpoolReserve" database, where no objects are created, this way that area is always available for spool.
If many "big spool limit" users run high spool queries parallel, then this rare situation can yet occure.

Multiple session of the user are active together

This is a quite rare situation also. Check the active users from dbc.sessioninfo.

Volatile tables

All existing volatile tables reside in your available spool space, reducing the available. If you create many, and even with skewed distribution, you can stuff your spool up. Choose "primary index" carefully, when defining volatile tables also.

Improper execution plan

These are the >90% of cases that cause the "No more spool space" errors. Let' see how:

"Duplication to all AMPs" of a non-small set of records
The root cause is typically missing or stale statistics. Either system thinks that much less records will be duplicated than the real (sometimes billions of records get in this kind of spools), or knows this exactly, but on the other branch of the query there are more low quality estimations, and this execution seems to be cheaper.
Redistribution of records by a hash that causes skewed distribution
Check the corresponding blog post: Accelerate skewed joins
Retrieve huge amount of records into spool (locally or redistributed onto the AMPs)
Specific query structures imply this execution, like: join to a view that "union all"-s big tables.

I suggest to use PRISE Tuning Assistant to identify what is the problem. It spectacularly displays which execution step falls in the problems above.
Increasing the spool limit will not solve the problems in the most cases.

Too big task

Sometimes a given SQL query requires big spool area to be performed, even with the best execution plan.

This is the only case when raising spool limit is the solution. But first you have to understand that the task is really big. PRISE Tuning Assistant is a good tool for identify this in a minute.