After performing RUNSTATS on a non-empty table, why does its cardinality still equal to 0?
A 'select count ...' query on the table returns a non-zero value, while
a 'select card from syscat.tables where tabname='<tablename>' and
tabschema='<schemaname>' returns 0.
RUNSTATS was invoked from a partition which does not contain the table's data
The RUNSTATS command collects statistics for a
table on the database partition from which it is invoked. If the table
does not exist on that database partition, the first database partition
in the database partition group is selected.
Therefore, in a DPF environment, RUNSTATS needs to be invoked from a partition that contains the table's tablespace?
The design of RUNSTATS is such that it will
collect information on only one partition, and if it is started on a
partition that is not part of the node group for the tablespace, it will
locate the first partition
from the node group and perform the
collection there. If the table cardinality is 0, it suggests that
RUNSTATS was invoked on a partition that does not contain table data;
therefore, if user did not export to a particular node/partition before
running RUNSTATS, then it suggests that the first partition in the
partition group does not contain table data.
Such a situation
is possible if the table data is partitioned by a distribution key, and
the distribution key is sub-optimal (thus, distribution is skewed and
some partitions may not have data).
Suppose RUNSTATS was
running from a partition that does contain table data, RUNSTATS (by
design) would calculate the value for the total number of rows in the
table by simply extrapolating
the count from that one partition and then multiplying it by the total number of partitions on which the tablespace is found.
Therefore, to ensure that a non-zero and accurate cardinality is returned:
1. Run the following query to determine how table data is distributed,
select dbpartitionnum (<column>), count (*) from <schema>.<table> group by dbpartitionnum (<column>)
then export DB2NODE to a partition that contains table data, then run RUNSTATS from that partition.
2. Ensure that the distribution key used is well-designed and data distribution is as even as possible across the partitions.