Sub-Partition Histogram Analysis (dhstgrmsub.sql)

What This Script Does

This script analyzes column histograms for sub-partitioned tables, providing detailed statistics about data distribution within each sub-partition. It helps DBAs understand how the optimizer estimates cardinality for complex partitioning schemes and identify potential performance issues with skewed data distributions.

Script

rem dhstgrm.sql
rem
set linesize 132
rem
ttitle 'Sub-Partition Histograms'
col OWNER format a8 heading 'OWNER'
col TABLE_NAME format a15 heading 'TABLE NAME'
col SUBPARTITION_NAME format a14 heading 'SUBPARTITION'
col COLUMN_NAME format a17 heading 'COLUMN NAME'
col DATA_TYPE format a9 heading 'DATA TYPE'
col nullable format a1 heading 'NULL?'
col num_distinct format 9,999,999 heading 'NUM|DISTINCT'
col low_value  heading 'LOW VALUE'
col high_value heading 'HIGH VALUE'
col density format 0.9999 heading 'DENSITY'
col num_nulls format 9,999,999 heading 'NUM|NULLS'
col num_buckets format 999 heading 'BUCKETS'
col ENDPOINT_NUMBER format 9,999,999 heading 'ENDPOINT#'
col BUCKET_NUMBER format 9,999,999 heading 'BUCKET#'
rem col ENDPOINT_VALUE format 999999999999999999 heading 'ENDPOINT VALUE'
col ENDPOINT_VALUE heading 'ENDPOINT VALUE'
col ENDPOINT_ACTUAL_VALUE format a10 trunc heading 'ENDPOINT|ACTUAL|VALUE'
rem
break on owner on table_name on column_name -
 on DATA_TYPE -
 on nullable -
 on subpartition_name -
 on num_distinct -
 on low_value -
 on high_value -
 on density -
 on num_nulls -
 on num_buckets
rem
select
 c.OWNER,
 c.TABLE_NAME,
 c.COLUMN_NAME,
 c.DATA_TYPE,
 decode(c.NULLABLE, 'Y', null, c.NULLABLE) nullable,
 h.subpartition_name,
 pc.num_distinct,
-- c.low_value,
-- c.high_value,
 pc.density,
 pc.num_nulls,
 pc.num_buckets,
 -- h.BUCKET_NUMBER,
 -- h.ENDPOINT_VALUE,
 -- h.ENDPOINT_ACTUAL_VALUE
 count(*)
from dba_subpart_histograms h,
     DBA_SUBPART_COL_STATISTICS pc,
     dba_tab_columns c
where h.owner like nvl(upper('&owner'),'%')
  and h.table_name like nvl(upper('&table'),'%')
  and h.column_name like nvl(upper('&column'),'%')
  and h.subpartition_name like nvl(upper('&subpartition'),'%')
  and c.owner=h.owner
  and c.table_name=h.table_name
  and c.column_name=h.column_name
  and pc.owner=h.owner
  and pc.table_name=h.table_name
  and pc.subpartition_name=h.subpartition_name
  and pc.column_name=h.column_name
group by
 c.OWNER,
 c.TABLE_NAME,
 c.COLUMN_NAME,
 c.DATA_TYPE,
 decode(c.NULLABLE, 'Y', null, c.NULLABLE),
 h.subpartition_name,
 pc.num_distinct,
 pc.density,
 pc.num_nulls,
 pc.num_buckets
order by
 c.OWNER,
 c.TABLE_NAME,
 c.COLUMN_NAME,
 h.subpartition_name
 --, h.BUCKET_NUMBER
/
rem
set linesize 80

Usage

-- Run the script in SQL*Plus or SQLcl
@dhstgrmsub.sql

-- When prompted, enter parameters:
-- owner: Schema owner (% for all)
-- table: Table name pattern (% for all)
-- column: Column name pattern (% for all)
-- subpartition: Sub-partition name pattern (% for all)

-- Examples:
Enter value for owner: SALES
Enter value for table: ORDERS
Enter value for column: %
Enter value for subpartition: %

-- Specific column analysis
Enter value for owner: DW
Enter value for table: FACT_SALES
Enter value for column: PRODUCT_ID
Enter value for subpartition: %

Parameters

&owner - Schema owner (supports wildcards)
&table - Table name pattern
&column - Column name pattern
&subpartition - Sub-partition name pattern

Required Privileges

SELECT on DBA_SUBPART_HISTOGRAMS
SELECT on DBA_SUBPART_COL_STATISTICS
SELECT on DBA_TAB_COLUMNS
Typically requires DBA role

Sample Output

Sub-Partition Histograms

OWNER    TABLE NAME      SUBPARTITION   COLUMN NAME       DATA TYPE     NUM   NUM      BUCKETS COUNT(*)
                                                         NULL? DISTINCT NULLS
-------- --------------- -------------- ----------------- --------- --------- --------- ------- --------
SALES    ORDERS          P2023Q1_EAST   CUSTOMER_ID       NUMBER           1234      45      75      127
                                        ORDER_STATUS      VARCHAR2            5       0       4        5
                                        PRIORITY          VARCHAR2            3       0       3        3

         ORDERS          P2023Q1_WEST   CUSTOMER_ID       NUMBER           2345      67      75      127
                                        ORDER_STATUS      VARCHAR2            5       0       4        5
                                        PRIORITY          VARCHAR2            3       0       3        3

         ORDERS          P2023Q2_EAST   CUSTOMER_ID       NUMBER           3456      89      75      127
                                        ORDER_STATUS      VARCHAR2            4       0       4        4
                                        PRIORITY          VARCHAR2            3       0       3        3

Key Output Columns

OWNER - Schema owning the table
TABLE NAME - Name of the partitioned table
SUBPARTITION - Sub-partition name
COLUMN NAME - Column with histogram
DATA TYPE - Column data type
NULL? - Nullable indicator
NUM DISTINCT - Number of distinct values
NUM NULLS - Count of null values
BUCKETS - Number of histogram buckets
COUNT(*) - Number of histogram entries

Understanding Sub-Partition Histograms

Histogram Types

FREQUENCY - One bucket per distinct value
HEIGHT BALANCED - Equal number of rows per bucket
HYBRID - Combination for popular values

Benefits

Improved cardinality estimates for complex queries
Better execution plans for partitioned joins
Accurate statistics for partition-wise operations

Common Use Cases

Analyze Data Skew

SELECT owner, table_name, subpartition_name,
       column_name, num_distinct, num_nulls,
       ROUND(num_nulls/NULLIF(num_rows,0)*100,2) null_pct
FROM dba_subpart_col_statistics
WHERE owner = '&owner'
AND table_name = '&table'
ORDER BY null_pct DESC;

Compare Distribution Across Sub-Partitions

SELECT table_name, column_name,
       COUNT(DISTINCT subpartition_name) subparts,
       MIN(num_distinct) min_distinct,
       MAX(num_distinct) max_distinct,
       STDDEV(num_distinct) stddev_distinct
FROM dba_subpart_col_statistics
WHERE owner = '&owner'
GROUP BY table_name, column_name
HAVING STDDEV(num_distinct) > 100
ORDER BY stddev_distinct DESC;

Find Columns with No Histograms

SELECT c.table_name, c.column_name
FROM dba_tab_columns c
WHERE c.owner = '&owner'
AND c.table_name IN (
    SELECT table_name
    FROM dba_part_tables
    WHERE partitioning_type = 'RANGE'
    AND subpartitioning_type IS NOT NULL
)
AND NOT EXISTS (
    SELECT 1 FROM dba_subpart_histograms h
    WHERE h.owner = c.owner
    AND h.table_name = c.table_name
    AND h.column_name = c.column_name
);

Performance Impact

Query Optimization

-- Check optimizer estimates vs actual
EXPLAIN PLAN FOR
SELECT COUNT(*) FROM table_name SUBPARTITION (subpart_name)
WHERE column_name = 'value';

SELECT operation, object_name, cardinality, cost
FROM plan_table
WHERE plan_id = (SELECT MAX(plan_id) FROM plan_table);

Partition Pruning

-- Verify partition pruning with histograms
SELECT partition_start, partition_stop,
       access_predicates, filter_predicates
FROM v$sql_plan
WHERE sql_id = '&sql_id'
AND operation = 'PARTITION RANGE';

Maintenance and Monitoring

Gather Sub-Partition Statistics

-- Gather stats for specific sub-partition
BEGIN
  DBMS_STATS.GATHER_TABLE_STATS(
    ownname => 'OWNER',
    tabname => 'TABLE_NAME',
    partname => 'SUBPARTITION_NAME',
    estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE,
    method_opt => 'FOR ALL COLUMNS SIZE AUTO'
  );
END;
/

Monitor Stale Statistics

SELECT owner, table_name, subpartition_name,
       last_analyzed,
       ROUND(SYSDATE - last_analyzed) days_old,
       stale_stats
FROM dba_tab_subpartitions
WHERE owner = '&owner'
AND stale_stats = 'YES'
ORDER BY last_analyzed;

Best Practices

Histogram Management

Gather statistics after significant data changes
Use AUTO method for histogram creation
Monitor histogram effectiveness
Consider manual histogram creation for skewed data

Performance Tuning

Ensure statistics are current
Use partition-wise joins when possible
Monitor execution plans for accuracy
Consider histogram size for large tables

Troubleshooting

Missing Histograms

-- Find columns that should have histograms
SELECT owner, table_name, column_name,
       num_distinct, density,
       'Consider histogram' recommendation
FROM dba_tab_col_statistics
WHERE owner = '&owner'
AND histogram = 'NONE'
AND num_distinct > 1
AND density * num_rows < 1
ORDER BY density;

Outdated Statistics

-- Check for outdated sub-partition statistics
SELECT sp.owner, sp.table_name, sp.subpartition_name,
       sp.last_analyzed,
       mt.inserts + mt.updates + mt.deletes modifications
FROM dba_tab_subpartitions sp,
     dba_tab_modifications mt
WHERE sp.table_owner = mt.table_owner
AND sp.table_name = mt.table_name
AND sp.subpartition_name = mt.subpartition_name
AND mt.inserts + mt.updates + mt.deletes > 1000
ORDER BY modifications DESC;

Table Histograms (../schema-analysis/dtabhist.md) - Table-level histograms
Large Table Statistics (../schema-analysis/dstatbig.md) - Overall statistics
Table Column Analysis (../schema-analysis/dtabcol.md) - Column details

Advanced Analysis

Histogram Effectiveness

-- Compare optimizer estimates with actual cardinality
WITH hist_analysis AS (
  SELECT owner, table_name, column_name,
         subpartition_name, num_distinct,
         CASE
           WHEN num_buckets > 1 THEN 'Has Histogram'
           ELSE 'No Histogram'
         END hist_status
  FROM dba_subpart_col_statistics
  WHERE owner = '&owner'
)
SELECT hist_status,
       COUNT(*) columns,
       AVG(num_distinct) avg_distinct
FROM hist_analysis
GROUP BY hist_status;

Data Distribution Analysis

-- Analyze value distribution patterns
SELECT h.table_name, h.column_name,
       h.subpartition_name,
       COUNT(*) bucket_count,
       MIN(h.endpoint_number) min_endpoint,
       MAX(h.endpoint_number) max_endpoint
FROM dba_subpart_histograms h
WHERE h.owner = '&owner'
GROUP BY h.table_name, h.column_name, h.subpartition_name
HAVING COUNT(*) > 75  -- Tables with many buckets
ORDER BY COUNT(*) DESC;