Sub-Partition Histogram Analysis (dhstgrmsub.sql)
What This Script Does
Section titled “What This Script Does”This script analyzes column histograms for sub-partitioned tables, providing detailed statistics about data distribution within each sub-partition. It helps DBAs understand how the optimizer estimates cardinality for complex partitioning schemes and identify potential performance issues with skewed data distributions.
Script
Section titled “Script”rem dhstgrm.sqlremset linesize 132remttitle 'Sub-Partition Histograms'col OWNER format a8 heading 'OWNER'col TABLE_NAME format a15 heading 'TABLE NAME'col SUBPARTITION_NAME format a14 heading 'SUBPARTITION'col COLUMN_NAME format a17 heading 'COLUMN NAME'col DATA_TYPE format a9 heading 'DATA TYPE'col nullable format a1 heading 'NULL?'col num_distinct format 9,999,999 heading 'NUM|DISTINCT'col low_value heading 'LOW VALUE'col high_value heading 'HIGH VALUE'col density format 0.9999 heading 'DENSITY'col num_nulls format 9,999,999 heading 'NUM|NULLS'col num_buckets format 999 heading 'BUCKETS'col ENDPOINT_NUMBER format 9,999,999 heading 'ENDPOINT#'col BUCKET_NUMBER format 9,999,999 heading 'BUCKET#'rem col ENDPOINT_VALUE format 999999999999999999 heading 'ENDPOINT VALUE'col ENDPOINT_VALUE heading 'ENDPOINT VALUE'col ENDPOINT_ACTUAL_VALUE format a10 trunc heading 'ENDPOINT|ACTUAL|VALUE'rembreak on owner on table_name on column_name - on DATA_TYPE - on nullable - on subpartition_name - on num_distinct - on low_value - on high_value - on density - on num_nulls - on num_bucketsremselect c.OWNER, c.TABLE_NAME, c.COLUMN_NAME, c.DATA_TYPE, decode(c.NULLABLE, 'Y', null, c.NULLABLE) nullable, h.subpartition_name, pc.num_distinct,-- c.low_value,-- c.high_value, pc.density, pc.num_nulls, pc.num_buckets, -- h.BUCKET_NUMBER, -- h.ENDPOINT_VALUE, -- h.ENDPOINT_ACTUAL_VALUE count(*)from dba_subpart_histograms h, DBA_SUBPART_COL_STATISTICS pc, dba_tab_columns cwhere h.owner like nvl(upper('&owner'),'%') and h.table_name like nvl(upper('&table'),'%') and h.column_name like nvl(upper('&column'),'%') and h.subpartition_name like nvl(upper('&subpartition'),'%') and c.owner=h.owner and c.table_name=h.table_name and c.column_name=h.column_name and pc.owner=h.owner and pc.table_name=h.table_name and pc.subpartition_name=h.subpartition_name and pc.column_name=h.column_namegroup by c.OWNER, c.TABLE_NAME, c.COLUMN_NAME, c.DATA_TYPE, decode(c.NULLABLE, 'Y', null, c.NULLABLE), h.subpartition_name, pc.num_distinct, pc.density, pc.num_nulls, pc.num_bucketsorder by c.OWNER, c.TABLE_NAME, c.COLUMN_NAME, h.subpartition_name --, h.BUCKET_NUMBER/remset linesize 80
-- Run the script in SQL*Plus or SQLcl@dhstgrmsub.sql
-- When prompted, enter parameters:-- owner: Schema owner (% for all)-- table: Table name pattern (% for all)-- column: Column name pattern (% for all)-- subpartition: Sub-partition name pattern (% for all)
-- Examples:Enter value for owner: SALESEnter value for table: ORDERSEnter value for column: %Enter value for subpartition: %
-- Specific column analysisEnter value for owner: DWEnter value for table: FACT_SALESEnter value for column: PRODUCT_IDEnter value for subpartition: %
Parameters
Section titled “Parameters”- &owner - Schema owner (supports wildcards)
- &table - Table name pattern
- &column - Column name pattern
- &subpartition - Sub-partition name pattern
Required Privileges
Section titled “Required Privileges”- SELECT on DBA_SUBPART_HISTOGRAMS
- SELECT on DBA_SUBPART_COL_STATISTICS
- SELECT on DBA_TAB_COLUMNS
- Typically requires DBA role
Sample Output
Section titled “Sample Output”Sub-Partition Histograms
OWNER TABLE NAME SUBPARTITION COLUMN NAME DATA TYPE NUM NUM BUCKETS COUNT(*) NULL? DISTINCT NULLS-------- --------------- -------------- ----------------- --------- --------- --------- ------- --------SALES ORDERS P2023Q1_EAST CUSTOMER_ID NUMBER 1234 45 75 127 ORDER_STATUS VARCHAR2 5 0 4 5 PRIORITY VARCHAR2 3 0 3 3
ORDERS P2023Q1_WEST CUSTOMER_ID NUMBER 2345 67 75 127 ORDER_STATUS VARCHAR2 5 0 4 5 PRIORITY VARCHAR2 3 0 3 3
ORDERS P2023Q2_EAST CUSTOMER_ID NUMBER 3456 89 75 127 ORDER_STATUS VARCHAR2 4 0 4 4 PRIORITY VARCHAR2 3 0 3 3
Key Output Columns
Section titled “Key Output Columns”- OWNER - Schema owning the table
- TABLE NAME - Name of the partitioned table
- SUBPARTITION - Sub-partition name
- COLUMN NAME - Column with histogram
- DATA TYPE - Column data type
- NULL? - Nullable indicator
- NUM DISTINCT - Number of distinct values
- NUM NULLS - Count of null values
- BUCKETS - Number of histogram buckets
- COUNT(*) - Number of histogram entries
Understanding Sub-Partition Histograms
Section titled “Understanding Sub-Partition Histograms”Histogram Types
Section titled “Histogram Types”- FREQUENCY - One bucket per distinct value
- HEIGHT BALANCED - Equal number of rows per bucket
- HYBRID - Combination for popular values
Benefits
Section titled “Benefits”- Improved cardinality estimates for complex queries
- Better execution plans for partitioned joins
- Accurate statistics for partition-wise operations
Common Use Cases
Section titled “Common Use Cases”Analyze Data Skew
Section titled “Analyze Data Skew”SELECT owner, table_name, subpartition_name, column_name, num_distinct, num_nulls, ROUND(num_nulls/NULLIF(num_rows,0)*100,2) null_pctFROM dba_subpart_col_statisticsWHERE owner = '&owner'AND table_name = '&table'ORDER BY null_pct DESC;
Compare Distribution Across Sub-Partitions
Section titled “Compare Distribution Across Sub-Partitions”SELECT table_name, column_name, COUNT(DISTINCT subpartition_name) subparts, MIN(num_distinct) min_distinct, MAX(num_distinct) max_distinct, STDDEV(num_distinct) stddev_distinctFROM dba_subpart_col_statisticsWHERE owner = '&owner'GROUP BY table_name, column_nameHAVING STDDEV(num_distinct) > 100ORDER BY stddev_distinct DESC;
Find Columns with No Histograms
Section titled “Find Columns with No Histograms”SELECT c.table_name, c.column_nameFROM dba_tab_columns cWHERE c.owner = '&owner'AND c.table_name IN ( SELECT table_name FROM dba_part_tables WHERE partitioning_type = 'RANGE' AND subpartitioning_type IS NOT NULL)AND NOT EXISTS ( SELECT 1 FROM dba_subpart_histograms h WHERE h.owner = c.owner AND h.table_name = c.table_name AND h.column_name = c.column_name);
Performance Impact
Section titled “Performance Impact”Query Optimization
Section titled “Query Optimization”-- Check optimizer estimates vs actualEXPLAIN PLAN FORSELECT COUNT(*) FROM table_name SUBPARTITION (subpart_name)WHERE column_name = 'value';
SELECT operation, object_name, cardinality, costFROM plan_tableWHERE plan_id = (SELECT MAX(plan_id) FROM plan_table);
Partition Pruning
Section titled “Partition Pruning”-- Verify partition pruning with histogramsSELECT partition_start, partition_stop, access_predicates, filter_predicatesFROM v$sql_planWHERE sql_id = '&sql_id'AND operation = 'PARTITION RANGE';
Maintenance and Monitoring
Section titled “Maintenance and Monitoring”Gather Sub-Partition Statistics
Section titled “Gather Sub-Partition Statistics”-- Gather stats for specific sub-partitionBEGIN DBMS_STATS.GATHER_TABLE_STATS( ownname => 'OWNER', tabname => 'TABLE_NAME', partname => 'SUBPARTITION_NAME', estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE, method_opt => 'FOR ALL COLUMNS SIZE AUTO' );END;/
Monitor Stale Statistics
Section titled “Monitor Stale Statistics”SELECT owner, table_name, subpartition_name, last_analyzed, ROUND(SYSDATE - last_analyzed) days_old, stale_statsFROM dba_tab_subpartitionsWHERE owner = '&owner'AND stale_stats = 'YES'ORDER BY last_analyzed;
Best Practices
Section titled “Best Practices”Histogram Management
Section titled “Histogram Management”- Gather statistics after significant data changes
- Use AUTO method for histogram creation
- Monitor histogram effectiveness
- Consider manual histogram creation for skewed data
Performance Tuning
Section titled “Performance Tuning”- Ensure statistics are current
- Use partition-wise joins when possible
- Monitor execution plans for accuracy
- Consider histogram size for large tables
Troubleshooting
Section titled “Troubleshooting”Missing Histograms
Section titled “Missing Histograms”-- Find columns that should have histogramsSELECT owner, table_name, column_name, num_distinct, density, 'Consider histogram' recommendationFROM dba_tab_col_statisticsWHERE owner = '&owner'AND histogram = 'NONE'AND num_distinct > 1AND density * num_rows < 1ORDER BY density;
Outdated Statistics
Section titled “Outdated Statistics”-- Check for outdated sub-partition statisticsSELECT sp.owner, sp.table_name, sp.subpartition_name, sp.last_analyzed, mt.inserts + mt.updates + mt.deletes modificationsFROM dba_tab_subpartitions sp, dba_tab_modifications mtWHERE sp.table_owner = mt.table_ownerAND sp.table_name = mt.table_nameAND sp.subpartition_name = mt.subpartition_nameAND mt.inserts + mt.updates + mt.deletes > 1000ORDER BY modifications DESC;
Related Scripts
Section titled “Related Scripts”- Table Histograms (../schema-analysis/dtabhist.md) - Table-level histograms
- Large Table Statistics (../schema-analysis/dstatbig.md) - Overall statistics
- Table Column Analysis (../schema-analysis/dtabcol.md) - Column details
Advanced Analysis
Section titled “Advanced Analysis”Histogram Effectiveness
Section titled “Histogram Effectiveness”-- Compare optimizer estimates with actual cardinalityWITH hist_analysis AS ( SELECT owner, table_name, column_name, subpartition_name, num_distinct, CASE WHEN num_buckets > 1 THEN 'Has Histogram' ELSE 'No Histogram' END hist_status FROM dba_subpart_col_statistics WHERE owner = '&owner')SELECT hist_status, COUNT(*) columns, AVG(num_distinct) avg_distinctFROM hist_analysisGROUP BY hist_status;
Data Distribution Analysis
Section titled “Data Distribution Analysis”-- Analyze value distribution patternsSELECT h.table_name, h.column_name, h.subpartition_name, COUNT(*) bucket_count, MIN(h.endpoint_number) min_endpoint, MAX(h.endpoint_number) max_endpointFROM dba_subpart_histograms hWHERE h.owner = '&owner'GROUP BY h.table_name, h.column_name, h.subpartition_nameHAVING COUNT(*) > 75 -- Tables with many bucketsORDER BY COUNT(*) DESC;