We have a multi-terabyte table (20+ TB actually) that we recently re-partitioned with a range partition on a date/time (e.g., CREATED_DATE) but also with a sub-hash partition on a particular unique device identifier (e.g., DEVICE_ID).
When I asked Oracle's cost based optimizer (CBO) to return a plan it came back with an index that was built with the columns in the order with the device identifier preceding the creation date. From an indexing perspective this makes sense. I want to start with the specificity of a numeric ID first but then focus in on a particular range of data for that device (e.g., search for all data for device X but do so for only a certain range of data between dates Y and Z).
At the same time our re-partitioning strategy was driven by a desire to implement an Information Life-cycle Management (ILM) Architecture where we could query and move our older data from faster to slower disk storage. This is why we chose to implement range partitioning first on the creation date thereby bucketizing data by year/month and then for added performance we created a sub-hash partition (i.e., additional buckets) on the device identifier.
At the end of the day my confusion was with how the CBO would pick an index that was not constructed in such a way that it aligned with our partitioning strategy. Below is how in my mind I pictured our B-tree index being laid out with respect to leaf nodes and their traversal to retrieve data for the example I provided earlier (i.e., for Device X search for data between dates Y and Z).
Again from an indexing perspective this makes sense. I'm starting with the most specific column using an equality in my predicate (i.e., DEVICE_ID = 103) and then performing a range scan for CREATED_DATE for a particular device identifier. What didn't make sense to me was how Oracle reconciled this against my partitioning strategy where my first level of "buckets" were by CREATED_DATE on Year/Month.
At that point one of our contractors was able to connect the dots for me explaining the difference between Global and Local indexes. Take first Global indexes. They are built essentially irrespective of the underlying partitioning strategy. They work well if you do not have the capability to include in your predicate or underlying index a qualifier that would allow you to take advantage of partition elimination (i.e., in our case CREATED_DATE).
As depicted below, each leaf node in the index can have references to any of the underlying table partitions. This also means that especially on range scans (e.g., let's say I only knew DEVICE_ID) you might end up touching every partition to retrieve your result set. Again can be slightly more efficient if one doesn't have a predicate that aligns with your partitioning strategy.
I can also now look at the EXPLAIN PLAN output for my query and see that it is actually using partition elimination. Notice in the case below it used a PARTITION RANGE ITERATOR because the predicate in my query included a CREATED_DATE that went across multiple partitions.
-----------------------------------------------
| Id | Operation |
-----------------------------------------------
| 0 | SELECT STATEMENT |
| 1 | SORT AGGREGATE |
|* 2 | FILTER |
| 3 | PARTITION RANGE ITERATOR |
|* 4 | TABLE ACCESS BY LOCAL INDEX ROWID|
|* 5 | INDEX RANGE SCAN |
-----------------------------------------------
When adjusting my predicate to specify a range within a particular partition the plan changed indicating that partition elimination further reduced my results to a single partition.
In a subsequent post I'll go into greater detail on simple points for reading the output of EXPLAIN PLAN traces. Going through this particular indexing exercise forced me to relearn what previously second nature to me. Specifically how one can use output from EXPLAIN PLAN to further tune your queries.
-----------------------------------------------
| Id | Operation |
-----------------------------------------------
| 0 | SELECT STATEMENT |
| 1 | SORT AGGREGATE |
|* 2 | FILTER |
| 3 | PARTITION RANGE SINGLE |
|* 4 | TABLE ACCESS BY LOCAL INDEX ROWID|
|* 5 | INDEX RANGE SCAN |
-----------------------------------------------
In a subsequent post I'll go into greater detail on simple points for reading the output of EXPLAIN PLAN traces. Going through this particular indexing exercise forced me to relearn what previously second nature to me. Specifically how one can use output from EXPLAIN PLAN to further tune your queries.
No comments:
Post a Comment