Cardinality and Index in Oracle

oracle-techonol-prajwal

Cardinality in Oracle

Cardinality means the number of something but it gets used in a variety of contexts.

It depends a bit on context.

  • When you’re building a data model, cardinality often refers to the number of rows in table A that relate to table B.That is, are there 1 row in B for every row in A (1:1), are there N rows in B for every row in A (1:N), are there M rows in B for every N rows in A (N:M), etc.
  • When you are looking at things like whether it would be more efficient to use a b*-tree index or a bitmap index or how selective a predicate is, cardinality refers to the number of distinct values in a particular column.If you have a PERSON table, for example, GENDER is likely to be a very low cardinality column (there are probably only two values in GENDER) while PERSON_ID is likely to be a very high cardinality column (every row will have a different value).
  • When you are looking at query plans, cardinality refers to the number of rows that are expected to be returned from a particular operation.

There are probably other situations where people talk about cardinality using a different context and mean something else.

Types Of Index in Oracle

There are two types of indexe in oracle

  • BitMap (for low cardinality data)
  • B-Tree (for high cardinality data)

Cardinality On Index Choice

The advantages of using bitmap indexes are greatest for columns in which the ratio of the number of distinct values to the number of rows in the table is small. We refer to this ratio as the degree of cardinality. A gender column, which has only two distinct values (male and female), is optimal for a bitmap index.

For example, on a table with one million rows, a column with 10,000 distinct values is a candidate for a bitmap index. A bitmap index on this column can outperform a B-tree index, particularly when this column is often queried in conjunction with other indexed columns.

B-tree indexes are most effective for high-cardinality data: that is, for data with many possible values, such as customer_name or phone_number. In a data warehouse, B-tree indexes should be used only for unique columns or other columns with very high cardinalities (that is, columns that are almost unique). The majority of indexes in a data warehouse should be bitmap indexes.

In ad hoc queries and similar situations, bitmap indexes can dramatically improve query performance. AND and OR conditions in the WHERE clause of a query can be resolved quickly by performing the corresponding Boolean operations directly on the bitmaps before converting the resulting bitmap to rowids. If the resulting number of rows is small, the query can be answered quickly without resorting to a full table scan.

Example : Bitmap Index

The following shows a portion of a company’s customers table.

SELECT cust_id, cust_gender, cust_marital_status, cust_income_level
FROM customers;

CUST_ID    C CUST_MARITAL_STATUS  CUST_INCOME_LEVEL
---------- - -------------------- ---------------------
... 
        70 F                      D: 70,000 - 89,999
        80 F married              H: 150,000 - 169,999
        90 M single               H: 150,000 - 169,999
       100 F                      I: 170,000 - 189,999
       110 F married              C: 50,000 - 69,999
       120 M single               F: 110,000 - 129,999
       130 M                      J: 190,000 - 249,999
       140 M married              G: 130,000 - 149,999
...

Because cust_gender, cust_marital_status, and cust_income_level are all low-cardinality columns (there are only three possible values for marital status, two possible values for gender, and 12 for income level), bitmap indexes are ideal for these columns. Do not create a bitmap index on cust_id because this is a unique column. Instead, a unique B-tree index on this column provides the most efficient representation and retrieval.

Bitmap Indexes and Nulls

Unlike most other types of indexes,bitmap indexes include rows that have NULL values. Indexing of nulls can be useful for some types of SQL statements, such as queries with the aggregate function COUNT.

Example : Bitmap Index

SELECT COUNT(*) FROM customers WHERE cust_marital_status IS NULL;

This query uses a bitmap index on cust_marital_status. Note that this query would not be able to use a B-tree index, because B-tree indexes do not store the NULLvalues.

SELECT COUNT(*) FROM customers;

Any bitmap index can be used for this query because all table rows are indexed, including those that have NULL data. If nulls were not indexed, the optimizer would be able to use indexes only on columns with NOT NULL constraints.

Bitmap Join Index Restrictions and Requirements

Join results must be stored, therefore, bitmap join indexes have the following restrictions:

  • Parallel DML is only supported on the fact table. Parallel DML on one of the participating dimension tables will mark the index as unusable.
  • Only one table can be updated concurrently by different transactions when using the bitmap join index.
  • No table can appear twice in the join.
  • You cannot create a bitmap join index on an index-organized table or a temporary table.
  • The columns in the index must all be columns of the dimension tables.
  • The dimension table join columns must be either primary key columns or have unique constraints.
  • The dimension table column(s) participating the join with the fact table must be either the primary key column(s) or with the unique constraint.
  • If a dimension table has composite primary key, each column in the primary key must be part of the join.
  • The restrictions for creating a regular bitmap index also apply to a bitmap join index. For example, you cannot create a bitmap index with the UNIQUE attribute

Using B-Tree Indexes in Data Warehouses

A B-tree index is organized like an upside-down tree. The bottom level of the index holds the actual data values and pointers to the corresponding rows, much as the index in a book has a page number associated with each index entry.

In general, use B-tree indexes when you know that your typical query refers to the indexed column and retrieves a few rows. In these queries, it is faster to find the rows by looking at the index. However, using the book index analogy, if you plan to look at every single topic in a book, you might not want to look in the index for the topic and then look up the page. It might be faster to read through every chapter in the book. Similarly, if you are retrieving most of the rows in a table, it might not make sense to look up the index to find the table rows. Instead, you might want to read or scan the table.

B-tree indexes are most commonly used in a data warehouse to enforce unique keys. In many cases, it may not even be necessary to index these columns in a data warehouse, because the uniqueness was enforced as part of the preceding ETL processing, and because typical data warehouse queries may not work better with such indexes. B-tree indexes are more common in environments using third normal form schemas. In general, bitmap indexes should be more common than B-tree indexes in most data warehouse environments.

Choosing Between Local Indexes and Global Indexes

B-tree indexes on partitioned tables can be global or local. With Oracle8i and earlier releases, Oracle recommended that global indexes not be used in data warehouse environments because a partition DDL statement (for example, ALTER TABLEDROP PARTITION) would invalidate the entire index, and rebuilding the index is expensive. In Oracle Database 10g, global indexes can be maintained without Oracle marking them as unusable after DDL. This enhancement makes global indexes more effective for data warehouse environments.

However, local indexes will be more common than global indexes. Global indexes should be used when there is a specific requirement which cannot be met by local indexes (for example, a unique index on a non-partitioning key, or a performance requirement).

Bitmap indexes on partitioned tables are always local.

Top Posts

  1. RANK, DENSE_RANK, FIRST and LAST Analytic Functions
  2. Partitioned Tables And Indexes in Oracle
  3. GROUP BY Functions in Oracle
  4. PIVOT Function in Oracle
  5. Oracle Architecture – SQL Query Processing

 

error: Content is protected !!