Advantage of Indexes
The database engine can use indexes to boost performance in a number of
different queries. Sometimes these performance improvements are
dramatic. An important feature of SQL Server 2000 is a component known
as the query optimizer. The query optimizer's job is to find the fastest
and least resource intensive means of executing incoming queries. An
important part of this job is selecting the best index or indexes to
perform the task. In the following sections we will examine the types of
queries with the best chance of benefiting from an index.
Searching For Records
The most obvious use for an index is in finding a record or set of
records matching a WHERE clause. Indexes can aid queries looking for
values inside of a range (as we demonstrated earlier), as well as
queries looking for a specific value. By way of example, the following
queries can all benefit from an index on UnitPrice:
DELETE FROM Products WHERE UnitPrice = 1
UPDATE Products SET Discontinued = 1 WHERE UnitPrice > 15
SELECT * FROM PRODUCTS WHERE UnitPrice BETWEEN 14 AND 16
UPDATE Products SET Discontinued = 1 WHERE UnitPrice > 15
SELECT * FROM PRODUCTS WHERE UnitPrice BETWEEN 14 AND 16
Indexes work just as well when searching for a record in DELETE andUPDATE commands as they do for SELECT statements.
Sorting Records
When we ask for a sorted dataset, the database will try to find an index
and avoid sorting the results during execution of the query. We control
sorting of a dataset by specifying a field, or fields, in an ORDER BY
clause, with the sort order as ASC (ascending) or DESC (descending). For
example, the following query returns all products sorted by price:
SELECT * FROM Products ORDER BY UnitPrice ASC
With no index, the database will scan the Products table and sort the
rows to process the query. However, the index we created on UnitPrice
(IDX_UnitPrice) earlier provides the database with a presorted list of
prices. The database can simply scan the index from the first entry to
the last entry and retrieve the rows in sorted order.
The same index works equally well with the following query, simply by scanning the index in reverse.
SELECT * FROM Products ORDER BY UnitPrice DESC
Grouping Records
We can use a GROUP BY clause to group records and aggregate values, for
example, counting the number of orders placed by a customer. To process a
query with a GROUP BY clause, the database will often sort the results
on the columns included in the GROUP BY. The following query counts the
number of products at each price by grouping together records with the
same UnitPrice value.
SELECT Count(*), UnitPrice FROM Products GROUP BY UnitPrice
The database can use the IDX_UnitPrice index to retrieve the prices in
order. Since matching prices appear in consecutive index entries, the
database is able count the number of products at each price quickly.
Indexing a field used in a GROUP BY clause can often speed up a query.
Maintaining a Unique Column
Columns requiring unique values (such as primary key columns) must have a
unique index applied. There are several methods available to create a
unique index. Marking a column as a primary key will automatically
create a unique index on the column. We can also create a unique index
by checking the Create UNIQUE checkbox in the dialog shown earlier. The
screen shot of the dialog displayed the index used to enforce the
primary key of the Products table. In this case, the Create UNIQUE
checkbox is disabled, since an index to enforce a primary key must be a
unique index. However, creating new indexes not used to enforce primary
keys will allow us to select the Create UNIQUE checkbox. We can also
create a unique index using SQL with the following command:
CREATE UNIQUE INDEX IDX_ProductName On Products (ProductName)
The above SQL command will not allow any duplicate values in the
ProductName column, and an index is the best tool for the database to
use to enforce this rule. Each time an application adds or modifies a
row in the table, the database needs to search all existing records to
ensure none of values in the new data duplicate existing values.
Indexes, as we should know by now, will improve this search time.
Index Drawbacks
There are tradeoffs to almost any feature in computer programming, and
indexes are no exception. While indexes provide a substantial
performance benefit to searches, there is also a downside to indexing.
Let's talk about some of those drawbacks now.
Indexes and Disk Space
Indexes are stored on the disk, and the amount of space required will
depend on the size of the table, and the number and types of columns
used in the index. Disk space is generally cheap enough to trade for
application performance, particularly when a database serves a large
number of users. To see the space required for a table, use the
sp_spaceused system stored procedure in a query window.
EXEC sp_spaceused Orders
Given a table name (Orders), the procedure will return the amount of
space used by the data and all indexes associated with the table, like
so:
Name rows reserved data index_size unused ------- -------- ----------- ------ ---------- ------- Orders 830 504 KB 160 KB 320 KB 24 KB
According to the output above, the table data uses 160 kilobytes, while
the table indexes use twice as much, or 320 kilobytes. The ratio of
index size to table size can vary greatly, depending on the columns,
data types, and number of indexes on a table.
Indexes and Data Modification
Another downside to using an index is the performance implication on
data modification statements. Any time a query modifies the data in a
table (INSERT, UPDATE, or DELETE), the database needs to update all of
the indexes where data has changed. As we discussed earlier, indexing
can help the database during data modification statements by allowing
the database to quickly locate the records to modify, however, we now
caveat the discussion with the understanding that providing too many
indexes to update can actually hurt the performance of data
modifications. This leads to a delicate balancing act when tuning the
database for performance.
A Disadvantage to Clustered Indexes
If we update a record and change the value of an indexed column in a
clustered index, the database might need to move the entire row into a
new position to keep the rows in sorted order. This behavior essentially
turns an update query into a DELETE followed by an INSERT, with an
obvious decrease in performance. A table clustered index can often be
found on the primary key or a foreign key column, because key values
generally do not change once a record is inserted into the database.