Index Seek + lookup: This is considered very bad in SQL server.But I think it is normal because you can not have all the columns in the index key always and thus to get the columns which are not part of index key,you need to get it from data page.
Anyway, there are number of ways to enhance the performance where index key + lookup is causing the issue.All these are specific e.g. one is to create a covering index for a given query.Thus making sure that we have just the index seek and no key lookup. But can we do it for all our queries?No.However, you can use this for the queries which are quite frequently used or which are causing real perf issues and creating a covering index avoid this.
Anyway, I am going to introduce a method which is quite specific for given type of queries where you can use existing index to improve the overall perrformance of query.
This is related to the data correlation between the non clustered index key and clustered index key.In DB terms it is called as clustering of the data as well.E.g. Orderdate in Sales.salesorderheader table and the Salesorderid has a correlation such that when one value is increasing other one is increasing as well.Like for a given orderdate the salesorderid will be greater than the salesorderid for any orderdate before the given orderdate.
I am giving the query below which has all the explaination.
PS: This method is quite specific as the data is perfectly correlated. Thus before applying this technique make sure following
1. You have used other Performance Techniques like creating covering index and any other technique.
2. Also,make sure that the clustering is almost perfect between non clustered index key + clustered index key
When you have applied the normal performance technique and still the performance is not improved and the data clustering is perfect. Then try this method and benchmark the performance of this method against your existing or best method. Then decide whether this technique is good enough as compared to your best method.
Benchmark your perf results before applying any new change.
In below code the elpased time and CPU times varies a lot as i have actual exec plan enabled. This impcats these metrics. Thus you should run the query without using that and see the avg cpu time and avg elapsed time over a number of executions.
This method is not recomended if the data clustering is not good or your query retrunrs most of the table’s data.
if OBJECT_ID(‘actsoh’) is not null
drop table actsoh
select * into actsoh from sales.SalesOrderHeader
create unique clustered index idx_pk_salesorderid on actsoh(salesorderid)
create nonclustered index idx_ord_dt on actsoh(orderdate)
update Statistics actsoh with fullscan
–now query the table say based on the order date range of say around 1 day
–or 1 year
drop proc nproc
create proc nproc
@days smallint = 0
select * from actsoh
where orderdate between @dt and DATEADD(dd,@days,@dt)
exec nproc @dt= ‘20010701’,@days = 0 with recompile
–131 IO’s. 0 ms CPU,170ms elapsed time.. –using index seek + key lookup
exec nproc @dt= ‘20010701’,@days = 30 with recompile
–554 IO’s. 0 ms ,382 ms –using index seek + key lookup
–try for 1 year
exec nproc @dt= ‘20010701’,@days = 365 with recompile
–804 IO’s. 31 ms CPU,791ms elapsed time..Using table scan or cluestered index scan
–Now I will apply my technique..In THis case the clustering is perfect..I will use the min and max salesorderid
drop proc clustproc
create proc clustproc
@days smallint = 0
declare @minsalesorderid int,@maxsalesorderid int
actsoh idx with (index = 2)
and idx.OrderDate between @dt and DATEADD(dd,@days,@dt)
select tbl.* from
actsoh tbl with (index = 1)
and tbl.SalesOrderID between @minsalesorderid and @maxsalesorderid
and tbl.OrderDate between @dt and DATEADD(dd,@days,@dt)
exec clustproc @dt= ‘20010701’,@days = 0 with recompile
–7 IO’s. 0 ms CPU,166ms elapsed time..
exec clustproc @dt= ‘20010701’,@days = 30 with recompile
–10 IO’s. 0 ms ,223 ms
–try for 1 year
exec clustproc @dt= ‘20010701’,@days = 365 with recompile
–90 IO’s. 16 ms CPU,266ms elapsed time.
–for full table
exec clustproc @dt= ‘20020701’,@days = 30500 with recompile
–864 IO’s for full table..whereas normal table scan is just 803 IO’s. so in the case where you need full table’s data then this might be slightly
I will explain the disadvantages and the scenarios where it could cause perfromance get worse in the next post.