Index Fragmentation in SQL Server 2008

You must have seen so many articles and scripts and even technet msdn provides suggestions
to reorganize and rebuild indexes just on the basis of the counter avg Fragmentation in percent.The counter
avg page space used in percent is not mentioned at all.For me it should be used in deciding when to rebuild or reorganize the
indexes. I will show you why this is much more improtant, personally for me it is even more improtant than avg fragmentation as the avg fragmentation impcats the large scan when physical IO’s are involved whereas avg page space used impacts physical Io. logical Io,CPU and memory and thus should be given much more attention.

Conclusion:

Suppose you have avg page space used in percent is say 50% and total page count is 1000.I could see following issues with this.

1. You need to read twice the pages to read the same data. If this table had fully packed pages it would have clocked 500 logical IO’s whereas now it will need 1000 IO’s.
2. More logical IO’s means you will do more physical IO’s. This could be huge impact if the table size were quite big.

3. This in my point is the main reason why I would like to include this counter in your rebuild and reorganize strategy
When you bring the data into memory it comes in pages. Thus you will have 1000 pages in memory whereas
you could have stored the same data in 500 pages. Thus you are wasting memory size of 500 pages.Memory is very precious in SQL server.Now Imagine that you have 30GB total memory and your table’s poages are 70% full. Then you are actually using 30GB whereas the same data could have been stored in 21GB. Thus 9GB would be wasted. This 9GB could have been used to keep for other table’s in memory or this memory could have been used by sort or hash opertaions.This in turn would have reduced physical IO’s and thus improved overall server performance.

Now what if your script were just checking the avg Fragmentation in percent and say rebuild the index only your fragmentation is more than 30%.This index were not picked up for rebuild and thus could have caused the performance issue.
Thus I would suggest that while deciding the strategy for index rebuild and reorg.Use the avg page space used in percent as well along with avg fragmentation.

Thus I would say that the avg page space used in percent is as important as avg fragmentation.I will reasearch into finding the exact impcats of high avg fragmenations and will post it.

Rest of the details are there in the script.

/*
You must have seen so many articles and scripts and even technet msdn provides suggestions
to reorganize and rebuild indexes just on the basis of the counter avg Fragmentation in percent.The counter
avg page space used in percent is not mentioned at all.For me it should be used in deciding when to rebuild or reorganize the
indexes. I will show you why this is much more improtant and what should be done in below script.
*/
–Create a database
if exists ( select 1 from sys.databases where name = ‘MyFragMentation’)
begin
drop database MyFragMentation
end
go
create database MyFragMentation
ON Primary
(
Name = ‘Mydata’,Filename = ‘C:\BOOK\DB\MyData.mdf’,size = 1024MB
)
LOG ON
(
Name = ‘MyLog’,Filename = ‘C:\BOOK\DB\MyLog.ndf’,size = 512MB
)
go

–Now create a table called mysoh which will have the same data as is the sales.salesorderheader table..
use MyFragMentation
go
If OBJECT_ID(‘mysoh’,’U’) is not null
begin
drop table mysoh
end
go
select * into mysoh from AdventureWorks.Sales.SalesOrderHeader
go
–Now I will create the primary key with clustered index
alter table mysoh add constraint pk_mysoh_SalesOrderId primary key (SalesOrderId)
go

–Now check the fragmentation
select * from sys.dm_db_index_physical_stats(db_id(),object_id(‘mysoh’),null,null,’DETAILED’)
go

/*
Page count 799
avg fragmentation in percent 37.54%. This is a new table and primary key is just now created
and still I see the fragmentation.This is quite a bit odd..
avg page space used in percent is 98.5%.This is quite good. I personaly look at this counter more than the
avg fragmentation in percent.
*/

–anyway let us rebuild the clustered index
alter index all on mysoh rebuild
go

select * from sys.dm_db_index_physical_stats(db_id(),object_id(‘mysoh’),null,null,’DETAILED’)
go

/*
Page count 798 it is reduced,
avg fragmentation in percent 37.59%. This is odd why this is increasing even though I am creating the clustered index.
avg page space used in percent is 98.61%.This is quite good again.
*/
–Now i will truncate the table and then inserts the data
truncate table mysoh
go
set identity_insert mysoh on
go
insert into mysoh with (tablockx)
( SalesOrderID
,[RevisionNumber]
,[OrderDate]
,[DueDate]
,[ShipDate]
,[Status]
,[OnlineOrderFlag]
,[SalesOrderNumber]
,[PurchaseOrderNumber]
,[AccountNumber]
,[CustomerID]
,[ContactID]
,[SalesPersonID]
,[TerritoryID]
,[BillToAddressID]
,[ShipToAddressID]
,[ShipMethodID]
,[CreditCardID]
,[CreditCardApprovalCode]
,[CurrencyRateID]
,[SubTotal]
,[TaxAmt]
,[Freight]
,[TotalDue]
,[Comment]
,[rowguid]
,[ModifiedDate])
select
SalesOrderID
,[RevisionNumber]
,[OrderDate]
,[DueDate]
,[ShipDate]
,[Status]
,[OnlineOrderFlag]
,[SalesOrderNumber]
,[PurchaseOrderNumber]
,[AccountNumber]
,[CustomerID]
,[ContactID]
,[SalesPersonID]
,[TerritoryID]
,[BillToAddressID]
,[ShipToAddressID]
,[ShipMethodID]
,[CreditCardID]
,[CreditCardApprovalCode]
,[CurrencyRateID]
,[SubTotal]
,[TaxAmt]
,[Freight]
,[TotalDue]
,[Comment]
,[rowguid]
,[ModifiedDate]
from AdventureWorks.Sales.SalesOrderHeader
option( querytraceon 610)
go

–check the fragmentation again

select * from sys.dm_db_index_physical_stats(db_id(),object_id(‘mysoh’),null,null,’DETAILED’)
go

/*
Now I got what i wanted
Page count 798 it is reduced,
avg fragmentation in percent 0%. This is what everyone is looking at.
avg page space used in percent is 98.61%.This is quite good again.

Select into folowed by alter table primary key had very high fragmentation and even rebuilding index did not improve it.
I am not sure much about this behaviour. I will post it about some day when I will have much more information.But this is
not my point today so you can skip it.
*/

–Now let us delete some data. I will delete all rows which have an even salesorderid

delete from mysoh where SalesOrderID%2= 0
go

–Check the fragmentation

select * from sys.dm_db_index_physical_stats(db_id(),object_id(‘mysoh’),null,null,’DETAILED’)
go

/*
Now I got what i wanted
Page count 798 it is reduced,
avg fragmentation in percent 0%. I have deleted half of my data but no change in this parameter and it is expected because
it represents the logical fragmenetation of pages in file.e.g. 0 fragmenetation means that every page in clustered index ‘s next page pointer
contains the page which is alos physically next to this page.
avg page space used in percent is 49.294%.This is seriouslly bad. We should have this value as close as to 100%.
As we deleted half of the rows our data should be fitted within approx 798/2 which is 399 pages but it is actually stored on 799 pages.
This is sheer watse of space.This itself is not an issue. Floowoing are its main impcats
1. You need to read twice the pages to read the same data. If this table had fully packed pages it would have clocked 399 logical
IO’s whereas now it will need 798 IO’s.
2. More logical IO’s means you will do more physical IO’s. This could be huge impact if the table size were quite big.This could have been serious issue.
3. This in my point is the main reason why I would like to include this counter in your rebuild and reorganize strategy
When you bring the data into memory it comes in pages. Thus you will have 798 pages in memory whereas
you could have stored the same data in 399 pages. Thus you are wasting memory size of 399 pages which is around 3.1MB.Memory is very precious in SQL server.
This is not huge in terms of memory but now visualize that you have 30GB total memory and your table’s poages are 70% full. Then you are actually using 30GB whereas
the same data could have been stored in 21GB. Thus 9GB would be wasted. This 9GB could have been used to store more
table’s data , thus avoiding the physical IO’s and this memory could have been used by sort and hash operation within memory.

This is like you are getting your data comressed by 30% without actually compressing the data.

Now what if your script were just checking the avg Fragmentation in percent and say rebuild the index only your fragmentation is more than 30%.
This index were not picked up for rebuild and thus could have caused the performance issue.
Thus I would suggest that while deciding the strategy for index rebuild and reorg.Use the avg page space used in percent as well along with avg fragmentation.

*/

–You saw previously that delete did not caused a chaneg in average fragmentation in percent. That is not always the case. When there will be large inserts and deletes
–Then the avg fragmentation will be changed accordingly. But before that I will show you that for the indexes
–where the avg fragmenetation is quite less but the avg page space used in percent is quite low. You do not need to do rebuild the index
–Just reorganize the index and it is lighter opertaion as compared to rebuild.

alter index all on mysoh reorganize
go
/*
CAUTION : If you do not have any fragmentation but your avg page space used is less you might tempt to use the reorganize but
this could cause the counter avg fragmentation to increase. Thus you might need to rebuild the index. Thus I would suggest that
you should try to rebuild the index,if you have window,space and resources.
*/
select * from sys.dm_db_index_physical_stats(db_id(),object_id(‘mysoh’),null,null,’DETAILED’)
go
/*
Now you will see that the number of pages reduced to 400 and avg page space used is around 98.5%.
But our avg fragmentation has been increased.Can you guess why?
Because now the data is reorganized on the pages and some pages are freed up and now some of the index’s next logical page
is not the next physical page.
*/
–I will rebuild the index so that I have minimu fragmentation and then I will do some inserts and deletes where the avg page space used will
–be reduced but at the same time the fragmentation will be increased.

alter index all on mysoh rebuild with(online = on)
go

select * from sys.dm_db_index_physical_stats(db_id(),object_id(‘mysoh’),null,null,’DETAILED’)
go

/*
now all counters seems fine to me..
page count 399 as expected
avg frag is 0
avg page space used is 98.61%
*/

–I will insert the data into the table which form sales.salesorderheader table with even salesorderid. Thus I will cause lots of fragmentation
–and page splits and so on.

set identity_insert mysoh on
go
insert into mysoh with (tablockx)
( SalesOrderID
,[RevisionNumber]
,[OrderDate]
,[DueDate]
,[ShipDate]
,[Status]
,[OnlineOrderFlag]
,[SalesOrderNumber]
,[PurchaseOrderNumber]
,[AccountNumber]
,[CustomerID]
,[ContactID]
,[SalesPersonID]
,[TerritoryID]
,[BillToAddressID]
,[ShipToAddressID]
,[ShipMethodID]
,[CreditCardID]
,[CreditCardApprovalCode]
,[CurrencyRateID]
,[SubTotal]
,[TaxAmt]
,[Freight]
,[TotalDue]
,[Comment]
,[rowguid]
,[ModifiedDate])
select
SalesOrderID
,[RevisionNumber]
,[OrderDate]
,[DueDate]
,[ShipDate]
,[Status]
,[OnlineOrderFlag]
,[SalesOrderNumber]
,[PurchaseOrderNumber]
,[AccountNumber]
,[CustomerID]
,[ContactID]
,[SalesPersonID]
,[TerritoryID]
,[BillToAddressID]
,[ShipToAddressID]
,[ShipMethodID]
,[CreditCardID]
,[CreditCardApprovalCode]
,[CurrencyRateID]
,[SubTotal]
,[TaxAmt]
,[Freight]
,[TotalDue]
,[Comment]
,[rowguid]
,[ModifiedDate]
from AdventureWorks.Sales.SalesOrderHeader
where SalesOrderID%2= 0
option( querytraceon 610)
go

select * from sys.dm_db_index_physical_stats(db_id(),object_id(‘mysoh’),null,null,’DETAILED’)
go
/*
Now there are a number of issues with the data
1. number of pages are 1004 instead of 799.
2.Avg page space used is 78.37% which is around 20% less
3.But good thing is hat avg fragmentation is 78.68% which means the script you created will rebuild this index.
*/

–I will rebuild the index

alter index all on mysoh rebuild with(online = on)
go

select * from sys.dm_db_index_physical_stats(db_id(),object_id(‘mysoh’),null,null,’DETAILED’)
go

/*
Much better now. There are around 802 pages
avg fragmentation is 1.24%
and avg space used is 98.24%
*/
/*
Conclusion: DO NOT JUST RELY ON THE COUNTER AVG FRAGMENTATION ALONE.BUT RATHER USE THIS COUNTER WITH AVG PAGE SPACE USED.

*/

Finding Top N worst performaing queries

You might have seen a number of queries which provides you with top 10 or top 5 worst performing queries either by Logical IO’s or by CPU time or by Elapsed Time.. I have similar query but it does many more thing.

I personally do not query either by total logical reads or total cpu time or total elapsed time individually. I usually use this query to get the queries which are top n on a number of counters like say top 5 by ‘Total Logical Reads’ and top 5 by ‘Total CPU Time’ and top 5 by ‘Total Elapsed Time’ etc. Thus I will be able to find the queries which are performing bad not just on one counter but on a number of counters. Thus I will be able to find “true” worst perforaming query. I used word “true” because sometime a query using more CPU time as it is a parallel query whereas it is not taking too much time to actuall execute query (or the total elapsed time rank is not say even in top 20). However, there is something which is still missing. There are queries which are using the memory grant for sort and hash joins. This value is not included here.However, for such queries CPU time is more and thus these kind of queries might come up in top queries by CPU time.

Be careful while executing it. This is doing lots of sorts to get the ranking and it is on different columns. Thus it will do lots of sorts and if your system is alredy under pressure this query might add to your perf issue. I would suggest that you start using one of the counter and comment others and then proceed to make sure that it doesnt put the pressure on the system.

Also, this query will work on SQL server 2008 and compability 100 and above.  


with PerformanceMetrics
as
(
select
--dest.text,
--statement_start_offset,
--statement_end_offset,
--LEN(dest.text) ln,
substring
(
dest.text,
statement_start_offset/2,
case when statement_end_offset = -1 then LEN(dest.text)
else statement_end_offset
end /2
) as 'Text of the SQL' ,
deqs.plan_generation_num as 'Number of times the plan was generated for this SQL',
execution_count as 'Total Number of Times the SQL was executed',
DENSE_RANK() over(order by execution_count desc) as 'Rank of the SQL by Total number of Executions',
total_elapsed_time/1000 as 'Total Elapsed Time in ms consumed by this SQL',
DENSE_RANK() over(order by total_elapsed_time desc) as 'Rank of the SQL by Total Elapsed Time',
Max_elapsed_time/1000 as 'Maximum Elapsed Time in ms consumed by this SQL',
min_elapsed_time/1000 as 'Minimum Elapsed Time in ms consumed by this SQL',
total_elapsed_time/1000*nullif(execution_count,0) as 'Average Elapsed Time in ms consumed by this SQL',
DENSE_RANK() over(order by total_elapsed_time/nullif(execution_count,0) desc) as 'Rank of the SQL by Average Elapsed Time',
total_worker_time as 'Total CPU Time in ms consumed by this SQL',
DENSE_RANK() over(order by total_worker_time desc) as 'Rank of the SQL by Total CPU Time',
Max_worker_time as 'Maximum CPU Time in ms consumed by this SQL',
min_worker_time as 'Minimum CPU Time in ms consumed by this SQL',
total_worker_time/nullif(execution_count,0) as 'Average CPU Time in ms consumed by this SQL',
DENSE_RANK() over(order by total_worker_time/nullif(execution_count,0) desc) as 'Rank of the SQL by Average CPU Time',
total_logical_reads as 'Total Logical Reads Clocked by this SQL',
DENSE_RANK() over(order by total_logical_reads desc) as 'Rank of the SQL by Total Logical reads',
Max_logical_reads as 'Maximum Logical Reads Clocked by this SQL',
min_logical_reads as 'Minimum Logical Reads Clocked by this SQL',
total_logical_reads/nullif(execution_count,0) as 'Average Logical Reads Clocked by this SQL',
DENSE_RANK() over(order by total_logical_reads/nullif(execution_count,0) desc) as 'Rank of the SQL by Average Logical reads',
total_physical_reads as 'Total Physical Reads Clocked by this SQL',
DENSE_RANK() over(order by total_physical_reads desc) as 'Rank of the SQL by Total Physical Reads',
Max_physical_reads as 'Maximum Physical Reads Clocked by this SQL',
min_physical_reads as 'Minimum Physical Reads Clocked by this SQL',
total_physical_reads/nullif(execution_count,0) as 'Average Physical Reads Clocked by this SQL',
DENSE_RANK() over(order by total_physical_reads/nullif(execution_count,0) desc) as 'Rank of the SQL by Average Physical Reads',
total_logical_writes as 'Total Logical Writes Clocked by this SQL',
DENSE_RANK() over(order by total_logical_writes desc) as 'Rank of the SQL by Total Logical Writes',
Max_logical_writes as 'Maximum Logical Writes Clocked by this SQL',
min_logical_writes as 'Minimum Logical Writes Clocked by this SQL',
total_logical_writes/nullif(execution_count,0) as 'Average Logical Writes Clocked by this SQL',
DENSE_RANK() over(order by total_logical_writes/nullif(execution_count,0) desc) as 'Rank of the SQL by Average Logical Writes',
deqp.query_plan as 'Plan of Query'
--similarly you can add the ranks for maximum values as well.That is quite useful in finding some of the perf issues.
from
sys.dm_exec_query_stats deqs
/*F0C6560A-9AD1-448B-9521-05258EF7E3FA*/ --use a newid so that we could exclude this query from the performanc emetrics output
outer apply sys.dm_exec_query_plan(deqs.plan_handle) deqp --sometimes the plan might not be in the cache any longer.So using outer apply
outer apply sys.dm_exec_sql_text(deqs.sql_handle) dest --Sometimes the text is not returned by the dmv so use outer apply.
where
dest.text not like '%F0C6560A-9AD1-448B-9521-05258EF7E3FA%'
)
select
*
from
PerformanceMetrics
where
1=1
--apply any of these where clause in any combinations or one by one..
--and [Rank of the SQL by Average CPU Time] <= 20 --Use this to find the top N queries by avg CPU time.
--and [Rank of the SQL by Average Elapsed Time] <= 20 --Use this to find the top N queries by avg elspsed time.
--and [Rank of the SQL by Average Logical reads] <= 20 --Use this to find the top N queries by avg logical reads.
--and [Rank of the SQL by Average Physical Reads] <= 20 --Use this to find the top N queries by avg physical reads.
and [Rank of the SQL by Total CPU Time] <= 20 --Use this to find the top N queries by total CPU time.
and [Rank of the SQL by Total Elapsed Time] <= 20 --Use this to find the top N queries by total elapsed time.
and [Rank of the SQL by Total Logical reads] <= 20 --Use this to find the top N queries by Total Logical reads.
and [Rank of the SQL by Total Physical Reads] <= 20 --Use this to find the top N queries by Total Physical Reads.
and [Rank of the SQL by Total number of Executions] <= 20 --Use this to find the top N queries by Total number of Executions.
--and [Rank of the SQL by Average Logical Writes] <= 20 --Use this to find the top N queries by Average Logical Writes.
and [Rank of the SQL by Total Logical Writes] <= 20 --Use this to find the top N queries by Total Logical Writes.

--I usually do the query by 6 rank types together Total logical reads,Total CPU time, Total Elapsed Time , Total Execution count ,Total Physical Reads and Total Logical Writes.Sometimes I exclude last two counters if i do not get any query in the output.
--If some queries are say in top 10 in all these 6 categories then these needs to tune first...
--But sometime you might not get any rows at all if u use these 6 categiories in that case remove one of these categories or try one by one..

Improved Merge Join algorithm in SQL server 2008

I assume that you know how merge join works. There is however an issue with the existing merge join algorithm in SQL server 2008. I am sure that it could be improved further to improve the perfroamcne by reducing the number of rows processed and reducing the number of IO’s.

When the inner table is accessed the scan starts at the first row and the table is scanned till it find the value of join key which is more than the value of join key from outer table. As soon as the join key value is more than the outer table join key row the scan is not required. This is quite efficient algorithm to end the scan. However, the scan always starts at the first value or first row( I am assuming that there are no other sarg’s which are causing the scan to start somewhere else or cause other index to be used). This part could be veru inefficient if say the last value of join key is at almost at the end of the inner table. Thus the whole clustered index or table is scanned where it might have possible that we just got say 1000 rows and thus scanning the pages for those 1000 rows.

Solution: For me the scan should start at the minimum value of join key from outer table. e.g. say we have clustered index on id and say outer table has minimu value 100000 and max value 105000. Thus the scan should start at 100000 and end at 105000.Currently it is not happening.Currently it is starting at 1 and finsihing at 105000. Thus the rows from 1 to 99999 are uselessly processed. This is waste of CPU as well as it causes more logical reads.

Below script will show you the issue in more detail and will show you the alternate method.

–I will explain the better merge algorithm.. I will use adventureworks database
use AdventureWorks
go
drop table mysoh
go
select * into mysoh from sales.SalesOrderHeader
go
alter table mysoh add constraint pk_mysoh primary key (SalesorderId)
go
create nonclustered index idx_ord_date_soh on mysoh(OrderDate)
go
drop table mysod
go
select * into mysod from sales.SalesOrderDetail
go
alter table mysod add constraint pk_mysod primary key (SalesorderId,salesorderdetailid)
go

drop procedure ExistingMergeAlgo
go
create procedure ExistingMergeAlgo
(
@orderdate datetime
)
as
begin
select *
from
mysoh soh
inner join mysod sod
on soh.SalesOrderID = sod.SalesOrderID
where
soh.OrderDate between @orderdate and DATEADD(dd,1,@orderdate)
end
go
–Now let us see how it is performing..
set statistics io,time on
go
exec ExistingMergeAlgo @orderdate = ‘20010701’
go
set statistics io,time off
go

–The proc got executed but a nested loop join was used as it is better than merge join.But I will be showing the better merge alogorithm
–so i will force the mereg joins so that nested loop and hash joins are not used at all. This is not recommneded in prod systems.This is just for demo
–purpose.

drop procedure ExistingMergeAlgo
go
create procedure ExistingMergeAlgo
(
@orderdate datetime
)
as
begin
select *
from
mysoh soh
inner merge join mysod sod
on soh.SalesOrderID = sod.SalesOrderID
where
soh.OrderDate between @orderdate and DATEADD(dd,1,@orderdate)
end
go

set statistics io,time on
go
exec ExistingMergeAlgo @orderdate = ‘20010701’
go
set statistics io,time off
go

–This is the first date in the table and thus the see actual number of rows from msyod table is 362 which is one more than
–actual number of rows retruned. Thus the merge join is quite good.
–Now I will use the next date say 20010702

set statistics io,time on
go
exec ExistingMergeAlgo @orderdate = ‘20010702’
go
set statistics io,time off
go

–actual number of rows are 367 I expected it to be around 370..Now I will select next date..

set statistics io,time on
go
exec ExistingMergeAlgo @orderdate = ‘20010703’
go
set statistics io,time off
go

–actual number of rows are 369 from mysod table i exxpected 376..

–let us select next date..

set statistics io,time on
go
exec ExistingMergeAlgo @orderdate = ‘20010704’
go
set statistics io,time off
go
–now the rows are 374 instead of 381..as the date value will increased so will be actual number of rows processed by CPU
–and also the IO’s as we have to start the scan and have to read till the last value of salesorderid.

–now try for the last day like last value of orderdate

set statistics io,time on
go
exec ExistingMergeAlgo @orderdate = ‘20040731’
go
set statistics io,time off
go

–now for this date..actual number of rows are 121317 which are total number of rows and look at the IO’s it is around 1502
–which is very high..There were just 96 rows retruned…I expected that optimizer would have started the scan for mysod on
–the minimu salesorderid value..

–I just wanted to show that the merge join was starting the scan of the table mysod from the first leaf page
–and then keep on scanning unless the salesorderid value exceed the salesorderid value in the mysoh table
–for given date range. as the date will increase the scan will start at the first page and will continue
–till the last value in salesorderid.

–This is the problem I am going to discuss with respect to the merge join. The optimizer is smart enough to know
–where to stop scanning the second table. However, optimizer is not smart enough to find the start point of the scan.
–The left part is sorted and we have the 47311 as the starting salesorderid (for date 20010704).Thus optimizer should start the scan
–from the 47311 and then do as it does usually scan till the last value is reached..
–I hope in future optimizer will be able to do this and this will make sure that we scan minimum number of pages and rows thus
–reduce both the logical reads ,cpu processing.

–WE can not wait till the Microsoft put this feature in the sql server optimizer. However, we could have a workaround for this.

drop procedure ExistingMergeAlgo
go
create procedure ExistingMergeAlgo
(
@orderdate datetime
)
as
begin
declare @minsalesorderid int
select @minsalesorderid= MIN(salesorderid) from mysoh where OrderDate between @orderdate and DATEADD(dd,1,@orderdate)
select *
from
mysoh soh
inner merge join mysod sod
on soh.SalesOrderID = sod.SalesOrderID
where
soh.OrderDate between @orderdate and DATEADD(dd,1,@orderdate)
–we will start processing from the first salesorderid of the date range rather than first row of the table..
and sod.SalesOrderID >= @minsalesorderid
end
go

set statistics io,time on
go
exec ExistingMergeAlgo @orderdate = ‘20010701’
go
set statistics io,time off
go

–This wont have any impcat as this is going to start from the first day anyway… there are 362 rows processed
–for mysod table..
–Now try next date..

set statistics io,time on
go
exec ExistingMergeAlgo @orderdate = ‘20010702’
go
set statistics io,time off
go

–Now actual number of rows are 10 instead of 367 rows. This means that the optimizer had to process 3% rows of what it was doing actually.
–This has reduced the logical Io’s as well because actual scan started at the min value of salesorderid rather than the first row in the table
–try for next date

set statistics io,time on
go
exec ExistingMergeAlgo @orderdate = ‘20010703’
go
set statistics io,time off
go

–number of rows processed are 8 instead of 362..Also the IO’s for sod is 4 instead of 10..

–Now I will jump directly to last date..

set statistics io,time on
go
exec ExistingMergeAlgo @orderdate = ‘20040731’
go
set statistics io,time off
go

–actual number of rows are now just 96 instead of 121317 as shown above..
–Also the IO;s are just 5 instead of 1502. This is quite a big imrovement in
–the IO’s and it is just not IO’s the rows processed are just 96. This means
–less cpu intensive work..

–the data was perfectly clustered and thus the perfromanc eimprovement was way too good. Now Let us try some other column say customer id..

create nonclustered index idx_customer_mysoh on mysoh(Customerid)
go

drop procedure ExistingMergeAlgo
go
create procedure ExistingMergeAlgo
(
@customerid int
)
as
begin
–declare @minsalesorderid int
–select @minsalesorderid= MIN(salesorderid) from mysoh where customerid = @customerid
select *
from
mysoh soh
inner merge join mysod sod
on soh.SalesOrderID = sod.SalesOrderID
where
soh.CustomerID =@customerid
–we will start processing from the first salesorderid of the date range rather than first row of the table..
–and sod.SalesOrderID >= @minsalesorderid
end
go
–find the customer with minimum number of rows and max number of rows

select customerid,COUNT(*)cnt ,MAX(SalesOrderID)-MIN(SalesOrderID) clstring from
mysoh group by customerid
order by 3 desc
/*
top 2
16612 2 30313
16496 2 30107
bottom 3
12563 2 6
14981 2 6
14977 1 0
*/

–let us start by botom one’s first

set statistics io,time on
go
exec ExistingMergeAlgo @customerid = 12563
go
set statistics io,time off
go
–rows processed 77324 and IO’s 996 for mysod table..

–let us try 14977

set statistics io,time on
go
exec ExistingMergeAlgo @customerid = 14977
go
set statistics io,time off
go

–ios 1493 and rows processed are 120K.. This means that for custiomer id 14977 the salesorderid is placed much recently than the customer
–12563..Now let us try top 2

set statistics io,time on
go
exec ExistingMergeAlgo @customerid = 16612
go
set statistics io,time off
go

–1477 IO’s and 118K rows processed for mysod table. Thus rows processed are simply determined by the max salesorderid
–lower this value lower will be the IO’s and lower will be the number of rows processed.

–Now let us try the technique i used earlier..

drop procedure ExistingMergeAlgo
go
create procedure ExistingMergeAlgo
(
@customerid int
)
as
begin
declare @minsalesorderid int
select @minsalesorderid= MIN(salesorderid) from mysoh where customerid = @customerid
select *
from
mysoh soh
inner merge join mysod sod
on soh.SalesOrderID = sod.SalesOrderID
where
soh.CustomerID =@customerid
–we will start processing from the first salesorderid of the date range rather than first row of the table..
and sod.SalesOrderID >= @minsalesorderid
end
go

–let us start by botom one’s first

set statistics io,time on
go
exec ExistingMergeAlgo @customerid = 12563
go
set statistics io,time off
go
–rows processed are 16 instead of 77324 and IO’s are 5 instead of 996 for mysod table..

–let us try 14977

set statistics io,time on
go
exec ExistingMergeAlgo @customerid = 14977
go
set statistics io,time off
go

–ios are 4 instead of 1493 and rows processed are 3 instead of 120K..

–Now let us try top 2

set statistics io,time on
go
exec ExistingMergeAlgo @customerid = 16612
go
set statistics io,time off
go

–1472 IO’s instead of 1477 IO’s and 118K rows instead of 118K rows processed for mysod table.

–Thus new technique is perfroming similar to the actual method in case of the min and max values are very much apart..
–But it performs superbly if the diff is quite less…

–Thus you could see that this method in worst case scenario will perform almost equal to the actual method
–but it could boost the perfromance by a magnitude for lots of scenario…

/*
Conclusion:
This is not to replace the nested loop join or any other costing method. This is just to show that
optimizer should start the scan from appropriate point for inner table rather than the first row while doing a merge join specially on already sorted data.
This suggestion is just to show that merge join alogorithm could be handled better to reduce the logical reads and cpu processing by determining
the start point of the scan for the inner table for merge join.

The workaround I have shown should not be used unless you do the following.
1. Make sure that other performance technique are used before going for this approach.
2. This technique requires code change as you have to find the minimum value of the clustered index key..
3. Before applying this technique benchmark performance of this method against the best plan and method for your queries.based on the result you can use this technique.Also this will work if inner table has a clustred index already or being accesses through other index ( on which we have join).
4. However, microsoft might provide an alternate for this in future then the code introduced will be redudant.
*/

Perfromance issue related with nvarchar(max) in SQL server 2008

Lots of people has started using nvarchar(max) instead of defining the fixed length nvarchar columns. This is mostly to avoid the modification in the code related to the length of the columns or where you are not sure what would be right length for a column. This should be actually resolved at the database design level itself and database designer should make sure that they define a proper length for the column based on the business requirement.However, that is either not possible all the time or people just do not bother and simply create the column using the nvarchar(max). I will show you the disadvantages of using nvarchar(max) instead of the fixed length columns specially if the size of the column is less than 8000 bytes.The columns which are more than 8000 bytes are anyway can not use the fixed length columns as that would result in truncation of the data for some of your rows.

Anyway, below is the script which you could run to see why the nvarchar(max) should not be used where it is possible to use nvarchar(fixedlength).

Use AdventureWorks
go
–create a table with some records records..
if OBJECT_ID(‘nvarcharfixed’) is not null
drop table nvarcharfixed
go

create table nvarcharfixed ( id int primary key not null,col1 nvarchar(200) not null,col2 char(400) not null,col3 int not null)
go
insert into nvarcharfixed
select SalesOrderID,SUBSTRING(replicate(reverse(SalesOrderNumber),100),1,100),SalesOrderNumber,soh.CustomerID
from Sales.SalesOrderHeader soh
go

create nonclustered index idx_col1_fixed on nvarcharfixed(col3) include(col1)
go
update Statistics nvarcharfixed with fullscan
go

if OBJECT_ID(‘nvarcharmax’) is not null
drop table nvarcharmax
go
create table nvarcharmax ( id int primary key not null,col1 nvarchar(max) not null,col2 char(400) not null,col3 int not null)
go

insert into nvarcharmax
select *
from nvarcharfixed
go
create nonclustered index idx_col1_max on nvarcharmax(col3) include (col1)
go
update Statistics nvarcharmax with fullscan
go

–Now Let us check what are the Index size ,table size etc to see what impact nvarchar(max) has on storage.
with IndexPhysStats
as
(
select OBJECT_NAME(object_id) as TblName,* from sys.dm_db_index_physical_stats(db_id(),object_id(‘nvarcharfixed’),null,null,’DETAILED’)
union all
select OBJECT_NAME(object_id) as TblName,* from sys.dm_db_index_physical_stats(db_id(),object_id(‘nvarcharmax’),null,null,’DETAILED’)
)
select * from IndexPhysStats
go

–There is no difference between the record size,index_depth,page_count,space_used and even fragmentation..However, there is a lob_data row and offcourse it is empty..

–Now let us try to get say avlaue from both tables.turn on actual execution plan
set statistics io,time on
go
select * from nvarcharfixed where id = 43659
select * from nvarcharmax where id = 43659
go
set statistics io,time off
go

–Estimated cost is same and everything else is same..But expected row size is 4443B for the max and 619 B for fixed.
–This is big difference. This could impcat the sorting etc..because the optimizer will use 4443B to assign the memory in case of max
–then 619 B..So let us see what diff will it make

set statistics io,time on
go
select * from nvarcharfixed where id between 43659 and 43670 order by col1
select * from nvarcharmax where id between 43659 and 43670 order by col1
go
set statistics io,time off
go

–No diff except row size..Though 1 extra IO for max table..

–now query based on the col3..

set statistics io,time on
go
select * from nvarcharfixed where col3 = 442– order by col1
select * from nvarcharmax where col3 = 442– order by col1
go
set statistics io,time off
go
–No diff except row size..Though 1 extra IO for max table..

set statistics io,time on
go
select * from nvarcharfixed where col3 = 442 order by col1
select * from nvarcharmax where col3 = 442 order by col1
go
set statistics io,time off
go
–No diff except row size..Though 1 extra IO for max table..

–Now I would like to sort on the columns but for full table..

set statistics io,time on
go
select * from nvarcharfixed order by col1
select * from nvarcharmax order by col1
option(maxdop 1)
go
set statistics io,time off
go
–Look at the plan costs …3.99 for fixed and 356.5 for the max. That is lots of difference. this caused the optimizer
–to opt for the parallel plan for max..Alos Memory grant is almost 5 times. 25K for fixed and 121K for max.

–Now I will sort based on say column col2

set statistics io,time on
go
select * from nvarcharfixed order by col2
select * from nvarcharmax order by col2
option(maxdop 1)
go
set statistics io,time off
go

–same as above..

–Now exclude the col1 column and get remaining columns..

set statistics io,time on
go
select id,col3,col2 from nvarcharfixed order by col2
select id,col3,col2 from nvarcharmax order by col2
option(maxdop 1)
go
set statistics io,time off
go

–Now both have same row size,same estimated cost and same amount of memory grant..

–finally I would like to query based on col1.. There are no indexes on col1 so let us see what will be the diff..

set statistics io,time on
go
select * from nvarcharfixed where col1 like ‘10%’ order by col2
select * from nvarcharmax where col1 like ‘10%’ order by col2
option(maxdop 1)
go
set statistics io,time off
go

–Now there is no diff at all except the row size which is now 619 b and 643 b much olesser than 4443 bytes..
–Is it because of the non clustered index scan that we are getting the size of the row less?so we will avoid the
–use of non clustered index scan and then we will see what will happen.

set statistics io,time on
go
select * from nvarcharfixed where col1 like ‘%’ order by col2
select * from nvarcharmax where col1 like ‘%’ order by col2
option(maxdop 1)
go
set statistics io,time off
go
–If you will see the size is 619 b and 643 b. Thus the size of the row is reduced even though the non clustered index
–is not used..The memory grant is a bit more than the fixed.

–was it because the stats are updated and optimizer is using them for row size..??
–Before checking that let us rerun the table scan with order by

set statistics io,time on
go
select * from nvarcharfixed order by col2
select * from nvarcharmax order by col2
option(maxdop 1)
go
set statistics io,time off
go

–still memory gramt is 106K and estimated row size is 4443 bytes again..So stats did not do anything it is just that providing the
–col1 in where cluase is making diff..

–try again

set statistics io,time on
go
select * from nvarcharfixed where col1 like ‘%’ order by col2
select * from nvarcharmax where col1 like ‘%’ order by col2
option(maxdop 1)
go
set statistics io,time off
go

–row size is reduced to 643 bytes again and so does the ,e,ory gramt and estimated cost.. Thus using the nvarcharmax column in where
–reduce the size of estimated row, thus in turn estimated cost and memory grant..

–let me drop the statistis on the col1 column and then swicth the auto creation off as well and see how it will perform

select * from sys.stats where object_id = object_id(‘nvarcharmax’,’U’)
go
drop statistics nvarcharmax._WA_Sys_00000002_413A6905
go
select * from sys.stats where object_id = object_id(‘nvarcharmax’,’U’)
go
alter database adventureworks set auto_create_statistics off
go
set statistics io,time on
go
select * from nvarcharfixed where col1 like ‘%’ order by col2
select * from nvarcharmax where col1 like ‘%’ order by col2
option(maxdop 1)
go
set statistics io,time off
go

–The size of the row is again 4443 bytes and thus a large amount of memory grant is needed.In this case it is 74K
–You will get the warning as well that there are no stats for col1

alter database adventureworks set auto_create_statistics on
go

Conclusion:
So if you are sure that the length of your nvarchar column will be less than 8000.Then do not define the columns as nvarchar(max)
but rather define the nvarchar(fixedlength) wherever possible. The main advantage I see for using fixed length are following.

1. You can create an index on the fixed length column but can not create on the nvarchar(max) column. If you are using the column in where clause in your queries the you should always use fixed length column instead of nvarchar(max). This should be enough to use the nvarchar fixed lengthcolumn. I will specify others as well. However, index key length allowed is 900 bytes. Thus any nvarchar column which will exceed this value will cause issue and will be failed.
2. Memory grant could be a big issue where the server is already memroy starved.As expected row size is more the optimizer will
estimates more memory grant and this value will be much higher than actual required and this would be a waste of a resource as precisous as memory.
If you have couple of queries which are using nvarchar(amx) column and sorting is needed then server might have memroy related issues.
This could be a big perf issue.
3. You can not create the indexes online when the table or index has the nvarchar(max) column.
alter index all on nvarcharmax
rebuild with (online=on)

However, if you have defined your columns as nvarchar(max) and can not change it now. Then you can reduce the memory grant related issue
by creating the stats on the column and specify the column in where clause (if it doesnt exist already) with a condition like where colname like ‘%’ and this will make sure that
the memory grant will be less (atleast for the columns which have avg length than 8000) as the estimated row size will be reduced. This will make sure that
memory is not wasted and thus will reduce the memory pressure if it was caused by the memory grant related to navrcar(max) columns and order by or sort

Performance Enhancement for Index Seek + Key Lookup in SQL Server 2

In last post  you saw way to imrove the performance of the key lookup when the data is clustered perfectly. In this post I will show the scenarios where the perforamnce might be worse than actual methods.

1. The data is perfectly clustered but all of your queries get the most of the data from the table. The threshold at which this method will perform bad depends on the table size and indexes and queries. Thus test you queries and see at what point the tweaked sql is performing bad.

exec clustproc @dt= ‘20020701’,@days = 30500 with recompile
go

–864 IO’s for full table..whereas normal table scan is just 803 IO’s. so in the case where you need full table’s data then this might be slightly bad performaing..

2. This method is best when the data in index key is perfectly clustered with the clustered index key. DO NOT USE this method when the clustering is not good. To show this I will create clustered index on rowguid columns so that the clustering is very bad for orderdate column.

if OBJECT_ID(‘actsoh’) is not null
begin
drop table actsoh
end
select * into actsoh from sales.SalesOrderHeader
go
alter table actsoh alter column rowguid varbinary(32)
go
create unique clustered index idx_pk_rowguid on actsoh(rowguid)
go
create nonclustered index idx_ord_dt on actsoh(orderdate)
go

update Statistics actsoh with fullscan
go
–now query the table say based on the order date range of say around 1 day
–1 month
–or 1 year
exec nproc @dt= ‘20010701’,@days = 0 with recompile
–142 IO’s. 0 ms CPU,170ms elapsed time.. –using index seek + key lookup

–1 month
exec nproc @dt= ‘20010701’,@days = 30 with recompile

–574 IO’s. 0 ms ,382 ms –using index seek + key lookup
–try for 1 year
exec nproc @dt= ‘20010701’,@days = 365 with recompile
go
–804 for full table scan..
–Now i will create the clustproc again with the rowguid column..
–Now I will apply my technique..In This case the clustering is very bad so let us see how it will perform..
drop proc clustproc
go
create proc clustproc
(@dt datetime,
@days smallint = 0
)
as
begin
declare @minrowguid varbinary(32),@maxrowguid varbinary(32)

select
@minrowguid= MIN(rowguid),@maxrowguid=max(rowguid)
from
actsoh idx with (index = 2)
where
1=1
and idx.OrderDate between @dt and DATEADD(dd,@days,@dt)

select tbl.* from
actsoh tbl with (index = 1)
where
1=1
and tbl.rowguid between @minrowguid and @maxrowguid
and tbl.OrderDate between @dt and DATEADD(dd,@days,@dt)
end
go

exec clustproc @dt= ‘20010701’,@days = 0 with recompile
–759 IO’s. 0 ms CPU,130ms elapsed time.. This is way too high than 142 by normal proc
–1 month
exec clustproc @dt= ‘20010701’,@days = 30 with recompile
–804 this is equivalent to table scan.. it is worse than 574 IO’s by using normal index seek + lookup
–try for 1 year
exec clustproc @dt= ‘20010701’,@days = 365 with recompile
go
–828 IO’s. 31 ms CPU,314 elapsed time.
–for full table
exec clustproc @dt= ‘20020701’,@days = 30500 with recompile
go
–938 IO’s/
/*

Conclsuion: Before using this method please make sure following.
1. Apply Normal performance technique like covering index etc to improve the proc performance.
2. Make sure that the data is clustered almost perfectly and most of your queries are used to get a part of the data and not whole table.
3. Benchmark the performance of this method with the best method you have already.If it provides enough perf gain then you could implement it.
4. Just test and try in dev and before appying to prod.Test whether it returns the data properly.
*/

Conclsuion:

This is quite a good way to reduce the IO and overall resource usages when the data is perfectly clustered.But before using this make sure that you try normal performance techniques like coversing indexes,tryidng to reduce the number of columns so that some of the existing indexes covers the actual query. Also, make sure that you benchmark it against your best method for all kind of posisble scenarios and then based on that decide whether to use this or not.

Performance Enhancement for Index Seek + Key Lookup in SQL Server 1

Index Seek + lookup: This is considered very bad in SQL server.But I think it is normal because you can not have all the columns in the index key always and thus to get the columns which are not part of index key,you need to get it from data page.

Anyway, there are number of ways to enhance the performance where index key + lookup is causing the issue.All these are specific e.g. one is to create a covering index for a given query.Thus making sure that we have just the index seek and no key lookup. But can we do it for all our queries?No.However, you can use this for the queries which are quite frequently used or which are causing real perf issues and creating a covering index avoid this.

Anyway, I am going to introduce a method which is quite specific for given type of queries where you can use existing index to improve the overall perrformance of query.

This is related to the data correlation between the non clustered index key and clustered index key.In DB terms it is called as clustering of the data as well.E.g. Orderdate in Sales.salesorderheader table and the Salesorderid has a correlation such that when one value is increasing other one is increasing as well.Like for a given orderdate the salesorderid will be greater than the salesorderid for any orderdate before the given orderdate.

I am giving the query below which has all the explaination.

PS: This method is quite specific as the data is perfectly correlated. Thus before applying this technique make sure following
1. You have used other Performance Techniques like creating covering index and any other technique.
2. Also,make sure that the clustering is almost perfect between non clustered index key + clustered index key

When you have applied the normal performance technique and still the performance is not improved and the data clustering is perfect. Then try this method and benchmark the performance of this method against your existing or best method. Then decide whether this technique is good enough as compared to your best method.

Benchmark your perf results before applying any new change.

In below code the elpased time and CPU times varies a lot as i have actual exec plan enabled. This impcats these metrics. Thus you should run the query without using that and see the avg cpu time and avg elapsed time over a number of executions.

This method is not recomended if the data clustering is not good or your query retrunrs most of the table’s data.

use AdventureWorks
go
if OBJECT_ID(‘actsoh’) is not null
begin
drop table actsoh
end
go
select * into actsoh from sales.SalesOrderHeader
go
create unique clustered index idx_pk_salesorderid on actsoh(salesorderid)
go
create nonclustered index idx_ord_dt on actsoh(orderdate)
go
update Statistics actsoh with fullscan
go

–now query the table say based on the order date range of say around 1 day
–1 month
–or 1 year
drop proc nproc
go
create proc nproc
(@dt datetime,
@days smallint = 0
)
as
begin
select * from actsoh
where orderdate between @dt and DATEADD(dd,@days,@dt)

end
go
exec nproc @dt= ‘20010701’,@days = 0 with recompile
–131 IO’s. 0 ms CPU,170ms elapsed time.. –using index seek + key lookup

–1 month
exec nproc @dt= ‘20010701’,@days = 30 with recompile

–554 IO’s. 0 ms ,382 ms –using index seek + key lookup
–try for 1 year
exec nproc @dt= ‘20010701’,@days = 365 with recompile
go
–804 IO’s. 31 ms CPU,791ms elapsed time..Using table scan or cluestered index scan

–Now I will apply my technique..In THis case the clustering is perfect..I will use the min and max salesorderid
drop proc clustproc
go
create proc clustproc
(@dt datetime,
@days smallint = 0
)
as
begin
declare @minsalesorderid int,@maxsalesorderid int

select
@minsalesorderid= MIN(SalesOrderID),@maxsalesorderid=max(SalesOrderID)
from
actsoh idx with (index = 2)
where
1=1
and idx.OrderDate between @dt and DATEADD(dd,@days,@dt)

select tbl.* from
actsoh tbl with (index = 1)
where
1=1
and tbl.SalesOrderID between @minsalesorderid and @maxsalesorderid
and tbl.OrderDate between @dt and DATEADD(dd,@days,@dt)
end
go

exec clustproc @dt= ‘20010701’,@days = 0 with recompile
–7 IO’s. 0 ms CPU,166ms elapsed time..
–1 month
exec clustproc @dt= ‘20010701’,@days = 30 with recompile

–10 IO’s. 0 ms ,223 ms
–try for 1 year
exec clustproc @dt= ‘20010701’,@days = 365 with recompile
go
–90 IO’s. 16 ms CPU,266ms elapsed time.
–for full table
exec clustproc @dt= ‘20020701’,@days = 30500 with recompile
go

–864 IO’s for full table..whereas normal table scan is just 803 IO’s. so in the case where you need full table’s data then this might be slightly
–bad performaing..

I will explain the disadvantages and the scenarios where it could cause perfromance get worse in the next post.