Ranking and Numbering rows – and subsets of rows – in T-SQL

I recently had to deal with a scenario where I needed to pivot out some rows after ordering (ranking) them according to specific rules so I could present some rows of data as columns, but in a specific order (don’t ask why, it’ll make me grind my teeth about data analysts that don’t understand how to analyse data…). The ordering in itself was only part of the solution, as to Pivot the data, the keys need to be specified in the query, so the natural keys can’t be used. The scenario is set out below:

Fig 1: Rank and Pivot. The Rank column needed to be added
Fig 1: Rank and Pivot. The Rank column needed to be added

My first thought was that I’d have to solve this with a cursor, which wasn’t a practical option as there were 1.5m rows of data to process, and if my solution involves a cursor I instantly think it’s a lousy solution. However I was pleased to discover the T-SQL function ROW_NUMBER() which allows you to add row numbering to ordered data and even subgroups of that data. (The below samples use the AdventureWorks2008 database.)

First up, basic row numbering:

SELECT ROW_NUMBER() OVER (ORDER BY ProductId) AS ID_Key
,        [ProductID]
,        [LocationID]
,        [Shelf]
,        [Bin]
,        [Quantity]

FROM [Production].[ProductInventory]

WHERE [ProductID] IN (1,2,3,4)

The above query adds an ID key to the data based on ordering by the ProductID field. The ROW_NUMBER() function requires an OVER clause to know on what basis it should assign the key, and this has to be an ORDER BY statement. The end result looks like this:

Fig 2: Simple row numbering
Fig 2: Simple row numbering

You can extend this to order within a subgroup, by specifying a PARTITION BY clause so ROW_NUMBER() operates with that subgroup. In the example below I partition by ProductId:

SELECT ROW_NUMBER() OVER (PARTITION BY ProductId ORDER BY Quantity DESC) AS Subset_ID_Key
,        [ProductID]
,        [LocationID]
,        [Shelf]
,        [Bin]
,        [Quantity]

FROM [Production].[ProductInventory]

WHERE [ProductID] IN (1,2,3,4)

Which yields this result, with the ranking now only applying within a Product Id:

Fig 3: Row numbering within a Subgroup
Fig 3: Row numbering within a Subgroup

Which can then be pivoted on the rank, as the key of the rank is now known:

SELECT ProductID
,        [1] AS Bin_1
,        [2] AS Bin_3
,        [3] AS Bin_3

FROM

(

SELECT ROW_NUMBER() OVER (PARTITION BY ProductId ORDER BY Quantity DESC) AS Subset_ID_Key
,        [ProductID]
,        [Bin]

FROM [Production].[ProductInventory]

WHERE [ProductID] IN (1,2,3,4)

) AS Pivot_Source

PIVOT

(
MAX(Bin)
FOR Subset_ID_Key IN ([1],[2],[3])
) AS Pivot_Output

Which yields this final output:

Fig 4: Ranked and Pivoted
Fig 4: Ranked and Pivoted

All done within a single query, and not a cursor in sight. ROW_NUMBER() was a great function to discover!

MSDN Documentation is here for:

  • ROW_NUMBER() – the key function
  • OVER – ordering and subgrouping the results of ROW_NUMBER
  • PIVOT – for pivoting out the results

4 thoughts on “Ranking and Numbering rows – and subsets of rows – in T-SQL

  1. This Partition and Order By section really helped me solve a tough row to col challenge that I had. Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>