Understanding RANK() vs. DENSE

Understanding RANK() vs. DENSE_RANK()

November 02, 2024

Understanding RANK() vs. DENSE_RANK() in SQL

The RANK() and DENSE_RANK() window functions are powerful tools in SQL for ranking rows within a result set. This guide explains their differences, particularly in how they handle ties in ranking.

Overview

RANK(): Assigns a rank to each row, leaving gaps in rank values when there are ties.
DENSE_RANK(): Similar to RANK(), but does not leave gaps in the ranking sequence when there are ties.

Syntax


function() OVER (PARTITION BY <column> ORDER BY <column>)

PARTITION BY: (optional) Divides the result set into partitions for ranking.
ORDER BY: Specifies the order in which to rank the rows.

Example Dataset

Consider the following employees table:

Name	Salary
Alice	70000
Bob	70000
Charlie	60000
David	80000
Eva	60000

Example Queries

Using RANK()


SELECT Name, Salary,
       RANK() OVER (ORDER BY Salary DESC) AS Rank
FROM employees;

Result Set with RANK()

Name	Salary	Rank
David	80000	1
Alice	70000	2
Bob	70000	2
Charlie	60000	4
Eva	60000	4

Using DENSE_RANK()


SELECT Name, Salary,
       DENSE_RANK() OVER (ORDER BY Salary DESC) AS DenseRank
FROM employees;

Result Set with DENSE_RANK()

Name	Salary	DenseRank
David	80000	1
Alice	70000	2
Bob	70000	2
Charlie	60000	3
Eva	60000	3

Key Differences

Gaps in Ranking:
- RANK(): Introduces gaps in the ranking sequence when there are ties.
- DENSE_RANK(): Provides a continuous ranking sequence without gaps.
Use Cases:
- Use RANK() for identifying ranks with gaps.
- Use DENSE_RANK() for sequential ranking without gaps.

Summary

Both RANK() and DENSE_RANK() are valuable for assigning rankings based on specified criteria. The choice depends on whether you want to maintain gaps in the ranks (using RANK()) or ensure a continuous ranking sequence (using DENSE_RANK()).

Search This Blog

Data Engineered

Understanding RANK() vs. DENSE_RANK()

Understanding RANK() vs. DENSE_RANK() in SQL

Overview

Syntax

Example Dataset

Example Queries

Using RANK()

Result Set with RANK()

Using DENSE_RANK()

Result Set with DENSE_RANK()

Key Differences

Summary

Comments

Post a Comment

Popular posts from this blog

Databricks Auto Loader

Optimizing Spark memory - Resolving OutOfMemory (OOM)

PySpark tips for day-to-day use