One of the interview questions I frequently make has to do with creating a query that will return the nth row from the result set, there has been several times the prospect does not answer because they don’t think which way they can do it, but sometimes they answer something that works, but it’s not like the very optimal.
For example, I ask them to tell me who was the second place from a list of Students and Grades. To explain better, let’s create a table called Students only with StudentName and Grade.
We have our table created, now let’s generate some random names and grades using the RAND() function assuming nobody had less than 45 and up to 100.
Okay so, one of the guys told me he would do a query like this:
This query is not wrong at all. If we execute each query, we can see that Student10 is in fact the second place with a grade of 96.
However, I remember he mentioned that this was not the very best way to do it and it certainly there would be a better way, and the answer is Yes!
ROW_NUMBER()
I know people that have used this function without knowing it was a Window Function. For those who haven’t, ROW_NUMBER() assigns a number to every row in the result set. The order of the numbers is based on the ORDER BY clause, which is independent to the ORDER BY in the outer query.
To use the ROW_NUMBER() function, we do it following the syntax:
The name of the function and the OVER clause. The latter can use an optional PARTITION BY, and what it does is, it defines the group of rows to work with.
Here we can see that we will get the StudentName, Grade and RowNumber ordered by Grade in a descending way. Since we didn’t provide a PARTITION BY, it will use the entire set of rows to work. In other words, every time the column we provide as PARTITION BY changes, it will reset the number of row. For this example it doesn’t work since there are no repeated students and the row number will be 1 for every body.
Now, for our initial question, we wanted to know who had second place from the list of students, we’re going to use our query with a Common Table Expression (or CTE), and in the outer query, we only add the WHERE clause asking for RowNumber equals to 2, and that’s it.
It’s worth mentioning that we’re not taking performance into account for now, we will look into that in some future post.
Here’s the link to the github repo with the code we used for this post: https://github.com/learningsql/sql-code
I hope you guys liked the post, don’t forget to share and like, see you soon. Cheers!
Discover more from Ben Rodríguez
Subscribe to get the latest posts sent to your email.