Hello, and welcome back to another post!
In the last posts I have been writing about SQL Server basics which I think it’s very useful. In this post, I will continue with the basics, but I’ll move on to something a bit different. This is something not a lot of people know, but I consider it’s important. Let’s see how SQL Server processes the queries we send it.
PHASES OF LOGICAL QUERY PROCESSING
What exactly is logical query processing? It’s the way SQL Server executes our queries. One would think that the way we write our queries is the same way it is executed by the engine, well is not that way.
Typically, you would write your query like this:
SELECT TOP 5 Col1, COUNT(*) FROM dbo.MyTable GROUP BY Col1 HAVING COUNT(*) > 1 ORDER BY COUNT(*) DESC;
Which you’d read: Bring me the following columns from MyTable grouped by Col1, filter the groups where there are more than one record and order my data by the amount of records for each group in a descending way.
SQL Server would read the query this way:
1 - FROM <Table> 1.1 - JOIN <join type> 1.2 - APPLY 1.3 - PIVOT 1.4 - UNPIVOT 2 - WHERE 3 - GROUP BY 4 - HAVING 5 - SELECT Col1, Col2 ... ColN 5.1 - DISTINCT 6 - ORDER BY 7 - OFFSET - FETCH, TOP
Using the same query as above: From MyTable, grouped by Col1, filter the groups where there are more than one record, bring back Col1 and the number of records per each group created, and lastly order my data by the number or records in a descending way.
Every phase described creates a virtual table and a temporary result set that SQL Server will use in subsequent phases. To make it more clear, let me show you an example.
SELECT CustomerID, COUNT(OrderID) AS OrderCount, YEAR(OrderDate) AS OrderYear FROM Orders AS o WHERE OrderYear = 2018 GROUP BY CustomerID, YEAR(OrderDate) ORDER BY CustomerID Msg 207, Level 16, State 1, Line 11 Invalid column name 'OrderYear'.
If we try to execute this query, we will get an error message saying that the column name OrderYear does not exists. Do you know why? Well, if you scroll up a bit, you will see that the WHERE
clause goes before than the SELECT
and its list of columns, so SQL Server hasn’t created that column yet.
Here’s another query very similar to that one.
SELECT CustomerID, COUNT(OrderID) AS OrderCount, YEAR(OrderDate) AS OrderYear FROM Orders AS o WHERE YEAR(OrderDate) = 2018 GROUP BY CustomerID, YEAR(OrderDate) ORDER BY OrderYear DESC; CustomerID OrderCount OrderYear ALFKI 1 2018 ANATR 2 2018 ANTON 3 2018 AROUT 1 2018 BERGS 7 2018 BLAUS 2 2018 BLONP 5 2018 BONAP 6 2018 BOTTM 4 2018 BSBEV 6 2018
Here, I changed the WHERE
condition and I added the alias OrderYear to the ORDER BY
clause. Since ORDER BY
happens after we define the column list and the result set it’s generated, that’s when the data is presented in an ordered way.
There are more complicated aspects of query processing like when Window Functions, PIVOT
, UNPIVOT
and APPLY
operators are executed, but I’ll leave those examples for another time.
That’s it for now, I hope you liked the post. Don’t forget to like, share and if you want to receive the most recent news in your inbox, subscribe to my news letter using your email in the bar in the right.
Thanks for reading, see you soon. Cheers! 🍻
Discover more from Ben Rodríguez
Subscribe to get the latest posts sent to your email.