Skip to content

Advanced SQL Techniques for Analytics: JOIN LATERAL

  • by

Hey there, welcome back! ✌️

If you saw my post on medium, you might already know what this post’s going to be about. In there I mentioned that the ANSI SQL standard doesn’t support the APPLY operator, but it has a similar feature called LATERAL derived table.

Let’s take our candidate_skills table as in our previous example:

CREATE TABLE public.candidate_skills
(
		candidate_id int,
		skill1 text,
		skill2 text,
		skill3 text,
		skill4 text,
		skill5 text,
		skill6 text,
		skill7 text,
		skill8 text,
		skill9 text,
		skill10 text
)

To get familiar with the syntax, let’s try to answer the same questions as in our previous example:

• How many skills each candidate have?

SELECT
    c.candidate_id,
    COUNT(s.skill) AS num_skills
FROM
    public.candidate_skills AS c
    JOIN LATERAL
    (
        SELECT skill1 AS skill UNION ALL
        SELECT skill2 UNION ALL
        SELECT skill3 UNION ALL
        SELECT skill4 UNION ALL
        SELECT skill5 UNION ALL
        SELECT skill6 UNION ALL
        SELECT skill7 UNION ALL
        SELECT skill8 UNION ALL
        SELECT skill9 UNION ALL
        SELECT skill10
    ) s(skill)
    ON TRUE
GROUP BY c.candidate_id
ORDER BY c.candidate_id;

Result:
candidate_id num_skills
------------ -----------
1            9
2            8
3            8
4            7
5            7
6            7
7            7
8            10
9            7
10           8

As you can see, the syntax is very similar, it’s placed right after the FROM clause using the type of JOIN and then the keyword LATERAL and we need to define an alias for the derived table “s” and the name of the column we’re going to return “skill”. The difference with the APPLY operator is that in here, we need to add the ON TRUE which basically means there is no specific join condition between the left table and the right input query or table expression.

Let’s check out another example to illustrate how can we use JOIN LATERAL derived tables with functions.

Suppose that you have this table:

CREATE TABLE public.candidates 
(
    candidate_id   INT,
    first_name     TEXT,
    last_name      TEXT,
    email          TEXT
)


SELECT * FROM public.candidates LIMIT 10;
Results from candidates table
Result from query above.

And you have the following function that creates the Full Name concatenating the first_name and last_name columns:

CREATE OR REPLACE FUNCTION public.get_candidate_fullname(candidate_id INT)
RETURNS TEXT
AS $$
DECLARE
    fullname TEXT;
BEGIN
    SELECT CONCAT(c.first_name, ' ', c.last_name) INTO fullname
    FROM public.candidates as c
    WHERE c.candidate_id = get_candidate_fullname.candidate_id;
    
    RETURN fullname;
END;
$$
LANGUAGE PLPGSQL;

Now, you want to see the candidate id, the full name and all the skill for each candidate. We’re going to achieve this using our tables: candidates, candidate_skills and get_candidate_fullname function.

SELECT
    cs.candidate_id,
    f.fullname,
    cs.skill1, cs.skill2, cs.skill3, cs.skill4, cs.skill5, cs.skill6, cs.skill7, cs.skill8, cs.skill9, cs.skill10
FROM
    public.candidate_skills AS cs 
    JOIN LATERAL public.get_candidate_fullname(cs.candidate_id) AS f(fullname) ON TRUE
ORDER BY cs.candidate_id
LIMIT 10;
Result from query above

Of course there is a better way to do this, but I just wanted to show you how to work with the JOIN LATERAL derived table. Also, if you want to transpose the columns of the skill and display as rows, we can use the two JOIN LATERAL derived tables to do it, like this:

SELECT
    cs.candidate_id,
    f.fullname,
    s.skill
FROM
    public.candidate_skills AS cs 
    JOIN LATERAL public.get_candidate_fullname(cs.candidate_id) AS f(fullname) ON TRUE
    JOIN LATERAL
    (
        SELECT skill1 AS skill UNION ALL
        SELECT skill2 UNION ALL
        SELECT skill3 UNION ALL
        SELECT skill4 UNION ALL
        SELECT skill5 UNION ALL
        SELECT skill6 UNION ALL
        SELECT skill7 UNION ALL
        SELECT skill8 UNION ALL
        SELECT skill9 UNION ALL
        SELECT skill10 
    ) AS s(skill)
    ON TRUE
WHERE cs.candidate_id = 461    
ORDER BY cs.candidate_id;
Result from query above.

Now you know how to use JOIN LATERAL derived tables. For this example I’m using PostgreSQL 16, but I think it’s available since 9.3. I have tested this in PostgreSQL 9.4, 15. I haven’t tested MySQL, but I know it’s been available since 8.1.14. It also works in Snowflake, syntax is a bit different though, but you can easily pick it up following the same syntax we used here.

I hope you liked this post, thank you for stopping by. Cheers! 🍻


Discover more from Ben Rodríguez

Subscribe to get the latest posts sent to your email.

Discover more from Ben Rodríguez

Subscribe now to keep reading and get access to the full archive.

Continue reading