Glad You're Ready. Let's Get Started!

Let us know how we can contact you.

Thank you!

We'll respond shortly.

  • Blog Navigation
Removing subqueries from 'negative joins' in SQL

I recently worked on a project that had Users who were members of Groups,
and we had to figure out how to get a list of all Groups that a User was not a part of. In this post I’ll describe two ways of accomplishing this: one with a subquery, and one without.

Removing subqueries may, in some cases, increase the performance of the query. If you have a subquery in an SQL statement and the query optimizer is telling you that the subquery is expensive, you can probably accomplish the same thing with regular joins and improve things a bit. The only way to tell is to run them on realistic datasets.

The data model

First, create the Users, Groups and Memberships tables, and add just enough data to test the queries, like so (using PostgreSQL syntax):

create table users (id serial, name varchar);
create table groups (id serial, name varchar);
create table memberships (id serial, user_id int, group_id int);
insert into users (name) values ('Dexter'), ('Lumen');
insert into groups (name) values ('Dark Passengers'), ('Vigilantes'), ('Normal People');
insert into memberships (user_id, group_id) values (1,1), (1,2), (2,2);

The users table should look like:

select * from users;

 id |  name
  1 | Dexter
  2 | Lumen

The groups table should look like:

select * from groups;

 id |     name
  1 | Dark Passengers
  2 | Vigilantes
  3 | Normal People

And the memberships table should look like:

from memberships
inner join users on = memberships.user_id
inner join groups on = memberships.group_id;

  name  |      name
 Dexter | Dark Passengers
 Dexter | Vigilantes
 Lumen  | Vigilantes

Notice how Dexter is a member 2 groups, and the ‘Normal People’ group has no members.

Example 1: Using a subquery

Finding the groups that Dexter is not a member of is easy with a subquery, and looks something like this:

select name
from groups
where id not in (
  select group_id from memberships where user_id = 1

 Normal People

As you can see, the dataset returned is correct, and the query is relatively simple to read.

Example 2: Grouping and counting members

NOTE: see tk’s comment below for a simpler version.

Depending on your database setup, it might be more performant to eliminate that subquery. Here’s one way to do it:

from groups
left join memberships on memberships.group_id =
left join memberships filtered_memberships on =
  and filtered_memberships.user_id != 1
group by,
having count(memberships.user_id) = count(filtered_memberships.user_id);

 Normal People

Let’s break that down a bit. The query above is doing this:

  • Getting all groups
  • Counting how many members are in each group
  • Counting how many non-Dexter members are in each group
  • If the two counts are the same, we know that Dexter must not be in that group

Here’s another way of looking at it, without the group by clause:

  memberships.user_id as member_id,
  filtered_memberships.user_id as filtered_member_id
from groups
left join memberships on memberships.group_id =
left join memberships filtered_memberships on =
  and filtered_memberships.user_id != 1;

      name       | member_id | filtered_member_id
 Dark Passengers |         1 |
 Vigilantes      |         1 |
 Vigilantes      |         2 |                  2
 Normal People   |           |

To figure out which groups Dexter is not a part of, count the number of entries in each column, and if they match, it means that Dexter is not a part of that group.

At first this may look odd, because the query joins to memberships twice: once for all members, once for just non-Dexter members.
It’s important that the filter (user_id != 1) be applied to the join clause, and not in a where clause, which would end up improperly removing rows from the result set.

Thanks to Charles LeRose for coming up with the idea of comparing the counts of the memberships.

Can you think of a better way to get the same result set without a subquery? If so, post it in the comments below!

  • tk

    How about

    select from groups left outer join memberships on memberships.group_id = where memberships.group_id is null;

  • Please excuse brevity and typos, i’m on my phone.

    Groups left join members on (groups.group_id = members.group_id and members.user_id = 1) where members.user_id is null

    I think that works because if the left join fails the condition then the user is not a member, but since it’s a left join you’ll get a null on the right side.

    Of course, it’s untested because I don’t have PG on my phone (yet :-) )


  • tk

    Sorry, I meant

    select name from groups
    left outer join memberships on memberships.group_id =
    and memberships.user_id = 1
    where group_id is null;

  • Jeff Dean

    Much more elegant. Thanks!

  • You don’t need it for this query, but remember that in addition to UNION, PostgreSQL also provides INTERSECTION and EXCEPT. I think EXCEPT can be used anytime that, if you had Ruby arrays, you’d want to do array1 – array2.

Share This