SQL Beginner Tips: Joins!

An introduction & syntax review of basic joins in SQL

If you’re new to SQL (or just need a fresher) feel free to check out my previous post about how to build SQL queries (previous post).

Now, let’s talk about joins in SQL. The purpose of this post is to cover the basic foundations on how to join data from different tables in Microsoft SQL Server. For this post, I’ve build two small tables that I can join together. Here’s a quick peek at them:

select * 
from OFFICE_LOCATION
Table 1: 3 records
IDCITYSTATELAT_NLONG_W
13PhoenixAZ33112
44DenverCO40105
66San JoseCA4768

The code to re-build this table if you want it to follow again is here:

/*
CREATE TABLE OFFICE_LOCATION
(ID INTEGER PRIMARY KEY,
CITY CHAR(20),
STATE CHAR(20),
LAT_N INTEGER,
LONG_W INTEGER);

INSERT INTO OFFICE_LOCATION VALUES (13, 'Phoenix', 'AZ', 33, 112);
INSERT INTO OFFICE_LOCATION VALUES (44, 'Denver', 'CO', 40, 105);
INSERT INTO OFFICE_LOCATION VALUES (66, 'San Jose', 'CA', 47, 68);

*/

The second table contains fake employee-location information:

SELECT * 
FROM EMPLOYEE_LOCATION
Table 2: 5 records
EMP_IDEMP_NAMECITYST
123DavidSan FranciscoCA
1344SarahDenverCO
664AnnaPhoenixAZ
664PaulDenverCo
664SeanSan JoesCosta Rica

If you want to follow along directly, here’s the queries to build this table:

/*
CREATE TABLE EMPLOYEE_LOCATION
(EMP_ID INTEGER,
EMP_NAME CHAR(50),
CITY CHAR(20),
ST CHAR (20)
)
INSERT INTO EMPLOYEE_LOCATION VALUES (123, 'David', 'San Francisco', 'CA' );
INSERT INTO EMPLOYEE_LOCATION VALUES (1344, 'Sarah', 'Denver', 'CO');
INSERT INTO EMPLOYEE_LOCATION VALUES (664, 'Anna', 'Phoenix', 'AZ');
INSERT INTO EMPLOYEE_LOCATION VALUES (664, 'Paul', 'Denver', 'CO');
INSERT INTO EMPLOYEE_LOCATION VALUES (664, 'Sean', 'New York', 'NY');
*/

A great cheat sheet to have on hand for SQL joins is this:

This joins image above outlines seven different joins but really they can be broken into four categories: left join, right join, inner join and outer join.

First, let’s focus on the left join which are the two joins outlined in the image on the lefthandside. The only difference between these two joins is that you can use the WHERE statement to filter your output to only return records from one table. In general though, the left join is very similar. The only trick to remember is that the first table in the query (the one listed in the FROM statement) is your “base table”. In other words, all records will be returned from this table regardless of if they match to the second table. Records in the second table (the one listed in the LEFT JOIN statement) will only be returned in the output if it matches to a record in the base table.

The base syntax for a left join (in which we return all fields of both tables):


SELECT A.*, B.*
FROM OFFICE_LOCATION A
LEFT JOIN EMPLOYEE_LOCATION B 
ON A.CITY = B.CITY
Table 3: 4 records
IDCITYSTATELAT_NLONG_WEMP_IDEMP_NAMECITYST
13PhoenixAZ33112664AnnaPhoenixAZ
44DenverCO401051344SarahDenverCO
44DenverCO40105664PaulDenverCo
66San JoseCA4768NANANANA

In our output, you can see that the last record for the city of San Jose CA does not have any data for the final four fields. These fields are from the second table, employee_location, and so this indicates that there is not a city that matches “San Jose” in the second table. If you only want to return the records that match between both tables, you can update the query to be the following:

SELECT A.*, B.*
FROM OFFICE_LOCATION A
LEFT JOIN EMPLOYEE_LOCATION B 
ON A.CITY = B.CITY
WHERE B.CITY IS NOT NULL
Table 4: 3 records
IDCITYSTATELAT_NLONG_WEMP_IDEMP_NAMECITYST
44DenverCO401051344SarahDenverCO
13PhoenixAZ33112664AnnaPhoenixAZ
44DenverCO40105664PaulDenverCo

If you only want to return the records that do not match, you can just remove the “NOT” from that final where clause.

All other types of joins slightly vary from this syntax. A right join is identical to left join but the second table is the main table and the first table listed in the query is matched onto the second.

The inner join function only returns records that match between both tables. In inner joins, it does not matter which table you put first in the query and which you put second. You can also just use the ‘JOIN’ function and SQL defaults to an inner join. Our example above with an inner join would like like the following:

SELECT A.*, B.*
FROM OFFICE_LOCATION A
INNER JOIN EMPLOYEE_LOCATION B 
ON A.CITY = B.CITY
Table 5: 3 records
IDCITYSTATELAT_NLONG_WEMP_IDEMP_NAMECITYST
44DenverCO401051344SarahDenverCO
13PhoenixAZ33112664AnnaPhoenixAZ
44DenverCO40105664PaulDenverCo

You can see that the left join with the where clause is idential to the output from the inner join. That’s because those queries only return the matched records between the two tables.

The final type of join that I rely on regularly is a full join. The full join returns everything from both tables. In our example, it would look like this:

SELECT A.*, B.*
FROM OFFICE_LOCATION A
FULL JOIN EMPLOYEE_LOCATION B 
ON A.CITY = B.CITY
Table 6: 6 records
IDCITYSTATELAT_NLONG_WEMP_IDEMP_NAMECITYST
13PhoenixAZ33112664AnnaPhoenixAZ
44DenverCO401051344SarahDenverCO
44DenverCO40105664PaulDenverCo
66San JoseCA4768NANANANA
NANANANANA123DavidSan FranciscoCA
NANANANANA664SeanSan JoesCosta Rica

The records with no ID value but values in the later fields are from the EMPLOYEE_LOCATION table but do not match onto the first table. You can see here that San Francisco is not in the first table and San Jose is spelled wrong in the second table - hence they don’t match. However, since it is a full join, all the records even the ones without a match are included in the output.

This has been a quick (and hopefully helpful!) guide to basic joins in SQL. I encourage you to keep playing around with joining on multiple fields and utilizing where clauses to filter your outputs. SQL also has super cool between joins that are great to look into if you’re joining tables with dates. Enjoy your joining!

Next
Previous