A SQL self join is a regular join, but the table is joined with itself. In other words, a self join is a query in which a table is joined with itself to find the relationships among its own rows.
Syntax of SQL SELF JOIN
SELECT column1, column2, column3, ... FROM table1 t1 JOIN table1 t2 ON t1.column = t2.column;
In the above syntax, table1 is joined with itself as t1 and t2. The alias t1 and t2 are used to differentiate between the two instances of the table in the join condition.
Example of SQL SELF JOIN
Consider the following table, “employees”:
EmployeeID | EmployeeName | ManagerID |
---|---|---|
E001 | John | E003 |
E002 | Jane | E003 |
E003 | Bob | NULL |
E004 | Alice | E003 |
Now, if we want to find the employees and their respective managers, we can use the following SQL SELF JOIN statement:
SELECT e1.EmployeeID AS Employee, e1.EmployeeName, e2.EmployeeID AS Manager, e2.EmployeeName FROM employees e1 JOIN employees e2 ON e1.ManagerID = e2.EmployeeID;
The output of this SQL SELF JOIN statement would be:
Employee | EmployeeName | Manager | ManagerName |
---|---|---|---|
E001 | John | E003 | Bob |
E002 | Jane | E003 | Bob |
E004 | Alice | E003 | Bob |
In the above output, each employee is matched with its respective manager based on the ManagerID column. The NULL value in the ManagerID column for Bob indicates that he is the top-most manager in the hierarchy.
Multiple Self Joins
We can also use multiple self joins to join a table with itself multiple times. For example, consider the following table “orders”:
OrderID | CustomerID | OrderDate |
---|---|---|
1 | C001 | 2022-01-01 |
2 | C002 | 2022-01-02 |
3 | C001 | 2022-02-01 |
4 | C002 | 2022-02-02 |
If we want to find the customers who have placed two or more orders, we can use the following SQL SELF JOIN statement:
SELECT t1.CustomerID, t1.OrderID, t1.OrderDate, t2.OrderID, t2.OrderDate FROM orders t1 JOIN orders t2 ON t1.CustomerID = t2.CustomerID AND t1.OrderID <> t2.OrderID;
The output of this SQL SELF JOIN statement would be:
CustomerID | OrderID | OrderDate | OrderID | OrderDate |
---|---|---|---|---|
C001 | 1 | 2022-01-01 | 3 | 2022-02-01 |
C002 | 2 | 2022-01-02 | 4 | 2022-02-02 |
In the above output, each customer who has placed two or more orders is matched with all of their respective orders. The AND t1.OrderID <> t2.OrderID
clause is used to exclude the rows where the same order is matched with itself.
Conclusion
SQL self join is a powerful tool that allows us to join a table with itself to find relationships among its own rows. It is useful in finding hierarchical relationships, duplicates, and other patterns in the data. By using multiple self joins, we can even find complex relationships among the rows of a single table.