SQL's UNION operator is a powerful tool that allows you to combine the results of two or more SELECT statements into a single result set. This versatile operator is essential for data analysts and database administrators who need to merge data from different tables or even different databases. In this comprehensive guide, we'll explore the UNION operator in depth, covering its syntax, use cases, and best practices.

Understanding the UNION Operator

The UNION operator is used to combine the result sets of two or more SELECT statements. It removes duplicate rows from the final result set unless the UNION ALL variant is used. The basic syntax of the UNION operator is as follows:

SELECT column1, column2, ... FROM table1
UNION
SELECT column1, column2, ... FROM table2;

🔑 Key points to remember:

  • The number and order of columns in all SELECT statements must be the same.
  • The data types of corresponding columns should be compatible.
  • By default, UNION removes duplicate rows from the result set.

Let's dive into some practical examples to illustrate how UNION works.

Example 1: Basic UNION Operation

Suppose we have two tables: employees and contractors. We want to create a list of all people working for the company, including both employees and contractors.

Table: employees
| id | name | department |
|—-|——-|————|
| 1 | Alice | Sales |
| 2 | Bob | Marketing |
| 3 | Carol | IT |

Table: contractors
| id | name | project |
|—-|——-|————|
| 1 | David | Website |
| 2 | Eve | Mobile App |
| 3 | Alice | Database |

To combine these lists, we can use the following UNION query:

SELECT id, name, department AS work_area FROM employees
UNION
SELECT id, name, project AS work_area FROM contractors;

Result:
| id | name | work_area |
|—-|——-|————|
| 1 | Alice | Sales |
| 2 | Bob | Marketing |
| 3 | Carol | IT |
| 1 | David | Website |
| 2 | Eve | Mobile App |
| 3 | Alice | Database |

📌 Note: Even though Alice appears in both tables, she's only listed once in the result. This is because UNION removes duplicates by default.

Example 2: UNION ALL

If you want to keep all rows, including duplicates, you can use UNION ALL instead of UNION. Let's modify our previous example:

SELECT id, name, department AS work_area FROM employees
UNION ALL
SELECT id, name, project AS work_area FROM contractors;

Result:
| id | name | work_area |
|—-|——-|————|
| 1 | Alice | Sales |
| 2 | Bob | Marketing |
| 3 | Carol | IT |
| 1 | David | Website |
| 2 | Eve | Mobile App |
| 3 | Alice | Database |

📌 Note: Now Alice appears twice in the result set because UNION ALL doesn't remove duplicates.

Example 3: UNION with WHERE Clause

You can use WHERE clauses in your SELECT statements to filter the data before combining it with UNION. Let's say we want to list all people in the Sales department or working on the Website project:

SELECT id, name, department AS work_area FROM employees
WHERE department = 'Sales'
UNION
SELECT id, name, project AS work_area FROM contractors
WHERE project = 'Website';

Result:
| id | name | work_area |
|—-|——-|———–|
| 1 | Alice | Sales |
| 1 | David | Website |

Example 4: UNION with ORDER BY

When using UNION, you can only have one ORDER BY clause, which must come at the end of the query and applies to the entire result set:

SELECT id, name, department AS work_area FROM employees
UNION
SELECT id, name, project AS work_area FROM contractors
ORDER BY name;

Result:
| id | name | work_area |
|—-|——-|————|
| 1 | Alice | Sales |
| 2 | Bob | Marketing |
| 3 | Carol | IT |
| 1 | David | Website |
| 2 | Eve | Mobile App |

📌 Note: The ORDER BY clause sorts the entire combined result set by name.

Example 5: UNION with Different Data Types

When using UNION, the data types of corresponding columns must be compatible. If they're not exactly the same, SQL will attempt to convert them. Let's look at an example with a products table and a services table:

Table: products
| id | name | price |
|—-|———|——-|
| 1 | Laptop | 1000 |
| 2 | Mouse | 20 |

Table: services
| id | name | hourly_rate |
|—-|—————-|————-|
| 1 | IT Consulting | 150 |
| 2 | Web Design | 100 |

We can combine these using UNION, but we need to ensure the data types match:

SELECT id, name, CAST(price AS DECIMAL(10,2)) AS cost FROM products
UNION
SELECT id, name, hourly_rate AS cost FROM services;

Result:
| id | name | cost |
|—-|—————-|——–|
| 1 | Laptop | 1000.00|
| 2 | Mouse | 20.00 |
| 1 | IT Consulting | 150.00 |
| 2 | Web Design | 100.00 |

📌 Note: We used CAST to ensure the price from the products table matches the data type of hourly_rate from the services table.

Example 6: UNION with Subqueries

UNION can also be used with subqueries. This is particularly useful when you need to combine results from complex queries. Let's say we want to list the highest-paid employee and the highest-paid contractor:

SELECT 'Employee' AS type, name, salary
FROM (
    SELECT name, salary
    FROM employees
    ORDER BY salary DESC
    LIMIT 1
) AS top_employee
UNION
SELECT 'Contractor' AS type, name, rate
FROM (
    SELECT name, rate
    FROM contractors
    ORDER BY rate DESC
    LIMIT 1
) AS top_contractor;

Result:
| type | name | salary |
|————|——-|——–|
| Employee | Alice | 75000 |
| Contractor | David | 100 |

📌 Note: This query uses subqueries to find the top-paid individual in each category before combining the results with UNION.

Best Practices and Performance Considerations

When using the UNION operator, keep these best practices in mind:

  1. 🚀 Use UNION ALL instead of UNION when you don't need to remove duplicates. UNION ALL is faster because it doesn't need to perform the extra step of removing duplicates.

  2. 📊 Ensure that the number and order of columns in all SELECT statements are the same to avoid errors.

  3. 🔍 Use appropriate WHERE clauses in each SELECT statement to filter data before the UNION operation. This can improve performance by reducing the amount of data that needs to be processed.

  4. 🔢 Be mindful of data types. Ensure that corresponding columns have compatible data types, or use CAST to convert them explicitly.

  5. 📈 When using ORDER BY with UNION, remember that it applies to the entire result set and must come at the end of the query.

  6. 🧠 Consider using UNION in combination with other SQL features like subqueries, JOIN operations, or aggregate functions to create more complex and powerful queries.

Conclusion

The SQL UNION operator is a versatile tool for combining result sets from multiple queries. Whether you're merging data from different tables, databases, or even completely different data sources, UNION provides a straightforward way to create unified result sets. By understanding its syntax, variants like UNION ALL, and best practices for its use, you can leverage UNION to solve complex data combination challenges efficiently.

Remember, the key to mastering UNION is practice. Try creating your own examples, experiment with different table structures, and combine UNION with other SQL features to fully grasp its potential. Happy querying!