DSPython Logo DSPython

SQL Subqueries

Master nested queries to solve complex data retrieval problems using WHERE, FROM, and SELECT subqueries.

SQL Advanced 75 min

🧠 Introduction: The Nested Query

A **Subquery** (or inner query) is a query nested inside another SQL query (the outer query). It executes first and passes its result back to the outer query, enabling complex logic that cannot be achieved with a simple WHERE or JOIN clause.

Subqueries are powerful and can be used in the SELECT, FROM, and WHERE clauses. They are often used to find a single value or a set of values needed to complete a main query.

πŸ“ Basic Subquery Syntax (WHERE clause):

SELECT * FROM TableA WHERE ColumnX IN (SELECT ColumnY FROM TableB WHERE Condition);

🎯 Topic 1: Non-Correlated Subqueries

A **Non-Correlated Subquery** is completely independent of the outer query. It runs once and its result is used by the main query. This is the most common and easiest type of subquery.

βœ… Example Use Case:

Find employees whose salary is greater than the **average salary of ALL employees**. The inner query calculates the average once, and the outer query uses that single value.

πŸ’» Example: Above-Average Salary

SELECT FirstName, Salary FROM Employees
WHERE Salary > (SELECT AVG(Salary) FROM Employees);

πŸ”„ Topic 2: Correlated Subqueries

A **Correlated Subquery** depends on the outer query for its values. It executes **once for every row** processed by the outer query. This makes them less efficient than non-correlated subqueries or Joins, but they are necessary for comparing data within groups.

Correlated subqueries typically use the **EXISTS** or **NOT EXISTS** operators to check for the presence of rows.

πŸ’» Example: Employee with Max Salary in Dept

SELECT FirstName, Department FROM Employees E1
WHERE Salary = (
SELECT MAX(Salary) FROM Employees E2
WHERE E1.Department = E2.Department
);

πŸ—‚οΈ Topic 3: Subqueries in WHERE, FROM, and SELECT

Subqueries can be categorized based on where they appear in the main query:

1. WHERE Clause (Row Subqueries):

Returns a single column or multiple rows/columns. Used with comparison operators (=, <, >) or multi-value operators (IN, ANY, ALL, EXISTS).

2. FROM Clause (Derived Tables):

The subquery is treated as an inline view (a temporary table) from which the outer query selects data. **Must be aliased.**

SELECT T.Dept, T.AvgSalary FROM (SELECT Department Dept, AVG(Salary) AvgSalary FROM Employees GROUP BY Department) T;

3. SELECT Clause (Scalar Subqueries):

Returns a **single value** (one column, one row). Used to display aggregated or derived data alongside individual rows.

SELECT FirstName, Salary, (SELECT AVG(Salary) FROM Employees) AS GlobalAvg FROM Employees;

πŸ”‘ Topic 4: Operators for Subqueries

Operators dictate how the main query processes the result of the subquery:

1. IN / NOT IN:

Tests membership in the set of values returned by the subquery. (e.g., `WHERE Department IN (SELECT DeptName FROM Projects)`).

2. ANY / ALL:

Used with a comparison operator: `> ANY` means greater than *at least one* value (equivalent to `> MIN()`). `> ALL` means greater than *every* value (equivalent to `> MAX()`).

3. EXISTS / NOT EXISTS:

Used for Correlated Subqueries. Checks if the subquery returns **any rows**. It does not care about the actual data, only if the condition is met for the outer row.

πŸ“š Module Summary

  • Subquery: A query nested inside another query (Outer Query).
  • Non-Correlated: Runs once; independent of the outer query.
  • Correlated: Runs for *every* row of the outer query; depends on the outer query's data.
  • Types: Use in WHERE (Row/Column values), FROM (Derived Tables), SELECT (Scalar value).

πŸ€” Interview Q&A

Tap on the questions below to reveal the answers.

A **Correlated Subquery** depends on the outer query and executes once for **every row** returned by the outer query. A Non-Correlated Subquery executes only **once**. Because of this repeated execution, Correlated Subqueries are typically slower and should be converted to a JOIN where possible.

You must use an alias when the subquery appears in the **FROM clause** (a Derived Table). The database engine treats the subquery's result set as a temporary table, and all tables (including temporary ones) in the FROM clause require an alias for referencing their columns in the outer query.

The **`IN`** operator checks if a value in the outer query is *equal* to **any** value returned by the subquery's result set (which can contain many values). The **`EXISTS`** operator checks for the *existence* of any rows returned by the subquery, typically a correlated one. `EXISTS` is often faster because the inner query can stop execution as soon as the first matching row is found.

A **Scalar Subquery** is a subquery that returns a single column and a single row (a single value). It can be used anywhere a single expression or column name is valid, most commonly in the SELECT or WHERE clause (with a single-value comparison operator like `=`).

πŸ€–
DSPython AI Assistant βœ–
πŸ‘‹ Hi! I’m your AI assistant. Paste your code here, I will find bugs for you.