SQL Aggregate Functions
Learn to summarize data using GROUP BY, SUM, AVG, and COUNT.
🔢 Topic 1: Aggregate Functions
**Aggregate functions** perform a calculation on a set of rows and return a single summary value. They summarize data vertically across an entire column or a specific group.
📈 The 5 Core Aggregates:
COUNT(): Returns the number of rows/values.SUM(): Returns the total sum of a numeric column.AVG(): Returns the average value of a numeric column.MIN(): Returns the smallest value.MAX(): Returns the largest value.
COUNT Variations (Important Distinction):
COUNT(*): Counts **all rows**, including those with `NULL` values.COUNT(ColumnName): Counts **only non-NULL** values in that specific column.COUNT(DISTINCT ColumnName): Counts the number of **unique, non-NULL** values in that column.
💻 Example: COUNT vs COUNT DISTINCT
-- Total number of employees (all rows)SELECT COUNT(*) FROM Employees;
-- Total number of unique departmentsSELECT COUNT(DISTINCT Department) FROM Employees;
👥 Topic 2: The GROUP BY Clause
The **GROUP BY** clause is used to group identical data from one or more columns into summary rows. Instead of getting one total salary for the company, you can get the total salary **for each department**.
⚠️ The Golden Rule of Grouping:
Any non-aggregated column in the SELECT list (a column that is not wrapped in SUM(), AVG(), etc.) **MUST** appear in the GROUP BY clause.
💻 Example: Sum by Department
-- Get the average salary and employee count for each departmentSELECT Department, AVG(Salary), COUNT(*)FROM EmployeesGROUP BY Department;
⚖️ Topic 3: HAVING vs WHERE
These two clauses both filter data, but their placement and use case are entirely different. This is the **most common topic** for SQL interviews.
1. WHERE Clause:
Filters **individual rows** *before* any grouping or aggregation takes place. **Cannot** use aggregate functions.
2. HAVING Clause:
Filters **grouped results** *after* the grouping and aggregation are complete. **Must** use aggregate functions in the condition.
💻 Example: Filtering Grouped Results
-- Get departments where the total salary is OVER 100000SELECT Department, SUM(Salary)FROM EmployeesGROUP BY DepartmentHAVING SUM(Salary) > 100000;
🧠 Topic 4: SQL Logical Execution Order
The order in which you write the SQL keywords is different from the order in which the database **logically executes** the query. Understanding this flow is essential for debugging and writing efficient SQL.
[Image of SQL execution order flow chart]📝 Execution Steps (The Database's Order):
- FROM: Determines the initial dataset (the source table).
- WHERE: Filters out individual rows (cannot use aggregates).
- GROUP BY: Groups the remaining rows based on the specified columns.
- HAVING: Filters out groups based on aggregate results.
- SELECT: Calculates the final expressions (aggregates, simple columns).
- ORDER BY: Sorts the final result set.
📚 Module Summary
- Aggregate Functions: Summarize vertically (SUM, AVG, COUNT).
- GROUP BY: Splits the data into groups to apply aggregates.
- WHERE: Filters rows BEFORE grouping (no aggregates).
- HAVING: Filters groups AFTER aggregation (must use aggregates).
- Order: FROM -> WHERE -> GROUP BY -> HAVING -> SELECT -> ORDER BY.
🤔 Interview Q&A
Tap on the questions below to reveal the answers.