20. GROUP BY & HAVING -- Grouping and Filtering Aggregates
Level: Guided beginner
- GROUP BY syntax -- organize rows by column
- GROUP BY with aggregate functions (SUM, AVG, COUNT)
- HAVING clause -- filter grouped results
- GROUP BY multiple columns
- GROUP BY with ORDER BY
- Reporting and dashboard queries
- Common GROUP BY mistakes
- Best practices for grouping
Aggregate functions (SUM, AVG, COUNT) combine multiple rows. GROUP BY organizes rows by category before aggregating. HAVING filters groups. This article teaches reporting queries.
Why GROUP BY & HAVING Matter for Backend Developers
APIs need summary data for dashboards:
Frontend asks: "Show average exam marks per class"
v
Backend needs: Average marks grouped by class, only for classes with 10+ students
v
GROUP BY ClassName + AVG(MarksObtained)
HAVING COUNT(*) >= 10
v
Class summary report
GROUP BY & HAVING power reports.
GROUP BY Syntax
SELECT
GroupColumn,
COUNT(*) AS RowCount,
SUM(NumericColumn) AS Total,
AVG(NumericColumn) AS Average
FROM Table
GROUP BY GroupColumn;
Rule: All non-aggregate columns in SELECT must be in GROUP BY.
Example 1: Count Students per Class
Group students by class:
SELECT
s.ClassName,
COUNT(s.StudentId) AS StudentCount
FROM Student s
GROUP BY s.ClassName
ORDER BY s.ClassName;
Result (students per class):
ClassName | StudentCount
10-A | 3
10-B | 2
11-A | 4
GROUP BY ClassName groups all students by class, COUNT per group.
Example 2: Average Exam Marks by Subject
Get average marks per subject:
SELECT
sub.Name AS Subject,
COUNT(r.ExamResultId) AS StudentsTested,
AVG(r.MarksObtained) AS AvgMarks,
MAX(r.MarksObtained) AS HighestMarks,
MIN(r.MarksObtained) AS LowestMarks
FROM Subject sub
LEFT JOIN Exam e ON sub.SubjectId = e.SubjectId
LEFT JOIN ExamResult r ON e.ExamId = r.ExamId
GROUP BY sub.SubjectId, sub.Name
ORDER BY AvgMarks DESC;
Result (subject statistics):
Subject | StudentsTested | AvgMarks | HighestMarks | LowestMarks
Maths | 12 | 82.5 | 98 | 65
Science | 11 | 79.3 | 95 | 58
English | 10 | 85.2 | 99 | 70
Multiple aggregates per group. LEFT JOIN includes subjects with no results (NULL counts).
Example 3: HAVING Clause (filter groups)
Show classes with average marks > 80:
SELECT
s.ClassName,
COUNT(s.StudentId) AS StudentCount,
AVG(r.MarksObtained) AS AvgMarks
FROM Student s
LEFT JOIN ExamResult r ON s.StudentId = r.StudentId
GROUP BY s.StudentId, s.ClassName
HAVING AVG(r.MarksObtained) > 80
ORDER BY AvgMarks DESC;
Result (only high-performing classes):
ClassName | StudentCount | AvgMarks
10-A | 3 | 82.5
11-A | 4 | 85.3
HAVING filters groups AFTER grouping. WHERE filters rows BEFORE grouping.
Example 4: Multiple GROUP BY Columns
Exam results grouped by class and subject:
SELECT
s.ClassName,
sub.Name AS Subject,
COUNT(r.ExamResultId) AS StudentCount,
AVG(r.MarksObtained) AS AvgMarks
FROM Student s
INNER JOIN ExamResult r ON s.StudentId = r.StudentId
INNER JOIN Exam e ON r.ExamId = e.ExamId
INNER JOIN Subject sub ON e.SubjectId = sub.SubjectId
GROUP BY s.ClassName, sub.SubjectId, sub.Name
ORDER BY s.ClassName, sub.Name;
Result (per class per subject):
ClassName | Subject | StudentCount | AvgMarks
10-A | English | 2 | 82.5
10-A | Maths | 3 | 80.2
10-B | Maths | 2 | 79.8
11-A | English | 3 | 85.5
GROUP BY multiple columns creates separate group per combination.
Example 5: Fee Collection Report
Total fees by payment status:
SELECT
f.Status AS PaymentStatus,
COUNT(f.FeeAccountId) AS StudentCount,
SUM(f.TotalFees) AS TotalAmount,
SUM(f.PaidAmount) AS AmountPaid,
SUM(f.TotalFees) - SUM(f.PaidAmount) AS AmountDue
FROM FeeAccount f
GROUP BY f.Status
ORDER BY TotalAmount DESC;
Result (fee summary):
PaymentStatus | StudentCount | TotalAmount | AmountPaid | AmountDue
Paid | 45 | 2250000 | 2250000 | 0
Partial | 12 | 600000 | 300000 | 300000
Pending | 8 | 400000 | 0 | 400000
GROUP BY payment status shows financial summary.
Example 6: Attendance Report per Subject
Student attendance percentage per subject:
SELECT
sub.Name AS Subject,
COUNT(DISTINCT a.StudentId) AS StudentCount,
COUNT(DISTINCT a.Date) AS ClassesHeld,
SUM(CASE WHEN a.IsPresent = 1 THEN 1 ELSE 0 END) AS TotalPresent,
CAST(
SUM(CASE WHEN a.IsPresent = 1 THEN 1 ELSE 0 END) * 100.0 /
COUNT(a.AttendanceId) AS DECIMAL(5,2)
) AS AttendancePercentage
FROM Subject sub
LEFT JOIN Attendance a ON sub.SubjectId = a.SubjectId
GROUP BY sub.SubjectId, sub.Name
HAVING COUNT(a.AttendanceId) > 0
ORDER BY AttendancePercentage DESC;
Result (attendance statistics):
Subject | StudentCount | ClassesHeld | TotalPresent | AttendancePercentage
Maths | 30 | 40 | 1140 | 95.00
English | 28 | 40 | 1064 | 94.86
Complex formula: attendance percentage per subject.
Example 7: Teachers with Multiple Subjects
Teachers and subject count:
SELECT
t.Name AS Teacher,
COUNT(sub.SubjectId) AS SubjectCount
FROM Teacher t
LEFT JOIN Subject sub ON t.TeacherId = sub.TeacherId
GROUP BY t.TeacherId, t.Name
HAVING COUNT(sub.SubjectId) >= 2
ORDER BY SubjectCount DESC;
Result (teachers with 2+ subjects):
Teacher | SubjectCount
Dr. Mehta | 3
Mrs. Rao | 2
HAVING filters to teachers with at least 2 subjects.
GROUP BY with WHERE and HAVING
WHERE filters rows BEFORE grouping. HAVING filters groups AFTER grouping.
SELECT
s.ClassName,
AVG(r.MarksObtained) AS AvgMarks
FROM Student s
INNER JOIN ExamResult r ON s.StudentId = r.StudentId
WHERE r.MarksObtained >= 60 -- Filter: only pass marks
GROUP BY s.ClassName
HAVING AVG(r.MarksObtained) > 75 -- Filter: class average > 75
ORDER BY AvgMarks DESC;
WHERE: Exclude fail marks. HAVING: Only show classes averaging > 75.
Common GROUP BY Mistakes
Mistake 1: Non-aggregate column not in GROUP BY
Wrong:
SELECT
s.ClassName,
s.Name, -- Name not in GROUP BY!
COUNT(r.ExamResultId) AS ExamCount
FROM Student s
LEFT JOIN ExamResult r ON s.StudentId = r.StudentId
GROUP BY s.ClassName;
-- ERROR: Cannot select non-aggregate column (Name) not in GROUP BY
Result: SQL error.
Fix: Add to GROUP BY or aggregate:
SELECT
s.ClassName,
COUNT(r.ExamResultId) AS ExamCount
FROM Student s
LEFT JOIN ExamResult r ON s.StudentId = r.StudentId
GROUP BY s.ClassName;
Mistake 2: Using WHERE instead of HAVING
Wrong:
SELECT
s.ClassName,
AVG(r.MarksObtained) AS AvgMarks
FROM Student s
LEFT JOIN ExamResult r ON s.StudentId = r.StudentId
GROUP BY s.ClassName
WHERE AVG(r.MarksObtained) > 80;
-- ERROR: Cannot use aggregate in WHERE
Result: Invalid syntax error.
Fix: Use HAVING:
SELECT
s.ClassName,
AVG(r.MarksObtained) AS AvgMarks
FROM Student s
LEFT JOIN ExamResult r ON s.StudentId = r.StudentId
GROUP BY s.ClassName
HAVING AVG(r.MarksObtained) > 80;
Mistake 3: GROUP BY with NULL causes unexpected group
Wrong:
SELECT
sub.Name AS Subject,
COUNT(r.ExamResultId) AS ExamCount
FROM Subject sub
LEFT JOIN ExamResult r ON sub.SubjectId = r.SubjectId
GROUP BY sub.SubjectId, sub.Name;
-- Subject with no exam results: NULL group created
Result includes NULL group (subjects never tested).
Fix: Filter if needed:
SELECT
sub.Name AS Subject,
COUNT(r.ExamResultId) AS ExamCount
FROM Subject sub
LEFT JOIN ExamResult r ON sub.SubjectId = r.SubjectId
GROUP BY sub.SubjectId, sub.Name
HAVING COUNT(r.ExamResultId) > 0;
-- Exclude untested subjects
Mistake 4: Wrong aggregate function
Wrong:
SELECT
s.ClassName,
MAX(s.Name) AS StudentName -- MAX on VARCHAR meaningless
FROM Student s
GROUP BY s.ClassName;
Result: Max student name alphabetically (nonsense).
Fix: Use correct aggregate:
SELECT
s.ClassName,
COUNT(*) AS StudentCount
FROM Student s
GROUP BY s.ClassName;
Mistake 5: GROUP BY with ORDER BY wrong column
Wrong:
SELECT
s.ClassName,
AVG(r.MarksObtained) AS AvgMarks
FROM Student s
LEFT JOIN ExamResult r ON s.StudentId = r.StudentId
GROUP BY s.ClassName
ORDER BY s.Name; -- ORDER BY: s.Name not in GROUP BY
-- ERROR
Fix: ORDER BY grouped or aggregate column:
SELECT
s.ClassName,
AVG(r.MarksObtained) AS AvgMarks
FROM Student s
LEFT JOIN ExamResult r ON s.StudentId = r.StudentId
GROUP BY s.ClassName
ORDER BY AvgMarks DESC;
Best Practices for GROUP BY & HAVING
- All non-aggregate columns in GROUP BY -- Strict rule
- Use HAVING for group filters -- Not WHERE
- Use WHERE for row pre-filters -- Before grouping
- Multiple GROUP BY for detail -- Drill-down reports
- Test NULL handling -- LEFT JOINs with GROUP BY create NULL groups
- Order by aggregate -- For top-N reports
- Use derived table for complex logic -- If multiple GROUP BY levels needed
WHERE: Filters BEFORE grouping (individual rows).
FROM Student s
WHERE s.ClassName = '10-A' -- Filter: only 10-A students
HAVING: Filters AFTER grouping (groups).
GROUP BY s.ClassName
HAVING COUNT(*) > 5 -- Filter: classes with 5+ students
WHERE: Row filter. HAVING: Group filter.
Each group must have single value per column:
SELECT s.ClassName, COUNT(*) FROM Student s GROUP BY s.ClassName;
-- ClassName has 1 value per group (all 10-A students)
-- COUNT(*) is aggregate (1 value per group)
-- Valid
If Name not in GROUP BY:
SELECT s.ClassName, s.Name FROM Student s GROUP BY s.ClassName;
-- ClassName: 1 value (10-A)
-- Name: multiple values (Ravi, Priya, Arjun) -- which one to show?
-- ERROR
SQL requires deterministic result: 1 row per group, 1 column value per row.
Yes, some databases allow it (not SQL Server standard):
SELECT ClassName, COUNT(*) FROM Student
GROUP BY 1; -- Group by first column (ClassName)
Better: Explicit column name:
SELECT ClassName, COUNT(*) FROM Student
GROUP BY ClassName;
More readable. Always use explicit column names.
NULL values grouped together:
SELECT FeeStatus, COUNT(*) FROM FeeAccount
GROUP BY FeeStatus;
Result:
FeeStatus | COUNT
NULL | 5 <- NULL is a group
Paid | 45
Pending | 8
NULL = NULL, so all NULL rows in one group.
To handle: Use COALESCE:
SELECT COALESCE(FeeStatus, 'Unknown') AS Status, COUNT(*)
FROM FeeAccount
GROUP BY FeeStatus;
Use ORDER BY and TOP:
SELECT TOP 5
s.ClassName,
COUNT(*) AS StudentCount
FROM Student s
GROUP BY s.ClassName
ORDER BY StudentCount DESC;
Result: Top 5 classes by student count.
With OFFSET/FETCH:
SELECT
s.ClassName,
COUNT(*) AS StudentCount
FROM Student s
GROUP BY s.ClassName
ORDER BY StudentCount DESC
OFFSET 0 ROWS FETCH NEXT 5 ROWS ONLY;
Same result, more explicit.
DISTINCT: Get unique values only (no aggregates).
SELECT DISTINCT ClassName FROM Student;
-- Unique class names
GROUP BY: Aggregates per group.
SELECT ClassName, COUNT(*) FROM Student
GROUP BY ClassName;
-- Count per class
Use DISTINCT for unique values. GROUP BY for aggregates. DISTINCT simpler if no aggregate needed.
Use ChatGPT, Claude, or Copilot to go deeper on GROUP BY and HAVING for reporting in SQL Server. Try these prompts:
"Explain GROUP BY: average exam marks per class""What's the difference between WHERE and HAVING?""Show multiple GROUP BY columns: class and subject statistics""How do I use GROUP BY to create a fee collection report?""Quiz me: ask 5 questions about GROUP BY, HAVING, aggregates"
💡 Tip: After reading this article, paste your own code into AI and ask "What could go wrong here and why?" — fastest way to find edge cases and deepen understanding.