Skip to main content

18. Subqueries -- Nested SELECT Statements

Level: Guided beginner

ℹ️ What You'll Learn
  • Subquery syntax -- SELECT within SELECT
  • Subqueries in WHERE clause (IN, EXISTS, comparison)
  • Subqueries in FROM clause (derived tables)
  • Scalar subqueries (single value)
  • Correlated subqueries (reference outer query)
  • Subqueries vs JOINs comparison
  • Performance of subqueries
  • Common subquery mistakes

JOINs connect tables. Subqueries nest queries. Both solve "relate data" problems. This article teaches using SELECT inside SELECT for complex queries.

Why Subqueries Matter for Backend Developers

APIs need filtered related data:

Frontend asks: "Get students with exam marks > class average"
v
Backend needs:
1. Calculate class average (subquery)
2. Get students exceeding that average (outer query)
v
Subquery solution: WHERE MarksObtained > (SELECT AVG(MarksObtained) FROM ExamResult)
v
One query, clear logic

Subqueries solve "query within query" problems that JOINs can't.

Subquery Syntax

SELECT *
FROM Table1
WHERE Column IN (
SELECT Column FROM Table2
WHERE Condition
);

Parts:

  • Outer query: Main SELECT
  • Subquery (inner query): SELECT inside parentheses
  • Subquery executes first, result feeds outer query

Example 1: Subquery in WHERE with IN

Get students in class 10-A who scored > 80:

SELECT
s.Name AS Student,
e.ExamName,
r.MarksObtained
FROM Student s
INNER JOIN ExamResult r ON s.StudentId = r.StudentId
INNER JOIN Exam e ON r.ExamId = e.ExamId
WHERE s.StudentId IN (
SELECT StudentId FROM ExamResult
WHERE MarksObtained > 80
);

Result (only high scorers):

Student | ExamName | MarksObtained
Ravi Kumar | Final | 92
Arjun Reddy | Final | 88

Subquery filters: WHERE MarksObtained > 80 returns student IDs. Outer query gets their details.

Example 2: Subquery with Comparison Operator

Get teacher who teaches more subjects than Dr. Mehta:

SELECT
t.Name AS Teacher,
COUNT(s.SubjectId) AS SubjectCount
FROM Teacher t
LEFT JOIN Subject s ON t.TeacherId = s.TeacherId
GROUP BY t.TeacherId, t.Name
HAVING COUNT(s.SubjectId) > (
SELECT COUNT(SubjectId) FROM Subject
WHERE TeacherId = (SELECT TeacherId FROM Teacher WHERE Name = 'Dr. Mehta')
);

Result (teachers with more subjects than Dr. Mehta):

Teacher | SubjectCount
Mrs. Rao | 3

Nested subqueries: Inner finds Dr. Mehta's ID, middle counts his subjects, outer filters teachers exceeding that.

Example 3: Subquery in FROM Clause (Derived Table)

Get top 3 students by average marks:

SELECT
StudentName,
AvgMarks
FROM (
SELECT
s.Name AS StudentName,
AVG(r.MarksObtained) AS AvgMarks
FROM Student s
INNER JOIN ExamResult r ON s.StudentId = r.StudentId
GROUP BY s.StudentId, s.Name
) AS StudentAverages
WHERE AvgMarks >= 85
ORDER BY AvgMarks DESC;

Result (students averaging 85+):

StudentName | AvgMarks
Ravi Kumar | 89.5
Priya Sharma | 87.2

Subquery in FROM creates temporary result set. Outer query filters it.

Example 4: Scalar Subquery (single value)

Get each student's marks relative to class average:

SELECT
s.Name AS Student,
r.MarksObtained,
(
SELECT AVG(MarksObtained) FROM ExamResult
WHERE ExamId = r.ExamId
) AS ExamAverage,
r.MarksObtained - (
SELECT AVG(MarksObtained) FROM ExamResult
WHERE ExamId = r.ExamId
) AS AboveAverage
FROM Student s
INNER JOIN ExamResult r ON s.StudentId = r.StudentId;

Result (marks vs class average):

Student | MarksObtained | ExamAverage | AboveAverage
Ravi Kumar | 92 | 80 | +12
Priya Sharma| 75 | 80 | -5

Scalar subquery returns single value. Used in SELECT column list.

Example 5: Correlated Subquery

Get students who scored above their subject's average:

SELECT
s.Name AS Student,
sub.Name AS Subject,
r.MarksObtained
FROM Student s
INNER JOIN ExamResult r ON s.StudentId = r.StudentId
INNER JOIN Exam e ON r.ExamId = e.ExamId
INNER JOIN Subject sub ON e.SubjectId = sub.SubjectId
WHERE r.MarksObtained > (
SELECT AVG(er.MarksObtained) FROM ExamResult er
INNER JOIN Exam ex ON er.ExamId = ex.ExamId
WHERE ex.SubjectId = sub.SubjectId
);

Result (above-average in each subject):

Student | Subject | MarksObtained
Ravi Kumar | Maths | 92
Priya Sharma| English | 88

Correlated subquery: References outer query's sub.SubjectId. Executes once per outer row.

Example 6: EXISTS Subquery

Get students with at least one exam result:

SELECT
s.Name AS Student
FROM Student s
WHERE EXISTS (
SELECT 1 FROM ExamResult er
WHERE er.StudentId = s.StudentId
);

Result (students with exam records):

Student
Ravi Kumar
Priya Sharma
Arjun Reddy

EXISTS: Returns TRUE/FALSE. No result set needed. More efficient than IN for large datasets.

Example 7: NOT IN / NOT EXISTS

Get students with NO exam results:

SELECT
s.Name AS Student
FROM Student s
WHERE s.StudentId NOT IN (
SELECT DISTINCT StudentId FROM ExamResult
);

Result (students never took exams):

Student
Kiran Rao
Sneha Patel

NOT IN filters out matched rows. Students not in exam result table.

Subqueries vs JOINs

Subquery:

SELECT * FROM Student s
WHERE s.StudentId IN (
SELECT StudentId FROM ExamResult
WHERE MarksObtained > 80
);

JOIN equivalent:

SELECT DISTINCT s.*
FROM Student s
INNER JOIN ExamResult r ON s.StudentId = r.StudentId
WHERE r.MarksObtained > 80;

When use subquery:

  • Clearer intent (filtering by aggregate)
  • Single pass of data needed
  • Nested logic easier to read

When use JOIN:

  • Performance critical (JOIN optimized)
  • Need columns from both tables
  • Multiple matching rows expected

Common Subquery Mistakes

Mistake 1: Subquery returns multiple rows with single-value operator

Wrong:

SELECT * FROM Student s
WHERE s.StudentId = (
SELECT StudentId FROM ExamResult
WHERE MarksObtained > 80
);
-- ERROR: Subquery returns 3 rows, = expects 1

Result: Runtime error.

Fix: Use IN for multiple rows:

SELECT * FROM Student s
WHERE s.StudentId IN (
SELECT StudentId FROM ExamResult
WHERE MarksObtained > 80
);

Mistake 2: Subquery references table not selected

Wrong:

SELECT * FROM Student s
WHERE s.StudentId IN (
SELECT StudentId FROM Exam
WHERE ExamId = 5
);
-- Wrong: Exam has no StudentId column

Result: Column not found error.

Fix: Use correct table:

SELECT * FROM Student s
WHERE s.StudentId IN (
SELECT StudentId FROM ExamResult
WHERE ExamId = 5
);

Mistake 3: Correlated subquery without reference to outer query

Wrong:

SELECT s.Name FROM Student s
WHERE EXISTS (
SELECT 1 FROM ExamResult
WHERE MarksObtained > 80
);
-- Missing: WHERE er.StudentId = s.StudentId
-- Returns same result regardless of student

Result: Every student returns (meaningless).

Fix: Link outer and inner:

SELECT s.Name FROM Student s
WHERE EXISTS (
SELECT 1 FROM ExamResult er
WHERE er.StudentId = s.StudentId
AND er.MarksObtained > 80
);

Mistake 4: Subquery in SELECT with multiple rows

Wrong:

SELECT
s.Name,
(SELECT MarksObtained FROM ExamResult) AS Marks
FROM Student s;
-- ERROR: Subquery returns multiple rows for scalar context

Result: Error or wrong result.

Fix: Aggregate or filter:

SELECT
s.Name,
(SELECT AVG(MarksObtained) FROM ExamResult WHERE StudentId = s.StudentId) AS AvgMarks
FROM Student s;

Mistake 5: Performance -- too many correlated subqueries

Wrong:

SELECT
s.Name,
(SELECT COUNT(*) FROM ExamResult WHERE StudentId = s.StudentId),
(SELECT AVG(MarksObtained) FROM ExamResult WHERE StudentId = s.StudentId),
(SELECT MAX(MarksObtained) FROM ExamResult WHERE StudentId = s.StudentId)
FROM Student s;
-- Executes subquery 3 times per student (N*3 queries)

Result: Slow on large datasets.

Fix: Use single JOIN with aggregates:

SELECT
s.Name,
COUNT(r.ExamResultId) AS ExamCount,
AVG(r.MarksObtained) AS AvgMarks,
MAX(r.MarksObtained) AS MaxMarks
FROM Student s
LEFT JOIN ExamResult r ON s.StudentId = r.StudentId
GROUP BY s.StudentId, s.Name;

Best Practices for Subqueries

  1. Use IN/EXISTS for filtering -- Clearer than JOIN when filtering by presence
  2. Use JOIN for columns from multiple tables -- More efficient
  3. Correlated subqueries carefully -- Execute once per outer row (slow on large data)
  4. Use derived tables for complex logic -- FROM subqueries organize logic
  5. Test performance -- Compare subquery vs JOIN execution time
  6. Limit nesting depth -- 2-3 levels max (harder to read beyond)
  7. Use EXISTS over IN for subqueries without join needed -- EXISTS short-circuits

🎯 Q1: What's the difference between IN and EXISTS?

IN: List all subquery results, check if outer value in list.

WHERE StudentId IN (SELECT StudentId FROM ExamResult WHERE MarksObtained > 80)

EXISTS: Check if subquery returns ANY row (returns TRUE/FALSE).

WHERE EXISTS (SELECT 1 FROM ExamResult er WHERE er.StudentId = s.StudentId)

Performance: EXISTS often faster (doesn't build full list, short-circuits on first match).

🎯 Q2: Can subqueries be faster than JOINs?

Generally: JOINs faster (SQL optimizes them).

When subqueries better:

  • Filtering by existence (EXISTS faster than INNER JOIN with DISTINCT)
  • Derived tables for complex grouped data
  • Scalar subqueries on indexed columns

Rule: Test both. Profile with SET STATISTICS IO ON;

🎯 Q3: What's a correlated subquery?

Subquery that references outer query's table:

SELECT s.Name FROM Student s
WHERE EXISTS (
SELECT 1 FROM ExamResult er
WHERE er.StudentId = s.StudentId <- References outer s.StudentId
);

Executes once per outer row. For 100 students = 100 subquery executions.

Rule: Avoid for large outer datasets. Use JOIN instead.

🎯 Q4: Can I nest subqueries (subquery in subquery)?

Yes:

SELECT * FROM Student s
WHERE s.StudentId IN (
SELECT StudentId FROM ExamResult
WHERE MarksObtained > (
SELECT AVG(MarksObtained) FROM ExamResult
)
);

But gets hard to read. Keep to 2-3 levels max.

🎯 Q5: What's a derived table?

Subquery in FROM clause treated as temporary table:

SELECT * FROM (
SELECT s.Name, AVG(r.MarksObtained) AS AvgMarks
FROM Student s
LEFT JOIN ExamResult r ON s.StudentId = r.StudentId
GROUP BY s.StudentId, s.Name
) AS StudentStats
WHERE AvgMarks > 85;

Derived table: Subquery result aliased as StudentStats. Outer query treats it like table.

Common in: Multi-level grouping, complex aggregates.

🎯 Q6: Should I use subquery or JOIN to get students with exams?

Subquery (EXISTS):

SELECT s.Name FROM Student s
WHERE EXISTS (SELECT 1 FROM ExamResult WHERE StudentId = s.StudentId);

JOIN:

SELECT DISTINCT s.Name FROM Student s
INNER JOIN ExamResult r ON s.StudentId = r.StudentId;

Better: JOIN (clearer, faster). EXISTS if subquery much simpler.


🤖Use AI to Learn Faster
⚠️ Important for beginners: Do NOT use AI to write your code yet. Type every example yourself. Your brain learns by doing, not by reading AI output. Use AI only to explain and quiz you — not to code for you. Once you have strong fundamentals, AI becomes a powerful productivity tool for repetitive tasks.

Use ChatGPT, Claude, or Copilot to go deeper on Subqueries and nested SELECT in SQL Server. Try these prompts:

  • "Explain subquery in WHERE clause: students scoring > class average"
  • "What's the difference between IN and EXISTS subqueries?"
  • "Show derived table example: top students by average marks"
  • "What's correlated subquery? Show teacher/subject example"
  • "Quiz me: ask 5 questions about subqueries, performance, EXISTS vs IN"

💡 Tip: After reading this article, paste your own code into AI and ask "What could go wrong here and why?" — fastest way to find edge cases and deepen understanding.

Next Article

19. UNION & UNION ALL -- Combining Result Sets ->

nexcoding.in