35. Best Practices & Database Design -- Building Robust Systems
Level: Advanced SQL Server for real projects
- Normalization (1NF, 2NF, 3NF)
- Denormalization (performance vs. consistency)
- Naming conventions (tables, columns, indexes)
- Coding standards (formatting, comments)
- Documentation (schema, relationships)
- Schema versioning (migrations)
- Common design mistakes
- Best practices
Bad design = slow, hard-to-maintain system. Good design = scalable, reliable. This article teaches building robust schemas.
Why Design Matters for Backend Developers
Design impacts everything:
Bad design:
- Slow queries (missing relationships)
- Redundant data (inconsistency)
- Hard to extend (tightly coupled)
Good design:
- Fast queries (proper relationships)
- Single source of truth (consistency)
- Easy to extend (flexible schema)
Design decisions compound.
Normalization Levels
1NF (First Normal Form): No repeating groups.
BAD: Student table with Exams = [Exam1, Exam2, Exam3] (array in column)
GOOD: Separate ExamResult table
2NF (Second Normal Form): No partial dependencies.
BAD: (StudentId, ClassId) -> ClassName (depends on ClassId only)
GOOD: Separate Class table
3NF (Third Normal Form): No transitive dependencies.
BAD: (StudentId) -> ClassName -> TeacherId (transitive chain)
GOOD: Student -> Class, Class -> Teacher
Example 1: SMS Schema (Normalized)
Properly normalized design:
-- 1NF: No repeating groups
CREATE TABLE Student (
StudentId INT PRIMARY KEY,
Name NVARCHAR(100),
ClassName NVARCHAR(10), -- Points to Class table (2NF)
DateOfBirth DATE
);
-- 2NF: Separate Class table
CREATE TABLE Class (
ClassId INT PRIMARY KEY,
ClassName NVARCHAR(10),
TeacherId INT FOREIGN KEY REFERENCES Teacher(TeacherId) -- (3NF)
);
-- 3NF: Teacher separate
CREATE TABLE Teacher (
TeacherId INT PRIMARY KEY,
Name NVARCHAR(100),
Qualification NVARCHAR(100)
);
-- Exam results as separate table (1NF)
CREATE TABLE ExamResult (
ExamResultId INT PRIMARY KEY,
StudentId INT FOREIGN KEY REFERENCES Student(StudentId),
ExamId INT FOREIGN KEY REFERENCES Exam(ExamId),
MarksObtained DECIMAL(5,2)
);
Each fact in one place. No redundancy.
Example 2: Denormalization (Performance)
When to denormalize:
-- Normalized: Slow (JOIN required)
SELECT COUNT(*) FROM Student WHERE ClassName = '10-A';
-- Requires: Student -> Class (if ClassName in Class table)
-- Denormalized: Fast (no JOIN)
CREATE TABLE Class (
ClassId INT PRIMARY KEY,
ClassName NVARCHAR(10),
StudentCount INT, -- DENORMALIZED (stored, updated by trigger)
TeacherId INT
);
SELECT StudentCount FROM Class WHERE ClassName = '10-A';
-- No JOIN, instant
-- Trigger maintains count
CREATE TRIGGER trg_Student_UpdateCount
ON Student AFTER INSERT, DELETE
AS
BEGIN
UPDATE Class SET StudentCount = (SELECT COUNT(*) FROM Student WHERE ClassName = Class.ClassName);
END;
Denormalize: Store aggregate. Update with trigger. Trade consistency for speed.
Example 3: Naming Conventions
Standard naming prevents confusion:
-- Tables (singular, PascalCase)
CREATE TABLE Student (...);
CREATE TABLE ExamResult (...);
-- Columns (descriptive)
StudentId, Name, DateOfBirth, IsActive
-- Primary keys: TableNameId
StudentId, ExamId, TeacherId
-- Foreign keys: ReferencedTableNameId
StudentId (in ExamResult), TeacherId (in Class)
-- Indexes: IX_TableName_Columns
CREATE INDEX IX_Student_ClassName ON Student(ClassName);
CREATE INDEX IX_ExamResult_StudentId ON ExamResult(StudentId);
-- Stored procedures: sp_Purpose
CREATE PROCEDURE sp_InsertExamResult @StudentId INT, ...
-- Views: vw_Purpose
CREATE VIEW vw_StudentSummary AS ...
-- Functions: fn_Purpose
CREATE FUNCTION fn_CalculateAge(@DateOfBirth DATE) ...
-- Triggers: trg_TableName_Action
CREATE TRIGGER trg_Student_Insert ON Student AFTER INSERT
Consistency. Anyone reading code understands structure.
Example 4: Documentation
Schema documentation:
-- Table: Student
-- Purpose: Store student information (demographic data only)
-- Historical: No soft-delete. Mark IsActive = 0 instead.
-- Growth: ~50K students (5 years). Indexes on ClassName, RollNumber.
CREATE TABLE Student (
StudentId INT PRIMARY KEY IDENTITY(1,1),
-- Surrogate key, auto-increment
Name NVARCHAR(100) NOT NULL,
-- Full name, format: "First Last"
RollNumber NVARCHAR(20) UNIQUE NOT NULL,
-- Unique per class-year, format: "CLASS-YEAR-NUMBER"
ClassName NVARCHAR(10) NOT NULL,
-- References Class.ClassName
DateOfBirth DATE,
-- Birth date. Calculate age: DATEDIFF(YEAR, DOB, GETDATE())
Status NVARCHAR(20) DEFAULT 'Active',
-- Values: Active, Inactive, Graduated, Transferred
CreatedAt DATETIME DEFAULT GETDATE(),
-- Row inserted time (audit)
FOREIGN KEY (ClassName) REFERENCES Class(ClassName)
);
-- Index purposes:
CREATE INDEX IX_Student_ClassName ON Student(ClassName); -- Filter by class (GROUP BY, WHERE)
CREATE INDEX IX_Student_RollNumber ON Student(RollNumber); -- Unique lookup
Document tables, columns, relationships, indexes, growth estimates.
Example 5: Schema Versioning
Track schema changes:
-- Schema version table
CREATE TABLE SchemaVersion (
VersionId INT PRIMARY KEY IDENTITY(1,1),
VersionNumber INT UNIQUE,
AppliedOn DATETIME DEFAULT GETDATE(),
Description NVARCHAR(500),
RolledBackOn DATETIME NULL
);
-- Migration: Add new column
INSERT INTO SchemaVersion (VersionNumber, Description)
VALUES (2, 'Add StudentCount to Class table');
ALTER TABLE Class ADD StudentCount INT DEFAULT 0;
-- Migration: Create trigger
INSERT INTO SchemaVersion (VersionNumber, Description)
VALUES (3, 'Add trigger to maintain StudentCount');
CREATE TRIGGER trg_Student_UpdateCount ON Student ...;
-- View schema history
SELECT * FROM SchemaVersion ORDER BY VersionId;
Version migrations. Rollback if needed.
Example 6: Coding Standards
Consistent formatting:
-- GOOD: Readable, follows standards
SELECT
s.Name AS Student,
s.ClassName,
COUNT(r.ExamResultId) AS ExamsCompleted,
AVG(r.MarksObtained) AS AvgMarks
FROM Student s
LEFT JOIN ExamResult r ON s.StudentId = r.StudentId
WHERE s.Status = 'Active'
GROUP BY s.StudentId, s.Name, s.ClassName
HAVING COUNT(r.ExamResultId) > 0
ORDER BY AvgMarks DESC;
-- BAD: Hard to read
SELECT s.Name AS Student,s.ClassName,COUNT(r.ExamResultId) AS ExamsCompleted,AVG(r.MarksObtained) AS AvgMarks FROM Student s LEFT JOIN ExamResult r ON s.StudentId=r.StudentId WHERE s.Status='Active' GROUP BY s.StudentId,s.Name,s.ClassName HAVING COUNT(r.ExamResultId)>0 ORDER BY AvgMarks DESC;
Standards: Keywords uppercase, line breaks after main clauses, aliases, comments.
Common Design Mistakes
Mistake 1: Over-normalization (too many JOINs)
Wrong:
SELECT StudentName FROM Student s
JOIN Class c ON ...
JOIN Grade g ON ...
JOIN GradeScale gs ON ...
-- 4 tables for StudentName (slow)
Fix: Denormalize if query slow:
ALTER TABLE Student ADD ClassName VARCHAR(10);
-- Now direct: SELECT Name FROM Student (no JOIN)
Mistake 2: No relationship enforcement (foreign keys)
Wrong:
CREATE TABLE ExamResult (
StudentId INT,
ExamId INT
-- No FOREIGN KEY
-- Can insert StudentId = 999 (doesn't exist)
);
Fix: Add constraints:
FOREIGN KEY (StudentId) REFERENCES Student(StudentId),
FOREIGN KEY (ExamId) REFERENCES Exam(ExamId)
-- Prevent orphaned records
Mistake 3: No primary keys
Wrong:
CREATE TABLE Attendance (
StudentId INT,
Date DATE,
IsPresent BIT
-- No PRIMARY KEY
-- Duplicates possible
);
Fix:
PRIMARY KEY (StudentId, Date)
-- Ensures one record per student per day
Mistake 4: Inconsistent naming
Wrong:
CREATE TABLE student (...); -- lowercase
CREATE TABLE EXAM_RESULT (...); -- UPPERCASE
CREATE TABLE Cls (...); -- abbreviated
CREATE INDEX idx_name ON ...; -- inconsistent prefix
Result: Confusion, inconsistent queries.
Fix: Standard naming (PascalCase tables, ix_ prefixes, etc.).
Mistake 5: No documentation
Wrong:
CREATE TABLE T1 (
C1 INT,
C2 NVARCHAR(100),
C3 DATETIME
);
-- What does T1 mean? What are C1, C2, C3?
Result: New developers confused.
Fix: Clear names and comments:
CREATE TABLE Student (
StudentId INT, -- Primary key
Name NVARCHAR(100), -- Full name
CreatedAt DATETIME -- Row inserted time
);
Best Practices for Design
- Normalize to 3NF minimum -- Avoid redundancy
- Denormalize for performance -- If query slow
- Consistent naming -- Same conventions across schema
- Meaningful column names -- Not id1, id2
- Foreign keys required -- Enforce relationships
- Primary keys always -- Prevent duplicates
- Documentation essential -- Comment tables, columns
- Schema versioning -- Track changes, rollback capability
Remove data redundancy:
1NF: No repeating groups
2NF: No partial dependencies
3NF: No transitive dependencies
Result: Each fact stored once. Consistency.
When JOIN is slow:
Normalized:
SELECT COUNT(*) FROM Student WHERE ClassName = '10-A';
-- Slow: Need Class table join
Denormalized:
SELECT StudentCount FROM Class WHERE ClassName = '10-A';
-- Fast: Direct lookup
Denormalize if: Frequent aggregate queries, slow JOINs. Keep consistency with trigger.
Prevent orphaned records:
FOREIGN KEY (StudentId) REFERENCES Student(StudentId)
-- Can't insert exam result for non-existent student
-- Ensures data integrity
Enforces relationships at database level.
Tables: Singular, PascalCase (Student, ExamResult) Columns: Descriptive (StudentId, DateOfBirth) ForeignKeys: ReferencedTableId (StudentId) Indexes: IX_TableName_Columns Procedures: sp_Action Views: vw_Purpose Functions: fn_Purpose
Consistency helps everyone.
Yes, critical:
-- Table: Student
-- Purpose: Core student demographic data
-- Growth: ~50K rows
-- Indexes: ClassName, RollNumber (frequent filters)
Document: Purpose, growth expectations, important indexes.
Saves future developers time.
Version and track:
INSERT INTO SchemaVersion (Version, Description) VALUES (2, 'Add StudentCount');
ALTER TABLE Class ADD StudentCount INT;
-- Rollback: DELETE FROM SchemaVersion WHERE Version = 2; DROP COLUMN StudentCount;
Version every change. Enables rollback.
Use ChatGPT, Claude, or Copilot to go deeper on Database design and best practices in SQL Server. Try these prompts:
"Explain normalization: 1NF, 2NF, 3NF with examples""When should I denormalize? Show StudentCount example""Show naming conventions: tables, indexes, procedures""What should I document in schema?""Quiz me: ask 5 questions about normalization, naming, design patterns"
💡 Tip: After reading this article, paste your own code into AI and ask "What could go wrong here and why?" — fastest way to find edge cases and deepen understanding.
Next Article
36. Advanced Patterns -- Pivot Tables & Recursive Queries ->