SQL for Data Engineering

My full course to help you

build production data pipelines with SQL

All video lessons are free on YouTube
Supporter Access unlocks structure, guided practice, and community

 

Unlock Full Access
Video Poster Image

Pricing

FREE OPTION

Free

Features:

🎥 Full YouTube Course (14+ hours)

🧑‍💻 Two Complete Portfolio Projects

🔗 Links to Required Materials & Resources

📊 Real-World Dataset (2023 to mid-2025)

 

SUPPORTERS

$49

One-time payment — Supporter Access adds:

🧪 170+ Interview-Level SQL Problems

📺 Playlist-Style Lesson Videos

⏳ Progress Tracking

💬 Community Access 

📝 Course Notes

📋 Cheat Sheets

🏆 Certificate of Completion

🎁 Full Real-World Dataset (2023–Present)

Unlock Full Access

Course Outline

Course Timeline Accordion
0️⃣ Course Intro
⏱️ 19 mins 📚 3 Lessons
Course Intro
⏱️ 6 mins 📦 1 concepts
Course Intro
⏱️ 6 mins
What is SQL
⏱️ 6 mins 📦 1 concepts
What is SQL
⏱️ 6 mins
Data & Pipeline Intro
⏱️ 6 mins 📦 1 concepts
Data & Pipeline Intro
⏱️ 6 mins
1️⃣ SQL Foundations
⏱️ 4 hrs 12 mins 📚 12 Lessons
SQL & Dataset Setup
⏱️ 14 mins 📦 4 concepts
Lesson Intro
⏱️ 1 min
Where are we running SQL
⏱️ 3 mins
Create a MotherDuck Account
⏱️ 2 mins
MotherDuck UI Walkthrough
⏱️ 5 mins
Database Setup
⏱️ 4 mins
Basic Keywords
⏱️ 19 mins 📦 9 concepts
Lesson Intro
⏱️ <1 min
SELECT * / FROM
⏱️ 4 mins
LIMIT
⏱️ 1 min
DISTINCT
⏱️ 2 mins
WHERE
⏱️ 3 mins
IS NULL / IS NOT NULL
⏱️ 2 mins
Commenting Code
⏱️ 2 mins
ORDER BY
⏱️ 2 mins
Order of Commands
⏱️ 2 mins
DuckDB - Friendly Syntax
⏱️ 2 mins
Comparison & Logical Operators
⏱️ 23 mins 📦 5 concepts
Lesson Intro
⏱️ <1 min
Intro to Operators
⏱️ 1 min
Comparison Operators - Pt.1 (=, !=, <, >)
⏱️ 6 mins
Logical Operators (AND, OR, NOT)
⏱️ 6 mins
Comparison Operators - Pt.2 (BETWEEN, IN)
⏱️ 3 mins
Final Example
⏱️ 6 mins
Wildcards & Aliases
⏱️ 11 mins 📦 3 concepts
Lesson Intro
⏱️ <1 min
Wildcards w/ LIKE
⏱️ 4 mins
Alias w/ AS
⏱️ 2 mins
Final Example
⏱️ 4 mins
Arithmetic Operators
⏱️ 12 mins 📦 4 concepts
Lesson Intro
⏱️ <1 min
Arithmetic Operators Intro
⏱️ <1 min
Addition & Subtraction
⏱️ 5 mins
Multiplication & Division
⏱️ 3 mins
Modulus (%)
⏱️ 3 mins
Aggregate Functions
⏱️ 17 mins 📦 9 concepts
Lesson Intro
⏱️ <1 min
Aggregate Function Intro
⏱️ 1 min
COUNT()
⏱️ 2 mins
COUNT(DISTINCT)
⏱️ 1 min
SUM()
⏱️ 1 min
AVG()
⏱️ <1 min
GROUP BY
⏱️ 3 mins
MIN() / MAX()
⏱️ 2 mins
MEDIAN()
⏱️ 3 mins
HAVING
⏱️ 3 mins
Terminal Intro
⏱️ 33 mins 📦 5 concepts
Lesson Intro
⏱️ 1 min
Intro to the Terminal
⏱️ 4 mins
Installing / Opening the Terminal
⏱️ 7 mins
Basic Terminal Commands (pwd, ls, cd)
⏱️ 4 mins
Working with Files & Folders (mkdir, touch, rm)
⏱️ 15 mins
Getting Help
⏱️ 2 mins
Local DuckDB Intro
⏱️ 38 mins 📦 10 concepts
Lesson Intro
⏱️ <1 min
Local DuckDB Intro
⏱️ 2 mins
Install DuckDB (Disclaimer)
⏱️ 4 mins
Install DuckDB - Windows Users ("Easy" Option)
⏱️ 4 mins
Install DuckDB - Windows Users (Pro Option)
⏱️ 2 mins
Install DuckDB - Mac Users (Easy Option)
⏱️ 6 mins
Install DuckDB - Mac Users (Pro Option)
⏱️ 3 mins
Local DuckDB Terminal Intro
⏱️ 2 mins
Local DuckDB Database
⏱️ 6 mins
Local DuckDB UI
⏱️ 4 mins
Local DuckDB Connect to MotherDuck
⏱️ 4 mins
VS Code Intro
⏱️ 23 mins 📦 5 concepts
Lesson Intro
⏱️ 1 min
Why VS Code?
⏱️ 3 mins
VS Code Install & Intro
⏱️ 4 mins
VS Code SQL Setup
⏱️ 5 mins
Setting up DuckDB & MotherDuck
⏱️ 8 mins
Getting Help w/ GitHub Copilot
⏱️ 3 mins
Data Modeling Pt.1
⏱️ 20 mins 📦 3 concepts
Lesson Intro
⏱️ 1 min
Databases, Schemas, & Tables
⏱️ 6 mins
Entity Relationship Diagram (ERD)
⏱️ 7 mins
Database Metadata (information_schema)
⏱️ 7 mins
JOINs
⏱️ 22 mins 📦 6 concepts
Lesson Intro
⏱️ <1 min
What are JOINs?
⏱️ 1 min
LEFT JOIN
⏱️ 8 mins
RIGHT JOIN
⏱️ 2 mins
INNER JOIN
⏱️ 2 mins
FULL OUTER JOIN
⏱️ 2 mins
Final Example
⏱️ 7 mins
Order of Execution
⏱️ 19 mins 📦 5 concepts
Lesson Intro
⏱️ <1 min
Query Processing 101
⏱️ 2 mins
SQL Clause Order
⏱️ 1 min
Order of Execution
⏱️ 2 mins
Final Example Pt.1 - Query Order Execution
⏱️ 7 mins
Final Example Pt.2 - Execution w/ EXPLAIN
⏱️ 8 mins
📊 SQL Exploratory Data Analysis — Project 1
⏱️ 1 hr 40 mins 📚 7 Lessons
Project #1 Intro
⏱️ 8 mins 📦 3 concepts
Lesson Intro
⏱️ 1 min
Background: Data Warehouse
⏱️ 2 mins
Project #1 Goal
⏱️ 2 mins
Project #1 Scope
⏱️ 3 mins
EDA #1 - In-Demand Skills
⏱️ 9 mins 📦 1 concepts
EDA #1 - In-Demand Skills
⏱️ 9 mins
EDA #2 - Highest Paying Skills
⏱️ 9 mins 📦 1 concepts
EDA #2 - Highest Paying Skills
⏱️ 9 mins
EDA #3 - Most Optimal Skills
⏱️ 15 mins 📦 1 concepts
EDA #3 - Most Optimal Skills
⏱️ 15 mins
README.md Build
⏱️ 16 mins 📦 3 concepts
Lesson Intro
⏱️ <1 min
README.md Intro
⏱️ 2 mins
Markdown Basics
⏱️ 7 mins
README.md Build
⏱️ 7 mins
Git & GitHub Pt.1
⏱️ 36 mins 📦 6 concepts
Lesson Intro
⏱️ 1 min
Git vs. GitHub
⏱️ 5 mins
Homebrew & Git Install (Mac Users Only)
⏱️ 3 mins
Git Setup (git config)
⏱️ 2 mins
Create Local Repository (git init, add, commit)
⏱️ 13 mins
GitHub Setup w/ Remote Repository
⏱️ 2 mins
Push & Pull Repo w/ GitHub (git push, git pull)
⏱️ 10 mins
Share Project #1
⏱️ 7 mins 📦 2 concepts
Lesson Intro
⏱️ <1 min
GitHub Final Push - Add README.md
⏱️ 3 mins
LinkedIn - Project Share & Post
⏱️ 3 mins
2️⃣ Production SQL
⏱️ 6 hrs 22 mins 📚 13 Lessons
Data Types
⏱️ 17 mins 📦 4 concepts
Lesson Intro
⏱️ 1 min
Data Types Intro
⏱️ 2 mins
Common Data Types
⏱️ 4 mins
Check Column Data Type
⏱️ 2 mins
CAST Operator
⏱️ 8 mins
DDL & DML Pt.1
⏱️ 38 mins 📦 8 concepts
Lesson Intro
⏱️ 2 mins
DDL vs. DML Intro
⏱️ 4 mins
CREATE / DROP DATABASE
⏱️ 4 mins
CREATE / DROP SCHEMA
⏱️ 4 mins
CREATE / DROP TABLE
⏱️ 9 mins
INSERT INTO
⏱️ 5 mins
ALTER TABLE - ADD / DROP COLUMN
⏱️ 2 mins
UPDATE
⏱️ 2 mins
ALTER TABLE - RENAME TABLE & RENAME/ALTER COLUMN
⏱️ 6 mins
DDL & DML Pt.2
⏱️ 25 mins 📦 6 concepts
Lesson Intro
⏱️ 1 min
DDL & DML - Refresher
⏱️ 3 mins
CTAS - CREATE TABLE AS SELECT
⏱️ 5 mins
CREATE VIEW
⏱️ 5 mins
CREATE TEMP TABLE
⏱️ 4 mins
DELETE
⏱️ 3 mins
TRUNCATE
⏱️ 4 mins
Subqueries and CTEs
⏱️ 36 mins 📦 4 concepts
Lesson Intro
⏱️ 1 min
What are Subqueries & CTEs?
⏱️ 4 mins
Subquery
⏱️ 10 mins
CTEs - Common Table Expressions
⏱️ 9 mins
Final Example - Existence Filtering w/ EXISTS
⏱️ 11 mins
DDL & DML Pt.3
⏱️ 39 mins 📦 6 concepts
Lesson Intro
⏱️ 1 min
Batch vs. Continuous Processing
⏱️ 5 mins
priority_roles - Table Load
⏱️ 3 mins
priority_jobs_snapshot - Initial Load
⏱️ 7 mins
UPDATE / INSERT / DELETE (Refresher)
⏱️ 14 mins
MERGE INTO
⏱️ 7 mins
CTAS vs. MERGE
⏱️ 3 mins
Data Modeling Pt.2
⏱️ 23 mins 📦 6 concepts
Lesson Intro
⏱️ 1 min
Data Modeling - Refresher
⏱️ 1 min
Why Data Modeling Matters?
⏱️ 3 mins
Source Systems to Analytical Systems
⏱️ 3 mins
Choosing a Database: OLTP vs OLAP
⏱️ 3 mins
Core Design Patterns
⏱️ 8 mins
SCDs - Slowly Changing Dimensions
⏱️ 4 mins
CASE Expressions
⏱️ 21 mins 📦 3 concepts
Lesson Intro
⏱️ <1 min
CASE Expressions
⏱️ 2 mins
CASE: Engineering Use Cases
⏱️ 12 mins
Final Example
⏱️ 6 mins
Date Functions
⏱️ 22 mins 📦 4 concepts
Lesson Intro
⏱️ 1 min
Intro to Dates
⏱️ 3 mins
EXTRACT()
⏱️ 5 mins
DATE_TRUNC()
⏱️ 5 mins
AT TIME ZONE
⏱️ 9 mins
SET Operators
⏱️ 17 mins 📦 2 concepts
Lesson Intro
⏱️ 1 min
UNION / INTERSECT / EXCEPT
⏱️ 6 mins
Final Example
⏱️ 10 mins
Text & NULL Functions
⏱️ 18 mins 📦 4 concepts
Lesson Intro
⏱️ <1 min
Text Functions - REPLACE / CONCAT
⏱️ 7 mins
Final Example - Text Functions
⏱️ 3 mins
NULL Functions - NULLIF / COALESCE
⏱️ 6 mins
Final Example - NULL Functions
⏱️ 2 mins
Window Functions
⏱️ 34 mins 📦 8 concepts
Lesson Intro
⏱️ 1 min
What are Window Functions?
⏱️ 3 mins
Window Function Syntax
⏱️ 3 mins
PARTITION BY
⏱️ 5 mins
ORDER BY
⏱️ 3 mins
PARTITION & ORDER BY
⏱️ 6 mins
Aggregation Functions
⏱️ 3 mins
Row & Rank Functions
⏱️ 5 mins
Navigation Functions
⏱️ 6 mins
Nested Functions
⏱️ 50 mins 📦 8 concepts
Lesson Intro
⏱️ 1 min
Intro to Nested Data Structures
⏱️ 5 mins
Arrays
⏱️ 7 mins
Structs
⏱️ 6 mins
Array of Structs
⏱️ 4 mins
Maps
⏱️ 6 mins
JSON - JavaScript Object Notation
⏱️ 6 mins
Final Example - Arrays
⏱️ 9 mins
Final Example - Array of Structs
⏱️ 6 mins
Git & GitHub Pt.2
⏱️ 42 mins 📦 7 concepts
Lesson Intro
⏱️ 1 min
What is a Branch?
⏱️ 4 mins
Managing Branches (git branch, git switch)
⏱️ 5 mins
Making Changes on a Branch
⏱️ 4 mins
Merging Branches: Fast Forward Merge
⏱️ 6 mins
Merging Branches: Three-Way Merge
⏱️ 10 mins
Pull Requests (PRs)
⏱️ 3 mins
.gitignore File
⏱️ 9 mins
🏗️ End-to-End Data Pipeline — Project 2
⏱️ 2 hrs 12 mins 📚 7 Lessons
Project #2 Intro
⏱️ 11 mins 📦 3 concepts
Lesson Intro
⏱️ 1 min
Data Warehouse vs. Data Mart - Recap
⏱️ 2 mins
Project #2 Goals
⏱️ 2 mins
Project #2 Scope
⏱️ 5 mins
Build Data Warehouse
⏱️ 38 mins 📦 5 concepts
Lesson Intro
⏱️ <1 min
Project #2 Git Workflow
⏱️ 2 mins
Create Star Schema Tables
⏱️ 11 mins
Load Data into Data Warehouse
⏱️ 16 mins
Data Validation
⏱️ 5 mins
Merge Feature Branch to Development
⏱️ 2 mins
Build Flat Table Mart
⏱️ 19 mins 📦 4 concepts
Lesson Intro
⏱️ 1 min
Why Build a Flat Table Mart?
⏱️ 2 mins
Build Flat Table Mart
⏱️ 11 mins
Data Validation
⏱️ 3 mins
Commit & Merge
⏱️ 2 mins
Build Skills Mart
⏱️ 28 mins 📦 4 concepts
Lesson Intro
⏱️ 1 min
Why Build a Skill Demand Mart?
⏱️ 3 mins
Building Skill Demand Mart
⏱️ 19 mins
Data Validation
⏱️ 2 mins
Update Master Build Script & Commit/Merge
⏱️ 2 mins
Build Priority Mart
⏱️ 22 mins 📦 6 concepts
Lesson Intro
⏱️ 1 min
Why Build This Priority Mart?
⏱️ 1 min
Create the Priority Mart
⏱️ 7 mins
Incremental Updates to Mart
⏱️ 6 mins
Update Master Build Script & Commit/Merge
⏱️ 2 mins
MotherDuck Deployment of DW & Mart
⏱️ 4 mins
Optional Exercise: Build Company Mart
⏱️ 1 min
README.md Build
⏱️ 11 mins 📦 3 concepts
Lesson Intro
⏱️ 1 min
Build Project #2 README.md
⏱️ 6 mins
Update Main Repo README.md
⏱️ 1 min
Commit & Merge
⏱️ 3 mins
Share Project #2
⏱️ 4 mins 📦 1 concepts
Lesson Intro
⏱️ 1 min
LinkedIn Updates
⏱️ 4 mins

Course Resources

💽 Course Dataset — SQL Environment

This is the primary dataset used throughout the entire course. It contains real-world data engineering & analytics job postings (2023 to mid-2025) and is hosted in MotherDuck for instant querying.

🔗 Step 1 — Sign in to MotherDuck

Create your free account 👉 https://lukeb.co/motherduck

💻 Step 2 — Attach Database

Run this SQL inside the MotherDuck editor:

SQL
ATTACH 'md:_share/data_jobs/87603155-cdc7-4c80-85ad-3a6b0d760d93'

 

📊 Project 1 — SQL Exploratory Data Analysis


Explore real-world job data using SQL to uncover in-demand skills, salary trends, and hiring patterns. You’ll practice EDA techniques and build your first portfolio-ready project.

🔗 Project #1 Repo

👉 https://lukeb.co/sql-de-project1

 

 

 🏗️ Project 2 — Data Pipeline: Warehouse + Mart


Build a production-style SQL pipeline — modeling a data warehouse and creating analytical marts. You’ll apply data modeling, transformations, and best practices to deliver a second portfolio project.

🔗 Project #2 Repo

👉 https://lukeb.co/sql-de-project2

Supporter Resources

📝  Practice Problems

 
🧩 170+ Interview-Level Problems: Learn SQL faster with meticulously designed exercises spanning a range from easy to challenging
 
🔍 Detailed Solutions and Results: Every problem is accompanied by a comprehensive solution and your expected query results

 

 

📺  Structured Video Lessons

 
🚢 Navigate with Ease: Jump instantly to any lesson or specific topic within the course – no more wasting time scrubbing through hours of video to find what you need
 
🧠 Focused Learning: Master concepts more effectively with dedicated, bite-sized videos for each distinct lesson, allowing for better concentration and easier review

 

 

🗒️ Lesson Notes & Cheat Sheets

 
📖 Structured Lesson Notes: Step-by-step walkthroughs for every topic, helping you follow along with each lesson and understand why queries and pipelines are built the way they are
 
📋 Practical Cheat Sheets: Quick-reference guides for core SQL syntax, transformations, and data engineering concepts you’ll reuse across projects

 

✨ Certificate of Completion


🎖️ Certificate of Completion: Receive a certificate to validate your new skills and enhance your LinkedIn profile

🧑‍💻 Showcase Experience: Share how you used real-world data to help solve a problem for data professionals

 

Unlock Full Access

About the Instructors

Luke Barousse - Course Instructor

 🌎 Real-world Experience with SQL
Spearheaded innovative projects in collaboration with MrBeast's team, integrating popular tools like SQL & Python.
 
💡🤖 Sharing Knowledge about Data & AI 
Guides a community of +600,000 data nerds in harnessing analytical tools to revolutionize their professional workflows.

🎓 Trusted Course Developer 
Imparted wisdom to +30,000 learners on DataCamp in leveraging analytical tools to elevate their career efficiency.
 

 

Kelly Adams - Course Producer

 🕹️ Hands on Experience with SQL
Driving strategic decisions within the social gaming industry at Golden Hearts Games, using popular tools like Google BigQuery and Looker.
 
📝 LinkedIn Content Creator
Documenting the day-to-day life of a full-time data analyst and teaching SQL to over 40,000 data professionals.
 
📹 Course Producer for Data Analytics Content
Educating an audience of +600,000 analysts about the latest data analytical tools to improve their professional skill sets.
 

Rikki Singh - Content Developer

 🧑‍💻 Hands-on SQL & Analytics
Works across gaming, entertainment, and marketing—using Redshift and BigQuery to query and model data, and builds decision-ready dashboards in Looker and Tableau.
 
💼 Director-Level Operator 
Leads analytics initiatives—bringing a “what matters to the business” lens to every lesson and project.

🎬 Course Producer for Data Analytics Content
Builds high-signal practice problems by benchmarking a wide range of learning platforms and question styles, then translating the best patterns into realistic, interview-ready exercises.

100% Satisfaction Guarantee or Your Money Back

 

⏱️ If you don’t feel the course problems and notes help you learn this tool as it has for countless others, I’ll refund your money!

📫 Email me within 30 days of purchasing the course on why you are unsatisfied, and I’ll return the full purchase price to you ASAP.

FAQ

Unlock Full Access