Clustering and Dimensionality Reduction
Find out what you'll learn throughout the course (if the video does not show, try allowing cookies in your browser).
What you'll learn
👉 K-means based clustering (k-means, k-modes, k-prototypes).
👉 Agglomerative clustering and linkage methods (including Min, Max, Average and Wald).
👉 Density based clustering (DBSCAN, HDBSCAN).
👉 Graph based clustering (Louvain algorithm).
👉 Clustering quality metrics, including DBCV, silhouette scores and graph clustering metrics.
👉 Clustering numerical, categorical and graph data.
👉 PCA dimensionality reduction.
👉 UMAP dimensionality reduction.
👉 Evaluating clustering quality with UMAP.
👉 Data preprocessing methods for clustering and dimensionality reduction (including, data scaling, handling skewed data, encoding categorical data, calculating distance metrics).
👉 Data preprocessing methods to create graphs from data (including KNN and SNN approaches).
👉 Python prerequisites.
What you'll get
Lifetime access
Instructor support
Certificate of completion
💬 English subtitles
Instructor
Dalibor Veljkovic
Dalibor is a data scientist and bio-statistician with a Master’s degree in signal processing. He's analyzed complex biological data and economics data, where he studied market trends.
At work, he advocates for a balanced approach that combines theoretical learning with practical applications. Find out more about Dalibor on Linkedin.
Can't afford it? Get in touch.
30 days money back guarantee
If you're disappointed for whatever reason, you'll get a full refund.
So you can buy with confidence.
Clustering and Dimensionality Reduction Course
Welcome to the definitive course on unsupervised machine learning—designed to go deeper than any other resource online.
While platforms like Udemy and Coursera offer introductory content, this course delivers unmatched depth, combining rigorous theory, hands-on implementation, and real-world case studies you won’t find elsewhere.
Why This Course Stands Apart
✅ No fluff, no shortcuts – We dissect every critical algorithm, from foundational concepts to advanced optimizations.
✅ Beyond the basics – Most courses stop at K-means and PCA; we cover hierarchical clustering, DBSCAN, HDBSCAN, UMAP, and more.
✅ Real-world rigor – Apply techniques to complex datasets (like RNA profiling, geospacial clustering, clustering customers and clustering actors) that mirror cutting-edge industry challenges.
✅ Code + theory + intuition – Not just toy examples—you’ll build production-ready solutions.
This isn’t just another overview—it’s the deepest dive into unsupervised learning available online.
Why Unsupervised Learning
Unsupervised learning unlocks hidden patterns and structures in data (a process known as data mining) without relying on pre-labeled examples. This approach isn’t just useful—it’s often essential when labeling data is impractical or impossible.
In this course, we’ll focus on two transformative techniques:
- Cluster analysis – Groups similar data points, revealing underlying patterns.
- Dimensionality reduction – Reduces the number of features to simplify analysis, improve algorithm performance, and uncover meaningful structure.
Mastering these methods is key to extracting actionable insights—a must-have skill in data science.
Why It Matters
These techniques power real-world applications across industries:
- Marketing: Customer segmentation and behavior analysis.
- Healthcare: Disease pattern detection and patient profiling.
-
Bioinformatics: Genetic data interpretation.
- Social Networks: Community structure analysis.
- Urban Planning: Traffic and infrastructure optimization.
What You’ll Learn
We’ll break down unsupervised learning algorithms—exploring how they work, their strengths, and their limitations. But we won’t stop at theory. You’ll implement them yourself through:
- Hands-on demonstrations
- Targeted case studies (each reinforcing key concepts)
- A capstone project: Clustering cells using RNA profiles—a real-world example of extracting insights from complex data.
By the end, you’ll have the skills to apply these techniques in your own projects. Whether you’re a practicing data scientist or a curious learner, this course will deepen your understanding of machine learning’s unsupervised frontier.
Who is this course for
From zero to hero—no prior expertise required.
- Beginners? We’ve got you. Dedicated Python primers will get you up to speed fast.
- Advanced learners? Skip straight to clustering and dimensionality reduction—then apply your skills immediately.
We designed this course so that even with minimal Python experience, you'll finish with the ability to analyze real data using clustering and dimensionality reduction—while advanced learners can dive straight into practical applications.
Course Curriculum
Search for those videos that say "preview" to take a look at some of our lessons.
- Chapter agenda (1:14)
- ---- Part 1 - basic python data types ---- (1:15)
- Numerical data types (4:59)
- Boolean data type (8:48)
- String data type (4:41)
- Python lists - part 1 (9:16)
- Python lists - part 2 (11:42)
- Sets and tuples (6:12)
- Dictionaries and "None" (6:29)
- Truthiness (3:34)
- ---- Part 2 - basic python functionalities ---- (0:56)
- Copying in python (shallow & deep copying) (8:24)
- Unpacking iterable data types (4:30)
- Python functions, *args and **kwargs (11:28)
- Python functions - demo (9:22)
- Lambda functions, scopes and decorators (11:31)
- Python classes (8:48)
- Python classes - demo (10:18)
- While loops and loop control statements (5:04)
- Comprehensions in python (12:27)
- Chapter summary (0:42)
- How are we doing? (0:26)
- Chapter agenda (0:49)
- ---- Part 1 - Numpy ---- (3:34)
- Indexing & slicing in numpy (12:22)
- Indexing & slicing in numpy - demo (10:52)
- Operations on single numpy arrays (9:53)
- Operations on single numpy arrays - demo (4:17)
- Operations between numpy arrays & broadcasting (7:12)
- Operations between numpy arrays & broadcasting - demo (2:25)
- Merging numpy arrays (4:27)
- Data types in numpy (3:10)
- Matrix operations in numpy (1:49)
- ---- Part 2 - pandas ---- (4:55)
- Pandas indexing and slicing (9:09)
- Creating data frames (6:28)
- Pandas indexing and slicing - demo (8:54)
- Operations on single data frames/series (12:51)
- Operations on single data frames/series - demo (4:07)
- Operations between data frames/series (9:53)
- Operations between data frames/series - demo (5:15)
- Other useful pandas functionalities (8:25)
- Pandas data types (4:14)
- Pandas data types - demo (2:18)
- Pandas group by statement (4:19)
- Pandas group by statement - demo (4:10)
- ---- Part 3 - Data visualisations ---- (7:18)
- Matplotlib basics (6:53)
- Seaborn basics (4:40)
- Chapter summary (0:48)
- How are we doing? (0:26)
- Chapter agenda (2:32)
- K-means clustering algorithm (11:55)
- Avoiding suboptimal solutions (8:06)
- Demo: Implementing k-means clustering algorithm from scratch - part 1 (13:23)
- Demo: Implementing k-means clustering algorithm from scratch - part 2 (8:13)
- K-means in sklearn (8:00)
- Data preprocessing for K-means (13:09)
- Adjusted rand index (16:09)
- Demo: Data preprocessing, k-means & adjusted rand index (9:16)
- Inferring number of clusters with inertia knee method (12:01)
- Silhouette scores (inferring number of clusters & analyzing cluster quality) (14:31)
- Demo: inertia knee method & silhouette scores - part 1 (13:21)
- Demo: inertia knee method & silhouette scores - part 2 (8:05)
- Chapter summary (1:55)
- How are we doing? (0:26)
- Chapter agenda (1:32)
- Feature cordinate systems (3:45)
- PCA and feature corinate systems (16:13)
- Intuition behind Principal component analysis (12:45)
- PCA as a linear transformation of the data - introduction (1:25)
- Linear transformations (11:59)
- Eigenvectors and eigenvalues (6:33)
- Change of basis (13:04)
- Variance and covariance (12:30)
- PCA from eigendecomposition perspective (15:20)
- Principal component analysis for dimensionality reduction (21:19)
- Demo: Performing PCA by using eigendecomposition (11:32)
- Principal component analysis in sklearn (12:34)
- Demo: PCA in sklearn (artificial data) (4:54)
- Demo: PCA in sklearn (real data) (10:04)
- Guidelines for choosing number of principal component (6:30)
- Demo : Choosing number of principal components (9:43)
- Chapter summary (1:10)
- How are we doing? (0:26)
- Chapter agenda (1:19)
- Graph theory basics (9:05)
- UMAP introduction (11:00)
- Fuzzy set basics (4:59)
- Gradient descent & stochastic gradient descent (17:48)
- Sparse matrices with SciPy (12:31)
- UMAP theory - part 1 (26:39)
- Demo: Implementing UMAP from scratch - part 1 (20:08)
- UMAP theory - part 2 (27:50)
- Speeding up python code with numba (4:29)
- Demo: Implementing UMAP from scratch - part 2 (14:54)
- UMAP python package (umap-learn) (6:10)
- Demo: Running UMAP with umap-learn python package (5:24)
- Tuning UMAP parameters (7:40)
- Demo: Tuning UMAP parameters (5:53)
- UMAP caveats (12:52)
- Demo: UMAP caveats (3:59)
- Chapter summary (1:08)
- Chapter agenda (3:10)
- Yellowbrick python library (6:19)
- Characterizing clusters using data visualizations (3:15)
- Demo: K-means, yellowbrick and cluster characterization (16:55)
- Handling and encoding categorical features (10:22)
- Encoding categorical data in python (12:04)
- Measuring distance in categorical data (4:20)
- Distance measures & distance metrics (4:02)
- Calculating distance with SciPy & choosing distance measures in other algorithms (9:05)
- Demo: Calculating distances with SciPy and sklearn (applied to categorical data) (5:28)
- K-modes clustering algorithm (10:24)
- K-modes python package (5:10)
- Demo: Clustering categorical data using the K-means algorithm (8:41)
- Demo: Clustering categorical data using the K-modes algorithm (14:05)
- Mixed data & gower distance (10:23)
- K-prototypes clustering algorithm (6:56)
- K-prototypes python package (3:47)
- Clustering customers demo prerequisites (12:36)
- Demo: Clustering customers (mixed data) (31:46)
- K-means algorithm pros & cons (5:42)
- Demo: k-means algorithm limitations (3:00)
- Chapter summary (1:21)
- Understanding the data - central dogma of molecular biology (8:21)
- Case study intro (1:46)
- Understanding the data - single cell RNA sequencing (4:22)
- Analyzing the data - removing low quality cells (13:06)
- Analyzing the data - normalization, gene selection, PCA, UMAP and clustering (15:21)
- Demo prerequisite - statistical testing basics (14:45)
- Demo: analyzing the data - part 1 (13:30)
- Demo: analyzing the data - part 2 (16:15)
- Case study summary (2:11)
- Chapter agenda (2:24)
- Hierarchical & agglomerative clustering introduction (6:47)
- Dendrogram linkages & constructing dendrograms (16:16)
- Cophenetic distance & cophenetic correlation (3:28)
- Constructing dendrograms with SciPy (5:50)
- Demo: Constructing dendrograms with SciPy (8:31)
- Approaches for extracting clusters from dendrograms (12:36)
- Agglomerative clustering with SciPy and sklearn (7:55)
- Demo: Agglomerative clustering with SciPy & dendrogram manipulation (18:01)
- Demo: Agglomerative clustering with sklearn (6:20)
- Agglomerative clustering general guidelines (8:53)
- Demo: Clustering cars (numerical data) (13:44)
- Demo: Clustering animals (categorical data) (6:21)
- Demo: Clustering cars (mixed data) (13:01)
- Chapter summary (1:29)
- Chapter agenda (2:26)
- Density based clustering - introduction (3:43)
- DBSCAN clustering algorithm (15:31)
- Nearest neighbors basics (13:32)
- Nearest neighbors in sklearn (5:25)
- Demo: implementing DBSCAN from scratch (10:21)
- DBSCAN in sklearn (2:54)
- Tuning DBSCAN parameters (10:40)
- Demo: Tuning DBSCAN parameters (3:56)
- Density based clustering validation (DBCV) - part 1 (11:30)
- Density based clustering validation (DBCV) - part 2 (15:17)
- Demo: Implementing DBCV from scratch - part 1 (13:40)
- Demo: Implementing DBCV from scratch - part 2 + DBCV python function (5:47)
- DBSCAN general guideliness (4:49)
- Demo: Clustering digits (mnist784) with DBSCAN (10:59)
- Demo: Clustering animals with DBSCAN (categorical data) (5:03)
- HDBSCAN clustering algorithm - part 1 (6:22)
- HDBSCAN clustering algorithm - part 2 (14:11)
- HDBSCAN clustering algorithm - part 3 (3:15)
- HDBSCAN python library (hdbscan) (7:32)
- Demo: Implementing HDBSCAN (partial implementation) (14:22)
- HDBSCAN general guideliness (7:08)
- Demo: Clustering iris and digits (mnist784) with HDBSCAN (7:37)
- Demo: Clustering animals with HDBSCAN (categorical data) (2:29)
- Robust scaler (demo prerequisite) (4:12)
- Demo: Clustering phones with HDBSCAN (mixed data) (7:34)
- Case study: Geospatial clustering with DBSCAN and HDBSCAN - introduction (6:40)
- Case study: Geospatial clustering with DBSCAN and HDBSCAN (14:33)
- Chapter summary (1:48)
- Chapter agenda (2:10)
- Graphs, graph layouts and graph communities (8:30)
- Igraph python library (1:16)
- Demo: Igraph library capabilities (11:58)
- Modularity in graph community structures (6:07)
- Louvain clustering algorithm (16:18)
- Louvain clustering - resolution parameter (4:26)
- Demo: Implementing Louvain from scratch (19:49)
- Analyzing community structure quality (8:49)
- Igraph - other useful functionalities (7:22)
- Demo: Community quality metrics (12:15)
- Case study: Clustering actors (15:46)
- Using graph clustering with numerical/categorical/mixed data (KNN graph) (10:13)
- Shared neighbors graph (SNN graph) (4:11)
- Creating KNN (k nearest neighbors) graphs with sklearn (2:30)
- Graph clustering guidelines (5:22)
- Louvain algorithm pros & cons (3:01)
- Demo: Clustering digits (mnist784) with Louvain (15:31)
- Demo: Clustering animals with Louvain (categorical data) (3:55)
- Chapter summary (1:09)
Frequently Asked Questions
When does the course begin and end?
You can start taking the course from the moment you enroll. The course is self-paced, so you can watch the tutorials and apply what you learn whenever you find it most convenient.
For how long can I access the course?
The course has lifetime access. This means that once you enroll, you will have unlimited access to the course for as long as you like.
What if I don't like the course?
There is a 30-day money back guarantee. If you don't find the course useful, contact us within the first 30 days of purchase and you will get a full refund.
Will I get a certificate?
Yes, you'll get a certificate of completion after completing all lectures, quizzes and assignments.