Learning Roadmap
A structured path from Python competency to production ML engineer. Estimated total: 14–20 weeks at 10–15 hrs/week, with real portfolio projects at the end of each phase.
Phase 1: Python & Math Foundations
2–3 weeksThe bedrock. Skip it if you are already comfortable with NumPy broadcasting and can derive a partial derivative by hand.
- ▸NumPy vectorised ops & broadcasting
- ▸Linear algebra: matrices, eigenvectors, SVD
- ▸Calculus: partial derivatives, chain rule
- ▸Probability & statistics: expectation, Bayes, MLE
- ▸Python typing, dataclasses, protocolsUsed throughout all projects
Phase 2: PyTorch Core
3–4 weeksBuild the mental model for tensors, autograd, and training loops before adding model complexity.
- ▸Tensor ops, device management (CPU/GPU/MPS)
- ▸Autograd & computational graph
- ▸nn.Module, optimisers, loss functions
- ▸DataLoader, Dataset, transforms
- ▸Training loop patterns: clipping, scheduling, validation
- ▸Project: CNN image classifierFirst portfolio project
Phase 3: Transformer Architecture
4–5 weeksThe architecture behind every frontier model. Implement each component from scratch so nothing remains a black box.
- ▸Scaled dot-product attention
- ▸Positional encoding & multi-head attention
- ▸Layer norm, residuals, feed-forward blocks
- ▸Encoder / decoder stacks
- ▸Pre-training vs fine-tuning: BERT vs GPT
- ▸Hugging Face: Trainer, tokenizers, model hub
- ▸Project: Build mini-GPTGenerative LM from scratch
- ▸Project: BERT fine-tune for NLPClassification pipeline
Phase 4: Reinforcement Learning
4–5 weeksSequential decision making from first principles. The math maps directly to every algorithm you implement.
- ▸MDPs, Bellman equations, value functions
- ▸Dynamic programming: policy & value iteration
- ▸Monte Carlo & temporal difference methods
- ▸Q-learning & DQN
- ▸Policy gradient theorem & REINFORCE
- ▸Actor-Critic and PPO
- ▸RL libraries: Tianshou pipeline
- ▸Project: DQN for Atari PongCNN + replay buffer
- ▸Project: PPO MuJoCo HalfCheetahCapstone
Phase 5: Production & Scale
2–3 weeksClose the gap between working code and code you can run overnight at scale.
- ▸Mixed precision training
- ▸Gradient checkpointing & memory profiling
- ▸Distributed training basics
- ▸Model export: TorchScript, ONNX, torch.compile
- ▸FastAPI + Docker inference server
- ▸Hyperparameter sweeps with W&B or Optuna
Start with what you know
Comfortable with Python and NumPy? Start with PyTorch Foundations. Already know PyTorch? Jump straight to Transformers or RL.
Browse All Projects →