Module 9Reinforcement Learning
RL Engineering with Tianshou
Use a real RL library without losing conceptual clarity.
Why this module matters
At some point you need reproducible experiment tooling, not just scratch notebooks.
Prerequisites
- ▸ Scratch PPO or DQN experience
Learning objectives
- ▸ Understand policy, collector, buffer, and trainer abstractions
- ▸ Map scratch concepts to library components
- ▸ Run seeded experiments cleanly
Core concepts
Collector
Replay buffer APIs
Experiment abstraction
Hands-on practice
- ▸ Rebuild a previous algorithm using Tianshou
Expected output
A reproducible RL experiment pipeline.
Study checklist
- ✅ Understand policy, collector, buffer, and trainer abstractions
- ✅ Map scratch concepts to library components
- ✅ Run seeded experiments cleanly
Common mistakes
- ⚠️ Using library abstractions without knowing what they hide
- ⚠️ Failing to align config with scratch baselines
Module rhythm
- 1. Read the summary and why-it-matters section first.
- 2. Work through concepts before rushing into practice.
- 3. Use the checklist to verify real understanding, not just completion.
How to continue
Finally, package a full project and compare scratch vs library systems.
Back to course overview →How to use this page well
Treat each module as a compact learning system: understand the intuition, verify the concepts, do one hands-on task, then use the checklist and mistakes section to pressure-test your understanding.