Module 9Reinforcement Learning

RL Engineering with Tianshou

Use a real RL library without losing conceptual clarity.

Why this module matters

At some point you need reproducible experiment tooling, not just scratch notebooks.

Prerequisites

  • Scratch PPO or DQN experience

Learning objectives

  • Understand policy, collector, buffer, and trainer abstractions
  • Map scratch concepts to library components
  • Run seeded experiments cleanly

Core concepts

Collector
Replay buffer APIs
Experiment abstraction

Hands-on practice

  • Rebuild a previous algorithm using Tianshou

Expected output

A reproducible RL experiment pipeline.

Study checklist

  • Understand policy, collector, buffer, and trainer abstractions
  • Map scratch concepts to library components
  • Run seeded experiments cleanly

Common mistakes

  • ⚠️ Using library abstractions without knowing what they hide
  • ⚠️ Failing to align config with scratch baselines

Module rhythm

  • 1. Read the summary and why-it-matters section first.
  • 2. Work through concepts before rushing into practice.
  • 3. Use the checklist to verify real understanding, not just completion.

How to continue

Finally, package a full project and compare scratch vs library systems.

Back to course overview →

How to use this page well

Treat each module as a compact learning system: understand the intuition, verify the concepts, do one hands-on task, then use the checklist and mistakes section to pressure-test your understanding.