Welcome to the RLSpaceInvaders project!

Stylized Space Invaders alien facing a simple AI network diagram

This is a reinforcement learning initiative to train autonomous agents to play Space Invaders using advanced machine learning algorithms.

Quick Links

Learn about our project goals, methodology, and evaluation plan.

Check out our current progress and project milestones.

View Status →

See our final results and conclusions.

The Final Report has just been published.

DQN: Finalized hyperparameter tuning with temporal frame stacking (k=4), pushing the tuned agent to an average score of 42.95 and 324 frames of survival at 1M timesteps — a major improvement over the untuned stacked baseline. Final average reward: 32.
QR-DQN: Distributional Q-learning paid off substantially, with QR-DQN reaching an average score of 76.5 and 588 survival frames at 1M steps, significantly outperforming DQN on both metrics. Final average reward: 50.
Rainbow DQN: Achieved the highest overall performance, scoring 106.0 on average at 1M frames and continuing to improve all the way to ~292.0 at 5M frames — the only algorithm that showed no plateau, thanks to its distributional and noisy-network components. Final average reward: 125.
PPO: Evaluated across four experimental configurations (Baseline Dual Decay, OpenAI Protocol, Genesis Reward Shaping, and Apex) plus a component ablation study isolating the roles of FrameStack, GAE, and learning-rate decay. Final average reward: 73.
Added a final cross-algorithm comparison of all four models, summarizing trade-offs in learning stability, sample efficiency, and peak reward.

Checkout our implementation on GitHub: