Our starter kit provides a real-time agent loop with modular components for perception (game frame recognition), planning & memory (long term vs. short term goals, knowledge storage), and control (gameboy emulator action execution).
Application closed and credits awarded. All compute credits have been distributed to approved teams.
Exact clones of organizer-hosted baselines are not eligible for prizes. Submissions must demonstrate novel approaches, meaningful modifications, or original implementations. Simple repackaging or minimal changes to existing baseline code will be disqualified from prize consideration.
Submissions for this track focus on achieving maximum game completion under time constraints. Your agent must interact exclusively through our custom Pokémon Emerald emulator API. Use any method, as long as the final action comes from a neural network.
Important: All submissions will undergo anti-cheat verification to ensure fair competition. This includes validation of agent behavior, action logs, and verification that submissions follow the competition rules.
submission.log and detailed logs generated by the starter kit during your agent's run. These logs validate that your agent followed competition rules and provide action/state information for evaluation.
Code Modification Policy:
You are encouraged to modify, extend, or completely rewrite the starter kit code to implement your approach.
The only requirement is that your submission includes the valid logs (including submission.log) generated by the starter kit's logging system,
which verifies your agent interacted with the game through the official API and followed competition rules.
Final rankings are determined by raw performance metrics only (number of actions and time). Based on community feedback, we have simplified the main ranking to focus purely on objective performance measures.
Novel Methods Welcome: While we provide a starter kit with an LLM-scaffolded approach, we encourage submissions using a wide variety of methods including tool-augmented systems, reinforcement learning, purely text-based reasoning, hybrid architectures, and other innovative techniques. The competition is designed to be open to diverse methodologies—whether you're building a complex multi-agent system or a streamlined end-to-end model, your approach is welcome!
While scaffolding complexity does not affect the main rankings, teams must still document their methodology across five dimensions for consideration of separate Judges' Choice and innovation awards:
Judges' Choice Awards: Separate awards will recognize innovative approaches, including those with minimal scaffolding (limit amount of prompts to the LLM), creative tool use, and novel architectural designs. These awards encourage diverse methodologies while keeping the main competition ranking simple and objective.
Teams must document their scaffolding components in detail during submission for eligibility for Judges' Choice awards. The organizing committee will review submissions for these special recognitions.
Official competition website goes live with preliminary documentation.
Full rules and track timeline announced. Starter code with a baseline RPG agent (scaffolding and VLM setup) and emulator API available for beta testers.
Track 2 Competition Begins. Submit runs of your Pokémon Emerald agent to the leaderboard.
Final submission deadline.
Winners announced at NeurIPS 2025.
Top performing agents in the RPG speedrunning challenge will be awarded $4,500 and 1000 GCP total:
Senior organizers will award at most four projects with $400 or 500 GCP to help continue their work after the competition ends.
This project does not necessarily have to place highly in the speedrun rankings but should propose a novel approach or demonstrate interesting capabilities in long-horizon planning or RPG navigation.