GeminiChess, an LLM built engine
Posted: Tue Sep 23, 2025 1:30 am
I spent most of the weekend getting an LLM to build a legal-move UCI chess engine end-to-end. Using Gemini 2.5-pro, I asked it to complete the first two milestones:
Step 1 — Core engine (C++ bitboards):
I prompted the model to produce a C++ bitboard engine that plays fully legal chess and speaks UCI. The model split the work across 151 responses.
Step 2 — Basic heuristics & time management:
I then asked Gemini to add a first pass of heuristics (its own suggestions) and simple time management. This took ~300 additional responses and resulted in the current program build. I’ve uploaded the source and the Linux/Windows binaries (it should compile on other platforms as well).
Automation pipeline:
Both steps were fully automated by a driver script that checked each LLM output for:
- successful compile,
- runs without crashing,
- basic UCI compliance,
- passing a small test suite.
About 15% of submissions were rejected by these checks (early in the run, acceptance may have been higher). The pipeline advanced or halted based on these criteria—no manual edits in the loop.
Current strength & next steps:
Right now it plays around fairymax strength—so nothing groundbreaking yet—but I believe there’s headroom for the LLM to keep improving it. I’m planning a third step to add more heuristics and better code structure.
Compute & cost notes:
I used over 95% of Google’s starter API credit to get this far and the work stopped before the model could reach the end of step 2. I still have some runway, but I may switch to more economical LLMs or explore running locally (which would anyhow require a GPU with decent VRAM).
I’m torn between pride and embarrassment
— but mostly curious what this community thinks. Feedback, testing ideas, or pitfalls I should watch for would be hugely appreciated!
Step 1 — Core engine (C++ bitboards):
I prompted the model to produce a C++ bitboard engine that plays fully legal chess and speaks UCI. The model split the work across 151 responses.
Step 2 — Basic heuristics & time management:
I then asked Gemini to add a first pass of heuristics (its own suggestions) and simple time management. This took ~300 additional responses and resulted in the current program build. I’ve uploaded the source and the Linux/Windows binaries (it should compile on other platforms as well).
Automation pipeline:
Both steps were fully automated by a driver script that checked each LLM output for:
- successful compile,
- runs without crashing,
- basic UCI compliance,
- passing a small test suite.
About 15% of submissions were rejected by these checks (early in the run, acceptance may have been higher). The pipeline advanced or halted based on these criteria—no manual edits in the loop.
Current strength & next steps:
Right now it plays around fairymax strength—so nothing groundbreaking yet—but I believe there’s headroom for the LLM to keep improving it. I’m planning a third step to add more heuristics and better code structure.
Compute & cost notes:
I used over 95% of Google’s starter API credit to get this far and the work stopped before the model could reach the end of step 2. I still have some runway, but I may switch to more economical LLMs or explore running locally (which would anyhow require a GPU with decent VRAM).
I’m torn between pride and embarrassment