Leaderboard open

Port NetHack 5.0 C → JavaScript with bit-exact parity
Loading…

News

Older news

Several teleport-contest bugs fixed thanks to careful reports from @xeophon (#5) and @serteal (#6): the public session corpus is back in sync between contest and judge (38 files re-recorded), seed0030's seg-0 character mismatch is fixed, and the scorer now requires the cursor to land in the recorded position for a screen to count as matched. Pull the latest template and re-run bash frozen/score.sh to pick up the corrections.

Contest updates: several improvements. Forks now declare a category — agentic, transpiled, or other — by running set-category.sh once before it will be scored. Animation-frame parity is scored as a supplemental metric, supported by a new API. /play/<owner>/ now supports saving, loading, and an in-browser options editor at /nethackrc/, and the persistence API was simplified to a single opts.storage handle so save/restore survives a browser reload. (serteal’s port hasn’t regressed; their sessions just need a small migration to fit the new API.) The corpus was re-recorded with instrumentation fixes. Phase 2 is clarified as a test of maintainability.

serteal has submitted the first transpiled solution — an Emscripten compilation of the C source into a JavaScript emulation of the C state machine, including a simulated C heap. Click serteal’s name to inspect the JavaScript, and Play to play the working game in the browser. It is not yet a readable JS port, and there is still plenty of time to write one! Can you build a port that beats the transpiler in Phase 2?

# Team Points / 22,670 PRNG Screen Anim Speed Playable Sessions / 88 Progress

Loading…

Category is based on how the team plans to produce most (over 50%) of the code. Agentic codebases are mostly produced by generating code with an LLM and Transpiled codebases rely mostly on transpiling the C sources with tools. Points shown as public + held-out: matched 80×24 screens, one point per recorded step where the fork’s render matches C exactly. The 88 sessions split 44/44 across the public corpus and a held-out set kept private until contest end. PRNG is advisory — the structural prerequisite for screens to match, but no points on its own. Anim is a supplemental count of matched animation frames: forks that opt in via the new animation API render in-between frames (dart trajectories, explosion expansion, etc.) and earn one credit per matched frame. Reported as a raw integer rather than a percentage so a fork that opts in stays visible even when most contestants haven’t wired up the API. Not part of the official ranking. Speed is a linear fit on the offline scoring path of the form startup_ms + per_move_ms × moves, computed against the same 88 sessions every fork is scored on. Two roughly-comparable numbers in one cell: how much fixed cost a session pays before the first move, and how much each move adds. Playable is a two-part browser-playability check. First, the judge loads the fork's index.html in real headless Chromium and watches for failed module fetches, top-level script errors, or 4xx/5xx on any subresource — catches deploy mistakes (missing files, broken import paths) that look fine in offline scoring but break the actual play page. Second, it drives the fork the same way the browser does (one moveloop call per keypress) and asks whether the aggregate ms/move stays under 5 ms. Both must pass. Neither column is part of the official ranking; both are reported as diagnostics. Play opens that contestant’s build in your browser. Tests opens the Session Viewer scoped to the fork — scrub through each public session frame by frame and see where it diverges from C.

100% match 50–99% <50%