2 Comments
User's avatar
Justis Mills's avatar

Notably, since I posted this, Claude has indeed made it through Mt. Moon! I don't think that really invalidates what I'm saying here; Claude's improved navigation/memory systems helped it in some ways relative to its first run (somewhat more efficient exploration), and hurt it in others (it invented the blackout strategy). Mt. Moon took it a little under 3 days the second time, and a little over 3 days the first time. So, improved scaffolding seems to be helping in aggregate... a little bit. But with totally unanticipated specific pitfalls.

Expand full comment
Midi's avatar

Stronger underlying models are the most straightforward solution, but I don't see why scaffolding alone would be insufficient for Pokemon specifically, even without overfitting to Pokemon.

The loops you provided as examples could easily be resolved with just a small amount of grounding. In this case, grounding could be as simple as asking Claude, with minimal context in the prompt, "I am trying to do this/go here, this is my plan. Does this sound like a good plan?". With its base knowledge of Pokemon, it would easily resolve these.

This approach would not work nearly as well for more complex questions, or niche content (would be more difficult to avoid context contamination), but it would still cover a broad range of situations. Most hallucinations would not survive a majority vote unless the context is universally contaminated. And there are many other ways to attain ground truth through tool use, e.g., Claude can use code to determine how many r's are in strawberry.

That said, Claude's vision definitely needs improvement, which requires a stronger base model. It also struggles immensely with memory and learning in general, which seems difficult to remedy through scaffolding, but naive scaling also seems questionable as a solution. Regardless, these limitations don't seem significant enough to stop it from handling something as easy as Pokemon.

Expand full comment