←── back to feed
/topics/claude-mythos-task-length-limits-and-capabilities

Claude Mythos task length limits and capabilities

4 items3 sourcesupdated 36d agotrend 0

Anthropic's Claude Mythos model demonstrates substantially extended task horizon capabilities, with evaluation tools running out of measurement range at 50% task completion exceeding 16 hours. The model's performance validates earlier claims about its general-purpose capabilities and long-context reasoning abilities.

  • METR evaluation tools maxed out measuring Claude Mythos task length at 50th percentile exceeding 16 hours
  • Claude Mythos confirmed as general-purpose model with strong exploit-finding and reasoning capabilities
  • Extended task horizon capabilities expected to appear in competing models from OpenAI and Google within similar timeframe
  • Human-AI team complementarity research shows teams outperform individuals only when error correlation below critical threshold ρ*