In a single day, JianJun Jin might switch from writing code on his computer to experimenting at a lab bench or working under ...
CEO-Bench: Can Agents Play the Long Game? . Contribute to zlab-princeton/ceobench-src development by creating an account on GitHub.