A containerized OpenEnv-compliant multi-domain RL environment for evaluating LLM agents on real-world professional workflows. Switch domains via one environment variable — same container, same API.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results