If you encounter any difficulties in using or reproducing the code, please contact me at zhaoyangyu713@gmail.com.
ReCode introduces recursive code generation for LLM agents, unifying plan and action into a single representation. By treating high-level plans as placeholder functions that recursively decompose into executable primitives, it achieves universal granularity control and dynamically adapts from strategic thinking to concrete actions. This repository hosts the reference implementation used in the paper, along with environment wrappers and experiment tooling.
ReCode adopts a divide-and-conquer strategy, decomposing complex tasks into executable code fragments:
run.py – CLI entry point that instantiates agents/envs, manages concurrency, and writes run summaries.agents/recode/ – ReCode agent implementation, prompt templates, and utility helpers.envs/ – Environment wrappers and assets for alfworld, webshop, and sciworld.configs/ – LLM profile templates and (expected) pricing metadata used by the async client.utils/ – Shared components: async OpenAI wrapper, constrained executor, logging helpers, error types.figures/ – Paper figures used throughout this README.To evaluate the effectiveness of ReCode, we divide our experiments into the inference part and the training part.
claude-4-sonnet.
Qwen2.5-7B-Instruct. ReCode+SFT delivers an impressive average performance of 70.4% across all environments, outperforming both ReAct+SFT (67.6%) and CodeAct+SFT (55.8%), highlighting its exceptional data efficiency.
To run ReCode, we need a conda environment. The python version should be 3.10 or newer.
Then, it is necessary to configure dependencies for three environments (it has not been confirmed whether conflicts will arise in the same environment), and we suggest configuring them in three separate environments.
conda create -n recode-envname python=3.10 # Replace "envname" with the your environment name.
conda activate recode-envname
Follow the ALFWorld instructions.
Set ALFWORLD_DATA to the dataset root or edit envs/alfworld/base_config.yaml to point to your local paths:
export ALFWORLD_DATA=/path/to/alfworld
Thanks to ETO for providing a convenient script to configure WebShop environment.
cd envs/webshop
pip install -e .
conda install -y -c conda-forge openjdk=11
pip install "en_core_web_lg @ https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.6.0/en_core_web_lg-3.6.0-py3-none-any.whl"
Run the provided helper to fetch the goal set and pre-built search index:
# The current path is "envs/webshop"
bash setup.sh
Install some other dependencies.
pip install -r requirements.txt # Here may not be complete, please contact me promptly if you encounter any problems
Ensure configs/profiles.yaml points to a valid API credential (copy configs/profiles_example.yaml if you need a template), then run a short dry run in any enabled environment:
python run.py -a recode -e alfworld -n 1 --split test --profile default
Replace alfworld with webshop or sciworld once their assets are available. Logs are written to logs/<run_id>/, and the console prints a condensed summary for quick diagnostics.
configs/profiles.yaml contains named profiles. The run.py --profile flag selects which profile to forward to AsyncLLM. Example:
models:
default:
api_key: "sk-your_api_key"
base_url: "https://api.openai.com/v1"
model: "gpt-4o-mini"
temperature: 0.0
track_costs: true
gpt-4o:
api_key: "sk-your_other_key"
base_url: "https://api.openai.com/v1"
model: "gpt-4o"
temperature: 0.7
max_tokens: 512
Cost tracking loads configs/prices.json. If you do not want to record costs, set track_costs: false for the profile.
As a fallback, you can omit the file and set OPENAI_API_KEY in the environment; the default profile will then use it.
A ready-to-edit template lives at configs/profiles_example.yaml; copy it to configs/profiles.yaml if you're starting from scratch:
cp configs/profiles_example.yaml configs/profiles.yaml
run.py is the canonical entry point. It resolves agent/environment aliases, manages concurrency, streams logs, and emits a structured summary.
# ALFWorld, single instance
python run.py -a recode -e alfworld -n 1 --split test --profile default
# WebShop, 3 test goals, allow deeper recursion
python run.py -a recode -e webshop -n 3 --split test --profile default --max-depth 12
# ScienceWorld, run 5 instances with 2-way concurrency
python run.py -a recode -e sciworld -n 5 -c 2 --profile gpt-4o
Key CLI flags:
-a / --agent – class path or alias (recode resolves to agents.recode.agent.ReCodeAgent).-e / --env – environment class or alias (alfworld, webshop, sciworld).-n / --instances – number of evaluation episodes.-c / --concurrent – max concurrent episodes (rich progress UI automatically adapts).--split, --seed, --max-depth, --profile – forwarded to both agent and environment.-C / --config – YAML file whose keys override CLI flags; useful for complex sweeps.Example YAML (configs/example.yaml):
agent: recode
env: alfworld
instances: 10
concurrent: 2
profile: gpt-4o
split: test
task_types: ["put", "clean"] # For ALFWorld
max_depth: 12
max_retry: 4
Run it with:
python run.py -C configs/example.yaml
logs/<run_id>/ with:
running_logs/run.log – aggregated stream of agent + environment logs.running_logs/instance_<id>.log – per-instance traces (when multiple instances are launched).<results.json> – structured summary written by write_summary, containing per-instance metrics and aggregated statistics (overall + per task type).Env interface under envs/<your_env>/env.py. Use base.environment.Env as the contract: implement reset, _run, is_done, is_success, and report. Return {"observations": [...], "env_name": <name>, "env": self} from reset.agents/recode/resources/:
prompts/<env_name>/actions.txt – concise description of valid run("...") calls/tools.fewshots/<env_name>/ – one or more .txt examples showing thought→execute patterns.agents/recode/agent.py::_load_resources and agents/recode/utils.parse_raw_observation to parse initial observations correctly.ENV_ALIASES in run.py (optional but convenient) and, if needed, plan-specific logic in the agent utilities.envs/webshop/setup.sh) to document dataset fetching.You can embed the agent directly inside your own loop by reusing the provided utilities:
import asyncio
from agents.recode.agent import ReCodeAgent
from envs.alfworld.env import AlfworldEnv
async def solve_once():
config = {"split": "test", "task_types": ["put"], "max_depth": 10}
env = AlfworldEnv(logger=None)
agent = ReCodeAgent()
init_info = env.reset(config)
agent.reset(config, init_info)
observations = init_info["observations"]
while not env.is_done():
actions = await agent.act(observations)
observations = await env.run(actions)
print(env.report())
await env.close()
asyncio.run(solve_once())
The same pattern works for any Env implementation; be sure to pass a logger if you need file-backed traces.
@misc{yu2025recodeunifyplanaction,
title={ReCode: Unify Plan and Action for Universal Granularity Control},
author={Zhaoyang Yu and Jiayi Zhang and Huixue Su and Yufan Zhao and Yifan Wu and Mingyi Deng and Jinyu Xiang and Yizhang Lin and Lingxiao Tang and Yingchao Li and Yuyu Luo and Bang Liu and Chenglin Wu},
year={2025},
eprint={2510.23564},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2510.23564},
}