Architecture

This page walks through the overall architecture of the SWE-MiniSandbox framework.
If you just want to run it, please refer to the Quick Start guide.

SWE-Agent Integration

SWE-MiniSandbox integrates with SWE-Agent to provide a complete solution for batch execution of software engineering tasks.
For background, we recommend first reviewing the SWE-Agent Architecture.

The main flow is:

The sweagent command line executable initializes an instance of the SWEsbEnv class.
SWEsbEnv is a modification of the original SWEEnv class in SWE-Agent and manages the interaction between the agent and the MiniSandbox environments.
The original SWEEnv class is retained for compatibility with standard Gym environments.
We extend SWEEnv with additional functionality to support convenient reward computation.
SWEsbEnv initializes the MiniSandbox deployment, which manages our container-free local Gym environments.
The MiniSandbox deployment uses SWE-Rex to manage terminal sessions.
We introduce a new runtime class and modify the remote runtime in SWE-Rex to support this project.

In addition, we re-implement the Agent_sb class based on the original Agent class to support the MiniSandbox-specific agent workflow.
The original Agent class remains available for standard Gym environment interactions.

Sky-RL Integration

Sky-RL typically uses a user-defined custom generator class to launch agent rollout processes.
For SWE-MiniSandbox, we re-implement the generator class following the mini-sweagent example.

The SweAgentGenerator class launches multiple init_and_run_sb or init_and_run_container processes to perform agent rollouts in parallel.
Each process initializes an instance of SWEsbEnv or SWEEnv to manage interactions between the agent and the respective environments.