Introduction
This section introduces the environment pre-cache process for SWE-MiniSandbox.
The environment cache is composed of two main parts:
- Venv-based Python environment cache
- GitHub repo cache (optional)
Venv-Based Python Environment Cache
We pre-install a list of miniconda3 environments with different Python versions (see Conda Backends). These conda environments are used as bases to create virtual environments (venvs) for specific tasks.
When creating a new sandbox, SWE-MiniSandbox:
- Selects a conda environment according to the task’s Python version requirement.
- Creates a new venv based on that conda environment and install the dependencies.
- Packs the venv and stores it under a shared directory.
The venv directory structure is:
shared_venv_dir/
repo_id_1/ # The repo id of the task
python_version/ # The python version
image_name_1/ # The image name of the task
venv.tar.gz # Packed venv directory
image_name_2/
venv.tar.gz
repo_id_2/
python_version/
image_name_1/
venv.tar.gz
So the path to the venv of a specific task is:
{shared_venv_dir}/{repo_id}/{python_version}/{image_name}/venv.tar.gz
shared_venv_dir is user-defined. Do not change this directory name after pre-cache, because the path is currently hard-coded inside the venv. (A more flexible mechanism will be implemented in the future.)
How repo_id Is Determined
Every deployment config gets a data item ds from the dataset.
The repo_id is obtained through the mapping function swesandbox.sandbox_deployment.map_to_git_id:
def map_to_git_id(ds, data_type):
if data_type == "swebench":
return ds["repo"]
elif data_type == "swesmith":
if "repo" in ds:
return ds["repo"]
instance_id = ds["instance_id"]
git_id = ".".join(instance_id.split(".")[:2])
return git_id
else:
return ds["repo"]
How python_version Is Determined
The python_version is obtained from the dataset:
python_version = self._config.ds.get("version", "latest")
If not specified, it defaults to "latest".
How image_name Is Determined
The image_name is obtained from the dataset:
image_name = self._config.ds.get("image_name", "default")
If not specified, it defaults to "default".
GitHub Repo Cache
After installing the GitHub repo into the corresponding venv, we can cache the repository on local disk. This is:
- Necessary for the SWE-bench dataset
- Optional (not necessary) for SWE-smith
The GitHub repo cache directory structure is:
cached_git/
repo_id_1/ # The repo id of the task
python_version/ # The python version
instance_id_1/ # The instance id of the task
testbed.tar.gz # Packed GitHub repo
instance_id_2/
testbed.tar.gz
repo_id_2/
python_version/
instance_id_1/
testbed.tar.gz
The path to the cached GitHub repo for a specific task is:
{cached_git_dir}/{repo_id}/{python_version}/{instance_id}/testbed.tar.gz
The instance_id is obtained from the dataset:
instance_id = ds.get("instance_id", "default")
Disabling GitHub Repo Cache
If you do not want to cache the GitHub repo, set the cache_git parameter to False in the deployment config class SandboxDeploymentConfig.
- If
cache_git = False, the GitHub repo is fetched directly from the remote repository and installed in editable mode into the venv. - If
cache_git = True, the fetched and installed repo is packed and stored as described above.
For SWE-Bench, cache_git needs to be set to True to ensure correct functionality.
For SWE-Smith, it can be set to False to save storage space. By default, it is set to True.
Environment Pre-Cache Pipeline
The environment pre-cache pipeline is implemented in the class sweagent.agent.empty_agent.EmptyAgent.
Workflow:
- Prepare the environment (venv + repo).
- Run an evaluation script to validate the environment.
- Mark each environment as
passedorfailedor other statuses.
Only environments marked as passed are used in subsequent training and evaluation.
For failed environments, you can inspect the error logs and prediction files (output_dir/instance_id/exception.log and output_dir/instance_id/instance_id.pred) to debug manually or with large language model assistance (an automatic pipeline will be provided in the future).
Pre-Cache Status Outputs
The status of environment pre-cache is stored under the output_dir, with the following structure:
output_dir/
instance_id_1/
instance_id_1.config.yaml # Config file used to set up the environment
instance_id_1.debug.log # Debug log for environment setup (not used by default)
instance_id_1.pred # Prediction JSON from the evaluation script. Key fields:
# - reward : 1 if passed, 0 if failed
# - test_out : output of the evaluation script
# - p2p : success rate of pass-to-pass cases
# - f2p : success rate of fail-to-pass cases
instance_id_1.traj # Agent trajectory; empty for EmptyAgent
(exception.log) # If any exception occurs during setup, it is stored here
instance_id_2/
...
preds.json # Aggregated predictions of all instances (not useful for EmptyAgent)
run_batch_exit_statuses.yaml # Exit status of all instances:
# - "passed" / "failed" / other Exceptions for EmptyAgent
# - "submitted" / other Exceptions for other agents