Use case: Run mini-swe-agent on NFS

I've been enjoying running mini-swe-agent on NFS (Network File System)

The way it works is to split up the dataset into shards, and then use several machines in parallel, each one processing a single shard (the indexing can be done via the `--slice` parameter of swebench.py) and writing the results into a shared working directory that sits on NFS.

There is only a small wrinkle with this, which is that the preds.json file gets corrupted because it is only protected by an in-process lock https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/run/extra/swebench.py#L49 -- this leads to errors when another process tries to read a half written file.

Currently I work around this by just commenting out the code that reads and writes preds.json (since I don't actually need that file).

There are several ways to support this workflow better if we wanted:

- Let people fork the `swebench.py` script if they want to do similar things
- Replace the in process lock with a https://pypi.org/project/filelock/ -- this has the advantage that it would allow using mini-swe-agent out of the box on NFS but the disadvantage that it introduces a new dependency (it is a very small and stable dependency thought). If we want to do this, we could also make the dependence on the filelock package optional.
- Encourage people who want to do something like this to run the shards in different directories and merge them manually at the end

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use case: Run mini-swe-agent on NFS #485

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Use case: Run mini-swe-agent on NFS #485

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions