Languages should be more configurable

Inspired by me trying to get the test suite to pass on my arch linux, for which the following changes were necessary:
* I use rustup, so /usr/bin/rustc isn't useful on its own. I installed a separate toolchain via `RUSTUP_HOME=/opt/rustup rustup toolchain install stable`. This means I needed to change the rustc path to `/opt/rustup/toolchains/stable-x86_64-unknown-linux-gnu/bin/rustc` (and add `/opt` to the sandbox).
* On arch, /usr/bin/pypy is a symlink into /opt as well, so this also requires the `/opt` bind mount.
* Arch's GHC doesn't support static linking, so I needed to replace the `-static` flag to it with `-dynamic`.

I think fixing this satisfactorily is two-fold:
* first, languages ought to have more control over the parameters the sandbox. This would allow moving https://github.com/cms-dev/cms/blob/9375d44d8a10dd7f0b61aef56ddadc1f22965946/cms/grading/Sandbox.py#L975-L982 into the language-specific code. Apart from the mapped directories, languages could also configure things like the environment variables and open file limits (which are needed for dotnet).
* second, system administrators ought to have control over the parameters of a language. This would allow changing the executable path of compilers/interpreters if necessary, adding extra bind mounts, or perhaps allow arbitrary tweaking of compiler flags.

Regarding the first point, we could add methods like "configure_{compilation/evaluation}_sandbox" to each language, which take as argument the Sandbox which is about to be used. This would be the most flexible from our POV, but would effectively make the Sandbox class a part of the "public" API that's necessary for adding new languages. Though I'm not sure if we really want to make any stability guarantees about these extensions, so maybe that's not a big issue. The alternative is to have slightly more restricted configuration, e.g. via returning a dict which can contain some specific sandbox parameters.

For the second point, we could make each language have some set of "options" that are configured in cms.toml. I think by default this could include `compiler_path` or `interpreter_path`, `compilation_extra_dirs` (for additional directories to bind-mount for compilation), `evaluation_extra_dirs`, and maybe `compiler_extra_flags` or `interpreter_extra_flags`. Each language could also add its own options (e.g. static vs dynamic linking in haskell). In the config file, this might look like:
```toml
[languages."Rust"]
compiler_path = "/opt/rustup/toolchains/stable-x86_64-unknown-linux-gnu/bin/rustc"

[languages."Python 3 / PyPy"]
compilation_extra_dirs = ["/opt"]
evaluation_extra_dirs = ["/opt"]

[languages."Haskell / ghc"]
use_dynamic_linking = true
```

	# Needed on Ubuntu by PHP (and more), since /usr/bin only contains a
	# symlink to one out of many alternatives.
	self.maybe_add_mapped_directory("/etc/alternatives")

	# Likewise, needed by C# programs. The Mono runtime looks in
	# /etc/mono/config to obtain the default DllMap, which includes, in
	# particular, the System.Native assembly.
	self.maybe_add_mapped_directory("/etc/mono", options="noexec")

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Languages should be more configurable #1480

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Languages should be more configurable #1480

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions