Skip to content

Commit f9ebc40

Browse files
Craigacpjhalexand
authored andcommitted
Adding more markdown docs.
1 parent 1429463 commit f9ebc40

File tree

3 files changed

+145
-1
lines changed

3 files changed

+145
-1
lines changed

README-Configuration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ are performing. Your Pipeline and its stages are represented as components
1717
in the configuration file.
1818

1919
```xml
20-
<?xml version="1.0" encoding="UTF-8"?\>
20+
<?xml version="1.0" encoding="UTF-8"?>
2121
<config>
2222
<component name="myPipeline" type="com.oracle.labs.sound.Pipeline">
2323
<property name="numThreads" value="2"/>

docs/Internals.md

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
# Internals
2+
3+
## Command Shell
4+
The command shell allows developers to implement CommandGroup, and then
5+
inspects the class via reflection pulling out all methods annotated @Command.
6+
These methods are added to a CLI, with appropriate help and arguments
7+
completion. The CLI is based on jline 3, but only provides the commands
8+
specified by the developer, along with a set of default commands that control
9+
how the inputs and outputs are directed, along with some help and status
10+
commands.
11+
12+
The arguments for the command shell are parsed from the supplied String into
13+
the set of supported arguments: Strings, String arrays (if the final argument
14+
only), primitives, primitive boxes, and File.
15+
16+
## Configuration System
17+
The configuration, options and provenance system provides ways for users to
18+
configure and control their programs through a combination of command line
19+
options and configuration files. The provenance system allows the generation of
20+
immutable provenance objects which record configured values and optionally
21+
runtime values depending on how the user implements the Provenance interface.
22+
23+
The configuration system allows developers to mark fields @Config in classes
24+
which implement the Configurable interface. Those fields can have their values
25+
written in on construction based on the values in the configuration file, or
26+
presented on the command line. The supported types are primitives (and
27+
primitive boxes), primitive arrays, strings, string arrays, classes which
28+
implement Configurable, arrays of classes which implement Configurable, lists,
29+
sets, maps, enums, enum sets, AtomicInteger, AtomicLong, File, Path, URL,
30+
OffsetDateTime, LocalDate, OffsetTime, and Random (which is deprecated). The
31+
lists and sets accept generic parameters of any of the non-collection types,
32+
and Maps have String keys and accepted non-collection types as values. The
33+
collection types are specified as a list of values which is written into a
34+
concrete ArrayList, a HashSet, or a HashMap, but the annotation must be on a
35+
field which is typed with the List, Set or Map interface.
36+
37+
## Lifetime of a ConfigurationManager
38+
39+
This summarises the lifetime of a ConfigurationManager from construction until
40+
it goes out of scope.
41+
42+
1. A new ConfigurationManager is constructed
43+
- If it's supplied a list of file paths then those files are passed to the appropriate ConfigLoader subclass based on their file type. New subtypes can be registered by calling a static method on ConfigurationManager before instantiating it. This mechanism is how the json and edn file types are registered, the xml file type is available by default.
44+
- Configuration files are processed by reading the configuration into ConfigurationData objects, which are Map<String,Union<String,List<Union<String,Class>>> for each object's configuration, along with a few metadata fields like the class name, the name given in the configuration file, and if the configuration is overlaid from some other object.
45+
2. Configurable objects can be supplied to the ConfigurationManager, these objects have their fields marked @Config inspected via reflection, and the values recorded in the configuration. This is performed recursively if the object contains Configurable object fields, or collections thereof.
46+
3. A configuration can be looked up by name or by class.
47+
- If by class, each configuration which is a subclass (or the class itself) is returned as a list after they are instantiated.
48+
- Configured objects are instantiated as follows:
49+
1. The relevant class file is loaded, triggering static initialisers. Invalid or unknown classes trigger a PropertyException terminating instantiation.
50+
2. An instance of the class is instantiated by calling it's no-args constructor, making it accessible if necessary. If this constructor is not present then a PropertyException is triggered.
51+
3. Fields are initialised to their default values, as specified in the class file.
52+
4. Each field from the configuration is processed, making it accessible first if necessary, triggering further object instantiation if required, writing each field value into the new object. Note: this step cannot recur infinitely as the object under construction hasn't been published yet so circular references will cause PropertyException as the object cannot be found.
53+
5. Invalid field values trigger PropertyException, and the instantiation terminates without publishing the object.
54+
6. After each field has been written, it's access privileges are reset to those specified in the class.
55+
7. After all fields have been written then the object has it's postConfig method called. This is intended to perform object specific validation and checking, similar to a standard Java constructor. If the object is invalid it may throw PropertyException, or another RuntimeException, and the instantiated object is discarded.
56+
8. Finally the object is published by storing it into the ConfigurationManager's map of instantiated objects, and then returning it to the caller.
57+
4. The current configuration can be written out to a file on disk, either including the just the objects that have been instantiated, or including all configurations known to the system (this may include configurations which are invalid, or for which the class is not available on the classpath).
58+
59+
## Lifetime of a ConfigurationManager processing Options
60+
61+
This summerises the steps executed by a `ConfigurationManager` on construction
62+
when parsing options.
63+
64+
1. A class implementing Options, with fields marked @Option for values that can be read in from the command line is constructed.
65+
2. A new ConfigurationManager is constructed, passing in the command line argument array, along with the options subclass.
66+
3. The options subclass is validated and the usage statement constructed.
67+
- A valid options subclass has fields which are either options subclasses, or marked @Option and the type is one of the supported configuration types mentioned above.
68+
- It also requires that the charName and longName of the option fields are unique in this particular instantiation, including all fields on nested options subclasses.
69+
4. The arguments are checked for the "help" or "usage" arguments, and a UsageException is thrown with the usage statement.
70+
5. The arguments are checked for any which specify a config file format (i.e. a fully qualified class name which implements FileFormatFactory), the format factories are instantiated by calling their no-args constructor and passed to the ConfigurationManager's addFileFormatFactory method.
71+
6. The arguments are checked for the configuration file list argument, and any files found are processed by the appropriate ConfigLoader subclass.
72+
7. The arguments are checked for arguments starting "--@" and the resulting "<object-name>.<field-name> <value>" tuples are written into the configuration, overwriting or adding to what was loaded from the configuration files.
73+
8. All arguments which match a charName or longName in the supplied options instance (and any nested options instances) have their values parsed (using the same logic as the configuration system), and assigned to the marked field. This can trigger object instantiation from the configuration, any fields which are subtypes of Configurable or collections of Configurable treat the supplied value as a configurable object name and look it up in the configuration.
74+
9. All remaining unparsed arguments are stored in a String array for inspection from the ConfigurationManager
75+
10. The ConfigurationManager constructor returns.

docs/Security.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# Security Considerations
2+
3+
OLCUT is a library which performs dependency injection, configuration, command
4+
line arguments processing, and includes a full interactive CLI. However all of
5+
these mechanisms are under the control of the developer incorporating OLCUT
6+
into their system. For the command shell, it's important to prevent
7+
unauthorised users from executing any command shells present in a program. For
8+
the configuration system it's important to use the most specific types possible
9+
for configurable fields, as this reduces the scope values that a malicious
10+
configuration file could insert into a program, and to properly validate field
11+
values in the `Configurable` subclass's `postConfig` method (as you would in a
12+
normal constructor).
13+
14+
The configuration system is designed to prevent cyclic references, as objects
15+
are only published once they have been fully constructed and their `postConfig`
16+
methods have executed. It should be impossible to see a half constructed object
17+
in any place other than that object's own `postConfig` method. If you find a
18+
way, please raise an issue using the appropriate channels.
19+
20+
## Serialized files
21+
22+
OLCUT allows the loading of Java serialized objects as specified in the
23+
configuration files. Due to the inherent issues with Java serialization, these
24+
object files should be stored in trusted locations where third parties do not
25+
have access. We recommend at minimum using a [JEP
26+
290](https://openjdk.java.net/jeps/290) allowlist to prevent users from
27+
deserializing arbitrary code, as OLCUT targets Java 8 and this is an API level
28+
feature in Java 9+ onwards, we recommend using a process level allowlist
29+
specified at JVM startup time. When OLCUT migrates to a newer Java version, we
30+
will include API support for specifying the allowlist in a configuration file.
31+
32+
## Threat model
33+
34+
As a library incorporated into other programs, OLCUT expects it's inputs to be
35+
checked by the wider program, and to have the locations it reads from
36+
controlled appropriately. Below we discuss a few threats specific to how OLCUT
37+
operates.
38+
39+
| Threat name | Description | Exposed assets | Mitigations |
40+
|------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
41+
| Malicious configuration file | Configuration files can contain arbitrarily large trees, and could cause DOS issues. Alternatively the configuration files could be arbitrarily large and cause DOS issues when being read. | None | The developer controls which fields are configurable, and should ensure that the configurable structure is always bounded by not allowing arbitrary recursion in configurable fields. Additionally OLCUT's config storage format is flat, and will not allow circular references between configured objects. To mitigate the large config file issue we could limit the file size that is allowed to be read, though this is trickier when loading configuration from jar files. |
42+
| Configuration saving | Configuration files can be generated from the state of existing configurable objects (under developer control). This program state could potentially include sensitive information. | Sensitive program state | Do not mark fields @Config if they could contain sensitive information. Alternatively if it is critical they be configurable for the program to operate, then mark them @Config(redact=true), which will prevent the sensitive fields from being written out in configuration or provenance. Finally if the information must be saved, then ensure that the file is written to secure storage. |
43+
| Provenance capture | Provenance tracks the state and construction path of objects by inspecting their fields. Each provenance object is either implemented by the developer, or autogenerated from a configurable object. If the object state contains sensitive information this will persist in the provenance. | Sensitive program state | The mitigations for configuration saving apply, along with an additional one which is to not store sensitive non-configurable fields in the provenance object implemented by the developer. |
44+
| Imprecise Config annotations | The @Config annotation can be applied to a range of Java primitives and immutable classes, along with things which implement Configurable. If the developer designs a class hierarchy where the configurable fields have relaxed type boundaries (i.e. the field is of type Configurable rather than the specific type Foo) then it allows malicious configuration files to inject any other class that implements Configurable that is available on the class path. This could result in an invalid program state if not properly checked, and potentially result in unexpected behaviour as the postConfig methods are executed on the Configurable objects. | Program state and execution | @Config and @Option annotations should use the most specific field type available to them. postConfig methods should properly validate their arguments like constructors do, and be written defensively wrt to odd inputs. |
45+
46+
## Java Security Manager
47+
48+
The configuration and provenance systems use reflection to construct and
49+
inspect classes, as such when running with a Java security manager you need to
50+
give the olcut jar appropriate permissions. We have tested this set of
51+
permissions which allows the configuration and provenance systems to work:
52+
53+
```
54+
// OLCUT permissions
55+
grant codeBase "file:/path/to/olcut/olcut-core.jar" {
56+
permission java.lang.RuntimePermission "accessDeclaredMembers";
57+
permission java.lang.reflect.ReflectPermission "suppressAccessChecks";
58+
permission java.util.logging.LoggingPermission "control";
59+
permission java.io.FilePermission "<<ALL FILES>>", "read";
60+
permission java.util.PropertyPermission "*", "read,write";
61+
};
62+
```
63+
64+
The read FilePermission can be restricted to the jars which contain
65+
configuration files, configuration files on disk, and the locations of
66+
serialised objects. The one here provides access to the complete filesystem, as
67+
the necessary read locations are program specific. If you need to save an OLCUT
68+
configuration out then you will also need to add write permissions for the save
69+
location.

0 commit comments

Comments
 (0)