-
Notifications
You must be signed in to change notification settings - Fork 113
Design of the HNix code base
Module connections give the idea of the project design & structure:
Update: (2020-12-23)
Update: (2020-12-23)
Update: (2021-02-02)
Update: (2021-02-02)
Welcome to the HNix code! You may notice some strange things as you venture into this realm, so this document is meant to prepare you, dear reader, for the secrets of the labyrinth as we've designed them.
The first thing to note is that HNix was primarily designed so that Haskell authors could use it to craft custom tooling around the Nix ecosystem. Thus, it was never fully intended for just the end user. As a result, we use a great deal of abstraction so that these enterprising Haskell authors may add new behavior at various points: within the type hierarchy, the value representation, and the evaluator.
To this end, you'll see a lot of type variables floating around, almost everywhere. These provide many of the "injection points" mentioned above. There is a strict convention followed for the naming of these variables, the lexicon for which is stated here.
HNix uses recursion schemes in the code part of the project to express the recursive nature of Nix language.
*F
type is F
/f
(Functor) for its particular F-algebra.
F-algebra in Haskell:
(quoting Oleg GrenRus)
The core idea ... is that instead of writing recursive functions on a recursive datatype, we prefer to write non-recursive functions on a related, non-recursive datatype we call the "base functor".
(quoting Gabriel Gonzales)
Actually, even in Haskell recursion is not completely first-class because the compiler does a terrible job of optimizing recursive code. This is why F-algebras and F-coalgebras are pervasive in high-performance Haskell libraries like vector, because they transform recursive code to non-recursive code, and the compiler does an amazing job of optimizing non-recursive code.
2018-04: Johns "Program Reduction: A Win for Recursion Schemes during HNix creation.
-
NValueF
are "base functor data type's for Nix typesNValue' = (NValueF (NValue))
: -
NValue'
which during evaluation runtime becomeNValue = Free NValue'
): -
NValue
Nix types.
Since:
-
NValueF (NValue)
produces: NValueF (NValue) -> NValue' -> NValue
- Transitivity:
NValueF (NValue) -> NValue
- Which is a F-algebra.
So it is real-deal F-algebra. It implements Nix language in the most direct way from Haskell, Nix becomes a DSL of Haskell from now on. With Haskell type-checking system. Moreover, F-algebra is very convenient implementation - it is an endlessly conveniently transformable thing through just several basic operations, as would be seen further. The transformable thing that is fully recursive in its nature, but from now on that does not bother us anymore, since it is under control.
NValue
can be met everywhere in evaluation operations.
There are NValue* t f m
and such, where:
t
is the type of thunks. It turns out that HNix doesn't actually need to
know how thunks are represented, at all. It only needs to know that the
interface can be honored: pending action that yield values may be turned into
thunks, and thunks can later be forced into values.
f
is the type of a comonadic and applicative functor that gets injected at
every level of a value's recursive structure. In the standard evaluation
scheme, this is used to provide "provenance" information to track which
expression context a value originated from (e.g., this 10 came from that
expression "5 + 5" over in this file, here).
m
is the "monad of evaluation", which must support at least the features
required by the evaluator. The fact that the user can evaluate in his own base
monad makes it possible to create custom builtins that make use of arbitrary
effects.
v
is the type of values, which is almost always going to be NValue t f m
,
though it could be NValueNF t f m
, the type of normal form values. Very few
points in the code are generic over both.
Base funtor data types (NValueF a
) also call "unfixed".
Morphim that is transformations between 2 F-algebras called recursion scheme.
So recursion schemes are just simply different operations on initial F-algebra. Why it is said as just any operations? Because any f a
(of kind >1) - is a functor, so any F a b c d..
to infinite complexity - is also a functor. And the only remained case is kind 1 type. And surprise - it is isomorphic to the Id a
, and that is a functor, so from 1-type F-algebra also can be constructed: Id a -> a
, and as regular F-algebra they are the first-class citizen for recursion schemes, there is no longer any difference for us.
Recursion scheme - transformation takes the initial(term) F-algebra and returns terminal(term) F-algebra. Folds, unfolds, traverses are just recursion scheme transformation. For folds, unfolds, traverses we need to provide the implementation of how particularly fold, unfold, traverse, and we do for the recursion scheme just the same.
Only the core Nix types are described through the use of the *F
(data NValueF
) they all to the one are here: https://github.com/haskell-nix/hnix/blob/573c035c80f0dce6a90732eca522de883c6ded11/src/Nix/Value.hs#L48-L84
data NValueF p m r -- Functors for their F-algebras
= NVConstantF NAtom
| NVStrF NixString
| NVPathF FilePath
| NVListF [r]
| NVSetF (AttrSet r) (AttrSet SourcePos)
| NVClosureF (Params ()) (p -> m r)
| NVBuiltinF String (p -> m r)
-- | The Nix values in WHNF (Weak head normal form), or "head normal form",
-- only the outer layer is evaluated to its normalized form, the rest of the body stays in a lazy form,
-- lazy form of the data called thunk.
newtype NValue' t f m a =
NValue
{ _nValue :: f (NValueF (NValue t f m) m a) -- Applying F-algebra functor to the F-algebra carrier, aka `F (A)` OR `f a`
}
-- | Nix language is fully recursive, so some recursions may never terminate.
-- The 'Free' structure is used here to represent the possibility that
-- cycles may appear during normalization.
type NValue t f m = Free (NValue' t f m) t
For example in parsers - are so called "unfixed" types for/from recursion schemes (common recursive operations), these types are a generalization of the recursive data types into open (non-explicitly recursive, but allowed through polymorphism) form. So Fix
ed data type versions - are (equivalent to/are) regular data types to work with (aka list), and the *F
- are their generalizations (one layer open for possible recursion (through self-application) - but without explicit recursion). *F
is a type that both can become recursive structure by applying Fix
to it, and as a single layer of the structure - internal value/data can be extracted from it in the initial form. For example - from the list cell the value extracts in the initial form, and it is impossible to extract initial data from the list (recursive structure) without referring to list cells (without referring to building blocks - the recursive structure can transform only into the recursive structure - which never terminates into receiving initial data), so addressing the cell in a list - is a unFix
of a list to the outermost cell, and from the cell, the data can be extracted.
Fixed types - are the actual presentation of the data types.