mirror of
https://github.com/NixOS/nix
synced 2025-07-10 04:43:53 +02:00
Expand manual on derivation outputs
Note, this includes some text adapted from from Eelco's dissertation
This commit is contained in:
parent
31923aaac0
commit
2aa6e0f084
12 changed files with 508 additions and 174 deletions
297
doc/manual/source/store/derivation/index.md
Normal file
297
doc/manual/source/store/derivation/index.md
Normal file
|
@ -0,0 +1,297 @@
|
|||
# Store Derivation and Deriving Path
|
||||
|
||||
Besides functioning as a [content addressed store] the Nix store layer works as a [build system].
|
||||
Other system (like Git or IPFS) also store and transfer immutable data, but they don't concern themselves with *how* that data was created.
|
||||
|
||||
This is where Nix distinguishes itself.
|
||||
*Derivations* represent individual build steps, and *deriving paths* are needed to refer to the *outputs* of those build steps before they are built.
|
||||
<!-- The two concepts need to be introduced together because, as described below, each depends on the other. -->
|
||||
|
||||
## Store Derivation {#store-derivation}
|
||||
|
||||
A derivation is a specification for running an executable on precisely defined input to produce on more [store objects][store object].
|
||||
These store objects are known as the derivation's *outputs*.
|
||||
|
||||
Derivations are *built*, in which case the process is spawned according to the spec, and when it exits, required to leave behind files which will (after post-processing) become the outputs of the derivation.
|
||||
This process is described in detail in [Building](@docroot@/store/building.md).
|
||||
|
||||
<!--
|
||||
Some of these things are described directly below, but we envision with more material the exposition will probably want to migrate to separate pages benough this.
|
||||
See outputs spec for an example of this one that migrated to its own page.
|
||||
-->
|
||||
|
||||
A derivation consists of:
|
||||
|
||||
- A name
|
||||
|
||||
- An [inputs specification][inputs], a set of [deriving paths][deriving path]
|
||||
|
||||
- An [outputs specification][outputs], specifying which outputs should be produced, and various metadata about them.
|
||||
|
||||
- The ["system" type][system] (e.g. `x86_64-linux`) where the executable is to run.
|
||||
|
||||
- The [process creation fields]: to spawn the arbitrary process which will perform the build step.
|
||||
|
||||
[store derivation]: #store-derivation
|
||||
[inputs]: #inputs
|
||||
[input]: #inputs
|
||||
[outputs]: ./outputs/index.md
|
||||
[output]: ./outputs/index.md
|
||||
[process creation fields]: #process-creation-fields
|
||||
[builder]: #builder
|
||||
[args]: #args
|
||||
[env]: #env
|
||||
[system]: #system
|
||||
|
||||
### Referencing derivations {#derivation-path}
|
||||
|
||||
Derivations are always referred to by the [store path] of the store object they are encoded to.
|
||||
See the [encoding section](#derivation-encoding) for more details on how this encoding works, and thus what exactly what store path we would end up with for a given derivation.
|
||||
|
||||
The store path of the store object which encodes a derivation is often called a *derivation path* for brevity.
|
||||
|
||||
## Deriving path {#deriving-path}
|
||||
|
||||
Deriving paths are a way to refer to [store objects][store object] that may or may not yet be [realised][realise].
|
||||
There are two forms:
|
||||
|
||||
- [*constant*]{#deriving-path-constant}: just a [store path].
|
||||
It can be made [valid][validity] by copying it into the store: from the evaluator, command line interface or another store.
|
||||
|
||||
- [*output*]{#deriving-path-output}: a pair of a [store path] to a [store derivation] and an [output] name.
|
||||
|
||||
In pseudo code:
|
||||
|
||||
```typescript
|
||||
type OutputName = String;
|
||||
|
||||
type ConstantPath = {
|
||||
path: StorePath;
|
||||
};
|
||||
|
||||
type OutputPath = {
|
||||
drvPath: StorePath;
|
||||
output: OutputName;
|
||||
};
|
||||
|
||||
type DerivingPath = ConstantPath | OutputPath;
|
||||
```
|
||||
|
||||
Deriving paths are necessary because, in general and particularly for [content-addressing derivations][content-addressing derivation], the [store path] of an [output] is not known in advance.
|
||||
We can use an output deriving path to refer to such an out, instead of the store path which we do not yet know.
|
||||
|
||||
[deriving path]: #deriving-path
|
||||
[validity]: @docroot@/glossary.md#gloss-validity
|
||||
|
||||
## Parts of a derivation
|
||||
|
||||
A derivation is constructed from the parts documented in the following subsections.
|
||||
|
||||
### Inputs {#inputs}
|
||||
|
||||
The inputs are a set of [deriving paths][deriving path], refering to all store objects needed in order to perform this build step.
|
||||
|
||||
The [process creation fields] will presumably include many [store paths][store path]:
|
||||
|
||||
- The path to the executable normally starts with a store path
|
||||
- The arguments and environment variables likely contain many other store paths.
|
||||
|
||||
But rather than somehow scanning all the other fields for inputs, Nix requires that all inputs be explicitly collected in the inputs field. It is instead the responsibility of the creator of a derivation (e.g. the evaluator) to ensure that every store object referenced in another field (e.g. referenced by store path) is included in this inputs field.
|
||||
|
||||
### System {#system}
|
||||
|
||||
The system type on which the [`builder`](#attr-builder) executable is meant to be run.
|
||||
|
||||
A necessary condition for Nix to schedule a given derivation on some Nix instance is for the "system" of that derivation to match that instance's [`system` configuration option].
|
||||
|
||||
By putting the `system` in each derivation, Nix allows *heterogenous* build plans, where not all steps can be run on the same machine or same sort of machine.
|
||||
Nix can schedule builds such that it automatically builds on other platforms by [forwarding build requests](@docroot@/advanced-topics/distributed-builds.md) to other Nix instances.
|
||||
|
||||
[`system` configuration option]: @docroot@/command-ref/conf-file.md#conf-system
|
||||
|
||||
[content-addressing derivation]: @docroot@/glossary.md#gloss-content-addressing-derivation
|
||||
[realise]: @docroot@/glossary.md#gloss-realise
|
||||
[store object]: @docroot@/store/store-object.md
|
||||
[store path]: @docroot@/store/store-path.md
|
||||
|
||||
### Process creation fields {#process-creation-fields}
|
||||
|
||||
These are the three fields which describe how to spawn the process which (along with any of its own child processes) will perform the build.
|
||||
You may note that this has everything needed for an `execve` system call.
|
||||
|
||||
#### Builder {#builder}
|
||||
|
||||
This is the path to an executable that will perform the build and produce the [outputs].
|
||||
|
||||
#### Arguments {#args}
|
||||
|
||||
Command-line arguments to be passed to the [`builder`](#builder) executable.
|
||||
|
||||
Note that these are the arguments after the first argument.
|
||||
The first argument passed to the `builder` will be the value of `builder`, as per the usual convention on Unix.
|
||||
See [Wikipedia](https://en.wikipedia.org/wiki/Argv) for details.
|
||||
|
||||
#### Environment Variables {#env}
|
||||
|
||||
Environment variables which will be passed to the [builder](#builder) executable.
|
||||
|
||||
### Placeholders
|
||||
|
||||
Placeholders are opaque values used within the [process creation fields] to [store objects] for which we don't yet know [store path]s.
|
||||
They are strings in the form `/<hash>` that are embedded anywhere within the strings of those fields, and we are [considering](https://github.com/NixOS/nix/issues/12361) to add store-path-like placeholders.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> Output Deriving Path exist to solve the same problem as placeholders --- that is, referring to store objects for which we don't yet know a store path.
|
||||
> They also have a string syntax with `^`, [described in the encoding section](#deriving-path-encoding).
|
||||
> We could use that syntax instead of `/<hash>` for placeholders, but its human-legibility would cause problems.
|
||||
|
||||
There are two types of placeholder, corresponding to the two cases where this problem arises:
|
||||
|
||||
- [Output placeholder]{#output-placeholder}:
|
||||
|
||||
This is a placeholder for a derivation's own output.
|
||||
|
||||
- [Input placeholder]{#input-placeholder}:
|
||||
|
||||
This is a placeholder to a derivation's non-constant [input],
|
||||
i.e. an input that is an [output derived path].
|
||||
|
||||
> **Explanation**
|
||||
>
|
||||
> In general, we need to realise [realise] a [store object] in order to be sure to have a store object for it.
|
||||
> But for these two cases this is either impossible or impractical:
|
||||
>
|
||||
> - In the output case this is impossible:
|
||||
>
|
||||
> We cannot build the output until we have a correct derivation, and we cannot have a correct derivation (without using placeholders) until we have the output path.
|
||||
>
|
||||
> - In the input case this is impractical:
|
||||
>
|
||||
> If we always build a dependency first, and then refer to its output by store path, we would lose the ability for a derivation graph to describe an entire build plan consisting of multiple build steps.
|
||||
|
||||
## Encoding
|
||||
|
||||
### Derivation {#derivation-encoding}
|
||||
|
||||
There are two formats, documented separately:
|
||||
|
||||
- The legacy ["ATerm" format](@docroot@/protocols/derivation-aterm.md)
|
||||
|
||||
- The experimental, currently under development and changing [JSON format](@docroot@/protocols/json/derivation.md)
|
||||
|
||||
Every derivation has a canonical choice of encoding used to serialize it to a store object.
|
||||
This ensures that there is a canonical [store path] used to refer to the derivation, as described in [Referencing derivations](#derivation-path).
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> Currently, the canonical encoding for every derivation is the "ATerm" format,
|
||||
> but this is subject to change for types derivations which are not yet stable.
|
||||
|
||||
Regardless of the format used, when serializing a derivation to a store object, that store object will be content-addressed.
|
||||
|
||||
In the common case, the inputs to store objects are either:
|
||||
|
||||
- [constant deriving paths](#deriving-path-constant) for content-addressed source objects, which are "initial inputs" rather than the outputs of some other derivation
|
||||
|
||||
- the outputs of other derivations
|
||||
|
||||
If those other derivations *also* abide by this common case (and likewise for transitive inputs), then the entire closure of the serialized derivation will be content-addressed.
|
||||
|
||||
### Deriving Path {#deriving-path-encoding}
|
||||
|
||||
- *constant*
|
||||
|
||||
Constant deriving paths are encoded simply as the underlying store path is.
|
||||
Thus, we see that every encoded store path is also a valid encoded (constant) deriving path.
|
||||
|
||||
- *output*
|
||||
|
||||
Output deriving paths are encoded by
|
||||
|
||||
- encoding of a store path referring to a derivation
|
||||
|
||||
- a `^` separator (or `!` in some legacy contexts)
|
||||
|
||||
- the name of an output of the previously referred derivation
|
||||
|
||||
> **Example**
|
||||
>
|
||||
> ```
|
||||
> /nix/store/lxrn8v5aamkikg6agxwdqd1jz7746wz4-firefox-98.0.2.drv^out
|
||||
> ```
|
||||
>
|
||||
> This parses like so:
|
||||
>
|
||||
> ```
|
||||
> /nix/store/lxrn8v5aamkikg6agxwdqd1jz7746wz4-firefox-98.0.2.drv^out
|
||||
> |------------------------------------------------------------| |-|
|
||||
> store path (usual encoding) output name
|
||||
> |--|
|
||||
> note the ".drv"
|
||||
> ```
|
||||
|
||||
## Extending the model to be higher-order
|
||||
|
||||
**Experimental feature**: [`dynamic-derivations`](@docroot@/development/experimental-features.md#xp-feature-dynamic-derivations)
|
||||
|
||||
So far, we have used store paths to refer to derivations.
|
||||
That works because we've implicitly assumed that all derivations are created *statically* --- created by some mechanism out of band, and then manually inserted into the store.
|
||||
But what if derivations could also be created dynamically within Nix?
|
||||
In other words, what if derivations could be the outputs of other derivations?
|
||||
|
||||
:::{.note}
|
||||
In the parlance of "Build Systems à la carte", we are generalizing the Nix store layer to be a "Monadic" instead of "Applicative" build system.
|
||||
:::
|
||||
|
||||
How should we refer to such derivations?
|
||||
A deriving path works, the same as how we refer to other derivation outputs.
|
||||
But what about a dynamic derivations output?
|
||||
(i.e. how do we refer to the output of an output of a derivation?)
|
||||
For that we need to generalize the definition of deriving path, replacing the store path used to refer to the derivation with a nested deriving path:
|
||||
|
||||
```diff
|
||||
type OutputPath = {
|
||||
- drvPath: StorePath;
|
||||
+ drvPath: DerivingPath;
|
||||
output: OutputName;
|
||||
};
|
||||
```
|
||||
|
||||
Now, the `drvPath` field of `OutputPath` is itself a `DerivingPath` instead of a `StorePath`.
|
||||
|
||||
With that change, here is updated definition:
|
||||
|
||||
```typescript
|
||||
type OutputName = String;
|
||||
|
||||
type ConstantPath = {
|
||||
path: StorePath;
|
||||
};
|
||||
|
||||
type OutputPath = {
|
||||
drvPath: DerivingPath;
|
||||
output: OutputName;
|
||||
};
|
||||
|
||||
type DerivingPath = ConstantPath | OutputPath;
|
||||
```
|
||||
|
||||
Under this extended model, `DerivingPath`s are thus inductively built up from a root `ConstantPath`, wrapped with zero or more outer `OutputPath`s.
|
||||
|
||||
### Encoding {#deriving-path-encoding}
|
||||
|
||||
The encoding is adjusted in the natural way, encoding the `drv` field recursively using the same deriving path encoding.
|
||||
The result of this is that it is possible to have a chain of `^<output-name>` at the end of the final string, as opposed to just a single one.
|
||||
|
||||
> **Example**
|
||||
>
|
||||
> ```
|
||||
> /nix/store/lxrn8v5aamkikg6agxwdqd1jz7746wz4-firefox-98.0.2.drv^foo.drv^bar.drv^out
|
||||
> |----------------------------------------------------------------------------| |-|
|
||||
> inner deriving path (usual encoding) output name
|
||||
> |--------------------------------------------------------------------| |-----|
|
||||
> even more inner deriving path (usual encoding) output name
|
||||
> |------------------------------------------------------------| |-----|
|
||||
> innermost constant store path (usual encoding) output name
|
||||
> ```
|
192
doc/manual/source/store/derivation/outputs/content-address.md
Normal file
192
doc/manual/source/store/derivation/outputs/content-address.md
Normal file
|
@ -0,0 +1,192 @@
|
|||
# Content-addressing derivation outputs
|
||||
|
||||
The content-addressing of an output only depends on that store object itself, not any other information external (such has how it was made, when it was made, etc.).
|
||||
As a consequence, a store object will be content-addressed the same way regardless of whether it was manually inserted into the store, outputted by some derivation, or outputted by a some other derivation.
|
||||
|
||||
The output spec for a content-addressed output must contains the following field:
|
||||
|
||||
- *method*: how the data of the store object is digested into a content address
|
||||
|
||||
The possible choices of *method* are described in the [section on content-addressing store objects](@docroot@/store/store-object/content-address.md).
|
||||
Given the method, the output's name (computed from the derivation name and output spec mapping as described above), and the data of the store object, the output's store path will be computed as described in that section.
|
||||
|
||||
## Fixed-output content-addressing {#fixed}
|
||||
|
||||
In this case the content-address of the *fixed* in advanced by the derivation itself.
|
||||
In other words, when the derivation has finished [building](@docroot@/store/building.md), and the provisional output' content-address is computed as part of the process to turn it into a *bona fide* store object, the calculated content address must much that given in the derivation, or the build of that derivation will be deemed a failure.
|
||||
|
||||
The output spec for an output with a fixed content addresses additionally contains:
|
||||
|
||||
- *hash*, the hash expected from digesting the store object's file system objects.
|
||||
This hash may be of a freely-chosen hash algorithm (that Nix supports)
|
||||
|
||||
> **Design note**
|
||||
>
|
||||
> In principle, the output spec could also specify the references the store object should have, since the references and file system objects are equally parts of a content-addressed store object proper that contribute to its content-addressed.
|
||||
> However, at this time, the references are not not done because all fixed content-addressed outputs are required to have no references (including no self-reference).
|
||||
>
|
||||
> Also in principle, rather than specifying the references and file system object data with separate hashes, a single hash that constraints both could be used.
|
||||
> This could be done with the final store path's digest, or better yet, the hash that will become the store path's digest before it is truncated.
|
||||
>
|
||||
> These possible future extensions are included to elucidate the core property of fixed-output content addressing --- that all parts of the output must be cryptographically fixed with one or more hashes --- separate from the particulars of the currently-supported store object content-addressing schemes.
|
||||
|
||||
### Design rationale
|
||||
|
||||
What is the purpose of fixing an output's content address in advanced?
|
||||
In abstract terms, the answer is carefully controlled impurity.
|
||||
Unlike a regular derivation, the [builder] executable of a derivation that produced fixed outputs has access to the network.
|
||||
The outputs' guaranteed content-addresses are supposed to mitigate the risk of the builder being given these capabilities;
|
||||
regardless of what the builder does *during* the build, it cannot influence downstream builds in unanticipated ways because all information it passed downstream flows through the outputs whose content-addresses are fixed.
|
||||
|
||||
[builder]: @docroot@/store/derivation/index.md#builder
|
||||
|
||||
In concrete terms, the purpose of this feature is fetching fixed input data like source code from the network.
|
||||
For example, consider a family of "fetch URL" derivations.
|
||||
These derivations download files from given URL.
|
||||
To ensure that the downloaded file has not been modified, each derivation must also specify a cryptographic hash of the file.
|
||||
For example,
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"outputs: {
|
||||
"out": {
|
||||
"method": "nar",
|
||||
"hashAlgo": "sha256",
|
||||
"hash: "1md7jsfd8pa45z73bz1kszpp01yw6x5ljkjk2hx7wl800any6465",
|
||||
},
|
||||
},
|
||||
"env": {
|
||||
"url": "http://ftp.gnu.org/pub/gnu/hello/hello-2.1.1.tar.gz"
|
||||
// ...
|
||||
},
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
It sometimes happens that the URL of the file changes,
|
||||
e.g., because servers are reorganised or no longer available.
|
||||
In these cases, we then must update the call to `fetchurl`, e.g.,
|
||||
|
||||
```diff
|
||||
"env": {
|
||||
- "url": "http://ftp.gnu.org/pub/gnu/hello/hello-2.1.1.tar.gz"
|
||||
+ "url": "ftp://ftp.nluug.nl/pub/gnu/hello/hello-2.1.1.tar.gz"
|
||||
// ...
|
||||
},
|
||||
```
|
||||
|
||||
If a `fetchurl` derivation's outputs were [input-addressed][input addressing], the output paths of the derivation and of *all derivations depending on it* would change.
|
||||
For instance, if we were to change the URL of the Glibc source distribution in Nixpkgs (a package on which almost all other packages depend on Linux) massive rebuilds would be needed.
|
||||
This is unfortunate for a change which we know cannot have a real effect as it propagates upwards through the dependency graph.
|
||||
|
||||
For content-addressed outputs (fixed or floating), on the other hand, the outputs' store path only depends on the derivation's name, data, and the `method` of the outputs' specs.
|
||||
The rest of the derivation is ignored for the purpose of computing the output path.
|
||||
|
||||
> **History Note**
|
||||
>
|
||||
> Fixed content-addressing is especially important both today and historically as the *only* form of content-addressing that is stabilized.
|
||||
> This is why the rationale above contrasts it with [input addressing].
|
||||
|
||||
## (Floating) Content-Addressing {#floating}
|
||||
|
||||
> **Warning**
|
||||
> This is part of an [experimental feature](@docroot@/development/experimental-features.md).
|
||||
>
|
||||
> To use this type of output addressing, you must enable the
|
||||
> [`ca-derivations`][xp-feature-ca-derivations] experimental feature.
|
||||
> For example, in [nix.conf](@docroot@/command-ref/conf-file.md) you could add:
|
||||
>
|
||||
> ```
|
||||
> extra-experimental-features = ca-derivations
|
||||
> ```
|
||||
|
||||
With this experimemental feature enabled, derivation outputs can also be content-addressed *without* fixing in the output spec what the outputs' content address must be.
|
||||
|
||||
### Purity
|
||||
|
||||
Because the derivation output is not fixed (just like with [input addressing]), the [builder] is not given any impure capabilities [^purity].
|
||||
|
||||
> **Configuration note**
|
||||
>
|
||||
> Strictly speaking, the extent to which sandboxing and deprivilaging is possible varies with the environment Nix is running in.
|
||||
> Nix's configuration settings indicate what level of sandboxing is required or enabled.
|
||||
> Builds of derivations will fail if they request an absense of sandboxing which is not allowed.
|
||||
> Builds of derivations will also fail if the level of sandboxing specified in the configure exceeds what is possible in teh given environment.
|
||||
>
|
||||
> (The "environment", in this case, consists of attributes such as the Operating System Nix runs atop, along with the operating-system-specific privilages that Nix has been granted.
|
||||
> Because of how conventional operating systems like macos, Linux, etc. work, granting builders *fewer* privilages may ironically require that Nix be run with *more* privilages.)
|
||||
|
||||
That said, derivations producing floating content-addressed outputs may declare their builders as impure (like the builders of derivations producing producing fixed outputs).
|
||||
This is provisionally supported as part of the [`impure-derivations`][xp-feature-impure-derivations] experimental feature.
|
||||
|
||||
### Compatibility negotiation
|
||||
|
||||
Any derivation producing a floating content-addresssed output implicitly requires the `ca-derivations` [system feature](@docroot@/command-ref/conf-file.md#conf-system-features).
|
||||
This prevents scheduling the building of the derivation on a machine without the experimental feature enabled.
|
||||
Even once the experimental feature is stabilized, this is still useful in order to be allow using remote builder running odler versions of Nix, or alternative implementations that do not support floating content addressing.
|
||||
|
||||
### Determinism
|
||||
|
||||
In the earlier [discussion of how self-references are handled when content-addressing store objects](@docroot@/store/store-object/content-address.html#self-references), it was pointed out that methods of producing store objects ought to be deterministic regardless of the choice of provisional store path.
|
||||
For store objects produced by manually inserting into the store to create a store object, the "method of production" is an informally concept --- formally, Nix has no idea where the store object came from, and content-addressing is crucial in order to ensure that the derivation is *intrinsically* tamper-proof.
|
||||
But for store objects produced by derivation, the "method is quite formal" --- the whole point of derivations is to be a formal notion of building, after all.
|
||||
In this case, we can elevate this informal property to a formal one.
|
||||
|
||||
A *determinstic* content-addressing derivation should produce outputs with the same content addresses:
|
||||
|
||||
1. Every time the builder is run
|
||||
|
||||
This is because either the builder is completely sandboxed, or because all any remaining impurities that leak inside the build sandbox are ignored by the builder and do not influence its behavior.
|
||||
|
||||
2. Regardless of the choice of any provisional outputs paths
|
||||
|
||||
Provisional store paths must be chosen for any output that has a self-reference.
|
||||
The choice of provisional store path can be thought of as an impurity, since it is an arbitrary choice.
|
||||
|
||||
If provisional outputs paths are deterministically chosen, we are in the first branch of part (1).
|
||||
The builder the data it produces based on it in arbitrary ways, but this gets us closer to to [input addressing].
|
||||
Deterministically choosing the provisional path may be considered "complete sandboxing" by removing an impurity, but this is unsatisfactory
|
||||
|
||||
<!--
|
||||
|
||||
TODO
|
||||
(Both these points will be expanded-upon below.)
|
||||
|
||||
-->
|
||||
|
||||
If provisional outputs paths are randomly chosen, we are in the second branch of part (1).
|
||||
The builder *must* not let the random input affect the final outputs it produces, and multiple builds may be performed and the compared in order to ensure that this is in fact the case.
|
||||
|
||||
### Floating versus Fixed
|
||||
|
||||
While the destinction between content- and input-addressing is one of *mechanism*, the distinction between fixed and floating content addression is more one of *policy*.
|
||||
A fixed output that passes its content address check is just like a floating output.
|
||||
It is only in the potential for that check to fail that they are different.
|
||||
|
||||
> **Design Note**
|
||||
>
|
||||
> In a future world where floating content-addressing is also stable, we in principle no longer need separate [fixed](#fixed) content-addressing.
|
||||
> Instead, we could always use floating content-addressing, and separately assert the precise value content address of a given store object to be used as an input (of another derivation).
|
||||
> A stand-alone assertion object of this sort is not yet implemented, but its possible creation is tracked in [Issue #11955](https://github.com/NixOS/nix/issues/11955).
|
||||
>
|
||||
> In the current version of Nix, fixed outputs which fail their hash check are still registered as valid store objects, just not registered as outputs of the derivation which produced them.
|
||||
> This is an optimization that means if the wrong output hash is specified in a derivation, and then the derivation is recreated with the right output hash, derivation does not need to be rebuilt --- avoiding downloading potentially large amounts of data twice.
|
||||
> This optimisation prefigures the design above:
|
||||
> If the output hash assertion was removed outside the derivation itself, Nix could additionally not only register that outputted store object like today, but could also make note that derivation did in fact successfully download some data.
|
||||
For example, for the "fetch URL" example above, making such a note is tantamount to recording what data is available at the time of download at the given URL.
|
||||
> It would only be when Nix subsequently tries to build something with that (refining our example) downloaded source code that Nix would be forced to check the output hash assertion, preventing it from e.g. building compromised malware.
|
||||
>
|
||||
> Recapping, Nix would
|
||||
>
|
||||
> 1. successfully download data
|
||||
> 2. insert that data into the store
|
||||
> 3. associate (presumably with some sort of expiration policy) the downloaded data with the derivation that downloaded it
|
||||
>
|
||||
> But only use the downloaded store object in subsequent derivations that depended upon the assertion if the assertion passed.
|
||||
>
|
||||
> This possible future extension is included to illustrate this distinction:
|
||||
|
||||
[input addressing]: ./input-address.md
|
||||
[xp-feature-ca-derivations]: @docroot@/development/experimental-features.md#xp-feature-ca-derivations
|
||||
[xp-feature-git-hashing]: @docroot@/development/experimental-features.md#xp-feature-git-hashing
|
||||
[xp-feature-impure-derivations]: @docroot@/development/experimental-features.md#xp-feature-impure-derivations
|
97
doc/manual/source/store/derivation/outputs/index.md
Normal file
97
doc/manual/source/store/derivation/outputs/index.md
Normal file
|
@ -0,0 +1,97 @@
|
|||
# Derivation Outputs and Types of Derivations
|
||||
|
||||
As stated on the [main pages on derivations](../index.md#store-derivation),
|
||||
a derivation produces [store objects], which are known as the *outputs* of the derivation.
|
||||
Indeed, the entire point of derivations is to produce these outputs, and to reliably and reproducably produce these derivations each time the derivation is run.
|
||||
|
||||
One of the parts of a derivation is its *outputs specification*, which specifies certain information about the outputs the derivation produces when run.
|
||||
The outputs specification is a map, from names to specifications for individual outputs.
|
||||
|
||||
## Output Names {#outputs}
|
||||
|
||||
Output names can be any string which is also a valid [store path] name.
|
||||
The name mapped to each output specification is not actually the name of the output.
|
||||
In the general case, the output store object has name `derivationName + "-" + outputSpecName`, not any other metadata about it.
|
||||
However, an output spec named "out" describes and output store object whose name is just the derivation name.
|
||||
|
||||
> **Example**
|
||||
>
|
||||
> A derivation is named `hello`, and has two outputs, `out`, and `dev`
|
||||
>
|
||||
> - The derivation's path will be: `/nix/store/<hash>-hello.drv`.
|
||||
>
|
||||
> - The store path of `out` will be: `/nix/store/<hash>-hello`.
|
||||
>
|
||||
> - The store path of `dev` will be: `/nix/store/<hash>-hello-dev`.
|
||||
|
||||
The outputs are the derivations are the [store objects][store object] it is obligated to produce.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> The formal terminology here is somewhat at adds with everyday communication in the Nix community today.
|
||||
> "output" in casual usage tends to refer to either to the actual output store object, or the notional output spec, depending on context.
|
||||
>
|
||||
> For example "hello's `dev` output" means the store object referred to by the store path `/nix/store/<hash>-hello-dev`.
|
||||
> It is unusual to call this the "`hello-dev` output", even though `hello-dev` is the actual name of that store object.
|
||||
|
||||
## Types of output addressing
|
||||
|
||||
The main information contained in an output specification is how the derivation output is addressed.
|
||||
In particular, the specification decides:
|
||||
|
||||
- whether the output is [content-addressed](./content-address.md) or [input-addressed](./input-address.md)
|
||||
|
||||
- if the content is content-addressed, how is it content addressed
|
||||
|
||||
- if the content is content-addressed, [what is its content address](./content-address.md#fixed-content-addressing) (and thus what is its [store path])
|
||||
|
||||
## Types of derivations
|
||||
|
||||
The sections on each type of derivation output addressing ended up discussing other attributes of the derivation besides its outputs, such as purity, scheduling, determinism, etc.
|
||||
This is no concidence; for the type of a derivation is in fact one-for-one with the type of its outputs:
|
||||
|
||||
- A derivation that produces *xyz-addressed* outputs is an *xyz-addressing* derivations.
|
||||
|
||||
The rules for this are fairly concise:
|
||||
|
||||
- All the outputs must be of the same type / use the same addressing
|
||||
|
||||
- The derivation must have at least one output
|
||||
|
||||
- Additionally, if the outputs are fixed content-addressed, there must be exactly one output, whose specification is mapped from the name `out`.
|
||||
(The name `out` is special, according to the rules described above.
|
||||
Having only one output and calling its specification `out` means the single output is effectively anonymous; the store path just has the derivation name.)
|
||||
|
||||
(This is an arbitrary restriction that could be lifted.)
|
||||
|
||||
- The output is either *fixed* or *floating*, indicating whether the its store path is known prior to building it.
|
||||
|
||||
- With fixed content-addressing it is fixed.
|
||||
|
||||
> A *fixed content-addressing* derivation is also called a *fixed-output derivation*, since that is the only currently-implemented form of fixed-output addressing
|
||||
|
||||
- With floating content-addressing or input-addressing it is floating.
|
||||
|
||||
> Thus, historically with Nix, with no experimental features enabled, *all* outputs are fixed.
|
||||
|
||||
- The derivation may be *pure* or *impure*, indicating what read access to the outside world the [builder](../index.md#builder) has.
|
||||
|
||||
- An input-addressing derivation *must* be pure.
|
||||
|
||||
> If it is impure, we would have a large problem, because an input-addressed derivation always produces outputs with the same paths.
|
||||
|
||||
|
||||
- A content-addressing derivation may be pure or impure
|
||||
|
||||
- If it is impure, it may be be fixed (typical), or it may be floating if the additional [`impure-derivations`][xp-feature-impure-derivations] experimental feature is enabled.
|
||||
|
||||
- If it is pure, it must be floating.
|
||||
|
||||
- Pure, fixed content-addressing derivations are not suppported
|
||||
|
||||
> There is no use for this forth combination.
|
||||
> The sole purpose of an output's store path being fixed is to support the derivation being impure.
|
||||
|
||||
[xp-feature-ca-derivations]: @docroot@/development/experimental-features.md#xp-feature-ca-derivations
|
||||
[xp-feature-git-hashing]: @docroot@/development/experimental-features.md#xp-feature-git-hashing
|
||||
[xp-feature-impure-derivations]: @docroot@/development/experimental-features.md#xp-feature-impure-derivations
|
31
doc/manual/source/store/derivation/outputs/input-address.md
Normal file
31
doc/manual/source/store/derivation/outputs/input-address.md
Normal file
|
@ -0,0 +1,31 @@
|
|||
# Input-addressing derivation outputs
|
||||
|
||||
[input addressing]: #input-addressing
|
||||
|
||||
"Input addressing" means the address the store object by the *way it was made* rather than *what it is*.
|
||||
That is to say, an input-addressed output's store path is a function not of the output itself, but the derivation that produced it.
|
||||
Even if two store paths have the same contents, if they are produced in different ways, and one is input-addressed, then they will have different store paths, and thus guaranteed to not be the same store object.
|
||||
|
||||
<!---
|
||||
|
||||
### Modulo fixed-output derivations
|
||||
|
||||
**TODO hash derivation modulo.**
|
||||
|
||||
So how do we compute the hash part of the output path of a derivation?
|
||||
This is done by the function `hashDrv`, shown in Figure 5.10.
|
||||
It distinguishes between two cases.
|
||||
If the derivation is a fixed-output derivation, then it computes a hash over just the `outputHash` attributes.
|
||||
|
||||
If the derivation is not a fixed-output derivation, we replace each element in the derivation’s inputDrvs with the result of a call to `hashDrv` for that element.
|
||||
(The derivation at each store path in `inputDrvs` is converted from its on-disk ATerm representation back to a `StoreDrv` by the function `parseDrv`.) In essence, `hashDrv` partitions store derivations into equivalence classes, and for hashing purpose it replaces each store path in a derivation graph with its equivalence class.
|
||||
|
||||
The recursion in Figure 5.10 is inefficient:
|
||||
it will call itself once for each path by which a subderivation can be reached, i.e., `O(V k)` times for a derivation graph with `V` derivations and with out-degree of at most `k`.
|
||||
In the actual implementation, memoisation is used to reduce this to `O(V + E)` complexity for a graph with E edges.
|
||||
|
||||
-->
|
||||
|
||||
[xp-feature-ca-derivations]: @docroot@/development/experimental-features.md#xp-feature-ca-derivations
|
||||
[xp-feature-git-hashing]: @docroot@/development/experimental-features.md#xp-feature-git-hashing
|
||||
[xp-feature-impure-derivations]: @docroot@/development/experimental-features.md#xp-feature-impure-derivations
|
Loading…
Add table
Add a link
Reference in a new issue