Rename doc/manual{src -> source}

This is needed to avoid this https://github.com/mesonbuild/meson/issues/13774 when we go back to making our subproject directory `src`.
2025-07-07 14:21:48 +02:00 · 2024-10-10 12:04:33 -04:00 · 2024-10-10 12:04:33 -04:00 · eb7d7780b1
commit eb7d7780b1
parent d5c45952ac
221 changed files with 75 additions and 74 deletions
--- a/doc/manual/source/store/file-system-object.md
+++ b/doc/manual/source/store/file-system-object.md
@ -0,0 +1,64 @@
+# File System Object
+
+Nix uses a simplified model of the file system, which consists of file system objects.
+Every file system object is one of the following:
+
+ - File
+
+   - A possibly empty sequence of bytes for contents
+   - A single boolean representing the [executable](https://en.m.wikipedia.org/wiki/File-system_permissions#Permissions) permission
+
+ - Directory
+
+   Mapping of names to child file system objects
+
+ - [Symbolic link](https://en.m.wikipedia.org/wiki/Symbolic_link)
+
+   An arbitrary string.
+   Nix does not assign any semantics to symbolic links.
+
+File system objects and their children form a tree.
+A bare file or symlink can be a root file system object.
+
+Nix does not encode any other file system notions such as [hard links](https://en.m.wikipedia.org/wiki/Hard_link), [permissions](https://en.m.wikipedia.org/wiki/File-system_permissions), timestamps, or other metadata.
+
+## Examples of file system objects
+
+A plain file:
+
+```
+50 B, executable: false
+```
+
+An executable file:
+
+```
+122 KB, executable: true
+```
+
+A symlink:
+
+```
+-> /usr/bin/sh
+```
+
+A directory with contents:
+
+```
+├── bin
+│   └── hello: 35 KB, executable: true
+└── share
+    ├── info
+    │   └── hello.info: 36 KB, executable: false
+    └── man
+        └── man1
+            └── hello.1.gz: 790 B, executable: false
+```
+
+A directory that contains a symlink and other directories:
+
+```
+├── bin -> share/go/bin
+├── nix-support/
+└── share/
+```
--- a/doc/manual/source/store/file-system-object/content-address.md
+++ b/doc/manual/source/store/file-system-object/content-address.md
@ -0,0 +1,85 @@
+# Content-Addressing File System Objects
+
+For many operations, Nix needs to calculate [a content addresses](@docroot@/glossary.md#gloss-content-address) of [a file system object][file system object].
+Usually this is needed as part of
+[content addressing store objects](../store-object/content-address.md),
+since store objects always have a root file system object.
+But some command-line utilities also just work on "raw" file system objects, not part of any store object.
+
+Every content addressing scheme Nix uses ultimately involves feeding data into a [hash function](https://en.wikipedia.org/wiki/Hash_function), and getting back an opaque fixed-size digest which is deemed a content address.
+The various *methods* of content addressing thus differ in how abstract data (in this case, a file system object and its descendents) are fed into the hash function.
+
+## Serialising File System Objects { #serial }
+
+The simplest method is to serialise the entire file system object tree into a single binary string, and then hash that binary string, yielding the content address.
+In this section we describe the currently-supported methods of serialising file system objects.
+
+### Flat { #serial-flat }
+
+A single file object can just be hashed by its contents.
+This is not enough information to encode the fact that the file system object is a file,
+but if we *already* know that the FSO is a single non-executable file by other means, it is sufficient.
+
+Because the hashed data is just the raw file, as is, this choice is good for compatibility with other systems.
+For example, Unix commands like `sha256sum` or `sha1sum` will produce hashes for single files that match this.
+
+### Nix Archive (NAR) { #serial-nix-archive }
+
+For the other cases of [file system objects][file system object], especially directories with arbitrary descendents, we need a more complex serialisation format.
+Examples of such serialisations are the ZIP and TAR file formats.
+However, for our purposes these formats have two problems:
+
+- They do not have a canonical serialisation, meaning that given an FSO, there can
+be many different serialisations.
+  For instance, TAR files can have variable amounts of padding between archive members;
+  and some archive formats leave the order of directory entries undefined.
+  This would be bad because we use serialisation to compute cryptographic hashes over file system objects, and for those hashes to be useful as a content address or for integrity checking, uniqueness is crucial.
+  Otherwise, correct hashes would report false mismatches, and the store would fail to find the content.
+
+- They store more information than we have in our notion of FSOs, such as time stamps.
+  This can cause FSOs that Nix should consider equal to hash to different values on different machines, just because the dates differ.
+
+- As a practical consideration, the TAR format is the only truly universal format in the Unix environment.
+  It has many problems, such as an inability to deal with long file names and files larger than 2^33 bytes.
+  Current implementations such as GNU Tar work around these limitations in various ways.
+
+For these reasons, Nix has its very own archive format—the Nix Archive (NAR) format,
+which is carefully designed to avoid the problems described above.
+
+The exact specification of the Nix Archive format is in `protocols/nix-archive.md`
+
+## Content addressing File System Objects beyond a single serialisation pass
+
+Serialising the entire tree and then hashing that binary string is not the only option for content addressing, however.
+Another technique is that of a [Merkle graph](https://en.wikipedia.org/wiki/Merkle_tree), where previously computed hashes are included in subsequent byte strings to be hashed.
+
+In particular, the Merkle graphs can match the original graph structure of file system objects:
+we can first hash (serialised) child file system objects, and then hash parent objects using the hashes of their children in the serialisation (to be hashed) of the parent file system objects.
+
+Currently, there is one such Merkle DAG content addressing method supported.
+
+### Git ([experimental][xp-feature-git-hashing]) { #git }
+
+> **Warning**
+>
+> This method is part of the [`git-hashing`][xp-feature-git-hashing] experimental feature.
+
+Git's file system model is very close to Nix's, and so Git's content addressing method is a pretty good fit.
+Just as with regular Git, files and symlinks are hashed as git "blobs", and directories are hashed as git "trees".
+
+However, one difference between Nix's and Git's file system model needs special treatment.
+Plain files, executable files, and symlinks are not differentiated as distinctly addressable objects, but by their context: by the directory entry that refers to them.
+That means so long as the root object is a directory, there is no problem:
+every non-directory object is owned by a parent directory, and the entry that refers to it provides the missing information.
+However, if the root object is not a directory, then we have no way of knowing which one of an executable file, non-executable file, or symlink it is supposed to be.
+
+In response to this, we have decided to treat a bare file as non-executable file.
+This is similar to do what we do with [flat serialisation](#serial-flat), which also lacks this information.
+To avoid an address collision, attempts to hash a bare executable file or symlink will result in an error (just as would happen for flat serialisation also).
+Thus, Git can encode some, but not all of Nix's "File System Objects", and this sort of content-addressing is likewise partial.
+
+In the future, we may support a Git-like hash for such file system objects, or we may adopt another Merkle DAG format which is capable of representing all Nix file system objects.
+
+[file system object]: ../file-system-object.md
+[store object]: ../store-object.md
+[xp-feature-git-hashing]: @docroot@/development/experimental-features.md#xp-feature-git-hashing
--- a/doc/manual/source/store/index.md
+++ b/doc/manual/source/store/index.md
@ -0,0 +1,5 @@
+# Nix Store
+
+The *Nix store* is an abstraction to store immutable file system data (such as software packages) that can have dependencies on other such data.
+
+There are [multiple types of Nix stores](./types/index.md) with different capabilities, such as the default one on the [local filesystem](./types/local-store.md) (`/nix/store`) or [binary caches](./types/http-binary-cache-store.md).
--- a/doc/manual/source/store/meson.build
+++ b/doc/manual/source/store/meson.build
@ -0,0 +1,18 @@
+types_dir = custom_target(
+  command : [
+    python.full_path(),
+    '@INPUT0@',
+    '@OUTPUT@',
+    '--'
+  ] + nix_eval_for_docs + [
+    '--expr',
+    'import @INPUT1@ (builtins.fromJSON (builtins.readFile ./@INPUT2@)).stores',
+  ],
+  input : [
+    '../../remove_before_wrapper.py',
+    '../../generate-store-types.nix',
+    nix3_cli_json,
+  ],
+  output : 'types',
+  env : nix_env_for_docs,
+)
--- a/doc/manual/source/store/store-object.md
+++ b/doc/manual/source/store/store-object.md
@ -0,0 +1,10 @@
+## Store Object
+
+A Nix store is a collection of *store objects* with *references* between them.
+A store object consists of
+
+  - A [file system object](./file-system-object.md) as data
+  - A set of [store paths](./store-path.md) as references to other store objects
+
+Store objects are [immutable](https://en.wikipedia.org/wiki/Immutable_object):
+Once created, they do not change until they are deleted.
--- a/doc/manual/source/store/store-object/content-address.md
+++ b/doc/manual/source/store/store-object/content-address.md
@ -0,0 +1,95 @@
+# Content-Addressing Store Objects
+
+Just [like][fso-ca] [File System Objects][File System Object],
+[Store Objects][Store Object] can also be [content-addressed](@docroot@/glossary.md#gloss-content-addressed),
+unless they are [input-addressed](@docroot@/glossary.md#gloss-input-addressed-store-object).
+
+For store objects, the content address we produce will take the form of a [Store Path] rather than regular hash.
+In particular, the content-addressing scheme will ensure that the digest of the store path is solely computed from the
+
+- file system object graph (the root one and its children, if it has any)
+- references
+- [store directory](../store-path.md#store-directory)
+- name
+
+of the store object, and not any other information, which would not be an intrinsic property of that store object.
+
+For the full specification of the algorithms involved, see the [specification of store path digests][sp-spec].
+
+[File System Object]: ../file-system-object.md
+[Store Object]: ../store-object.md
+[Store Path]: ../store-path.md
+
+## Content addressing each part of a store object
+
+### File System Objects
+
+With all currently supported store object content addressing methods, the file system object is always [content-addressed][fso-ca] first, and then that hash is incorporated into content address computation for the store object.
+
+### References
+
+With all currently supported store object content addressing methods,
+other objects are referred to by their regular (string-encoded-) [store paths][Store Path].
+
+Self-references however cannot be referred to by their path, because we are in the midst of describing how to compute that path!
+
+> The alternative would require finding as hash function fixed point, i.e. the solution to an equation in the form
+> ```
+> digest = hash(..... || digest || ....)
+> ```
+> which is computationally infeasible.
+> As far as we know, this is equivalent to finding a hash collision.
+
+Instead we just have a "has self reference" boolean, which will end up affecting the digest.
+
+### Name and Store Directory
+
+These two items affect the digest in a way that is standard for store path digest computations and not specific to content-addressing.
+Consult the [specification of store path digests][sp-spec] for further details.
+
+## Content addressing Methods
+
+For historical reasons, we don't support all features in all combinations.
+Each currently supported method of content addressing chooses a single method of file system object hashing, and may offer some restrictions on references.
+The names and store directories are unrestricted however.
+
+### Flat { #method-flat }
+
+This uses the corresponding [Flat](../file-system-object/content-address.md#serial-flat) method of file system object content addressing.
+
+References are not supported: store objects with flat hashing *and* references can not be created.
+
+### Text { #method-text }
+
+This also uses the corresponding [Flat](../file-system-object/content-address.md#serial-flat) method of file system object content addressing.
+
+References to other store objects are supported, but self references are not.
+
+This is the only store-object content-addressing method that is not named identically with a corresponding file system object method.
+It is somewhat obscure, mainly used for "drv files"
+(derivations serialized as store objects in their ["ATerm" file format](@docroot@/protocols/derivation-aterm.md)).
+Prefer another method if possible.
+
+### Nix Archive { #method-nix-archive }
+
+This uses the corresponding [Nix Archive](../file-system-object/content-address.md#serial-nix-archive) method of file system object content addressing.
+
+References (to other store objects and self references alike) are supported so long as the hash algorithm is SHA-256, but not (neither kind) otherwise.
+
+### Git { #method-git }
+
+> **Warning**
+>
+> This method is part of the [`git-hashing`][xp-feature-git-hashing] experimental feature.
+
+This uses the corresponding [Git](../file-system-object/content-address.md#serial-git) method of file system object content addressing.
+
+References are not supported.
+
+Only SHA-1 is supported at this time.
+If [SHA-256-based Git](https://git-scm.com/docs/hash-function-transition)
+becomes more widespread, this restriction will be revisited.
+
+[fso-ca]: ../file-system-object/content-address.md
+[sp-spec]: @docroot@/protocols/store-path.md
+[xp-feature-git-hashing]: @docroot@/development/experimental-features.md#xp-feature-git-hashing
--- a/doc/manual/source/store/store-path.md
+++ b/doc/manual/source/store/store-path.md
@ -0,0 +1,79 @@
+# Store Path
+
+> **Example**
+>
+> `/nix/store/a040m110amc4h71lds2jmr8qrkj2jhxd-git-2.38.1`
+>
+> A rendered store path
+
+Nix implements references to [store objects](./index.md#store-object) as *store paths*.
+
+Think of a store path as an [opaque], [unique identifier]:
+The only way to obtain store path is by adding or building store objects.
+A store path will always reference exactly one store object.
+
+[opaque]: https://en.m.wikipedia.org/wiki/Opaque_data_type
+[unique identifier]: https://en.m.wikipedia.org/wiki/Unique_identifier
+
+Store paths are pairs of
+
+- A 20-byte digest for identification
+- A symbolic name for people to read
+
+> **Example**
+>
+> - Digest: `b6gvzjyb2pg0kjfwrjmg1vfhh54ad73z`
+> - Name:   `firefox-33.1`
+
+To make store objects accessible to operating system processes, stores have to expose store objects through the file system.
+
+A store path is rendered to a file system path as the concatenation of
+
+- [Store directory](#store-directory) (typically `/nix/store`)
+- Path separator (`/`)
+- Digest rendered in a custom variant of [Base32](https://en.wikipedia.org/wiki/Base32) (20 arbitrary bytes become 32 ASCII characters)
+- Hyphen (`-`)
+- Name
+
+> **Example**
+>
+> ```
+>   /nix/store/b6gvzjyb2pg0kjfwrjmg1vfhh54ad73z-firefox-33.1
+>   |--------| |------------------------------| |----------|
+> store directory            digest                 name
+> ```
+
+Exactly how the digest is calculated depends on the type of store path.
+Store path digests are *supposed* to be opaque, and so for most operations, it is not necessary to know the details.
+That said, the manual has a full [specification of store path digests](@docroot@/protocols/store-path.md).
+
+## Store Directory
+
+Every [Nix store](./index.md) has a store directory.
+
+Not every store can be accessed through the file system.
+But if the store has a file system representation, the store directory contains the store’s [file system objects], which can be addressed by [store paths](#store-path).
+
+[file system objects]: ./file-system-object.md
+
+This means a store path is not just derived from the referenced store object itself, but depends on the store that the store object is in.
+
+> **Note**
+>
+> The store directory defaults to `/nix/store`, but is in principle arbitrary.
+
+It is important which store a given store object belongs to:
+Files in the store object can contain store paths, and processes may read these paths.
+Nix can only guarantee referential integrity if store paths do not cross store boundaries.
+
+Therefore one can only copy store objects to a different store if
+
+- The source and target stores' directories match
+
+  or
+
+- The store object in question has no references, that is, contains no store paths
+
+One cannot copy a store object to a store with a different store directory.
+Instead, it has to be rebuilt, together with all its dependencies.
+It is in general not enough to replace the store directory string in file contents, as this may render executables unusable by invalidating their internal offsets or checksums.
--- a/doc/manual/source/store/types/index.md.in
+++ b/doc/manual/source/store/types/index.md.in
@ -0,0 +1,43 @@
+Nix supports different types of stores:
+
+@store-types@
+
+## Store URL format
+
+Stores are specified using a URL-like syntax. For example, the command
+
+```console
+# nix path-info --store https://cache.nixos.org/ --json \
+  /nix/store/a7gvj343m05j2s32xcnwr35v31ynlypr-coreutils-9.1
+```
+
+fetches information about a store path in the HTTP binary cache
+located at https://cache.nixos.org/, which is a type of store.
+
+Store URLs can specify **store settings** using URL query strings,
+i.e. by appending `?name1=value1&name2=value2&...` to the URL. For
+instance,
+
+```
+--store ssh://machine.example.org?ssh-key=/path/to/my/key
+```
+
+tells Nix to access the store on a remote machine via the SSH
+protocol, using `/path/to/my/key` as the SSH private key. The
+supported settings for each store type are documented below.
+
+The special store URL `auto` causes Nix to automatically select a
+store as follows:
+
+* Use the [local store](./local-store.md) `/nix/store` if `/nix/var/nix`
+  is writable by the current user.
+
+* Otherwise, if `/nix/var/nix/daemon-socket/socket` exists, [connect
+  to the Nix daemon listening on that socket](./local-daemon-store.md).
+
+* Otherwise, on Linux only, use the [local chroot store](./local-store.md)
+  `~/.local/share/nix/root`, which will be created automatically if it
+  does not exist.
+
+* Otherwise, use the [local store](./local-store.md) `/nix/store`.
+