The existing header is a bit too big. Now the following use-cases are
separated, and get their own headers:
- Using or implementing an arbitrary store: remaining `store-api.hh`
This is closer to just being about the `Store` (and `StoreConfig`)
classes, as one would expect.
- Opening a store from a textual description: `store-open.hh`
Opening an aribtrary store implementation like this requires some sort
of store registration mechanism to exists, but the caller doesn't need
to know how it works. This just exposes the functions which use such a
mechanism, without exposing the mechanism itself
- Registering a store implementation: `store-registration.hh`
This requires understanding how the mechanism actually works, and the
mechanism in question involves templated machinery in headers we
rather not expose to things that don't need it, as it would slow down
compilation for no reason.
I can't find a good way to benchmark in isolation from the
git cache, but common sense dictates that creating (and destroying)
a 131KiB std::vector for each regular file from the archive imposes
quite a significant overhead regardless of the IO bound git cache.
AFAICT there is no reason to keep a copy of the data since
it always gets fed into the sink and there are no coroutines/threads
in sight.
As it turns out using `std::regex` is actually the bottleneck
for root discovery. Just substituting `std::` -> `boost::`
makes root discovery twice as fast (3x if counting only userspace time).
Some rather ad-hoc measurements to motivate the switch:
(On master)
```
nix build github:nixos/nix/1e822bd4149a8bce1da81ee2ad9404986b07914c#nix-cli --out-link result-1e822bd4149a8bce1da81ee2ad9404986b07914c
taskset -c 2,3 hyperfine "result-1e822bd4149a8bce1da81ee2ad9404986b07914c/bin/nix store gc --dry-run --max 0"
Benchmark 1: result-1e822bd4149a8bce1da81ee2ad9404986b07914c/bin/nix store gc --dry-run --max 0
Time (mean ± σ): 481.6 ms ± 3.9 ms [User: 336.2 ms, System: 142.0 ms]
Range (min … max): 474.6 ms … 487.7 ms 10 runs
```
(After this patch)
```
taskset -c 2,3 hyperfine "result/bin/nix store gc --dry-run --max 0"
Benchmark 1: result/bin/nix store gc --dry-run --max 0
Time (mean ± σ): 254.7 ms ± 9.7 ms [User: 111.1 ms, System: 141.3 ms]
Range (min … max): 246.5 ms … 281.3 ms 10 runs
```
`boost::regex` is a drop-in replacement for `std::regex`, but much faster.
Doing a simple before/after comparison doesn't surface any change in behavior:
```
result/bin/nix store gc --dry-run -vvvvv --max 0 |& grep "got additional" | wc -l
result-1e822bd4149a8bce1da81ee2ad9404986b07914c/bin/nix store gc --dry-run -vvvvv --max 0 |& grep "got additional" | wc -l
```
Remove outdated and no longer relevant TODO. It's more confusing
now, since symbol table must now be addressed by uint32_t indices
in order to keep Attr size down to 16 bytes on 64 bit machines.
This patch finally applies the transition to std::less<>,
which is a transparent comparator. There's no functional
change and string lookups in sets are now more efficient
and don't produce temporaries (e.g. set.find(std::string_view{"key"})).
Unfortunately Feature is just an alias to `std::string`
and not a new-type, so a ton of code relies on it being
exactly a `std::string`.
Using transparent comparators just for StringSet necessitates
using it here as well.
The intention is to switch to transparent comparators from N3657 for
ordered set containers for strings and using the alias consistently
would simplify things.
This trades off some executable size for measurable lexer performance
improvements.
Note on the explicitly enabling 8bit scanner.
This is needed due to the default behavior of flex (excerpt from the manual [1]):
> Flex’s default behavior is to generate an 8-bit scanner unless you
> use the ‘-Cf’ or ‘-CF’, in which case flex defaults to generating
> 7-bit scanners unless your site was always configured to generate 8-bit
> scanners.
Some quantifyable metrics:
Nixpkgs revision: a6e3f45acf4e817532a861ab0eda4ab5485fecc1
Parsing the largest file in nixpkgs: pkgs/development/haskell-modules/hackage-packages.nix.
(Before this patch)
```
$ nix build github:nixos/nix/9fe3077d4#nix-expr
$ du --apparent-size result/lib/libnixexpr.so
2518 result/lib/libnixexpr.so
$ nix build github:nixos/nix/9fe3077d4#nix-cli
$ taskset -c 2,3 hyperfine "GC_INITIAL_HEAP_SIZE=16g \
result/bin/nix-instantiate --parse \
../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix > /dev/null"
Time (mean ± σ): 375.5 ms ± 6.3 ms [User: 316.9 ms, System: 56.7 ms]
Range (min … max): 368.5 ms … 388.3 ms 10 runs
```
(After the patch)
```
$ nix build .#nix-expr
$ du --apparent-size result/lib/libnixexpr.so
2685 result/lib/libnixexpr.so
$ nix build .#nix-cli
$ taskset -c 2,3 hyperfine "GC_INITIAL_HEAP_SIZE=16g \
result/bin/nix-instantiate --parse \
../nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix > /dev/null"
Time (mean ± σ): 326.8 ms ± 4.9 ms [User: 269.5 ms, System: 55.3 ms]
Range (min … max): 319.7 ms … 335.5 ms 10 runs
```
Overall, the change is roughly:
- 2518KiB -> 2685KiB ~ 150 KiB of machine code
- 375ms -> 325ms ~ 50ms
The perf uplift for eval-heavy test cases is obviously less noticeable,
but it doesn't make sense not to take this free perf win.
[1]: https://westes.github.io/flex/manual/Options-Affecting-Scanner-Behavior.html#Options-Affecting-Scanner-Behavior
When a PUT is redirected, some of the data can be sent by curl before headers are read. This means the subsequent PUT operation needs to seek back to origin.
The current example relies upon [nixfmt's deprecated tree traversal
behavior](https://github.com/NixOS/nixfmt/pull/240). The simplest
alternative is the new `nixfmt-tree` wrapper for `nixfmt`/`treefmt`.
Since we dropped fs::symlink_exists, we no longer have a need for the fs
namespace. Having less abstractions makes it easier to lookup the
functions in reference documentations.
As it turns out the orignal implementation of symlink_exists cannot be
used in Nix because it did now std::filesystem::filesystem_error.
The new implementation fixes that but is now actually the same as
pathExists except for the path type.