Due this by introducing `DerivationCreationAndRealisationGoal`.
`DerivationGoal` now only tracks a single output, and is back to
tracking a plain store path `drvPath`, not a deriving path one. Its
`addWantedOutputs` method is gone. These changes will allow subsequent
PRs to simplify it greatly.
Because the purpose of each goal is back to being immutable, we can also
once again make `Goal::buildResult` a public field, and get rid of the
`getBuildResult` method. This simplifies things also.
`DerivationCreationAndRealisationGoal` is a cheap "trampoline" goal. It
takes immutable sets of wanted outputs, and just kicks of
`DerivationGoal`s for them. Since now "actual work" is done in these
goals, it is not wasteful to have separate ones for separate sets of
outputs, even if those outputs (and the derivations they are from)
overlap.
This separation of concerns will make it possible for future work on
issues like #11928, and to continue the path of having more goal types,
but each goal type does fewer things (issue #12628).
This commit in some sense reverts
f4f28cdd0e, but that one kept around
`addWantedOutputs`. I am quite sure it was having two layers of goals
with `addWantedOutputs` that caused the issues --- restarting logic like
`addWantedOutputs` has is very tempermental! In this version of the
change, we have *zero* layers of `addWantedOutputs` --- no goal type
needs it, or otherwise has a mutable objective --- and so I think this
change is safe.
This method does *not* create a new type of goal. We instead just make
`DerivationGoal` more sophisticated, which is much easier to do now that
`DerivationBuildingGoal` has been split from it (and so many fields are
gone, or or local variables instead).
This avoids the need for a secondarily trampoline goal that interacted
poorly with `addWantedOutputs`. That, I hope, will mean the bugs from
before do not reappear.
There may in fact be a reason to introduce such a trampoline in the
future, but it would only happen in conjunction with getting rid of
`addWantedOutputs`.
Restores the functionality (and tests) that was reverted in
f4f28cdd0e.
This wires up the {pre,post}FunctionCallHook machinery
in EvalState::callFunction and migrates FunctionCallTrace
to use the new EvalProfiler mechanisms for tracing.
Note that branches when the hook gets called are marked with [[unlikely]]
as a hint to the compiler that this is not a hot path. For non-tracing
evaluation this should be a 100% predictable branch, so the performance
cost is nonexistent.
Some measurements to prove support this point:
```
nix build .#nix-cli
nix build github:nixos/nix/d692729759e4e370361cc5105fbeb0e33137ca9e#nix-cli --out-link before
```
(Before)
```
$ taskset -c 2,3 hyperfine "GC_INITIAL_HEAP_SIZE=16g before/bin/nix eval nixpkgs#gnome --no-eval-cache" --warmup 4
Benchmark 1: GC_INITIAL_HEAP_SIZE=16g before/bin/nix eval nixpkgs#gnome --no-eval-cache
Time (mean ± σ): 2.517 s ± 0.032 s [User: 1.464 s, System: 0.476 s]
Range (min … max): 2.464 s … 2.557 s 10 runs
```
(After)
```
$ taskset -c 2,3 hyperfine "GC_INITIAL_HEAP_SIZE=16g result/bin/nix eval nixpkgs#gnome --no-eval-cache" --warmup 4
Benchmark 1: GC_INITIAL_HEAP_SIZE=16g result/bin/nix eval nixpkgs#gnome --no-eval-cache
Time (mean ± σ): 2.499 s ± 0.022 s [User: 1.448 s, System: 0.478 s]
Range (min … max): 2.472 s … 2.537 s 10 runs
```
This patch adds an EvalProfiler and MultiEvalProfiler that can be used
to insert hooks into the evaluation for the purposes of function tracing
(what function-trace currently does) or for flamegraph/tracy profilers.
See the following commits for how this is supposed to be integrated into
the evaluator and performance considerations.
Ensure relative path inputs are relative to the parent node's _actual_
`outPath`, instead of the subtly different `sourceInfo.outPath`.
Additionally, non-flake inputs now also have a `sourceInfo` attribute.
This fixes the relationship between `self.outPath` and
`self.sourceInfo.outPath` in some edge cases.
Fixes#13164
Previous code had a sneaky bug due to which no caching
actually happened:
```cpp
auto linesForInput = (*lines)[origin->offset];
```
That should have been:
```cpp
auto & linesForInput = (*lines)[origin->offset];
```
See [1].
Now that it also makes sense to make the cache bound in side
in order not to memoize all the sources without freeing any memory.
The default cache size has been chosen somewhat arbitrarily to be ~64k
origins. For reference, 25.05 nixpkgs has ~50k .nix files.
Simple benchmark:
```nix
let
pkgs = import <nixpkgs> { };
in
builtins.foldl' (acc: el: acc + el.line) 0 (
builtins.genList (x: builtins.unsafeGetAttrPos "gcc" pkgs) 10000
)
```
(After)
```
$ hyperfine "result/bin/nix eval -f ./test.nix"
Benchmark 1: result/bin/nix eval -f ./test.nix
Time (mean ± σ): 292.7 ms ± 3.9 ms [User: 131.0 ms, System: 120.5 ms]
Range (min … max): 288.1 ms … 300.5 ms 10 runs
```
(Before)
```
hyperfine "nix eval -f ./test.nix"
Benchmark 1: nix eval -f ./test.nix
Time (mean ± σ): 666.7 ms ± 6.4 ms [User: 428.3 ms, System: 191.2 ms]
Range (min … max): 659.7 ms … 681.3 ms 10 runs
```
If the origin happens to be a `all-packages.nix` or similar in size then the
difference is much more dramatic.
[1]: 22e3f0e987