1
0
Fork 0
mirror of https://github.com/NixOS/nix synced 2025-07-07 14:21:48 +02:00

gc: replace ordered sets with unordered sets for in-memory caches

During garbage collection we cache several things -- a set of known-dead
paths, a set of known-alive paths, and a map of paths to their derivers.
Currently they use STL maps and sets, which are ordered structures that
typically are backed by binary trees. Since we are putting pseudorandom
paths into these and looking them up by exact key, we don't need the
ordering, and we're paying a nontrivial cost per insertion.

The existing maps require O(n log n) memory and have O(log n) insertion
and lookup time.

We could instead use unordered maps, which are typically backed by
hashmaps. These require O(n) memory and have O(1) insertion and lookup
time.

On my system this appears to result in a dramatic speedup -- prior to
this patch I was able to delete 400k paths out of 9.5 million over the
course of 34.5 hours. After this patch the same result took 89 minutes.

This result should NOT be taken at face value because the two runs
aren't really comparable; in particular the first started when I had 9.5
million store paths and the seconcd started with 7.8 million, so we are
deleting a different set of paths starting from a much cleaner
filesystem. But I do think it's indicative.

Related: https://github.com/NixOS/nix/issues/9581
This commit is contained in:
Andrew Poelstra 2025-01-12 18:36:32 +00:00
parent a44ae8b5a9
commit 4fac767b52
No known key found for this signature in database
GPG key ID: C588D63CE41B97C1

View file

@ -455,7 +455,7 @@ void LocalStore::collectGarbage(const GCOptions & options, GCResults & results)
bool gcKeepOutputs = settings.gcKeepOutputs; bool gcKeepOutputs = settings.gcKeepOutputs;
bool gcKeepDerivations = settings.gcKeepDerivations; bool gcKeepDerivations = settings.gcKeepDerivations;
StorePathSet roots, dead, alive; std::unordered_set<StorePath> roots, dead, alive;
struct Shared struct Shared
{ {
@ -661,7 +661,7 @@ void LocalStore::collectGarbage(const GCOptions & options, GCResults & results)
} }
}; };
std::map<StorePath, StorePathSet> referrersCache; std::unordered_map<StorePath, StorePathSet> referrersCache;
/* Helper function that visits all paths reachable from `start` /* Helper function that visits all paths reachable from `start`
via the referrers edges and optionally derivers and derivation via the referrers edges and optionally derivers and derivation