- 
                Notifications
    You must be signed in to change notification settings 
- Fork 18.4k
Description
After #35112 landed we saw some notable improvements in allocator scalability, but performance still plateaus at around 20-24 cores. The cause is mcentral, which has become the new scalability bottleneck. The reason it's a bottleneck is because each per-size-class mcentral is a pair of linked lists protected by lock. The lock covers all iteration and operations on the mcentral. While a single addition or removal from the linked list isn't a source of great contention, caching a span from an mcentral involves iteration which is a source of significant contention this lock.
Furthermore, the code around mcentral is fairly confusing. The main source of this confusion is span ownership: when should a span be in an mcentral? Currently a span may be owned simultaneously by:
- An mcentral.
- A concurrent sweeper.
- mheap_.sweepSpans.
- An mcache.
This makes reasoning about the span lifecycle and ownership tricky. With some refactoring, I think we can achieve scalability and also change the span ownership model to limit the number of simultaneous owners. @aclements suggested before that we could repurpose the gcSweepBuf, a data structure built for fast concurrent access, to replace the linked lists in the mcentral. We can take this idea further and also use these data structures for sweeping, instead of having a separate mheap_.sweepSpans. This means that markrootSpans will need a different mechanism for finding spans with specials, but we can use a bitmap for that similar to the page reclaimer.
By unifying the sweep queue with mcentral we can also make it so that concurrent sweepers take complete ownership of the span, which makes reasoning about (*mspan).sweep much easier as well. Finally, since we don't need to acquire a lock, there's nothing wrong with an mcache taking complete ownership of a span.
There remains one place where multiple span ownership would still exist and that's with the page reclaimer, which will probably never be able to take ownership of a span. But that's OK, since it only ever sweeps spans which will be freed to the heap, so all the other mechanisms can just ignore spans which are picked up by the page reclaimer (identified by their sweepgen value).