Cache-Line Archaeology: Finding and Fixing False Sharing in Production
Your threads are doing independent work on independent data, and yet adding a second thread makes everything six times slower. This is false sharing, and it hides in struct layouts and thread-local counters across more production codebases than anyone wants to admit.