Modern servers typically process request streams by assigning a worker thread to a request, and rely on a round robin policy for context-switching. Although this programming paradigm is intuitive, it is oblivious to the execution state and ignores each software module's affinity to the processor caches. As a result, resumed threads of execution suffer additional delays due to conflict and compulsory misses while populating the caches with their evicted working sets. Alternatively, the...