Replaces drcachesim's loops over all ways with a hashtable lookup. For larger cache hierarchies and caches with higher associativity this increases performance by 15% in cpu-bound tests on offline traces, when we use a large initial table size to avoid resizes which seem to outweigh the gains.
The hashtable unfortunately results in a 15% slowdown on simple cache hierarchies, due to the extra time in erase() and other maintenance operations outweighing the smaller gains in lookup. Thus, we make the default to not use a hashtable and use the original linear walk, providing a method to optionally enable the hashtable. The cache simulator enables the hashtables for any 3+-level cache hierarchy with either coherence or many cores.
Adds coherence to some existing 3-level-hierarchy tests to ensure we have tests that cover the hashtable path.
The TLB simulator will need to tweak these hashtables: but it looks like it is already doing the wrong thing in invalidate() and other simulator_t methods, filed as #4816.
Activity
requested review from @root
363 363 success_ = false; 364 364 return; 365 365 } 366 // For larger hierarchies, especially with coherence, using hashtables - Last updated by Derek Bruening
161 204 addr_t last_tag_; 162 205 int last_way_; 163 206 int last_block_idx_; 207 // Optimization: keep a hashtable for quick lookup of {block,way} 208 // given a tag, if using a large cache hierarchy where serial 209 // walks over the associativity end up as bottlenecks. 210 // We can't easily remove the blocks_ array and replace with just 211 // the hashtable as replace_which_way(), etc. want quick access to 212 // every way for a given line index. 213 std::unordered_map<addr_t, std::pair<caching_device_block_t *, int>, Created by: xyzsam
Two comments:
- absl containers are open-source, have we considered migrating to them? I wonder how much perf we could get with that. Obviously not for this PR of course.
- Is it necessary to provide the hash function signature? IIUC
std::hash<addr_t>
is expected to returnstd::size_t
.
-
That is worth considering. There are other places where the std tables were sub-optimal: I think it was raw2trace. Here we go: https://github.com/DynamoRIO/dynamorio/issues/2056#issuecomment-333167409. I filed #4844 on absl.
-
It seems to be required for the hash function as a lambda.
-
- Last updated by Derek Bruening