I think we may be looking at these wrong. Yes there’s a visible throughput/latency improvement here but what about other factors? Power savings? Cache efficiency? CPU cycles saved for other co-running processes?
These are going to be pretty hard to measure without an x86_64 simulator. So I don’t fault them for not including such benches. But there might be more to the story here.
I think we may be looking at these wrong. Yes there’s a visible throughput/latency improvement here but what about other factors? Power savings? Cache efficiency? CPU cycles saved for other co-running processes?
These are going to be pretty hard to measure without an x86_64 simulator. So I don’t fault them for not including such benches. But there might be more to the story here.