Date of Original Version
Abstract or Description
Thread-level speculation (TLS) allows us to automatically parallelize general-purpose programs by supporting parallel execution of threads that might not actually be independent. In this paper, we show that the key to good performance ties in the three different ways to communicate a value between speculative threads: speculation, synchronization and prediction. The difficult part is deciding how and when to apply each method. This paper shows how we can apply value prediction, dynamic synchronization and hardware instruction prioritization to improve value communication and hence performance in several SPECint benchmarks that have been automatically transformed by our compiler to exploit TLS. We find that value prediction can be effective when properly throttled to avoid the high costs of mis-prediction, while most of the gains of value prediction can be more easily achieved by exploiting silent stores. We also show that dynamic synchronization is quite effective for most benchmarks, while hardware instruction prioritization is not. Overall, we find that these techniques have great potential for improving the performance of TLS.
Proceedings of the Eighth International Symposium on High-Performance Computer Architecture (HPCA '02).