Wei Wu / @lazyparser

PLCT Lab. OSDT/HelloGCC/HelloLLVM. RISC-V Ambassador.

Download as .zip Download as .tar.gz View on GitHub



开个玩笑。你肯定不是没有节操的人,但是严谨的你需要阅读一下Todd Mytkowicz等人的论文,防止自己在实验环节出现“Measurement Bias”错误。计算机发展到现在已经变得非常复杂,各种看起来微不足道的因素都有可能影响到你的性能测试结果。只有在实验的时候充分的考虑到这些可能的因素并积极的避免,才可能会让自己的实验结果更加的可信。


This paper presents a surprising result: changing a seemingly innocuous aspect of an experimental setup can cause a systems researcher to draw wrong conclusions from an experiment. What appears to be an innocuous aspect in the experimental setup may in fact introduce a significant bias in an evaluation. This phenomenon is called measurement bias in the natural and social sciences.

Our results demonstrate that measurement bias is significant and commonplace in computer system evaluation. By significant we mean that measurement bias can lead to a performance analysis that either over-states an effect or even yields an incorrect conclusion. By commonplace we mean that measurement bias occurs in all architectures that we tried (Pentium 4, Core 2, and m5 O3CPU), both compilers that we tried (gcc and Intel’s C compiler), and most of the SPEC CPU2006 C programs. Thus, we cannot ignore measurement bias. Nevertheless, in a literature survey of 133 recent papers from ASPLOS, PACT, PLDI, and CGO, we determined that none of the papers with experimental results adequately consider measurement bias.

Inspired by similar problems and their solutions in other sciences, we describe and demonstrate two methods, one for detecting (causal analysis) and one for avoiding (setup randomization) measurement bias.

[1]: Producing wrong data without doing anything obviously wrong