问题
为什么这个代码
代码语言:javascript复制const float x[16] = { 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8,
1.9, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6};
const float z[16] = {1.123, 1.234, 1.345, 156.467, 1.578, 1.689, 1.790, 1.812,
1.923, 2.034, 2.145, 2.256, 2.367, 2.478, 2.589, 2.690};
float y[16];
for (int i = 0; i < 16; i )
{
y[i] = x[i];
}
for (int j = 0; j < 9000000; j )
{
for (int i = 0; i < 16; i )
{
y[i] *= x[i];
y[i] /= z[i];
y[i] = y[i] 0.1f; // <--
y[i] = y[i] - 0.1f; // <--
}
}
比下面的代码快近 10 倍左右
代码语言:javascript复制const float x[16] = { 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8,
1.9, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6};
const float z[16] = {1.123, 1.234, 1.345, 156.467, 1.578, 1.689, 1.790, 1.812,
1.923, 2.034, 2.145, 2.256, 2.367, 2.478, 2.589, 2.690};
float y[16];
for (int i = 0; i < 16; i )
{
y[i] = x[i];
}
for (int j = 0; j < 9000000; j )
{
for (int i = 0; i < 16; i )
{
y[i] *= x[i];
y[i] /= z[i];
y[i] = y[i] 0; // <--
y[i] = y[i] - 0; // <--
}
}
回答
这是由非规格化浮点数造成的。
处理器对非规格化浮点数的处理效率比规格化浮点数要慢 10-100 倍。下面是针对上面的代码所做的测试,
第一次 | 第二次 | 第三次 | |
---|---|---|---|
0.1f | 0.771s | 0.683s | 0.663s |
0 | 12.157s | 12.226s | 12.496s |
0.0f | 12.108s | 12.171s | 12.161s |
关于非规格化和规格化浮点数,请参考:https://blog.csdn.net/AaricYang/article/details/91358149