基于五阶段流水线的RISC-V CPU模拟器实现 (8)

日期：2021-08-09 栏目：程序人生浏览：次

具体实现如下

bool BranchPredictor::predict(uint32_t pc, uint32_t insttype, int64_t op1, int64_t op2, int64_t offset) { switch (this->strategy) { case NT: return false; case AT: return true; case BTFNT: { if (offset >= 0) { return false; } else { return true; } } break; case BPB: { PredictorState state = this->predbuf[pc % PRED_BUF_SIZE]; if (state == STRONG_TAKEN || state == WEAK_TAKEN) { return true; } else if (state == STRONG_NOT_TAKEN || state == WEAK_NOT_TAKEN) { return false; } else { dbgprintf("Strange Prediction Buffer!\n"); } } break; default: dbgprintf("Unknown Branch Perdiction Strategy!\n"); break; } return false; } void BranchPredictor::update(uint32_t pc, bool branch) { int id = pc % PRED_BUF_SIZE; PredictorState state = this->predbuf[id]; if (branch) { if (state == STRONG_NOT_TAKEN) { this->predbuf[id] = WEAK_NOT_TAKEN; } else if (state == WEAK_NOT_TAKEN) { this->predbuf[id] = WEAK_TAKEN; } else if (state == WEAK_TAKEN) { this->predbuf[id] = STRONG_TAKEN; } // do nothing if STRONG_TAKEN } else { // not taken if (state == STRONG_TAKEN) { this->predbuf[id] = WEAK_TAKEN; } else if (state == WEAK_TAKEN) { this->predbuf[id] = WEAK_NOT_TAKEN; } else if (state == WEAK_NOT_TAKEN) { this->predbuf[id] = STRONG_NOT_TAKEN; } // do noting if STRONG_NOT_TAKEN } }

并且在解码阶段和执行阶段添加有关分支预测的代码

// Sumulator::decode() bool predictedBranch = false; if (isBranch(insttype)) { predictedBranch = this->branchPredictor->predict(this->fReg.pc, insttype, op1, op2, offset); if (predictedBranch) { this->predictedPC = this->fReg.pc + offset; this->anotherPC = this->fReg.pc + 4; this->fRegNew.bubble = true; } else { this->anotherPC = this->fReg.pc + offset; } } // Simulator::execute() if (isBranch(inst)) { ... // this->dReg.pc: fetch original inst addr, not the modified one this->branchPredictor->update(this->dReg.pc, branch); }

需要注意的是将PC修改为预测器预测的PC的时机，必须要在一个周期的结束时，也就是simulate()函数中循环的末尾处。

// The Branch perdiction happens here to avoid strange bugs in branch prediction if (!this->dReg.bubble && !this->dReg.stall && !this->fReg.stall && this->dReg.predictedBranch) { this->pc = this->predictedPC; }

这样即可完成分支预测模块的实现，并且很容易能够扩展出新的分支预测策略。

5.2 分支预测模块的性能评测

有趣的是，有了这个分支预测模块之后，我们可以对不同分支预测策略的性能进行评测。下面的表格是一个对分支预测准确率的简单统计。

评测程序 Always Taken BTFNT Prediction Buffer
helloworld.riscv 0.4706 0.7059 0.4706
quicksort.riscv 0.5075 0.9506 0.9587
matrixmult.riscv 0.6235 0.6325 0.6275
ackermann.riscv 0.4955 0.5053 0.9593

我们可以看到，对于helloworld程序，由于程序过于简单，其中绝大多数指令只会被执行一次，所以基于历史信息的Prediction Buffer方法退化到了Always Taken方法(因为默认预测是选择跳转)，而基于程序结构的经验性判断方法BTFNT反而取得了最高的准确率。

对于快速排序评测程序，我们发现Prediction Buffer和BTFNT都取得了极其高的预测准确率。这是因为排序元素较多(100个)，并且绝大多数情况下都在反复执行很少的一段代码。由于这些代码绝大多数都满足向前会跳转的性质，所以BTFNT方法的准确率很高。由于循环的执行长度非常长(约100次)，所以基于历史信息的Predicton Buffer能够很好地获得较高的预测准确性。

对于矩阵乘法程序，三个分支预测算法的表现非常接近。这可能是由于矩阵乘法中每次循环的执行长度都很短(10个元素)，限制了BTFNT和Prediction Buffer的性能。

对于Ackermann函数求解程序，其中完全没有循环语句，只有函数递归调用和条件判断语句，绝大多数的分支指令都在递归调用的函数内，因此，这时基于历史信息的Prediction Buffer就能发挥出最大威力，得出相当高的预测准确率，而BTFNT在此则相对比较受限了，如果递归函数内刚好两个if语句，一个if语句是向前跳转，一个if语句是向后跳转，而两条语句在大多数情况下都是跳转，那么BTFNT的准确率就会在50%左右徘徊。

5.3 意见和建议

转载注明出处：https://www.heiqu.com/zyfxjz.html

基于五阶段流水线的RISC-V CPU模拟器实现 (8)

相关推荐