Lab6

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Lab6 as PDF for free.

More details

  • Words: 458
  • Pages: 1
Paul Fake and Stephen Beard Lab 6 Write-up Note: We would like to give credit to Adam Miller for his test suite that we used to test our instructions. Also, with mispim compiled with O3, Shang should run in about 3 minutes for O3, and about 20 minutes for no optimization. 1) If you are building a processor and have to do static branch prediction (meaning you have to assume at compile time whether a branch is taken or not), how should you do it? You can make a different decision for branches that go forward or backward. Backwards branches are typically within loops, and loop conditions will almost always pass, sometimes very many times, whereas the conditions will only fail once per loop. Thus, backwards branches tend to be taken more often than not, so it is useful for the compiler to assume a backwards branch will be taken and place one of the branch-taken instructions into the branch delay slot. As for forward branches, many more forward branches are taken than not taken without optimization, but many more are not taken than taken with optimization. Predicting that they will not be taken will slow down the nonoptimized case, but it will speed up the optimized case, so the processor should predict forward branches will not be taken. 2) How good is the gcc MIPS compiler in filling the branch delay slot? Without any optimization, the compiler does not fill any useful instructions into the branch delay slot. With O3, however, there are 8.25 times as many useful instructions after a branch than there are useless instructions, so the compiler does a pretty decent job putting useful instructions in the delay slot. 3) How good is the gcc MIPS compiler in avoiding load-use hazards? Whether optimized or not, the instruction after a load does not use the result of the load. In order to fix hazards, the compiler places a no-op in place of the hazardous instruction. So, the compiler does a good job avoiding load-use hazards. 4) If you are building a 256-byte direct-mapped cache, what should you choose as your block (line) size? For O3, a block size of 32 bytes yields the highest hit rate, so that would be an ideal size for a 256-byte cache (assuming that most programs are release with optimization). 5) What conclusions can you draw about the differences between compiling with no optimization and -O3 optimization? O3 optimization places useful instructions in branch delay slots, puts more non-hazardous instructions after a load, and makes better use of registers, resulting in fewer loads and stores. Without optimization, branch delay slots are filled with no-ops, load-use hazards are remedied with no-opts, and most variables are stored into memory instead of registers.

Related Documents

Lab6
June 2020 9
Lab6
June 2020 2
Lab6
November 2019 5
Lab6.docx
April 2020 3
Lab6.docx
December 2019 12
Lab6.docx
June 2020 4