Rewrite exercise stuff

This commit is contained in:
peteraa 2019-06-07 19:54:18 +02:00
parent 932413bb3d
commit f5d038eaf6
9 changed files with 330 additions and 148 deletions

View file

@ -1,31 +0,0 @@
This is the coursework for the graded part of the TDT4255 course at NTNU.
Since it is the authors opinion that most tools out there are vastly underdesigned, this project comes
with a lot of added homegrown utilities, including a RISC-V parser, assembler and interpreter.
When you test a design with a given program, that program is first parsed, then run in a software interpreter
to get correct output, then assembled into a binary.
This binary will then be loaded to your synthesized design (your processor) by the test harness provided in
the skeleton code, along with any initial state.
Your processor will run the supplied binary, and the changes to state (memory and registers) will be recorded
and compared with the interpreter log.
If it matches, your processor works, if not, you get an execution trace, hopefully showing what went wrong and
where.
To get started, read the exercise.org file, it goes over the first pieces of the puzzle.
If you want to learn chisel on your own and use this project please send me some feedback on what you liked,
disliked and what could have been improved :)
If you end up using it for a course you're teaching I would be thrilled too.
In this case, you can spend the time you're saving by sending a pull requests with some improvements!
Pull requests are more than welcome!
Nice to have list:
* More sophisticated test feedback. A detailed error report on why the processor design failed.
* Scaffolding to run synthesized designs. Preferrably targeting the PYNQ platform.
* A fix for whatever problems *you* run into when using this project.
* Either a battery of tests to find corner cases stress testing forwarders, hazard detectors etc, or even better, the tools to generate code with hazards automatically.

58
README.org Normal file
View file

@ -0,0 +1,58 @@
This is the coursework for the graded part of the TDT4255 course at NTNU.
* Instructions
To get started with designing your 5-stage RISC-V pipeline you should follow the
[[./exercise.org][Exercise instructions]]
* About
Since much of the tooling for HW design is rather difficult to work with this skeleton comes
with a lot of reinvented wheels which should make inspecting what is really going on a little
clearer.
The FiveStage suite works in the following way:
** Parsing a test
The [[./src/test/scala/RISCV/Parser.scala][Parser]] parses an assembly test found in the test resource directory.
The resulting program can then be loaded on to a VM, or assembled into machine code.
** Interpreting the test
Next the parsed assembly code is run on a virtual machine.
Relevant information is then compiled in an execution trace log which shows which instruction was
performed at a given step and what the resulting state was.
** Preparing your circuit
Next up the chisel design is synthesized into a circuit emulator.
The (relatively seamless) test harness provided for your circuit is then used in order to preload
the instruction memory with the assembled machinecode, as well as test defined initial memory and
register configurations.
** Running your circuit
As with the VM, your circuit will leave an extensive log which is parsed and used to verify the
correctness of your design
** Checking the result
If your processor performed the same updates to registers and memory, and terminated at the same
address the test is successful.
** Debugging a failed test
When a test fails, (or if you have enabled verbose logging) a side by side execution log is shown,
allowing you to pinpoint exactly how your processor went wrong.
* Intended use
This coursework is intended to be used!
If you are a tutor currently teaching computer architecture you may freely use this project, but
I would be very grateful if you provided me with feedback. Pull requests always welcome!
* Contributing
Considering the very significant amount of work saved on making your own coursework, you could
maybe help adding features.
Take a look at [[./TODO.org][the TODO file]] (does not render well in github) to get an idea of nice features to
have, or add different features altogether!
Additionally, if you write your own tests, please send a pull request! The more tests the better!
* Solution
This is a graded coursework, so I would prefer that if you fork this project you keep the solution
private.
If you want access to the solution please send me a message verifying that you are a tutor and I
will make it available to you.

View file

@ -23,7 +23,6 @@
*** TODO Basic programs *** TODO Basic programs
Needs more Needs more
** DONE Merge in LF changes ** DONE Merge in LF changes
** TODO Breakpoints ** TODO Breakpoints
*** TODO VM breakpoints *** TODO VM breakpoints
**** TODO Record breakpoints in chisel tester **** TODO Record breakpoints in chisel tester
@ -40,7 +39,16 @@
*** DONE Use DONE address *** DONE Use DONE address
** DONE Hazard generator ** DONE Hazard generator
good enough good enough
** TODO Semantic logging
Currently logging is quite awkward, a combination of fansi and regular strings.
Ideally a markdown format such as HTML should be used. There are already plenty
good scala libraries for this, such as liyaohi's stuff (big shoutout!)
** TODO Interactive stepping
This one is a pretty big undertaking, but it could be very useful to run the circuit in an interactiv
environment.
https://venus.cs61c.org/ is a good example of how useful this can be for a virtual machine.
This task requires pretty good understanding of chisel.
* Maybe * Maybe
** DONE Move instruction recording to IMEM rather than IF? ** DONE Move instruction recording to IMEM rather than IF?
Only care about what IF gets, won't have to deal with whatever logic is in IF. Only care about what IF gets, won't have to deal with whatever logic is in IF.

View file

@ -1,11 +1,138 @@
* Exercise 1 & 2 * Exercise 1
The task in this exercise is to implement a 5-stage pipelined processor for The task in this exercise is to implement a 5-stage pipelined processor for
the RISCV32I instruction set. the [[./instructions.org][RISCV32I instruction set]].
You will use the skeleton code which comes with a freebies, namely the registers, For exercise 1 you will build a 5-stage processor which handles one instruction
instruction memory and data memory. at a time, whereas in exercise 2 your design will handle multiple instructions
at a time.
This is done by inserting 4 NOP instructions inbetween each source instruction,
enabling us to use the same tests for both exercise 1 and 2.
In the project skeleton files ([[./src/main/scala/][Found here]]) you can see that a lot of code has
already been provided.
Before going further it is useful to get an overview of what is provided out
of the box.
+ [[./src/main/scala/Tile.scala]]
This is the top level module for the system as a whole. This is where the test
harness accessses your design, providing the necessary IO.
*You should not modify this module for other purposes than debugging.*
+ [[./src/main/scala/CPU.scala]]
This is the top level module for your processor.
In this module the various stages and barriers that make up your processor
should be declared and wired together.
Some of these modules have already been declared in order to wire up the
debugging logic for your test harness.
*This module is intended to be further fleshed out by you.*
+ [[./src/main/scala/IF.scala]]
This is the instruction fetch stage.
In this stage instruction fetching should happen, meaning you will have to
add logic for handling branches, jumps, and for exercise 2, stalls.
The reason this module is already included is that it contains the instruction
memory, described next which is heavily coupled to the testing harness.
*This module is intended to be further fleshed out by you.*
+ [[./src/main/scala/IMem.scala]]
This module contains the instruction memory for your processor.
Upon testing the test harness loads your program into the instruction memory,
freeing you from the hassle.
*You should not modify this module for other purposes than maaaaybe debugging.*
+ [[./src/main/scala/ID.scala]]
The instruction decode stage.
The reason this module is included is that the registers reside here, thus
for the test harness to work it must be wired up to the register unit to
record its state updates.
*This module is intended to be further fleshed out by you.*
+ [[./src/main/scala/Registers.scala]]
Contains the registers for your processor. Note that the zero register is alredy
disabled, you do not need to do this yourself.
The test harness ensures that all register updates are recorded.
*You should not modify this module for other purposes than maaaaybe debugging.*
+ [[./src/main/scala/MEM.scala]]
Like ID and IF, the MEM skeleton module is included so that the test harness
can set up and monitor the data memory
*This module is intended to be further fleshed out by you.*
+ [[./src/main/scala/DMem.scala]]
Like the registers and Imem, the DMem is already implemented.
*You should not modify this module for other purposes than maaaaybe debugging.*
+ [[./src/main/scala/Const.scala]]
Contains helpful constants for decoding, used by the decoder which is provided.
*This module may be fleshed out further by you if you so choose.*
+ [[./src/main/scala/Decoder.scala]]
The decoder shows how to conveniently demux the instruction.
In the provided ID.scala file a decoder module has already been instantiated.
You should flesh it out further.
You may find it useful to alter this module, especially in exercise 2.
*This module should be further fleshed out by you.*
+ [[./src/main/scala/ToplevelSignals.scala]]
Contains helpful constants.
You should add your own constants here when you find the need for them.
You are not required to use it at all, but it is very helpful.
*This module can be further fleshed out by you.*
+ [[./src/main/scala/SetupSignals.scala]]
You should obviously not modify this file.
You may choose to create a similar file for debug signals, modeled on how
the test harness is built.
*You should not modify this module at all.*
** Tests
In addition to the skeleton files it's useful to take a look at how the tests work.
You will not need to alter anything here other than the [[./src/test/scala/Manifest.scala][test manifest]], but some
of these settings can be quite useful to alter.
The main attraction is the test options. By altering the verbosity settings you
may change what is output.
The settings are
+ printIfSuccessful
Enables logging on tests that succeed
+ printErrors
Enables logging of errors. You obviously want this one on, at least on the single
test.
+ printParsedProgram
Prints the desugared program. Useful when the test asm contains instructions that
needs to be expanded or altered.
Unsure what "bnez" means? Turn this setting on and see!
+ printVMtrace
Enables printing of the VM trace, showing how the ideal machine executes a test
+ printVMfinal
Enables printing of the final VM state, showing how the registers look after
completion. Useful if you want to see what a program returns.
+ printMergedTrace
Enables printing of a merged trace. With this option enabled you get to see how
the VM and your processor executed the program side by side.
This setting is extremely helpful to track down where your program goes wrong!
This option attempts to synchronize the execution traces as best as it can, however
once your processor design derails this becomes impossible, leading to rather
nonsensical output.
Instructions that were only executed by either VM or Your design is colored red or
blue.
*IF YOU ARE COLOR BLIND YOU SHOULD ALTER THE DISPLAY COLORS!*
+ nopPadded
Set this to false when you're ready to enter the big-boy league
+ breakPoints
Not implemented. It's there as a teaser, urging you to implement it so I don't have to.
These are contained in the files Registers.scala, Dmem.scala and Imem.scala
** Getting started ** Getting started
In order to make a correct design in a somewhat expedient fashion you need to be In order to make a correct design in a somewhat expedient fashion you need to be
@ -13,108 +140,130 @@
This means you should have a good idea of how your processor should work *before* This means you should have a good idea of how your processor should work *before*
you start writing code. While chisel is more pleasent to work with than other HDLs you start writing code. While chisel is more pleasent to work with than other HDLs
the bricoleur approach is not recommended. the [[https://i.imgur.com/6IpVNA7.jpg][bricoleur]] approach is not recommended.
My recommended approach is therefore to create a sketch of your processor design. My recommended approach is therefore to create an RTL sketch of your processor design.
Start with an overall sketch showing all the components, then drill down. Start with an overall sketch showing all the components, then drill down.
In your sketch you will eventually add a box for registers, IMEM and DMEM, which In your sketch you will eventually add a box for registers, IMEM and DMEM, which
should make it clear how the already finished modules fit into the grander design, should make it clear how the already finished modules fit into the grander design,
making the skeleton-code less mysterious. making the skeleton-code less mysterious.
Next, your focus should be to get the simplest possible program to work, a program
that simply does a single add operation. Info is progressively being omitted in the
later steps, after all brevity is ~~the soul of~~ wit
Step 0: ** Adding numbers
In order to verify that the project is set up properly, open sbt in your project root In order to get started designing your processor the following steps guide you to
by typing ./sbt (or simply sbt if you already use scala). implementing the necessary functionality for adding two integers.
sbt, which stands for scala build tool will provide you with a repl where you can
compile and test your code.
The initial run will take quite a while to boot as all the necessary stuff is downloaded. Info is progressively being omitted in the latter steps in order to not bog you down
in repeated details. After all brevity is ~~the soul of~~ wit
Step ¼: *** Step 0
In your console, type `compile` to verify that everything compiles correctly. In order to verify that the project is set up properly, open sbt in your project root
by typing ~./sbt.sh~ (or simply sbt if you already use scala).
sbt, which stands for scala build tool will provide you with a repl where you can
compile and test your code.
Step ½: The initial run will take quite a while to boot as all the necessary stuff is downloaded.
In your console, type `test` to verify that the tests run, and that chisel can correctly
build your design.
This command will unleash the full battery of tests on you.
Step ¾: **** Step ¼:
In your console, type `testOnly FiveStage.SelectedTests` to run only the tests that you In your console, type ~compile~ to verify that everything compiles correctly.
have defined in the testConf.scala file.
In the skeleton this will run the simple add test only, but you should alter this
manifest as you build your processor to run more complex tests as a stopgap between
running single tests and the full battery.
Be aware that chisel will make quite a lot of noise during test running. I'm not **** Step ½:
aware of a good way to get rid of this sadly. In your console, type ~test~ to verify that the tests run, and that chisel can correctly
build your design.
This command will unleash the full battery of tests on you.
Step 1: **** Step ¾:
In order to do this, your processor must be able to select new instructions, so in In your console, type ~testOnly FiveStage.SingleTest~ to run only the tests that you
your IF.scala you must increment the PC. have defined in the [[./src/test/scala/Manifest.scala][test manifest]] (currently set to ~"forward2.s"~).
Step 2: As you will first implement addition you should change this to the [[./src/test/resources/tests/basic/immediate/addi.s][add immediate test]].
Next, the instruction must be forwarded to the ID stage, so you will need to add the Luckily you do not have to deal with file paths, simply changing ~"forward2.s"~ to
instruction to the io part of InstructionFetch as an output. ~"addi.s"~ suffices.
Step 3: Ensure that the addi test is run by repeating the ~testOnly FiveStage.SingleTest~
Your ID stage must take in an instruction in its io bundle, and decode it. In the command.
skeleton code a decoder has already been instantiated in the InstructionDecode module,
but it is given a dummy instruction.
Likewise, you must ensure that the register gets the relevant data.
This can be done by using the instruction class methods (TopLevelSignals.scala) which
lets us access the relevant part of the instruction with the dot operator.
For instance:
#+BEGIN_SRC scala *** Step 1:
myModule.io.funct6 := io.instruction.funct6 In order to execute instructions your processor must be able to fetch them.
#+END_SRC In [[./src/test/main/IF.scala]] you can see that the IMEM module is already set to fetch
the current program counter address (line 41), however since the current PC is stuck
at 0 it will fetch the same instruction over and over. Rectify this by commenting in
~// PC := PC + 4.U~ at line 43.
You can now verify that your design fetches new instructions each cycle by running
the test as in the previous step.
drives funct6 of `myModule` with the 26th to 31st bit of `instruction`. *** Step 2:
Next, the instruction must be forwarded to the ID stage, so you will need to add the
instruction to the io interface of the IF module as an output signal.
In [[./src/test/main/IF.scala]] at line 21 you can see how the program counter is already
defined as an output.
You should do the same with the instruction signal.
Step 4:
Your IF should now have an instruction as an OUTPUT, and your ID as an INPUT, however
they are not connected. This must be done in the CPU class where both the ID and IF are
instantiated.
Step 4½: *** Step 3:
You should now verify that the correct control signals are produced. Using printf, ensure As you defined the instruction as an output for your IF module, declare it as an input
that: in your ID module ([[./src/test/main/ID.scala]] line 21).
+ The program counter is increasing in increments of 4
+ The instruction in ID is as expected
+ The decoder output is as expected
+ The correct operands are fetched from the registers
Step 5: Next you need to ensure that the registers and decoder gets the relevant data from the
You will now have to create the EX stage. Use the structure of the IF and ID modules to instruction.
guide you here.
In your EX stage you should have an ALU, preferrable in its own module a la registers in ID.
While the ALU is hugely complex, it's very easy to describle in hardware design languages!
Using the same approach as in the decoder should be sufficient:
#+BEGIN_SRC scala This is made more convenient by the fact that `Instruction` is a class, allowing you
val ALUopMap = Array( to access methods defined on it.
ADD -> (io.op1 + io.op2), Keep in mind that it is only a class at compile and synthesis time, it will be
SUB -> (io.op1 - io.op2), indistinguishable from a regular ~UIint(32.W)~ in your finished circuit.
... The methods can be accessed like this:
) #+BEGIN_SRC scala
// Drive funct6 of myModule with the 26th to 31st bit of instruction
myModule.io.funct6 := io.instruction.funct6
#+END_SRC
io.aluResult := MuxLookup(0.U(32.W), io.aluOp, ALUopMap) *** Step 4:
#+END_SRC Your IF should now have an instruction as an OUTPUT, and your ID as an INPUT, however
they are not connected. This must be done in the CPU class where both the ID and IF are
instantiated.
Step 6: **** Step 4½:
Your MEM stage does very little when an ADD instruction is executed, so implementing it should You should now verify that the correct control signals are produced. Using printf, ensure
be easy that:
+ The program counter is increasing in increments of 4
+ The instruction in ID is as expected
+ The decoder output is as expected
+ The correct operands are fetched from the registers
Step 7: Keep in mind that printf might not always be cycle accurate, the point is to ensure that
You now need to actually write the result back to your register bank. your processor design at least does something!
This should be handled at the CPU level.
If you sketched your processor already you probably made sure to keep track of the control
signals for the instruction currently in WB, so writing to the correct register address should
be easy for you ;)
Step 8: *** Step 5:
Ensure that the simplest add test works, give yourself a pat on the back, you've just found the You will now have to create the EX stage. Use the structure of the IF and ID modules to
corner pieces of the puzzle, so filling in the rest is "simply" being methodical. guide you here.
In your EX stage you should have an ALU, preferrable in its own module a la registers in ID.
While the ALU is hugely complex, it's very easy to describle in hardware design languages!
Using the same approach as in the decoder should be sufficient:
#+BEGIN_SRC scala
val ALUopMap = Array(
ADD -> (io.op1 + io.op2),
SUB -> (io.op1 - io.op2),
...
)
io.aluResult := MuxLookup(0.U(32.W), io.aluOp, ALUopMap)
#+END_SRC
*** Step 6:
Your MEM stage does very little when an ADDI instruction is executed, so implementing it should
be easy. All you have to do is forward signals
*** Step 7:
You now need to actually write the result back to your register bank.
This should be handled at the CPU level.
If you sketched your processor already you probably made sure to keep track of the control
signals for the instruction currently in WB, so writing to the correct register address should
be easy for you ;)
If you ended up driving the register write address with the instruction from IF you should take
a moment to reflect on why that was the wrong choice.
*** Step 8:
Ensure that the simplest addi test works, and give yourself a pat on the back!
You've just found the corner pieces of the puzzle, so filling in the rest is "simply" being methodical.

View file

@ -5,26 +5,6 @@ import chisel3.util._
import chisel3.core.Input import chisel3.core.Input
import chisel3.iotesters.PeekPokeTester import chisel3.iotesters.PeekPokeTester
// From RISC-V reference card
object ALUOps {
val ADD = 0.U(4.W)
val SUB = 1.U(4.W)
val AND = 2.U(4.W)
val OR = 3.U(4.W)
val XOR = 4.U(4.W)
val SLT = 5.U(4.W)
val SLL = 6.U(4.W)
val SLTU = 7.U(4.W)
val SRL = 8.U(4.W)
val SRA = 9.U(4.W)
val COPY_A = 10.U(4.W)
val COPY_B = 11.U(4.W)
val DC = 15.U(4.W)
}
object lookup { object lookup {
def BEQ = BitPat("b?????????????????000?????1100011") def BEQ = BitPat("b?????????????????000?????1100011")
def BNE = BitPat("b?????????????????001?????1100011") def BNE = BitPat("b?????????????????001?????1100011")

View file

@ -40,7 +40,7 @@ class Decoder() extends Module {
* The reason for this is that it serves as convenient sugar to make maps. * The reason for this is that it serves as convenient sugar to make maps.
* *
* This doesn't matter to you, just fill in the blanks in the style currently * This doesn't matter to you, just fill in the blanks in the style currently
* used, I just want to demystify some of the magic. * used, I just want to demystify some of the scala magic.
* *
* `a -> b` == `(a, b)` == `Tuple2(a, b)` * `a -> b` == `(a, b)` == `Tuple2(a, b)`
*/ */

View file

@ -40,7 +40,7 @@ class InstructionFetch extends MultiIOModule {
io.PC := PC io.PC := PC
IMEM.io.instructionAddress := PC IMEM.io.instructionAddress := PC
PC := PC + 4.U // PC := PC + 4.U
val instruction = Wire(new Instruction) val instruction = Wire(new Instruction)
instruction := IMEM.io.instruction.asTypeOf(new Instruction) instruction := IMEM.io.instruction.asTypeOf(new Instruction)

View file

@ -103,3 +103,21 @@ object ImmFormat {
val SHAMT = 5.asUInt(3.W) val SHAMT = 5.asUInt(3.W)
val DC = 0.asUInt(3.W) val DC = 0.asUInt(3.W)
} }
object ALUOps {
val ADD = 0.U(4.W)
val SUB = 1.U(4.W)
val AND = 2.U(4.W)
val OR = 3.U(4.W)
val XOR = 4.U(4.W)
val SLT = 5.U(4.W)
val SLL = 6.U(4.W)
val SLTU = 7.U(4.W)
val SRL = 8.U(4.W)
val SRA = 9.U(4.W)
val COPY_A = 10.U(4.W)
val COPY_B = 11.U(4.W)
val DC = 15.U(4.W)
}

View file

@ -20,7 +20,7 @@ object Manifest {
val singleTest = "forward2.s" val singleTest = "forward2.s"
val nopPadded = false val nopPadded = true
val singleTestOptions = TestOptions( val singleTestOptions = TestOptions(
printIfSuccessful = true, printIfSuccessful = true,