GEM5 project
Computer Architecture
Project 1
Objective: Become familiar with gem5 and learn implications of different processor aspects like
Fetch Width, Commit Width etc., and different execution units.
In this project, you will be required to use gem5 processor simulator, which is popular in
academia. You will do experiments with different benchmarks and will learn how to simulate the
performance on gem5. For various benchmarks, you will analyze impact of different cache
parameters on performance metrics. To learn how to download, build and use gem5, please refer
to the supporting resources.
Note: We are using Gem5 to simulate the ARM processor. If you followed all steps in the Getting
Started document, this should already be configured.
Benchmarks: Performance based system-design has made benchmarking a crucial aspect of the
design process. Different benchmarks target specific areas of the computation. For example,
widely used SPEC (Standard Performance Evaluation Corporation) CPU benchmarks
characterize workloads for general purpose computers. They are used to evaluate the efficacy of
different microarchitecture design aspects such as different data or instruction-level parallelism
techniques, sophisticated caching or branch-prediction techniques etc. For this project, we will
use the benchmarks from MiBench suite consisting benchmarks for embedded processors.
Mainly, we will deal with networking benchmark dijkstra, binary of which is provided to you.
You may also simulate other benchmarks such as FFT, qsort or basicmath on gem5 and analyze
the performance statistics.
CPU Model: Gem5 can simulate several CPU models. Please refer to
http://www.m5sim.org/CPU_Models to learn more about them. AtomicSimpleCPU model is a
functional simulator that uses atomic memory accesses, and can be used to profile instructions,
and collect simulation statistics. The TimingSimpleCPU model uses timing memory accesses and
models the memory accesses in greater details – which is needed here while dealing with different
cache configurations. In this project we will be using an Out of Order CPU model called
DerivO3CPU. The O3 CPU model has various factors like fetch width, issue width, integer and
floating point arithmetic logic units (IntALU, FP_ALU), integer and floating point
Multiply/Divide unit, SIMD etc.
How to run a benchmark program on gem5?
Use the following command line to get help on the options to make changes to the cache
architecture configuration:
To simulate the execution of dijkstra benchmark on ARM processor with the cache specification
such as direct-mapped L1 data cache of 1 kB, 16-way set associative L2 data cache of size 16 kB,
L1 instruction cache of 2 kB etc., we can run following command:
> ./build/ARM/gem5.opt ./configs/example/se.py --cpu-type=DerivO3CPU -
-caches --l2cache -c ./benchmarks/dijkstra/dijkstra_small -o
./benchmarks/dijkstra/input.dat
To understand what each of the above processor parameters represent, please use the script:
> build/ARM/gem5.opt configs/example/se.py -h
Things to do before the project:
In this project, you will analyze the performance statistics for different Fetch, Issue, Commit
widths of O3 CPU model and different number of ALUs and Multiply/Divide units for the
processor.
You may need to store different configurations of the same benchmark. You can use a -d
options like shown below to run configurations and store it in a different folders.
./build/ARM/gem5.opt -d dijkstra-Conf1 ./configs/example/se.py --cpu-
type=DerivO3CPU –caches --l2cache -c
./benchmarks/dijkstra/dijkstra_small -o
./benchmarks/dijkstra/input.dat
Usually, executing gem5 without any option stores the output files in the folder m5out. However,
-d command redirects your stats.txt to a directory called dijsktra-Conf1 (which is
created during runtime). To know more about how to save your stats in your directory rather
than m5out, explore:
> build/ARM/gem5.opt -h
Problem 1 [5 Points]: Impact of Various Widths of O3 processor.
> vi src/cpu/o3/O3CPU.py
An Out of Order processor consists of fetch width, decode width, rename width, dispatch width,
issue width, writeback width, commit width, and squash width. These affect the performance of
the processor like the execution time, cpi, ipc etc. In this project you will be changing the following
configurations.
Change the following parameters in O3CPU.py.
1. fetchWidth
2. decodeWidth
3. renameWidth
4. dispatchWidth
5. issueWidth
6. wbWidth
7. commitWidth
8. squashWidth
Run the dijkstra benchmark on gem5 with following pipeline configurations. You can use the
command alike the one described before. Please use DerivO3CPU model.
Configura tion
#
Fetch Width
Decode Width
Rename Width
Dispatch Width
Issue Width
WB Width
Commit Width
Squash Width
1 2 2 2 2 2 2 2 2
2 4 4 4 4 4 4 4 4
3 (default) 8 8 8 8 8 8 8 8
Generate three folders for the above configurations and call it dijkstra-Conf1, dijkstra-Conf2,
dijkstra-Conf3 respectively.
Note: After modifying the O3CPU.py file and before running the Dijkstra benchmark you need
to recompile the gem5 for each configuration. By recompiling for the new configuration gem5
creates the object files to incorporate the changes made to the file. Run the following in your
gem5-cse-ca directory. Do not modify the Functional Units for this problem. Please keep the
Functional Units to the default configuration. > scons build/ARM/gem5.opt
Problem 2 [5 Points]: Impact of Execution Unit.
Check if the widths from problem1 are set to the default configuration.
There are multiple execution units in the Out of Order processor in gem5, like IntALU (for integer
computation), IntMultDiv (for integer multiply and divide), FP_ALU (floating point
computation), FP_MultDiv (floating multiply and Divide), SIMD etc. You can find these in the
following file: > vi src/cpu/o3/FuncUnitConfig.py
Configuration #
IntALU count
IntMultDiv count
FP_ALU count
FP_MultDiv count
1 (default) 6 2 4 2
2 4 1 4 1
3 8 4 6 4
Generate three folders for the above configurations and call it dijkstra-Conf1, dijkstra-Conf2,
dijkstra-Conf3 respectively.
Note: After modifying the FuncUnitConfig.py file and before running the Dijkstra benchmark
you need to recompile the gem5 for each configuration. By recompiling for the new configuration
gem5 creates the object files to incorporate the changes made to the file. Run the following in your
gem5-cse-ca directory.
> scons build/ARM/gem5.opt
Project Deliverables
1. Your project submission should contain the following:
a. Two directories, Problem1 and Problem2.
b. Project1 and Project2 directories should contain three directories dijkstra-Conf1,
dijkstra-Conf2 dijkstra-Conf3 and dijkstra-Conf4.
c. Each of these configuration directories should contain config.ini, config,json, and
stats.txt files. The directory structure should be as following:
\---Project1-Submission
+---Problem1
| +---dijkstra-Conf1
| | config.ini
| | config.json
| | stats.txt
| |
| +---dijkstra-Conf2
| | config.ini
| | config.json
| | stats.txt
| |
| \---dijkstra-Conf3
| | config.ini
| | config.json
| | stats.txt
| |
|
\---Problem2
+---dijkstra-Conf1
| config.ini
| config.json
| stats.txt
|
+---dijkstra-Conf2
| config.ini
| config.json
| stats.txt
|
\---dijkstra-Conf3
| config.ini
| config.json
| stats.txt
|
Submission Instructions
1) Submit single zip file containing all the files. Please follow the naming conventions correctly.
2) In case where you are doing project in a group, only one group member should make
submission through blackboard. All the team members of the group will receive the same score.
Rubric
1. Problem 1 [5 points]
a. Correct configuration from the table from config.ini file. [2 points]
b. Correct stats in the stats.txt file [3 points]
2. Problem 2 [5 points]
a. Correct configuration from the table from config.ini file. [2 points]
b. Correct stats in the stats.txt file [3 points]