Computer Science homework

profileqq49641173
en.docx

Write an assembler for the following instructions:

add, addi, lw, sw, beq, bgt, j.

Assume that: all registers are addressed by $register-name (like $s1, $t2, $a0, $sp; see

your textbook for range of valid register names); the end of program is recognized by the

.end instruction, the program is loaded into memory at location 0; each label has 1 to 4

characters; there is 1 or more space between the symbols in the program.

The input to your assembler is a text file consists of 1 to 50 assembly instructions. The

last instruction is .end. You can write your assembler in any language as long as we are

able to test it in the department.

The output of your program is a text file that contains the object code. The object code is

represented in binary and also hex format.

Your assembler should consist of two passes.

First Pass,

During the first pass, the assembler generates a table that correlates all user-defined

address symbols with their decimal equivalent value. The binary translation is done

during the second pass. The content of PC (Program Counter) stores the value of the

memory location assigned to the instruction or operand presently being processed. The

assembler sets this counter to 0 initially. A line of symbolic code is analyzed to determine

if it has a label (by the presence of a colon). If the line of code contains a label, it is

stored in the address symbol table together with its decimal equivalent number specified

by the content of PC. PC is then incremented by 4 and a new line of code is processed.

Second Pass,

Instructions are translated during the second pass by means of table-lookup or other

procedures. There are three tables:

1. Pseudoinstruction table.

2. Instruction table.

3. Address symbol table.

The entries of the pseudoinstruction table are for pseudoinstructions such as bgt. Each

entry refers the assembler to a subroutine that processes the pseudoinstruction when

encountered in the program. The instruction table contains the symbols for the rest of

instructions and their related information.

PC is initially set to 0. Lines of code are then analyzed one at a time. Labels are neglected

during the second pass, so the assembler goes immediately to the instruction field and

proceeds to check the first symbol encountered. It first checks the pseudoinstruction

table. A match with an entry sends the assembler to the corresponding subroutine. If the

symbol encountered is not a pseudoinstruction, the assembler refers to the instruction

table. If a match occurs, the instruction is converted to its equivalent machine code by the

use of address symbol table (if needed). The implementations of tables 1 and 2 are

optional. You must use a hash function for the implementation of Table 3.

Error Diagnostics,

The assembler should check for possible errors in the symbolic program and if there is an

error it should print out the proper error message.

Sample Program

main:

add $s1, $s2, $s3

loop: addi $s1,$zero, 4

sw $s3, 0($s2)

next: addi $a0, $zero, 0

lw $s4,0($s1)

j loop

addi $s4, $s1, 4

beq $s4, $s3, loop

bgt $s6, $s7, loop

.end