Bits, Bytes, Branches

Part 2: assembly language

PGSS Computer Science Core Slides

When last we met, we learned how computers represent data.

Today we examine how computers are controlled.

We'll do this by studying HYMN, a HYpothetical MachiNe. HYMN has sixteen-bit words.

Memory

The computer's memory is outside the CPU. It holds all information about the current state of a running program. For example, for a word processing program the memory would hold

HYMN has 256 bytes of memory, addressed 00000000 (0 base 10) to 11111111 (255 base 10).

Registers

On the CPU are registers. These are basically very fast memory locations, few in number.

Registers store intermediate calculations that we don't need later. (Consider the remainder of n divided by i in Prime-Test-All.)

HYMN has eight 16-bit registers. We label these R0, R1, R2,..., R7.

Instructions

A program is a sequence of instructions. An instruction is a coded bit sequence giving direction to the CPU.

Say we want to place 2 in R0. On HYMN the following instruction does this:

  00000 000 00000010
The first five zeroes are the op code. This tells the nature of the instruction, in this case to change a register. The next three zeroes tell which register to change (R0). The final eight bits specify the number to put into the register (2 in binary).

This is machine language.

Assembly language

For humans, machine language is a pain to use. Mnemonic symbols are much easier to write. Assembly language allows us to do this.

In HYMN's assembly language we write

  LR R0, 2
to load 2 into R0. LR stands for Load to Register.

Each line of an assembly language program corresponds with a machine language instruction. An assembler does this translation automatically.

HYMN's assembly language

LR Rd, n
loads binary representation of n into Rd.
LRA Rd, a
loads from memory address a to Rd.
STORE a, Ri
stores Ri's contents at memory address a.
ADD Rd, Ri, Rj
puts sum of Ri and Rj into Rd.
SUB Rd, Ri, Rj
puts difference of Ri and Rj into Rd.
MULT Rd, Ri, Rj
puts product of Ri and Rj into Rd.
DIV Rd, Re, Ri, Rj
puts quotient of Ri and Rj into Rd and remainder into Re.
B a
loads next instruction from (branches to) memory address a.
BEQUAL Ri, a
branches to a if Ri is zero.
BLESS Ri, a
branches to a if Ri is negative.

Prime-Test-All on HYMN

00000000            LR R0, 2               // R0 is i
00000010            LR R1, 1
00000100            LRA R2, 10000000       // R2 is n
00000110            SUB R3, R2, R0
00001000            BLESS R3, (NOTPRIME)   // n < 2
00001010            BEQUAL R3, (ISPRIME)   // n == 2
00001100 (NEXTITER) DIV R3, R4, R2, R0
00001110            BEQUAL R4, (NOTPRIME)
00010000            ADD R0, R0, R1         // step i
00010010            MULT R3, R0, R0
00010100            SUB R3, R3, R2
00010110            BLESS R3, (NEXTITER)   // i * i > n
00011000            BEQUAL R3, (NOTPRIME)  // i * i == n
00011010 (ISPRIME)  LR R0, 1
00011100            B (ENDPROG)
00011110 (NOTPRIME) LR R0, 0
00100000 (ENDPROG)  STORE 10000010, R0     // store ans.
  :
10000000 // here n is kept
10000010 // here the answer goes

High-level languages

Assembly language code, though an improvement, is still a pain. Something closer to the pseudocode we sawe yesterday for Prime-Test-All.

This is what high-level languages give us. IN one high-level language (Ada) Prime-Test-All would be

  function Prime_Test_All(n : integer) return integer is
  begin
	for i in 2 .. integer(sqrt(double(n))) loop
	  if n mod i = 0 then
		return 0;
	  end if;
	end loop;
	return 1;
  end Prime_Test_All;

The compiling process

A compiler translates a high-level language to machine language.

Here, then, is pseudocode for programming: