|
|
Saturday, Feburary 23, 2002 |
|
Question:
Do we need to know how to do problem 6 on the Spring '01 exam?
Reading over it, it seemed a little bit unfamiliar, but I wante
Answer:
Nah, last spring had a different schedule than this spring. This question
was all about linking and what linkers have to change such that compiled
code works. Anyway, I don't think we've even gotten to that in lecture,
so don't worry about it.
|
Saturday, Feburary 16, 2002 |
|
Question:
Is it possible to do an arbitrary JMP command? In other words, can I
give the assembler an explicit location in memory to jump to? All of the
code that is generated by gcc uses relative JMPs (i.e. jump forward e0
bytes).
Answer:
Yes, it is -- and you've already seen it, but probably didn't notice it!
Think back to how we translated loops. We always converted them to a form
where they jumped to a label, such as "jmp .L1". That ".L1" is just a
human-friendly representation fo an address. After the code is assembled,
it is converted into a hard address.
Question:
When an item is pushed onto the stack, when is it "used?"
Answer:
Data on the stack can be used in any number of ways. It might be read off
of the stack and removed by a pop operation.
Or, it might be accessed on the stack without removing it. For example,
it is very common for a function to read its parameters using something
like "movl 16(%ebp), %ebx". Remember that %ebp is the "base pointer" also
called the "frame pointer". It holds the address of the bottom
(beginning, least recently pushed, high address) of a function's stack
frame. Above the base pointer, a function can find its arguments,
which were pushed by the caller. So, "movl 16(%ebp), %ebx" says, read the
'double word' (4 bytes) beginning 16 bytes above the base pointer (in the
caller's frame) into the %ebx register." This makes the parameter
available in %ebx, while leaving it in the stack. (Although we can peek
into the stack, we can't take things out of the inside, only from the
top by popping).
Values on the stack can also be removed by the "ret" (return) operation.
"call" pushes the return address onto the stack before jumping. In a
complementary way, "ret" pops this address of of the stack and jumps to
it.
(Aside: If you are wondering why a "movl" (move long), moves only
1 word (4 bytes) and "movw" (move word) moves only 1/2 word (2 bytes), it
is for backward compatibility. Although the IA32 architecture uses a
32-bit word, these are called "double words" for backward compatibility
with the old 16-bit architecture, which had 2 byte words.)
Question:
What is the significance of this assembly instruction?
cmpl $0x0,0xfffffffc(%ebp)
I understand what it means to add '8' or some smaller number to
ebp: you're looking up something in the stack. But what does it
mean when you add something like this that will almost certainly
cause an overflow?
Answer:
Yes. The instruction will look up some location in the stack.
Adding 0xfffffffc to %ebp is the same as add -4 to the value of
%ebp. Remember the %ebp points the up bound of the current
procedure's stack frame, therefore this instruction will access a
memory location from the current stack frame. Find out the
structure of the stack frame and you will know what lives in that
memory location.
Question:
When an item is pushed onto the stack, when is it "used?"
Answer:
Data on the stack can be used in any number of ways. It might be
read off of the stack and removed by a pop operation.
Or, it might be accessed on the stack without removing it. For
example, it is very common for a function to read its parameters
using something like "movl 16(%ebp), %ebx". Remember that %ebp is
the "base pointer" also called the "frame pointer". It holds the
address of the bottom (beginning, least recently pushed, high
address) of a function's stack frame. Above the base pointer, a
function can find its arguments, which were pushed by the
caller. So, "movl 16(%ebp), %ebx" says, read the 'double word' (4
bytes) beginning 16 bytes above the base pointer (in the caller's
frame) into the %ebx register." This makes the parameter available
in %ebx, while leaving it in the stack. (Although we can peek into
the stack, we can't take things out of the inside, only from the
top by popping).
Values on the stack can also be removed by the "ret" (return)
operation. "call" pushes the return address onto the stack before
jumping. In a complementary way, "ret" pops this address of of the
stack and jumps to it.
(Aside: If you are wondering why a "movl" (move long), moves
only 1 word (4 bytes) and "movw" (move word) moves only 1/2 word
(2 bytes), it is for backward compatibility. Although the IA32
architecture uses a 32-bit word, these are called "double words"
for backward compatibility with the old 16-bit architecture, which
had 2 byte words.)
Question:
What do the _init function and .init section do? Are they created by the
compiler, or are they actual user-defined functions?
Answer:
Executible programs can be packaged in any number of ways. One of these is
known as the Execute and Link Format (ELF). Linux follows this standard.
ELF formats are composed of separate pieces or sections. One of these
section is called ".init" and another ".fini".
".init" contains code that should be executed before main() is called and
".fini" contains code that should be executed after main returns, if the
program exits normally (doesn't terminate or die).
You can think of the code within these two sections as the constructor and
destructor for the program, itself. For example, the code that loads
shared libraries (library functions that bind at runtime, such as DLLs in
Windows and .so's is UNIX) is located wihtin the ".init" section.
In C++, these sections are also used to call global constructors and
global destructors, which are, themselves, typically stored in separate
sections. ".ctors" and ".dtors".
The _init() function is compiler generated. It lives in the ".init"
section and is responsible for initialization discussed above. The
complement is true for _fini(), which lives in the ".fini" section and
does the tear-down.
I usually don't get quite this "down and dirty" very often, so I don't
have a tremendous amount of experience in this area. But, my understanding
is that both _init() and _fini() are called by a function within the
".text" area, the program's executible code, called _start(). The big
picture is that _start() basically does a little bit of preparation, then
calls _init(), then main(), then _fini().
But, let me add a big footnote here that says that ELF only defines the
organization of the executible file. I don't think it actually specify all
of the details of a program's preamble. As a result, I wouldn't be
surprised to see several different compilers, each of which employs ELF,
generating slightly different preamble code -- even where the differnces
aren't necessarily mandated by the underlying hardware architectures.
Question:
How many conditional flags does a computer have? Basically, all
flags have 0 in it, and when certain condition is met, the flag
becomes to have 1 ? This means "set flag"?
Answer:
There are 6 condisitonal flags (status flags). They are part
of the flags register, which is 32-bits wide and contains 32
flags. Most of these flags aren't really of interest to us. We're
basically concerned with four flags CF, SF, ZF, and OF. These
flags are typically used with operands like testl and cmpl to
control the flow of execution through conditionals, such as
if-else.
Well, each of these flags is 1 bit, so it has a value of either
0 or 1, depending on the result of the last mathematical or
logical operation. Each mathematical or logical will reset
these flags.
Question:
Insturction "setns" checks condtion ~SF. If ~SF is true, does it
mean that SF has "1" ? The "setns" sets single bytes based on
condition code. So, "setns %al" write "0" to lower single byte of
" destination register" when the condtion code is false ?
Answer:
Yes, SF is "1" when ~SF is true.
"setns" is interested in the sign flag, SF. If SF is 1, it sets
the byte value to 0, otherwise it sets it to 1. "setne" does
exactly the same thing, except that it inspects the zero flag
(ZF), not the sign flag.
Question:
The proram uses " movzbl " instruction to fill the high order
bytes of the register after the "set" instruction. What is the
meaning of changing bit expression of destination register
according to the condtional code? Does it mean that the
destination register will have either 0x00000000 or 0x000000ff?
Answer:
Since the set instruction only sets the low-order byte of the
4-byte register, the value other bytes is indeterminate. The
"movzbl" instruction moves 1 bytes and zeros everything else. As a
reuslt, it can be used to initialize the other bytes of the 4 byte
destination register to 0.
|
Thursday, January 30, 2002 |
|
Question:
In leal operator, it has two operands - first is the form of memory
reference and second is the name of register.
Answer:
Yes, this sounds correct. But, let me repeat it back to you, just to make
double-sure.
- The first operand is the source. The second operand is the
destination
- The first operand operand takes the form of a "scaled, indexed
operand", Displacement (Base, Index, scalar),
where the result is equal to (Base + (Scalar *Index) +
Displacement)
- This can be used, for example, to compute the address of some
element within an array. For example if we have an array of
integers begging at address Arr, the address of the fifth
element could be named, (Arr, 5, 4). Since the size of an
integer is 4 bytes, the 5th element is located 20 bytes past
Arr or (Arr + 4*5).
- Since this instruction just generates a number, but does not
dereference it as an address, it can actually be used for simple
computation. Any expression of the form
(Base + (Scalar *Index)) + Displacement) can be computed
this way. The result is simply stored in the destination register.
- Please note that a displacement of 0 is assumed, if it is not
specified. Similarly, a scalar of 1 is assumed, if not specified.
- In other words, it works just like a movl, except it doesn't
dereference the result. Instead, it sticks the result directly
into the destination register.
Question:
It means that in the leal operand, (%ebp) means a address of memory?
Answer:
With leal, the second operand must be the name of a register, for example,
"%eax". It is not leagal for the second operand, the destination, to be an
indirect reference to memory via a register. For example, "(%eax)" is not
a legal destination for leal.
For the first operand, something like "(%eax)" is legal, but it doesn't do
what you might expect -- at least if you're expecting similar semantics to
movl. Instead, it just uses the value of "%eax" -- it does not dereference
it to get the value at the named memory location. Remember, this
instruction loads an address, not memory.
Question:
Given, "leal (%edx, %ebx), %eax", %edx and %ebx have values and the
values are memory address?
Answer:
Maybe, maybe not. The leal instruction doesn't care. Instead, it just
"crunches the numbers". This instruction was designed to play with
addresses, but in practice, it is much more flexible. In the example
above, where "%edx" holds the value of "a" and "%ebx" holds the value of
"b", we don't know the type of the result. If "a" and "b" are pointers,
the result will be an address. If "a" and "b" are ints, the result will be
an "int" (Although, note that overflow doesn't qork quite right if you use
leal to add, instead of addl)
Question:
Given, "leal (%edx, %ebx), %eax", %eax will have a memory address decided
by %edx + %ebx ?
Answer:
"%eax" will hold the result of adding "%edx" and "%ebx". The type of the
result, be it a pointer (addresss), int, etc, depend on the types of the
operands. Assembly isn't strongly typed -- it does what you ask with
whatever you give it. In this case, "%eax = %edx + %ebx"
Question:
How does "leal(%edx, %ebx)" do actual arithmatic calculation with only
memory addresses?
Answer:
Keep in mind here that assemly is not really typed. "leal" just sees the
bits in the registers. Adding an int is much the same as adding an
address. It just crunches the numbers. It is copletely up to the
programmer to interprete the meaning of the result.
|
Thursday, January 24, 2002 |
|
Question:
Regarding unsigned and two's complement integers, why is
UMax = 2 * TMax + 1?
Answer:
In Two's complement encoding, for positive numbers the MSB must be a
zero since it acts as the sign bit. Therefore you lose the 2^(w-1)
numbers you can represent using the MSB.
That's why:
UMax = TMax + 2^(w-1)
= 2 * TMax + 1
Question:
I am not sure how casting works. For an unsigned integer, is the sign
extension bit 0 regardless of the MSB.
Answer:
When extending a number from a short to an int, or from a int to a long,
the key is to preserve the value. For unsigned numbers, we may just use
a 0 as the extension bit regardless of the MSB. However for signed
numbers, the sign extension bit is the MSB. Therefore for negative
numbers where the MSB is 1, the sign extension bit is 1. Otherwise
consider what would happen if you just extend a short negative number
to an int with a 0 sign extension bit. Then our new MSB would be 0, and
we converted a neg. number to a positive number in extending it.
|
Saturday, January 19, 2002 |
|
Question:
I'm currently registered for one recitation and would like to
attend another. I have a course confliuct. What should I do?
Answer:
We currently have a big space crunch right now. All fo the recitations
are full with long waiting lists. We are going to try to load balance
these and admit as many people as we can during the first recitation
-- one week from Monday.
We will also talk about this at our Monday evening staff meeting.
Ideally, we'll develop an electronic way for you to communicate
your recitation preference to us prior to Monday's recitation.
For now, please contact the instructor of the section that you
want to attend.
|
Tuesday, January 14, 2002 |
|
Question:
Can we use last semester's textbook?
Answer:
No. Please buy a new copy at the bookstore. Your textbook is excellent
and very nearly final. But, the last unit was recently revised. The
revision was not just cosmetic -- the last unit has been expanded
and reorganized.
The good news is that since the textbook is currently being beta
tested, you get a really, really good price.
Question:
What is the differences between Program Counter, Register and
Register File?
In text book, when it says "register", it points one of the register
files? How is the PC updated to point next instruction?
Answer:
Registers are small, named, pieces of storage within the processor. You
can think of them as variables that are implemented in hardware.
Typically, the program manages them using loads and stores. "Load value X
from memory into register Y", and "Store the value from register Y into
main memory at address Z". When the processor performs operations, it
generally does so on values stored within registers, because they operate
at the same speed as the processor -- no delay.
The Program Counter (PC) is a special-purpose register. It tells the
processor which instruction to execute next. It can be set, by the
program, like any other register. This is how loops, &c are implemented.
Instead of setting the value using load, it is typically done with a
"jump" instruction. The other thing that is special about this register is
that it is automatically incremented with each executing instruction -- so
the processor will execute the next instruction next.
Although each register is conceptually separate, they are usually best
implemented as, in effect, one small memory module. This memory module is
called the register file. At the most basic level, it is word addressed,
just like normal memory. Most registers are one word long (some are two).
When you write assembly code, you refer to registers by name. But, at
the machine level, these names are offsets into (addresses within)
the register file.
But, please don't get bogged down with the details at this point. The
important things to remember are these. Registers are small, fast, named
places to store things within the processor. Some are general purpose and
are used by the compiler to manipulate values, whereas others are special
purpose and have very specific, predetermined roles. The register file is
nothing more than the name we give to the collection of all of the
registers within the processor.
|