【深入理解計算機系統】 七:AVR指令集架構
技術標籤:stackuibuttontagsh.264windbg
7.1. Types of Instruction Sets
The set of instructions that a microprocessor can execute and their structure is generically called itsinstruction set architectureor ISA. The ISA description contains all the details about the execution of each instruction and the effects in the components of the microprocessor. A microprocessor may have a small number of instruction types but a large number of total possible instructions because of the variations allowed in the operands. For example, an instruction to add two numbers may have variations for signed and unsigned numbers, one of the operands is stored in memory, or the result can be stored in one of the general purpose registers.
Most of the information in this chapter is contained in the document8-bit AVR Instruction Setor itsequivalent HTML version.
The decision about what instructions will be possible to execute in a microprocessors is one of the most important and has a profound effect in the performance when executing programs. Furthermore, the set of instructions and the processor architecture are tightly coupled. For example, the operands for most of the instructions are stored in the register file, thus, it must allow for these operands to be accessed efficiently. The AVR architecture, for example, allows various combinations when accessing the operands in the stageRegister Operand Fetch.
The other criteria that influences the type of instructions supported by a microprocessor is the complexity of the implementation. Ultimately, the instructions need to be sequenced by a digital circuit. The higher the complexity of the tasks carried out by an instruction, the higher the complexity of the design. This complexity usually translates into a large number of gates that require a larger physical space (larger chip) and more power consumption.
Consider the following example. Should a processor include as part of its machine language an instruction that given a real number and the coefficients of a second order polynomial obtains its corresponding value? Suppose we call this instructionSDPE
(second degree polynomial evaluation). One possible format of this instruction could be:
SDPE a, b, c, n, dest
The instruction would evaluate the polynomial using the three coefficients and the additional value, and store the result in locationdest
. The evaluation would calculate the value with the formula:
Although such instruction could be useful, a processor without such instruction could still calculate the same result executing a sequence of instructions to perform the required additions and multiplications. Thus the trade-off to explore when designing a microprocessor is between the complexity of the architecture, and the performance obtained by including certain operations in the ISA. A microprocessor with a reduced set of instructions may require more operations to perform sophisticated calculations, but they are executed faster due to its simplicity. Conversely, a microprocessor with a large number of instructions may require much shorter instruction sequences, but they may take longer to execute due to their complexity.
Over the years microprocessor manufacturers have explored this trade-off and created two categories of systems depending on the complexity of the machine languages.
7.1.1. Complex Instruction Set Computers (CISC)
The processors that provide a rich and complex set of instructions are calledComplex Instruction Set ComputersorCISC. These instructions typically use several operands and may require multiple memory accesses. There are various examples of CISC microprocessors, but perhaps the most popular architectures of this type are Intel’s x86, IA-32, and IA-64 (also known asItanium,x86_64orAMD64.) Thex86architecture was present in the personal computers in 1970s. The architecture was then evolved to the IA-32 (32 bit bus), and then the 64 bit version. These processors are present in numerous desktop computers, servers, and laptops. Other examples of CISC architectures are those in systems such as IBM’s System/360, DEC PDP-11, DEC VAX, or Motorola 68k.
The machine instructions in a CISC computer allow handling multiple operands, some or all of them in memory, and accessed with address manipulations that may require certain arithmetic operations. For example, the following instruction is part of the IA-32 architecture:
ADD $4, 14(%eax, %ebx, 8)
and adds the value 4 to the data in memory stored in the address obtained by adding the value 14, the content of register%eax
, and the content of register%ebx
multiplied by 8. As you can see, the execution of this instruction requires more than one arithmetic operation. The addition denoted by the instruction code (ADD
) can be performed once the operands are obtained, and additional additions, and even a multiplication is needed to obtain the second operand.
7.1.2. Reduced Instruction Set Computers (RISC)
RISC architectures appeared as an alternative to existing CISC microprocessors. The philosophy is the opposite, microprocessors have a very reduce set of instructions performing each avery simple operation. The idea is to translate this simplicity in the functionality into a faster execution, and a simpler structure in the microprocessor. A reduced execution time does not necessarily translate into a processor that is faster than a CISC architecture. A simple set of instructions means that complex operations will require sequences of instructions, some of them very large.
Some examples of current RISC architectures are:
- MIPS (Microprocessor without Interlocked Pipeline Stages): used in some routers, Nintendo consoles, original PlayStation, PlayStation 2 and PSP.
- ARM: present in numerous computers and personal devices such as digital cameras, mobile phones, iPod, etc.
- SPARC (Scalable Processor ARChitecture): powered the systems sold by Sun Microsystems (now owned by Oracle).
- PowerPC (Performance OPtimization with Enhanced RISC - Performance Computing): It was created by a consortium of Apple, IBM and Motorola in the early 1990s to be used in personal computers. Not widely used these days.
- AVR architecture. This architecture is now included in multiple microcontrollers used in embedded systems. Some of the systems in the Arduino family use microprocessors with this architecture.
The comparison between RISC and CISC architectures cannot be done solely in terms of the number of instructions executed per unit of time. With that measure, RISC processors are clear winners. A more exact comparison is to execute the same high level task in two processors and measure the amount of time it takes to finish the task, instead of the number of instructions executed. Very likely, the CISC architecture will execute a smaller number of instructions, but each of them will take longer than the RISC architecture.
7.1.3. Fixed vs Variable Size Instruction Encoding
Another important feature of a microprocessor that can be used to divide them into two categories is the format in which the instructions are encoded. In processors that use afixed lengthformat, every instruction has exactly the same size. This feature has numerous consequences in the design stage. Typically, a small number of operands are allowed in every instruction. A large number of operands would make the format longer, and perhaps not fully used by all the instructions. The number of different instructions is also reduced as a larger number of instructions translates into a larger number of possible cases to encode, and eventually into a larger format. The main advantage of these instructions is in the decoding stage. That is, when the instruction has been loaded in the instruction register, the processor needs to identify which one it is and the type of operands that are required. A fixed format greatly simplifies this stage as the bits encoding the different elements of the instruction are located in the same positions.
The processors with avariable lengthformat, on the other hand, allow instructions to be encoded with different number of bits. Thus, an instruction may have an arbitrary number of operands as they will be encoded with additional bits. The main disadvantage of this approach is during the decoding stage, that is, when the processor needs to identify the type of instruction to execute and obtain all its operands. In these processors, the decoding stage may require loading additional bytes from memory while the instruction is being decoded.
The following video summarizes the differences between CISC and RISC architecture..
7.2. Instruction Format of AVR Architecture
The AVR architecture has afixed lengthformat. All instructions (with only a few exceptions) are encoded with 16 bits. When describing the different components of an instruction, we will be using the following convention for registers and operands:
Rd
: A register in the register file that will be the destination of the result derived from the instruction.Rr
: A register in the register file which will provide one of the operands for the instruction.R
: Result of the instruction after its execution.K
: A constant value.k
: A constantmemory address.b
: A bit in a register in the register file or an input/output register.s
: One of the bits of the status register.X
,Y
,Z
: 16 bit registers obtained combining two registers in the register file (X=R27:R26
,Y=R29:R28
,Z=R31:R30
).
The following figure shows some of the instruction encoding schemes used by the AVR architecture.
Some of the Instruction Encoding Schemes in the AVR Architecture
Every instruction must have some bits to encode the type of operations that is required. These bits are called theoperation code. The AVR architecture uses a variable number of bits to encode the operation. In its minimum version, the left-most four bits are used.
Instructions for which the operands are one of the 32 general purpose registers, require five bits per operand (
25=32combinations, one per register). These bits are not necessarily in contiguous positions in the instructions but this fact makes no difference when decoding the instructions. For example, the following instruction:ADD R0, R31
is encoded as shown in the following figure:
Encoding of theADD
instruction
The instruction implements the operation
R0=R0+R31.Following the convention to describe the AVR ISA, theADD
operation would be denoted by
ADDRd,Rr
. Note that the first operand is both a source operand and the destination in which the result is going to be stored. As a consequence, the content of
R0
before the execution of the instruction is lost. The top part of the previous figure shows the encoding of these instructions for any register number, whereas the bottom part shows the exact encoding for that instruction. As you can see, the five bits in the format with the letter
r(source operand) have value
11111referring to register
R31
. Analogously, the five bits denoted by
d(destination operand) have value
00000referring to register
R0
. The hexadecimal representation of this instruction is
0x0E0F
.
Some instructions allow a number as operand instead of a register. These numbers that appear literally in instructions are calledimmediateoperands. The reason for this name will become apparent when theaddressing modesare described. These operands require the number to be encoded with bits that are part of the instruction format. For example, the instruction
CPI R16, 255
is encoded as shown in the following figure:
Encoding of theCPI
instruction
This type of instruction is generically represented byCPIRd,K
. This encoding shows some consequences of trying to maintain the format of the instructions of the same size. The operation code are the four left-most bits with value0011
, they encode theCPI
part of the instruction.CPI
stands forcompare with immediateand the instruction is going to compare the immediate operand (the number 255) with the content of registerR16
. Internally, the microprocessor will carry out the operationR16-255
and update the status register to reflect the conditions of the result. In other words, instead of storing the value of the operation, only certain conditions are captured in the status register. The hexadecimal representation of this instruction is0x3F0F
Since the instruction is encoded with 16 bits, and four of them are used for the operation code, this means that there are only 12 bits remaining to encode one register and a number. In principle, if we allow any of the 32 general purpose registers to be specified, there would be only 7 bits remaining to encode the immediate operand, leaving only the range of possible values
Kto satisfy 0≤K≤127.However, the designers decided to expand the range of the immediate operand to 0≤K≤255requiring eight bitsto encode that operand. This leaves only four bits to encode the register. As a consequence, the instructionCPI
can only use 16 of the 32 general purpose registers. More precisely, this instruction can specify as first operand a register with index
dthe range
16≤d≤31.This is an example of the special requirements derived from a fixed length instruction format.
As a consequence, the instructionCPIR7,255
would be ruled incorrect because it does not correspond to the encoding in the AVR instruction format.
The third type of operands allowed in some instructions are values that must be used as memory addresses (not as simple numeric values). As in the case of theimmediateoperands, these values need to be encoded using the bits of the instruction which poses a challenge for the fixed format. For example, the instruction
LDS R12, 12565
loads the content of memory in position12565
into registerR12
. Its binary encoding must include the operation code, the registerR12
and the memory address12565
. Memory addresses in the AVR architecture have 16 bits. How is it possible to encode such instruction while remaining within the fixed format of 16 bits per instructions? There is no other solution than create an exception in the encoding. TheLDS
instruction is one of the cases in the AVR architecture that uses 32 bits as shown in the following figure:
Encoding of theLDS
instruction
The operation code in this case is represented by 11 bits, the leftmost seven bits, plus the right most four of the first 16 bits. The second 16 bits encode the memory address. These instructions are generically described asLDSRd,k
. The hexadecimal representation of this instruction is0x91C03115
.
The set of steps taken by the microprocessor to execute this instruction are different from the previous ones. In this case, after obtaining the first 16 bits (hexadecimal0x91C0
), the decoding stage detects that is aLDS
instruction and therefore a second memory word needs to be loaded with the address operand. This double access to memory, to fetch the instruction and to obtain the additional operand, will delay its execution time.
The following video shows some examples of how instructions are encoded in the AVR architecture..
7.3. The Assembly Language
To execute the instructions allowed by the architecture, all of them must be encoded with zeros and ones. The description of how to perform this encoding for each instruction is described in the document8-bit AVR Instruction Set. But writing programs using this language and its binary encoding is extremely inconvenient.
The solution for this problem is to define a language with the same instructions, operands and formats as the machine language but instead of using binary logic, use combinations of letters and numbers so that it is understood by a programmer. This language is calledassembly languageand is the notation shown in the previous sections when describing the instructions.
Thus, the assembly language can be defined as a direct alphanumeric representation of the instructions executed by a microprocessor as part of its machine language. The translation between assembly language and a machine instruction is a straightforward application of simple encoding rules to transform each operand into a set of bits of the instruction format.
Consider again the assembly language instruction encoded as0x0E0F
. These 16 bits correspond with the symbolADDR0,R31
, and the translation process between them is straightforward as shown in the previous section.
In the rest of this section we describe the syntax to write programs in assembly language for the AVR architecture. Each microprocessor has its own set of instructions and formats, thus, the assembly code is highly dependent on the architecture. However, there are several syntactical features that are common to almost all microprocessors. For example, instructions always start with the instructionmnemonic(ADD
,CPI
,LDS
, etc.) followed by the operands separated by commas. The AVR assembly language has the following additional rules:
- The first operand after the instruction mnemonic is thedestinationoperand.
- The remaining operands aresourceoperands.
- Registers are referred by their names which are made by a number between 0 and 31 with the prefix
R
orr
. - The assembly instructions are case insensitive. Both
r1
orR1
refer to the same register, andadd
andADD
refer to the same instruction. - Numeric constants in an instruction are simply represented as numbers, with no prefix. For example
CPIR16,255
. - Memory addresses are typically referred by itslabelwhich must be previously defined in the data section of an assembly program.
7.3.1. Detailed Description of the AVR Instructions
Writing programs in assembly language requires a detailed description of all the instructions executed by the microprocessor. Such description must include all possible formats, operand combinations, and how the operations are executed. This information is typically included in documents known asprogramming manuals, orinstruction set architecture manual.
In the case of the AVR architecture, this detailed description is included in the 152 pages document titledAtmel AVR 8-bit Instruction Set.
As an example of the information included in this document, the description of the instructionADD
(add without carry) includes the following details (see page 16 of the instruction set manual):
Example:
6. ADD - Add without Carry
6.1 Description
Adds two registers without the C Flag and places the result in the destination register Rd.
Operation:
Rd⟵Rd+RrSyntax:ADDRd,Rr
Operands:
0≤d≤31,0≤r≤31Program Counter:
PC⟵PC+116-bit Opcode:
6.2 Status Register (SREG) and Boolean Formula
H:
Rd3⋅Rr3+Rr3⋅R3¯+R3¯⋅Rd3Set if there was a carry from bit 3; cleared otherwise.
V:
Rd7⋅Rr7⋅R7¯+Rd7¯⋅Rr7¯⋅R7Set if two’s complement overflow resulted from the operation; cleared otherwise.
N
R7Set if MSB of the result is set; cleared otherwise.
Z:
R7¯⋅R6¯⋅R5¯⋅R4¯⋅R3¯⋅R2¯⋅R1¯⋅R0¯Set if the result is$00
; cleared otherwise.
C:
Rd7⋅Rr7+Rr7⋅R7¯+R7¯⋅Rd7Set if there was carry from the MSB of the result; cleared otherwise.
R (Result) equals Rd after the operation.
Example:
add r1, r2 ; Add r2 to r1 (r1 = r1 + r2)
add r28, r28 ; Add r28 to itself (r28 = r28 + r28)
Words:1 (2 bytes)
Cycles:1
As you can see, this page includes comprehensive information about the operation carried out by the instruction, its syntax, binary encoding, and the effect of its execution over the status register. The description also includes the size of its encoding, and the time it takes to execute, in this case, 1 cycle. All of the 124 instructions of the set are described with this level of detail.
A sequence of instructions in an assembly program is simply written with one instruction per line as shown in the following example
|
|
Although the rules to write programs in assembly code are all contained in the description of its instruction set, there is some additional functionality that is required to define and declare data, variables and functions. These definitions can then be used as symbols within a program as it can be seen in the use of the labelsmsg
andprintf
in the previous example denoting respectively a variable and a function.
7.4. Subset of Instructions of the AVR Architecture
This section includes the description of a subset of the instructions of the AVR architecture. A description of all the instructions can be found in the document 8-bit AVR Instruction Set. The instructions have been divided in to seven categories: data transfer, arithmetic, logic, shift/rotate, compare, jump/branch, subroutine call/return, and input/output.
7.4.1. Instruction to transfer data
These are instructions to move data from one location to another. The possible locations are memory or general purpose registers.
MOV: Move values between registers
Instruction that makes a copy of one register into another.
Syntax:MOVRd,Rr
where
Operations:
Rd⟵RrNo bits in the status register are modified.
Example:
MOV R1, R2 ;; Copy the value of R2 in R1
mov r1, r2 ;; The instruction is case insensitive
LDI: Load an immediate (constant) in a register
Instruction that loads an eight bit constant value in one of the general purpose registersR16
toR31
.
Syntax:LDIRd,K
, where
Operations:
Rd⟵KNo bits in the status register are modified.
Example:
LDI r16, 34 ;; Load 34 in r16
LD: Load Indirect from Data Space to Register
Instruction that loads one byte from the memory address in the data space contained in one of the 16 bits registersX
,Y
orZ
to a register. The value of the memory address can bepost-incremented, orpre-decremented. The register must have the appropriate address loaded before the instruction is executed.
Syntax:
LDRd,X
;LDRd,Y
;LDRd,Z
LDRd,X+
;LDRd,Y+
;LDRd,Z+
,LDRd,-X
;LDRd,-Y
;LDRd,-Z
where
0≤d≤31Operations:
- Rd⟵(X),Rd⟵(Y),Rd⟵(Z)
- Rd⟵(X),X⟵X+1;Rd⟵(Y),Y⟵Y+1;Rd⟵(Z),Z⟵Z+1
- X⟵X+1,Rd⟵(X);Y⟵Y−1,Rd⟵(Y);Z⟵Z+1,Rd⟵(Z)
No bits in the status register are modified.
Example:
ld r23, x
ld r24, y+
ld r25, -z
LDS: Load Directly from the Data Space
Instruction that loads one byte from the data space directly from a given address. The address is typically given as a string which has been previously defined in the data section of a program.
Syntax:
LDSRd,k
where
0≤d≤31,0≤k≤65535Operations:
Rd⟵(k)No bits in the status register are modified.
Example:
lds r3, maximum ;; Load the byte in label maximum into r3
LDD: Load Indirect + Displacement from Data Space to Register
Instruction that loads one byte from the memory address in the data space to a register. The address is computed adding the register and the displacement expressed as a natural number
dsuch that 0≤d≤63.Syntax:
LDDRd,Y+q
,LDDRd,Z+q
,
where
0≤d≤31,0≤q≤63Operations:
- Rd⟵(Y+q)
- Rd⟵(Z+q)
No bits in the status register are modified.
Example:
ldd r25, y+12 ;; Load r25 with the value pointed by y + 12
ldd r31, z+63 ;; Load r31 with the value pointed by z + 63
ST: Store byte
Instruction that stores the byte in the second operand in the address pointed by the registerX
,Y
, orZ
. The value of the memory address can bepost-incremented, orpre-decremented. The register must have the appropriate address loaded before the instruction is executed.
Syntax:
STX,Rr
;STY,Rr
;STZ,Rr
STX+,Rr
;STY+,Rr
;STZ+,Rr
ST-X,Rr
;ST-Y,Rr
;ST-Z,Rr
where
0≤r≤31Operations:
- (X)⟵Rr;(Y)⟵Rr;(Z)⟵Rr
- (X)⟵Rr;X⟵X+1;(Y)⟵Rr;Y⟵Y+1;(Z)⟵Rr;Z⟵Z+1
- X⟵X−1,(X)⟵Rr;Y⟵Y−1,(Y)⟵Rr;Z⟵Z−1,(Z)⟵Rr
No bits in the status register are modified.
Example:
st x, r25
st y+, r25
st -z, r25
STD: Store from a Register into Indirect + Displacement Data Space
Instruction that stores one byte from the given register in memory address in the data space. The memory address is computed adding the register and the displacement expressed as a natural number
dsuch that 0≤d≤63.Syntax:
STDY+q,Rr
,STDZ+q,Rr,
where
0≤31,0≤q≤63
Operations:
- (Y+q)⟵Rr
- (Z+q)⟵Rr
No bits in the status register are modified.
Example:
std y+12, r25 ;; Store r25 in the address pointed by y + 12
std z+63, r31 ;; Store r31 in the address pointed by z + 63
STS: Store a register directly in memory
Instruction that stores one byte from a register into the data memory in an address directly included in the instruction. The address is typically a symbol that has been previously defined.
Syntax:
STSk,Rr
where
0≤r≤31,0≤k≤65535Operations:
(k)⟵RrExample:
STS maximum, R2 ;; Store R2 in label maximum in memory.
PUSH: Store the register value in the stack
Instruction that stores the value in the given register in the stack and updates the value of the stack pointer (SP).
Syntax:PUSHRr
where
Operations:
SP⟵SP−1,(SP)⟵RrNo bits in the status register are modified.
Example:
PUSH R14 ;; Store R14 at the top of the stack
POP: Load the register with the value at the top of the stack
Instruction that loads the value at the top of the stack, pointed by the stack pointer (SP) in the given register. The stack pointer is also updated to keep pointing to the new top of the stack.
Syntax:POPRd
where
Operations:
Rd⟵(SP),SP⟵SP+1No bits in the status register are modified.
Example:
POP R14 ;; Restore the value of R14 from the stack
7.4.2. Arithmetic Instructions
ADD: Add two registers
Instruction that adds two registers and places the result in the destination one.
Syntax:ADDRd,Rr
where
Operations:
Rd⟵Rd+RrThe bits H, S, V, N, Z and C of the status register are modified.
Example:
ADD R1, R2 ;; Add R1 and R2 and leave the result in R1
SUB: Subtract two registers
Instruction that subtracts the second register from the first and leaves the result in the first.
Syntax:SUBRd,Rr
where
Operations:
Rd⟵Rd−RrThe bits H, S, V, N, Z and C of the status register are modified.
Example:
sub r3, r4 ;; Store in R3, the result of R3 - R4
SUBI: Subtract an 8-bit constant from a register
Instruction that subtracts the second operand, an 8-bit constant from the register and leaves the result in the register.
Syntax:SUBIRd,K
where
Operations:
Rd⟵Rd−kThe bits H, S, V, N, Z and C of the status register are modified.
Example:
subi r23, 14 ;; Store in R23, the result of R23 - 14
INC: Increment
Instruction that adds one to the content of the given register and places the result in the same register.
Syntax:INCRd
where
Operations:
Rd⟵Rd+1The bits S, V, N, Z and C from the status registers are updated.
Example:
INC R26
DEC: Decrement
Instruction that subtract one from the content of the given register and leaves the result in the same resister.
Syntax:DECRd
where
Operations:
Rd⟵Rd−1The bits S, V, N, Z and C from the status registers are updated.
Example:
DEC R5
NEG: Negate the value in a register
Instruction that changes the sign of the value in a register
Syntax:NEGRd
where
Operations:
Rd⟵0−RdThe bits H, S, V, N, Z, and C are updated.
Example:
NEG R22 ;; Change the sign of the value in register R22.
MUL: Unsigned Multiplication
Instruction that performs the multiplication of two unsigned 8-bit values. The 16-bit unsigned product isalwaysstored in registersR1
(the high byte), andR0
(the low byte).
Syntax:MULRd,Rr
where
Operations:
R1: R0⟵Rd×RrThe bits Z and C of the status register are updated.
Example:
MUL R5, R7 ;; The 16 bit product is stored in R1:R0
7.4.3. Logic Instructions
Logic instructions perform what is known asbit-wiselogic operations. In other words, the operation is performed in all 8 bits of the operands by paring the bits in the same position for both operands (when appropriate) and performing the basic operation.
AND: Logical Conjunction
Instruction that performs the logical AND between the content of the two given registers.
Syntax:
ANDRd,Rr
where
Operations:
Rd⟵Rd⋅RrThe bits S, N, Z and C of the status register are updated, and the bit V is set to zero.
Example:
AND R16, R7 ;; Compute the conjunction and leave result in R16
OR: Logical Disjunction
Instruction that performs the logical OR between the content of the two given registers.
Syntax:
ORRd,Rr
where
Operations:
Rd⟵Rd∨RrThe bits S, N, Z and C of the status register are updated, and the bit V is set to zero.
Example:
OR R16, R7 ;; Compute the disjunction and leave result in R16
EOR: Exclusive OR
Instruction that performs the logical exclusive OR between the content of the two given registers.
Syntax:
EORRd,Rr
where
Operations:
Rd⟵Rd⊕RrThe bits S, N, Z and C of the status register are updated, and the bit V is set to zero.
Example:
EOR R16, R7 ;; Compute the exclusive or and leave result in R16
7.4.4. Shift and Rotate Instructions
Shift and rotate instructions are used to access the bits inside a byte independently. By shifting and rotation their positions, they are prepared to be then processed by other instructions.
LSR/LSL: Logical Shift Right/Left
Instruction that shifts all bits in the given register by one place to the right/left. The instructionLSR
clears bit 7 of the register, and bit 0 is loaded with the value of the C (carry) flag of the status register. This operation is equivalent to divide the unsigned value in the register by two. The instructionLSL
clears bit 0 of the register, and bit 7 is loaded into the C flag of the status register. This operation is the same as multiplying signed and unsigned values by two.
Syntax:
LSRRd
LSLRd
where
0≤d≤31Operations:
The bits H, S, V, N, Z and C of the status register are updated.
Example:
LSL R12 ;; Shift R12 to the left (multiply by 2)
LSR R15 ;; Shift R15 to the right (divide by 2)
ROR/ROL: Rotate Right/Left through Carry
Instruction that treats its operand as a circular register by concatenating the carry bit of the status register. The instructions shift the carry bit C into bit 7/bit 0. The instruction can be used to multiply multi-byte numbers by two.
Syntax:
RORRd
ROLRd
where
0≤d≤31Operations:
The bits H, S, V, N, Z and C of the status register are updated.
Example:
ROL R12 ;; Rotate R12 to the left
ROR R15 ;; Rotate R15 to the right
ASR: Arithmetic Shift Right
Shifts all bits in Rd one place to the right. Bit 7 is held constant, and bit 0 is loaded into the C flag in the status register. This instruction is equivalent to divide a number by two while maintaining its sign.
Syntax:
ASRRd
where
Operations:
The bits S, V, N, Z and C of the status register are updated.
Example:
ASR R31 ;; Shift to the right (multiply by 2 and propagate sign)
7.4.5. Compare Instructions
The following two instructions are used when a comparison is needed between two numbers. This type of operations are very common in assembly programs. For example, when evaluating the condition of an iterative structure (a loop), the comparison is used to decide if a new iteration is required. Two comparison instructions are presented: one that compares two registers, and another that compares one register with a constant.
CP: Compare
Instruction that performs a comparison between two registers. The processor performs a subtraction between the two operandsbut the result is not storedin a general purpose register, but only modifies the flags of the status register.
Syntax:CPRd,Rr
where
Operations:
Rd−RrThe bits H, S, V, N, Z and C of the status register are modified.
Example:
CP R1, R5 ;; R5 - R1 is calculated and reflected in the status register
CPI: Compare with Immediate
Instruction that performs a comparison between a register and a 8-bit constant.The result is not storedin a general purpose register, but only modifies the flags of the status register.
Syntax:CPRd,K
where
Operations:
Rd−KThe bits H, S, V, N, Z and C of the status register are modified.
Example:
CPI R16, 5 ;; Subtract R16 - 5 and updates the status register
7.4.6. Jump and Branch Instructions
The set of instructions to jump and branch is one of the most important in the instruction set architecture. They are used to change the sequence of instruction execution and are the building blocks to implement high level constructions such as conditionals and loops. The difference between the jump and branch instructions is that the first ones always jump to a new location in the code, whereas the second change the sequenceonly if a specific condition is met. The branch instructions are also known asconditional branchorconditional jumpinstructions. Processors typically include a large set of these instructions to include as many conditions as possible.
JMP: Jump
Instruction that jumps to an addressk
in the program memory.
Syntax:JMPk
where
Operations:
PC⟵kNo bits in the status register are modified.
Example:
JMP dest ;; Jump to location with label dest
....
....
dest: ADD R1, R2 ;; Destination of the jum
BREQ/BRNE: Branch if Equal/Not Equal
Instructions that jump to the specified location in the code if the previous comparison or operation resulted in a zero result. The instructions check the value of the Z flag in the status register. The destination of the jump is supposed to be at a short distance of this instruction (no more than 63 instructions before or after).
Syntax:
BREQk
BRNEk
where
−64≤k≤63Operations:
- if(Z=1)thenPC⟵PC+k+1,elsePC⟵PC+1
- if(Z=0)thenPC⟵PC+k+1,elsePC⟵PC+1
No bits in the status register are modified.
Example:
CPI R27, 5 ;; Compare R27 with number 5
BRNE done ;; Branch if R27 is different from 5
...
...
done: ADD ... ;; Destination of the branch
BRSH/BRLO: Branch if same or higher/lower (Unsigned)
Instructions that branch to a specific location if the previous operation has set the carry flag (C) to zero or one respectively. If preceded by a comparison operation, these instructions branch if the first operand of the comparison satisfies the condition specified in the branch with respect to the second operandboth interpreted as unsigned integers. For example, in the sequence:
CPI R23, 15
BRSH dest ;; Branch if R23 same or higher than 15
the program will branch if R23 (as unsigned integer) issame or higherthan 15.
Syntax:
BRSHk
BRLOk
where
−64≤k≤63Operations:
- if(C=0)thenPC⟵PC+k+1,elsePC⟵PC+1
- if(C=1)thenPC⟵PC+k+1,elsePC⟵PC+1
No bits in the status register are modified.
Example:
CPI R23, 15
BRLO dest ;; Branch if R23 is lower than 15 as unsigned
...
...
...
dest: ADD ...
BRGE/BRLT: Branch if greater or equal/lower (Signed)
Instructions that branch to a specific location if the previous operation has set the sign flag (S) to zero or one respectively. If preceded by a comparison operation, these instructions branch if the first operand of the comparison satisfies the condition specified in the branch with respect to the second operandboth interpreted as signed integers. For example, in the sequence:
CPI R23, -15
BRGE dest ;; Branch if R23 same is greater than -15
the program will branch if R23 (as signed integer) issame or higherthan -15.
Syntax:
BRGEk
BRLTk
where
−64≤k≤63Operations:
- if(S=0)thenPC⟵PC+k+1,elsePC⟵PC+1
- if(S=1)thenPC⟵PC+k+1,elsePC⟵PC+1
No bits in the status register are modified.
Example:
CPI R23, -15
BRLT dest ;; Branch if R23 is less than -15 as signed int.
...
...
...
dest: ADD ...
7.4.7. Compare and Branch Sequences
As mentioned before, the sequence of two instructions in which the first one compares two operands and the second is a conditional branch instruction is executed very frequently by the processor. These two-instruction sequences are the building block to implement the change in execution flow in high level constructions such as if-then-else (conditionals) or loops.
It is very important to understand how conditions are calculated, reduced to a comparison, and the result used to decide if a branch is occurring or not. When writing assembly code, there is a simple rule to remember how these instructions are interpreted. If we assume that there are two operandsA
andB
compared using one of thecompare and test instructionsLinks to an external site.followed by a conditional branch such that:
CPI A, B
BR[CONDITION] destination
the code will branch todestination
if the conditionACONDITIONB
is satisfied. For example, if the branch instruction isBRGE
, then the processor will branch ifA
is greater or equal thanB
.
For example, the high level programming fragment:
if (x < 12) {
[Code]
}
can be implemented in assembly code using the compare and branch instructions as follows (assuming that the valuex
is an integer and has been previously loaded in registerR24
):
CPI R24, 12
BRGE done ;; Skip the block if R24 is greater or equal to 12
...
... ;; Instructions implemented the [Code] block
...
done: ... ;; Instructions following the if-then construction
7.4.8. Subroutine Call and Return
The instructions to call and return from a subroutine are also fundamental to implement function calls. Both instructions make use of the stack to store the address from which the call is being made, and where the subroutine must return upon termination. The implicit assumption is that function calls are perfectly nested, that is, a subroutine may call inside another subroutine that upon termination returns to the point where the first was executing.
The reason to use the stack to store the return address is derived from the existence of recursive functions. A program may have a single routine, method, or function, but during execution, the function may call itself, thus requiring to keep an arbitrary number of return addresses (as many as function invocations currently executing). A fixed location to store the return address would not allow functions to call themselves. The stack solves this problem by copying the return address as many times as needed in a specific order.
CALL: Call a subroutine
Instruction that calls a subroutine located in any other position of the program memory. The return address (the address of the instruction following the call) is stored in the stack (STACK). The stack pointer (SP) is decremented to point to the return address.
Syntax:CALLk
where
Operations:
PC⟵k,SP⟵SP−2,STACK⟵PC+2No bits in the status register are modified.
Example:
CALL funct ;; Call the function
...
...
funct: ... ;; First instruction of the function
RET: Return from subroutine
Instruction that returns to the instruction following the instructionCALL
last invoked. The instruction has no operands, as they areimplicit. The address is assumed to be stored at the top of the stack. The instruction modifies the stack (STACK) to remove this value and adjust the stack pointer (SP).
Syntax:
ret
Operations:
PC⟵STACK,SP⟵SP+2No bits in the status register are modified.
Example:
CALL funct ;; Call the function
... ;; Instruction to return
...
funct: ... ;; First instruction of the function
...
RET ;; Return to the point after funct was invoked
7.4.9. Input and Output Instructions
The input and output instructions are similar to the load and store with the difference that data is transferred from and to the input/output ports. These ports are denoted by their address.
IN: Load (reads) data from I/O space into register
Instruction that reads the value of an input port in location given as second parameter, and stores the value in the register given as first parameter.
Syntax:INRd,A
where
A
is the address of an input port.
Operations:
Rd⟵I/O[A]No bits in the status register are modified.
Example:
in r12, 0x3E ;; Load the data in port 0x3E in r12
OUT: Stores (writes) data from a register in I/O space
Instruction that stores or writes the data in the register given as second parameter in the output location given as first parameter.
Syntax:OUTA,Rd
where
A
is the address of an input port.
Operations:
I/O[A]⟵RdNo bits in the status register are modified.
Example:
out 0x3D, r13 ;; Write the data in r12 to port 0x3E
7.5. Summary of Instructions
The subset of instructions considered in this document is summarized in the following tables. You maydownloadaPDFwiththesetablesforyourconvenience
.
Mnemonic | Operands | Description | Operation |
---|---|---|---|
MOV | Rd, Rr | Move Rr to Rd | Rd ← Rr |
LDI | Rd, K | Load K in Rd | Rd ← K |
Rd must be from R16 to R31, 0 ⩽ K ⩽ 255 | |||
LD | Rd, X | Load (X) in Rd | Rd ← (X) |
Rd, X+ | Load (X) in Rd, X is incremented | Rd ← (X), X ← X + 1 | |
Rd, -X | X is decremented, Load (X) in Rd | X ← X - 1, Rd ← (X) | |
All three registers X, Y and Z can be used | |||
LDD | Rd, Y+q | Rd is loaded with (Y + q) | Rd ← (Y + q) |
Register Z can also be used. Register X cannot be used. q has to be between 0 and 63 | |||
LDS | Rd, k | Load Rd with data in position k in memory. | Rd ← (k) |
k cannot go beyond 65535 | |||
ST | X, Rr | Store Rr in (X) | (X) ← Rr |
X+, Rd | Store Rr in (X), X is incremented | (X) ← Rr, X ← X + 1 | |
-X, Rd | X is decremented, Store Rr in (X) | X ← X - 1, (X) ← Rr | |
All three registers X, Y and Z can be used | |||
STD | Y+q, Rr | (Y + q) stores Rr | (Y + q) ← Rr |
Register Z can also be used. Register X cannot be used. q has to be between 0 and 63 | |||
STS | k, Rr | Store Rr in position k in memory. | (k) ← Rd |
k must satisfy 0 ⩽ k ⩽ 65535 | |||
PUSH | Rr | Push Rr to stack | SP ← SP - 1, (SP) ← Rr |
POP | Rd | Pop Rd from stack | Rd ← (SP), SP ← SP + 1 |
Mnemonic | Operands | Description | Operation |
---|---|---|---|
ADD | Rd, Rr | Add Rr to Rd | Rd ← Rd + Rr |
SUB | Rd, Rr | Subtract Rr from Rd | Rd ← Rd - Rr |
SUBI | Rd, K | Subtract K from Rd | Rd ← Rd - K |
K must be between 0 and 255 | |||
INC | Rd | Increment Rd | Rd ← Rd + 1 |
DEC | Rd | Decrement Rd | Rd ← Rd - 1 |
NEG | Rd | Change sign of Rd | Rd ← 0 - Rd |
MUL | Rd, Rr | Unsigned multiply | R1:R0 ← Rd * Rr |
Mnemonic | Operands | Description | Operation |
---|---|---|---|
AND | Rd, Rr | Conjunction of Rd and Rr | Rd ← Rd * Rr |
OR | Rd, Rr | Disjunction of Rd and Rr | Rd ← Rd + Rr |
EOR | Rd, Rr | Exclusive OR of Rd and Rr | Rd ← Rd ⊕ Rr |
Mnemonic | Operands | Description | Operation |
---|---|---|---|
LSL/LSR | Rd | Shift left/right | Shift into C flag |
ROL/ROR | Rd | Rotate left/right | Rotate with C flag |
ASR | Rd | Arithmetic shift right | Sign extended shift |
Mnemonic | Operands | Description | Operation |
---|---|---|---|
CP | Rd, Rr | Subtract and update flags | Rd - Rr |
CPI | Rd, K | Subtract K and update flags | Rd - K |
K must be between 0 and 255 |
Mnemonic | Operands | Description | Operation |
---|---|---|---|
JMP | k | Go to instruction in position k | PC ← k |
BREQ/BRNE | k | Branch if equal/ not equal to instruction in position k | If (Z = 1/0) then PC ← PC + k+1 otherwise PC ← PC + 1 |
k must satisfy -64 ⩽ k ⩽ 63 | |||
BRSH/BRLO | k | Branch if same or higher/lower (unsigned) to instruction in position k | If (C = 0/1) then PC ← PC + k+1 otherwise PC ← PC + 1 |
k must satisfy -64 ⩽ k ⩽ 63 | |||
BRGE/BRLT | k | Branch if greater or equal/lower (signed) to instruction in position k | If (S = 0/1) then PC ← PC + k+1 otherwise PC ← PC + 1 |
k must satisfy -64 ⩽ k ⩽ 63 |
Mnemonic | Operands | Description | Operation |
---|---|---|---|
CALL | k | Call subroutine in position k | PC ← k, SP ← SP - 2, (SP) ← PC + 2 |
RET | Return from subroutine | PC ← (SP), SP ← SP + 2 |
Mnemonic | Operands | Description | Operation |
---|---|---|---|
IN | Rd, A | Load I/O data from address A to register Rd | Rd ← I/O[A] |
OUT | A, Rd | Store value in register Rd in I/O address A | I/O[A] ← Rd |
The following video shows a summary of some of the instructions provided by the AVR architecture..
7.1. Types of Instruction Sets
The set of instructions that a microprocessor can execute and their structure is generically called itsinstruction set architectureor ISA. The ISA description contains all the details about the execution of each instruction and the effects in the components of the microprocessor. A microprocessor may have a small number of instruction types but a large number of total possible instructions because of the variations allowed in the operands. For example, an instruction to add two numbers may have variations for signed and unsigned numbers, one of the operands is stored in memory, or the result can be stored in one of the general purpose registers.
Most of the information in this chapter is contained in the document8-bit AVR Instruction Setor itsequivalent HTML version.
The decision about what instructions will be possible to execute in a microprocessors is one of the most important and has a profound effect in the performance when executing programs. Furthermore, the set of instructions and the processor architecture are tightly coupled. For example, the operands for most of the instructions are stored in the register file, thus, it must allow for these operands to be accessed efficiently. The AVR architecture, for example, allows various combinations when accessing the operands in the stageRegister Operand Fetch.
The other criteria that influences the type of instructions supported by a microprocessor is the complexity of the implementation. Ultimately, the instructions need to be sequenced by a digital circuit. The higher the complexity of the tasks carried out by an instruction, the higher the complexity of the design. This complexity usually translates into a large number of gates that require a larger physical space (larger chip) and more power consumption.
Consider the following example. Should a processor include as part of its machine language an instruction that given a real number and the coefficients of a second order polynomial obtains its corresponding value? Suppose we call this instructionSDPE
(second degree polynomial evaluation). One possible format of this instruction could be:
SDPE a, b, c, n, dest
The instruction would evaluate the polynomial using the three coefficients and the additional value, and store the result in locationdest
. The evaluation would calculate the value with the formula:
Although such instruction could be useful, a processor without such instruction could still calculate the same result executing a sequence of instructions to perform the required additions and multiplications. Thus the trade-off to explore when designing a microprocessor is between the complexity of the architecture, and the performance obtained by including certain operations in the ISA. A microprocessor with a reduced set of instructions may require more operations to perform sophisticated calculations, but they are executed faster due to its simplicity. Conversely, a microprocessor with a large number of instructions may require much shorter instruction sequences, but they may take longer to execute due to their complexity.
Over the years microprocessor manufacturers have explored this trade-off and created two categories of systems depending on the complexity of the machine languages.
7.1.1. Complex Instruction Set Computers (CISC)
The processors that provide a rich and complex set of instructions are calledComplex Instruction Set ComputersorCISC. These instructions typically use several operands and may require multiple memory accesses. There are various examples of CISC microprocessors, but perhaps the most popular architectures of this type are Intel’s x86, IA-32, and IA-64 (also known asItanium,x86_64orAMD64.) Thex86architecture was present in the personal computers in 1970s. The architecture was then evolved to the IA-32 (32 bit bus), and then the 64 bit version. These processors are present in numerous desktop computers, servers, and laptops. Other examples of CISC architectures are those in systems such as IBM’s System/360, DEC PDP-11, DEC VAX, or Motorola 68k.
The machine instructions in a CISC computer allow handling multiple operands, some or all of them in memory, and accessed with address manipulations that may require certain arithmetic operations. For example, the following instruction is part of the IA-32 architecture:
ADD $4, 14(%eax, %ebx, 8)
and adds the value 4 to the data in memory stored in the address obtained by adding the value 14, the content of register%eax
, and the content of register%ebx
multiplied by 8. As you can see, the execution of this instruction requires more than one arithmetic operation. The addition denoted by the instruction code (ADD
) can be performed once the operands are obtained, and additional additions, and even a multiplication is needed to obtain the second operand.
7.1.2. Reduced Instruction Set Computers (RISC)
RISC architectures appeared as an alternative to existing CISC microprocessors. The philosophy is the opposite, microprocessors have a very reduce set of instructions performing each avery simple operation. The idea is to translate this simplicity in the functionality into a faster execution, and a simpler structure in the microprocessor. A reduced execution time does not necessarily translate into a processor that is faster than a CISC architecture. A simple set of instructions means that complex operations will require sequences of instructions, some of them very large.
Some examples of current RISC architectures are:
- MIPS (Microprocessor without Interlocked Pipeline Stages): used in some routers, Nintendo consoles, original PlayStation, PlayStation 2 and PSP.
- ARM: present in numerous computers and personal devices such as digital cameras, mobile phones, iPod, etc.
- SPARC (Scalable Processor ARChitecture): powered the systems sold by Sun Microsystems (now owned by Oracle).
- PowerPC (Performance OPtimization with Enhanced RISC - Performance Computing): It was created by a consortium of Apple, IBM and Motorola in the early 1990s to be used in personal computers. Not widely used these days.
- AVR architecture. This architecture is now included in multiple microcontrollers used in embedded systems. Some of the systems in the Arduino family use microprocessors with this architecture.
The comparison between RISC and CISC architectures cannot be done solely in terms of the number of instructions executed per unit of time. With that measure, RISC processors are clear winners. A more exact comparison is to execute the same high level task in two processors and measure the amount of time it takes to finish the task, instead of the number of instructions executed. Very likely, the CISC architecture will execute a smaller number of instructions, but each of them will take longer than the RISC architecture.
7.1.3. Fixed vs Variable Size Instruction Encoding
Another important feature of a microprocessor that can be used to divide them into two categories is the format in which the instructions are encoded. In processors that use afixed lengthformat, every instruction has exactly the same size. This feature has numerous consequences in the design stage. Typically, a small number of operands are allowed in every instruction. A large number of operands would make the format longer, and perhaps not fully used by all the instructions. The number of different instructions is also reduced as a larger number of instructions translates into a larger number of possible cases to encode, and eventually into a larger format. The main advantage of these instructions is in the decoding stage. That is, when the instruction has been loaded in the instruction register, the processor needs to identify which one it is and the type of operands that are required. A fixed format greatly simplifies this stage as the bits encoding the different elements of the instruction are located in the same positions.
The processors with avariable lengthformat, on the other hand, allow instructions to be encoded with different number of bits. Thus, an instruction may have an arbitrary number of operands as they will be encoded with additional bits. The main disadvantage of this approach is during the decoding stage, that is, when the processor needs to identify the type of instruction to execute and obtain all its operands. In these processors, the decoding stage may require loading additional bytes from memory while the instruction is being decoded.
The following video summarizes the differences between CISC and RISC architecture..