5.1 Elements of Assembly Language

MikroElektronika

Assembly language is basically like any other language, which means that it has its words, rules and syntax. The basic elements of assembly language are:

  • Labels;
  • Orders;
  • Directives; and
  • Comments.

8051-chapter-05-image-001

Syntax of Assembly language

When writing a program in assembly language it is necessary to observe specific rules in order to enable the process of compiling into executable “HEX-code” to run without errors. These compulsory rules are called syntax and there are only several of them:

  • Every program line may consist of a maximum of 255 characters;
  • Every program line to be compiled, must start with a symbol, label, mnemonics or directive;
  • Text following the mark “;” in a program line represents a comment ignored (not compiled) by the assembler; and
  • All the elements of one program line (labels, instructions etc.) must be separated by at least one space character. For the sake of better clearness, a push button TAB on a keyboard is commonly used instead of it, so that it is easy to delimit columns with labels, directives etc. in a program.

Numbers

If octal number system, otherwise considered as obsolite, is disregarded, assembly laguage allows numbers to be used in one out of three number systems:

Decimal Numbers

If not stated otherwise, the assembly language considers all the numbers as decimal. All ten digits are used (0,1,2,3,4,5,6,7,8,9). Since at most 2 bytes are used for saving them in the microcontroller, the largest decimal number that can be written in assembly language is 65535. If it is necessary to specify that some of the numbers is in decimal format, then it has to be followed by the letter “D”. For example 1234D.

Hexadecimal Numbers

Hexadecimal numbers are commonly used in programming. There are 16 digits in hexadecimal number system (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F). The largest hexadecimal number that can be written in assembly language is FFFF. It corresponds to decimal number 65535. In order to distinguish hexadecimal numbers from decimal, they are followed by the letter “h”(either in upper- or lowercase). For example 54h.

Binary Numbers

Binary numbers are often used when the value of each individual bit of some of the registers is important, since each binary digit represents one bit. There are only two digits in use (0 and 1). The largest binary number written in assembly language is 1111111111111111. In order to distinguish binary numbers from other numbers, they are followed by the letter “b” (either in upper- or lowercase). For example 01100101B.

Operators

Some of the assembly-used commands use logical and mathematical expessions instead of symbols having specific values. For example:

As seen, the assembly language is capable of computing some values and including them in a program code, thus using the following mathematical and logical operations:

NAME OPERATION EXAMPLE RESULT
+ Addition 10+5 15
Subtraction 25-17 8
* Multiplication 7*4 28
/ Division (with no remainder) 7/4 1
MOD Remainder of division 7 MOD 4 3
SHR Shift register bits to the right 1000B SHR 2 0010B
SHL Shift register bits to the left 1010B SHL 2 101000B
NOT Negation (first complement of number) NOT 1 1111111111111110B
AND Logical AND 1101B AND 0101B 0101B
OR Logical OR 1101B OR 0101B 1101B
XOR Exclusive OR 1101B XOR 0101B 1000B
LOW 8 low significant bits LOW(0AADDH) 0DDH
HIGH 8 high significant bits HIGH(0AADDH) 0AAH
EQ, = Equal 7 EQ 4 or 7=4 0 (false)
NE,<> Not equal 7 NE 4 or 7<>4 0FFFFH (true)
GT, > Greater than 7 GT 4 or 7>4 0FFFFH (true)
GE, >= Greater or equal 7 GE 4 or 7>=4 0FFFFH (true)
LT, < Less than 7 LT 4 or 7<4 0 (false)
LE,<= Less or equal 7 LE 4 or 7<=4 0 (false)

Symbols

Every register, constant, address or subroutine can be assigned a specific symbol in assembly language, which considerably facilitates the process of writing a program. For example, if the P0.3 input pin is connected to a push button used to stop some process manually (push button STOP), the process of writing a program will be much simpler if the P0.3 bit is assigned the same name as the push button, i.e. “pushbutton_STOP”. Of course, like in any other language, there are specific rules to be observed as well:

  • For the purpose of writing symbols in assembly language, all letters from alphabet (A-Z, a-z), decimal numbers (0-9) and two special characters (“?” and “_”) can be used. Assembly language is not case sensitive.

For example, the following symbols will be considered identical:

  • In order to distinguish symbols from constants (numbers), every symbol starts with a letter or one of two special characters (? or _).
  • The symbol may consist of maximum of 255 characters, but only first 32 are taken into account. In the following example, the first two symbols will be considered duplicate (error), while the third and forth symbols will be considered different:

  • Some of the symbols cannot be used when writing a program in assembly language because they are already part of instructions or assembly directives. Thus, for example, a register or subroutine cannot be assigned name “A” or “DPTR” because there are registers having the same name.

Here is a list of symbols not allowed to be used during programming in assembly language:

A AB ACALL ADD
ADDC AJMP AND ANL
AR0 AR1 AR2 AR3
AR4 AR5 AR6 AR7
BIT BSEG C CALL
CJNE CLR CODE CPL
CSEG DA DATA DB
DBIT DEC DIV DJNZ
DPTR DS DSEG DW
END EQ EQU GE
GT HIGH IDATA INC
ISEG JB JBC JC
JMP JNB JNC JNZ
JZ LCALL LE LJMP
LOW LT MOD MOV
MOVC MOVX MUL NE
NOP NOT OR ORG
ORL PC POP PUSH
R0 R1 R2 R3
R4 R5 R6 R7
RET RETI RL RLC
RR RRC SET SETB
SHL SHR SJMP SUBB
SWAP USING XCH XCHD
XDATA XOR XRL XSEG

Labels

A label is a special type of symbols used to represent a textual version of an address in ROM or RAM memory. They are always placed at the beginning of a program line. It is very complicated to call a subroutine or execute some of the jump or branch instructions without them. They are easily used:

  • A symbol (label) with some easily recognizable name should be written at the beginning of a program line from which a subroutine starts or where jump should be executed.
  • It is sufficient to enter the name of label instead of address in the form of 16-bit number in instructions calling a subroutine or jump.

During the process of compiling, the assembler automatically replaces such symbols with appropriate addresses.

Directives

Unlike instructions being compiled and written to chip program memory, directives are commands of assembly language itself and have no influence on the operation of the microcontroller. Some of them are obligatory part of every program while some are used only to facilitate or speed up the operation.
Directives are written in the column reserved for instructions. There is a rule allowing only one directive per program line.

EQU directive

The EQU directive is used to replace a number by a symbol. For example:

After using this directive, every appearance of the label “MAXIMUM” in the program will be interpreted by the assembler as the number 99 (MAXIMUM = 99). Symbols may be defined this way only once in the program. The EQU directive is mostly used at the beginning of the program therefore.

SET directive

The SET directive is also used to replace a number by a symbol. The significant difference compared to the EQU directive is that the SET directive can be used an unlimited number of times:

BIT directive

The BIT directive is used to replace a bit address by a symbol. The bit address must be in the range of 0 to 255. For example:

CODE directive

The CODE directive is used to assign a symbol to a program memory address. Since the maximum capacity of program memory is 64K, the address must be in the range of 0 to 65535. For example:

DATA directive

The DATA directive is used to assign a symbol to an address within internal RAM. The address must be in the range of 0 to 255. It is possible to change or assign a new name to any register. For example:

IDATA directive

The IDATA directive is used to change or assign a new name to an indirectly addressed register. For example:

XDATA directive

The XDATA directive is used to assign a name to registers within external (additional) RAM memory. The addresses of these registers cannot be larger than 65535. For example:

ORG directive

The ORG directive is used to specify a location in program memory where the program following directive is to be placed. For example:

This program starts at location 100. The table containing data is to be stored at location 1024 (1000h).

USING directive

The USING directive is used to define which register bank (registers R0-R7) is to be used in the program.

END directive

The END directive is used at the end of every program. The assembler will stop compiling once the program encounters this directive. For example:

Directives used for selecting memory segments

There are 5 directives used for selecting one out of five memory segments in the microcontroller:

The CSEG segment is activated by default after enabling the assembler and remains active until a new directive is specified. Each of these memory segments has its internal address counter which is cleared every time the assembler is activated. Its value can be changed by specifying value after the mark AT. It can be a number, an arithmetical operation or a symbol. For example:

A dollar symbol “$” denotes current value of address counter in the currently active segment. The following two examples illustrate how this value can be used practically:

Example 1:

Example 2:

These two program lines can be used for computing exact number of characters in the message “ALARM turn off engine” which is defined at the address assigned the name “MESSAGE”.

DS directive

The DS directive is used to reserve memory space expressed in bytes. It is used if some of the following segments ISEG, DSEG or XSEG is currently active. For example:

Example 1:

Example 2:

DBIT directive

The DBIT directive is used to reserve space within bit-addressable part of RAM. The memory size is expressed in bits. It can be used only if the BSEG segment is active. For example:

DB directive

The DB directive is used for writing specified value into program memory. If several values are specified, then they are separated by a comma. If ASCII array is specified, it should be enclosed within single quotation marks. This directive can be used only if the CSEG segment is active. For example:

If this directive is preceeded by a lable, then the label will point to the first element of the array. It is the number 22 in this example.

DW directive

The DW directive is similar to the DB directive. It is used for writing a two-byte value into program memory. The higher byte is written first, then the lower one.

IF, ENDIF and ELSE directives

These directives are used to create so called conditional blocks in the program. Each of these blocks starts with directive IF and ends with directive ENDIF or ELSE. The statement or symbol (in parentheses) following the IF directive represents a condition which specifies the part of the program to be compiled:

  • If the statement is correct or if the symbol is equal to one, the program will include all instructions up to directive ELSE or ENDIF.
  • If the statement is not correct or if the symbol value is equal to zero, all instructions are ignored, i.e. not compiled, and the program continues with instructions following directives ELSE or ENDIF.

Example 1:

If the program is of later date than version 3 (statement is correct), subroutines “Table 2” and “Addition” will be executed. If the statement in parentheses is not correct (VERSION<3), two instructions calling subroutines will not be compiled.

Example 2:

If the value of the symbol called “Model” is equal to one, the first two instructions following directive IF will be compiled and the program continues with instructions following directive ENDIF (all instructions between ELSE and ENDIF are ignored). Otherwise, if Model=0, instructions between IF and ELSE are ignored and the assembler compiles only instructions following directive ELSE.

Control directives

Control directives start with a dollar symbol $. They are used to determine which files are to be used by the assembler during compilation, where the executable file is to be stored as well as the final layout of the compiled program called Listing. There are many control directives, but only few of them is of importance:

\$INCLUDE directive

This directive enables the assembler to use data stored in other files during compilation. For example:

\$MOD8253 directive

This $MOD8253 directive is a file containing names and addresses of all SFRs of 8253 microcontrollers. By means of this file and directive having the same name, the assembler can compile the program on the basis of register names. If they are not used, it is necessary to specify name and address of every SFRs to be used at the beginning of the program.