Introduction
Intel 8086 microprocessor is the first member of the x 86 families of processors. Advertised as to "source-code compatible" with Intel 8080 and Intel 8085 processors, the 8086 was all not object code compatible with them. The 8086 is complete 16-bit architecture - 16-bit internal registers, 16-bit data bus, 20-bit address bu. Because the processor has a 16-bit index registers and memory pointers, it can be effectively address only of 64 KB of memory. To address memory beyond the 64 KB the CPU will uses segment registers - these registers specify the memory locations for code, stack and dataand the extra data of 64 KB segments. These segments can be positioned anywhere in memory,if necessary, user programs can be change their position. This addressing methods has one big advantage - that it is very easy to write memory-independent code, when the size of code, stack, data is smaller than 64 KB each. The complexity of the code and the programming is increases, sometimes significantly, when the size of stack, data and code is larger the 64 KB. To support different variations in this awkward memory addressing scheme has many 8086 compilers including 6 different memory models: like tiny, small, compact, medium, large and huge. The 64 KB direct addressing limitations was eliminated with the introduction of 32-bit protected mode.
Intel 8086 instructions set includes a very few powerful strings instructions. When these instructions were prefixed by the REP (repeat) instructions, the CPU will be performing the block operations- move block of data, compare data blocks and set data block to the certain value, etc. that is one 8086 string instructions with a REP prefix could do as much 4-5 instruction loop on some of the other processors.
Objective
The 8086 String Instructions
How the 8086 String Instructions Operate?
The string instructions operate on blocks (contiguous linear arrays) of memory. For example the movs instruction moves a sequence of bytes from one memory location to another. The cmps instruction compares two blocks of memory. The scas instruction scans a block of memory for a particular value. These string instructions often require three operands a destination block address a source block address and (optionally) an element count. For example when using the movs instruction to copy a string you need a source address a destination address and a count (the number of string elements to move).
Unlike other instructions which operate on memory the string instructions are single-byte instructions which don't have any explicit operands. The operands for the string instructions include
For example one variant of the movs (move string) instruction copies a string from the source address specified by ds:si to the destination address specified by es:di of length cx. Likewise the cmps instruction compares the string pointed at by ds:si of length cx to the string pointed at by es:di.
The REP/REPE/REPZ and REPNZ/REPNE
The string instructions by themselves do not operate on strings of data. The movs instruction for example will move a single byte word or double word. When executed by itself the movs instruction ignores the value in the cx register. The repeat prefixes tell the 80x86 to do a multi-byte string operation. The syntax for the repeat prefix is:
Field:
Label repeat mnemonic operand ;comment
For MOVS:
rep movs {operands}
For CMPS:
repe cmps {operands}
repz cmps {operands}
repne cmps {operands}
repnz cmps {operands}
For SCAS:
repe scas {operands}
repz scas {operands}
repne scas {operands}
repnz scas {operands}
For STOS:
rep stos {operands}
You don't normally use the repeat prefixes with the lods instruction.
As you can see the presence of the repeat prefixes introduces a new field in the source line - the repeat prefix field. This field appears only on source lines containing string instructions. In your source file:
When specifying the repeat prefix before a string instruction the string instruction repeats cx times without the repeat prefix the instruction operates only on a single byte word or double word.
The CMPS Instruction
The cmps instruction compares two strings. The CPU compares the string referenced by es:di to the string pointed at by ds:si. Cx contains the length of the two strings (when using the rep prefix). Like the movs instruction the MASM assembler allows several different forms of this instruction:
{REPE} CMPSB
{REPE} CMPSW
{REPE} CMPS
(compare memory with memory)
To compare two strings to see if they are equal or not equal you must compare corresponding elements in a string until they don't match. Consider the following strings:
"String1"
"String1"
The only way to determine that these two strings are equal is to compare each character in the first string to the corresponding character in the second. After all the second string could have been "String2" which definitely is not equal to "String1". Of course once you encounter a character in the destination string which doesn't equal the corresponding character in the source string the comparison can stop. You needn't compare any other characters in the two strings.
For character strings use the cmps instruction in the following manner:
After the execution of the cmps instruction if the two strings were equal their lengths must be compared in order to finish the comparison.
You can also use the cmps instruction to compare multi-word integer values (that is extended precision integer values). Because of the amount of setup required for a string comparison this isn't practical for integer values less than three or four words in length but for large integer values it's an excellent way to compare such values. Unlike character strings we cannot compare integer strings using a lexicographical ordering.
One last thing to keep in mind with using the cmps instruction - the value in the cx register determines the number of elements to process not the number of bytes. Therefore when using cmpsw cx specifies the number of words to compare. This of course is twice the number of bytes to compare.
The SCAS Instruction
The cmps instruction compares two strings against one another. You cannot use it to search for a particular element within a string. For example you could not use the cmps instruction to quickly scan for a zero throughout some other string. You can use the scas (scan string) instruction for this task.
Unlike the movs and cmps instructions the scas instruction only requires a destination string (es:di) rather than both a source and destination string. The source operand is the value in the al (scasb) ax (scasw) or eax (scasd) register.
The scas instruction by itself compares the value in the accumulator (al ax or eax) against the value pointed at by es:di and then increments (or decrements) di by one two or four. The CPU sets the flags according to the result of the comparison. While this might be useful on occasion scas is a lot more useful when using the repe and repne prefixes.
When the repe prefix (repeat while equal) is present scas scans the string searching for an element which does not match the value in the accumulator. When using the repne prefix (repeat while not equal) scas scans the string searching for the first string element which is equal to the value in the accumulator.
The scas instruction takes the following forms:
{REPE} SCASB
{REPE} SCASW
{REPE} SCAS dest
{REPNE} SCASB
{REPNE} SCASW
{REPNE} SCAS dest
Like the cmps and movs instructions the value in the cx register specifies the number of elements
The STOS Instruction
The stos instruction stores the value in the accumulator at the location specified by es: di. After storing the value the CPU increments or decrements di depending upon the state of the direction flag. Although the stos instruction has many uses its primary use is to initialize arrays and strings to a constant value.
The stos instruction takes four forms. They are
{REP} STOSB
{REP} STOSW
{REP} STOSD
{REP} STOS dest
The stosb instruction stores the value in the al register into the specified memory location(s) the stosw instruction stores the ax register into the specified memory location(s) and the stosd instruction stores eax into the specified location(s). The stos instruction is either an stosb stosw or stosd instruction depending upon the size of the specified operand.
The LODS Instruction
The lods instruction is unique among the string instructions. You will never use a repeat prefix with this instruction. The lods instruction copies the byte or word pointed at by ds:si into the al ax or eax register after which it increments or decrements the si register by one two or four. Repeating this instruction via the repeat prefix would serve no purpose whatsoever since the accumulator register will be overwritten each time the lods instruction repeats. At the end of the repeat operation the accumulator will contain the last value read from memory.
Like the stos instruction the lods instruction takes four forms:
{REP} LODSB
{REP} LODSW
{REP} LODS dest
As mentioned earlier you'll rarely if ever use the rep prefixes with these instructions. The 80x86 increments or decrements si by one two or four depending on the direction flag and whether you're using the lodsb lodsw or lodsd instruction.
Procedures
Procedure is a part of code that can be called from your program in order to make some specific task. Procedures make program more structural and easier to understand. Generally procedure returns to the same point from where it was called.
The syntax for procedure declaration: Name PROC
; here goes the code
; of the procedure ...
RET
name ENDP
name - is the procedure name, the same name should be in the top and the bottom, this is used to check correct closing of procedures.
Probably, you already know that RET instruction is used to return to operating system. The same instruction is used to return from procedure (actually operating system sees your program as a special procedure).
PROC and ENDP are compiler directives, so they are not assembled into any real machine code. Compiler just remembers the address of procedure.
CALL instruction is used to call a procedure.
Here is an example:
ORG 100h
CALL m1
MOV AX, 2
RET ; return to operating system.
m1 PROC
MOV BX, 5
RET ; return to caller.
m1 ENDP
END
The above example calls procedure m1, does MOV BX, 5, and returns to the next instruction after CALL: MOV AX, 2.
Macros
Macros are just like procedures, but not really. Macros look like procedures, but they exist only until your code is compiled, after compilation all macros are replaced with real instructions. If you declared a macro and never used it in your code, compiler will simply ignore it. Emu8086.inc is a good example of how macros can be used, this file contains several macros to make coding easier for you.
Macro definition:
name MACRO [parameters,...]
<instructions>
ENDM
Unlike procedures, macros should be defined above the code that uses it, for example:
MyMacro MACRO p1, p2, p3
MOV AX, p1
MOV BX, p2
MOV CX, p3
The above code is expanded into:
MOV AX, 00001h
MOV BX, 00002h
MOV CX, 00003h
MOV AX, 00004h
MOV BX, 00005h
MOV CX, DX
Some important facts about macros and procedures: