Assembly Language Programming - Lesson 3 - Codes

Robert M · November 25, 2003

In lesson one we learned what a bit is. In lesson 2 we learned how to use bits to enumerate lists of items. In this lesson we are going to learn how to use bits to encode information.

DEFINITIONS:

Before we study codes, however, we need to take a detour and learn some new terminology. When we enumerated, we saw that with 1 bit we can enumerate 2 items 0 and 1. With 2 bits we can enumerate up to 4 items 00, 01, 10 and 11. So on and so on, such that given N bits we can enumerate up to 2^N items. As you can guess, it is a very common practice to combine bits together for the purpose of enumeration. Some combinations are used so frequently in programming that they have been given special names:

1 bit = a bit

3 bits = an Octet -> Since it can enumerate 8 items.

4 bits = a nybble

8 bits = a byte

16 bits = a word

I will be using these terms in all future lessons so get comfortable with them now. For example the Atari 2600 has 128 bytes of RAM. How many bits is that? ANSWER: 128 bytes * 8 bits/byte = 1024 bits. What is RAM? Don't worry I will explain that in a later lesson.

If you are sharp eyed you may have noticed something about the naming of the bit strings above. Except for the octet each one is a power of 2! 2^0=1 (bit), 2^1=2(no name), 2^2=4(nybble), 2^3=8(byte), 2^4=16(word). This is no accident. Computers are based on bits and manipulate bits hence powers of two are a natural occurance in digital computers. So these numbers appear very often in programming. As a programmer you will find there are advantages to using powers of 2 in your programming. The odd Octet will become clear in Lesson 4.

INTRODUCTION TO CODES:

All enumerations are codes, but not all codes are enumerations. What does that mean? It means that enumerations are one type of binary code.

In lesson 2, we enCODEd the type of fruit (Apple, orange, bananna, cherry) using bits. What makes enumerations special codes is that they exactly match the binary numbering system used in computers for arithmetic so: Apple = 00 = zero, orange = 01 = one, bananna = 10 = two, cherry = 11 = three. We don't have to encode our types of fruit that way we could encode them as Apple = 10110, Orange = 10000, bannana = 10111, cherry = 11000, but this is now a code and not an enumeration.

Operation Codes:

One of the most important codes you will become familiar with is Operation Codes. Every microprocessor (CPU) has what is called an instruction set or a set of operation codes. Operation codes is often abbreviated as opcodes.

Operation codes are the executable (as opposed to pure information) part of your program. The hardware of the microprocessor reads each opcode in the sequence of the program and performs the action demanded. Later in this course we will explore all of the opcodes in the 6507 microprocessor (the processor in the Atari 2600) in detail. In the 6507 instruction set each opcode is 8-bits long, or 1 byte. The opcodes are not an enumeration, they take all sorts of values using 8 bits within a byte often skipping many bit combinations that would make the code an enumeration. The bits set in each opcode were chosen because they simplified the work of the engineers to build the logic circuits in the microprocessor.

Gray Codes:

A gray code is a special kind of binary code of N bits. Gray codes are used for counting 0, 1, 2, 3, etc. Gray codes are special in that each time you add or subtract 1 from the code, only 1 bit will change. Here is an example of a 2-bit gray code:

00 = zero

01 = one

11 = two

10 = three

00 = zero (pattern is repeating...)

You can see that only one bit changes as you count up or down through the 4 combinations. Gray codes are handy in situations where you want to minimize the amount of harware needed to implement a counting circuit in a computer. In the Atari 2600, the driving controllers (Indy 500) use a 2 bit gray code to encode the direction the paddle is being turned, the speed at which the code changes indicates the speed the paddle is turning at.

Binary Coded Decimal (BCD):

Binary Coded Decimal or BCD is a method for storing decimal numbers in an easy (sometimes) to use format within a computer. You are already aware of decimal numbers you use them to count all the time: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, etc. In BCD each decimal digit is encoded into a separate nybble = 4 bits.

decimal = binary

0 = 0000

1 = 0001

2 = 0010

3 = 0011

4 = 0100

5 = 0101

6 = 0110

7 = 0111

8 = 1000

9 = 1001

Each byte contains 2 nybbles, so each byte can hold 2 BCD digits (00 to 99 decimal). This is an important code for you as a programmer because the 6507 processor has built in support for adding and subtracting BCD numbers. The big advantage for BCD numbers is that each digit is confined to its own nybble. It is therefore easier to isolate the individual digits for drawing them onto the screen. So scores are good candidates for being stored in BCD since you want to draw them on the screen as decimal digits. The big disadvantage for BCD is that you are wasting bits each BCD digit takes 4 bits which if used completely could store 16 values, 6 more than the 10 that it is being used for.

ALPHANUMERIC Codes:

Programmers use alphanumeric codes to store text information used by their programs. Each letter, digit, and punctuation symbol is assigned a binary code. The minimum number of bits in the code is dependent on the desired number of characters used in your text strings. If you only want capital letters, numbers and the punctuation marks . Then you need 26+10+4 = 40 symbols, which requires 6 bits (from Lesson 2) per character in the string. Text manipulation is not a frequent activity for Atari 2600 games since the resources are so limited. Some games do display text. If your game will, then you need to decide between storing the strings as characters and then expanding them into the graphics to display on the screen, or store them in the cartridge already expanded and ready to display. Its a trade off between speed and storage space. We will explore such trade-offs much much later in the class.

There are 2 commonly used Alphanumeric codes for all computers. The first one is called ASCII. ASCII codes are 7 bits long. Therefore there are 128 symbols in the ASCII codes. Many documents on the internet contain ASCII codes. You can recognize these files in Windows as the files with a .txt file type extension. When you write your assembly code programs the program you used to store your code files will most likely store them as ASCII codes. The other common alphanumeric code is the Unicode. Unicode is a 16-bit code. It contains all the letters and symbols needed to display any known language in the world. ASCII has only the letters needed for english.

Summary:

The examples above are just a few codes. An infinite number of codes is possible because as we learned in Lesson 1, the meaning of the bits is entirely up to the programmer writing the program. Please try the exercises below to cement these new ideas home

-----------------------------------------

Excerises:

1. Covert the following decimal numbers to BCD format:

a. 10

b. 253

c. 7689

d. 4

2. Give an example of a 3 bit Gray code. NOTE: There is more than 1 correct answer.

3. How many nybbles are there in a word?

4. How many bits are in 512 bytes?

5. How many octets are in 72 nybbles?

6. You wish to store strings in your program. The strings will contain only capital letters A-Z, spaces, periods, question marks, and a special character that marks the end of the string. How many bits are needed to store each character in a string? By packing the character codes together how many characters could fit into 8 bytes?

Answers will be posted within 24 hours.

EricBall · November 26, 2003

One word of caution: although 16 bits is often called a word; a word is not always 16 bits.

One of the ways each generation of computers gets more powerful is to process more bits at once. So the 6502 is an 8 bit processor because it works with 8 bits of data at once. The Pentium is a 32 bit processor, and 64 bit (and higher) processors exist. The number of bits in a word is often related to the processor.

Byte and nybble (or nibble) can sometimes be different than 8 and 4 bits, but only on processors with oddball (10 or 18) word sizes which are very uncommon these days.

Thomas Jentzsch · November 26, 2003

One word of caution: although 16 bits is often called a word; a word is not always 16 bits.

Trying to confuse some newbies?

:idea: A word on the 650x is always 16 bit!

RCorcoran · November 27, 2003

Long time Starmaster 2600 World Record holder :-)

I love that!

Robert M · November 27, 2003

Sorry the holiday is taking much of my time, I will post the answers on Friday evening. Unless someone else wants to take a shot at posting the answers

Cheers!

EricBall · December 1, 2003

One word of caution: although 16 bits is often called a word; a word is not always 16 bits.

Trying to confuse some newbies?
A word on the 650x is always 16 bit!

I agree that on 650x (or any other 8 bit CPU) words are 16 bits. But, the same cannot be said if you are talking about the x86, or 68xxx, or some obscure processor like the PDP-1 or the GI CP1600 used in the Intellivision.

Anyway. For the purposes of this tutorial, word = 16 bits and 16 bits = word.

Tom · December 1, 2003

I agree that on 650x (or any other 8 bit CPU) words are 16 bits. But, the same cannot be said if you are talking about the x86...

to make the confusion complete:

actually on the x86 16 bits are called a word aswell, even on the newer cpus that have 32 bit registers.

for arm cpu's, a word is 32 bits, and 16 bits are called a halfword.

sorry, couldn't resist =)

Robert M · December 2, 2003

Sorry for the delay!

Excercise ANSWERS:

Excerises:

1. Covert the following decimal numbers to BCD format:

a. 10

b. 253

c. 7689

d. 4

a. 10 = 0001 0000

b. 253 = 0010 0101 0011

c. 7689 = 0111 0110 1000 1001

d. 4 = 0100

2. Give an example of a 3 bit Gray code. NOTE: There is more than 1 correct answer.

Here's a way to make a gray code of any length of bits. Start with a gray code for 1 bit =

0

1

To get a gray code of n+1 bits from a gray code of n bits simply repeat the n code followed (or preceeded) by a single zero, then repeat the n code again in reverse followed by a single 1.

So from the above one bit code we get a two bit code as

0+0 = 00

1+0 = 10

Repeat 1 bit code in reverse.

1+1 = 11

0+1 = 01

Simply repeat this process to get a 3 bit gray code:

00+0 = 000

10+0 = 100

11+0 = 110

01+0 = 010

Repeat 2 bit code in reverse...

01+1 = 011

11+1 = 111

10+1 = 101

00+1 = 001

3. How many nybbles are there in a word?

We know how many bits are in nybble and a word so we can convert from one to the other by converting words to bits and then bits to nybbles.

1 word = 16 bits

1 nybble = 4 bits

nybbles per word = 16/4 = 4 nybbles.

4. How many bits are in 512 bytes?

By our definitions there are 8 bits in a byte. So there will be:

512 * 8 = 4096 bits in 512 bytes. {edited: after error was pointed out by Nukey}

5. How many octets are in 72 nybbles?

Again we will convert from one unit to bits, and then from bits to the other unit.

72 nybbles * 4 (bits per nybble) = 288 bits

288 bits / 3 bits per octet = 96 octets.

6. You wish to store strings in your program. The strings will contain only capital letters A-Z, spaces, periods, question marks, and a special character that marks the end of the string. How many bits are needed to store each character in a string? By packing the character codes together how many characters could fit into 8 bytes?

First we count the total number of symbols possible for each character. A-Z is 26 symbols. Space, period, question mark, and termination are 4 more symbols. 26 + 4 = 30 total possible symbols.

We can enumerate the symbols as we learned in lesson 2. The number of bits needed is

log(base2) 30 rounded up = 5 bits are needed per symbol in the string!

Now the second part of the question asked how many symbols will fit into 8 bytes worth of bits.

total bits = 8 bits per byte * 8 bytes = 64 bits.

Number of symbols in 8 bytes = 64 bits / 5 bits per symbol = 12.8 symbols

Not quite 13 symbols will fit into 8 bytes of space.

Robert M · December 2, 2003

One word of caution: although 16 bits is often called a word; a word is not always 16 bits.

Trying to confuse some newbies?
A word on the 650x is always 16 bit!

I agree that on 650x (or any other 8 bit CPU) words are 16 bits. But, the same cannot be said if you are talking about the x86, or 68xxx, or some obscure processor like the PDP-1 or the GI CP1600 used in the Intellivision.

Anyway. For the purposes of this tutorial, word = 16 bits and 16 bits = word.

Eric Ball and the two Toms are correct in that the definitions for byte and word are not written in stone. Their definitions can vary from computer system to computer system. For the purpose of this course, however, a byte is 8 bits and a word is 16-bits.

Cheers!

Nukey Shay · December 2, 2003

Um...512 * 8 = 4096 :ponder:

Robert M · December 2, 2003

Um...512 * 8 = 4096

Ah! I was just seeing if anyone was paying attention! Yes, that's it! That's the ticket!

Thanks Nukey, I'll edit the previous post!

Nukey Shay · December 2, 2003

No prob. Kudos on the tutorial

BTW although 5 bits could be used in that manner to generate text, you probably won't see that method being used...since it's easier to use a full byte for each alphanumaric character. If space becomes an issue, it would be more effective using a compression routine rather than creating each character code "on the fly".

Assembly Language Programming - Lesson 3 - Codes

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members