Wednesday, October 21, 2009

ASCII one oh one, Lee Bond N7KC

October 21, 2009 Educational Radio Net, PSRG 74th Session

Have you ever wondered about the relationship between computers and data entry? What really happens when you hit a key on your computer and the corresponding character pops up on the monitor? If you are old enough to remember the Mits Altair or IMSAI 8 bit microcomputers then you will have no problem answering these two lead in questions.

Surely one of the oldest schemes for encoding alphabetical characters and numerals is the code developed by Samuel Morse and is known as Morse code. Early amateur radio operators had no choice but to learn the code in order to communicate with their fellow hams. The railroad telegraphers code is a variant of the Morse code and both the radio and railroad schemes were the backbone of the early communications industry.

As telecommunications technique improved the mechanical tele-printer based on the 5 level Baudot code was introduced. Communications was much easier when typing replaced the telegraph key and the distant output was in easily read text. Since only 5 symbols were used the number of possible combinations was limited to 32. If you count all the letters of the alphabet, both lower and upper case, and the numerals 0 through 9, and various punctuation characters it is clear that even a shifted 32 will not do the job.

During the late 50’s, when the early mainframe computers were evolving, it was obvious that something had to be done to improve the encoding of alpha-numerical character information that would be fed to the room sized digital monsters that glowed in the dark.

Enter ASCII, pronounced ass-key, the acronym for American Standard Code for Information Interchange. ASCII is a 7 bit code hence 128 unique bit pattern combinations are possible. Enough combinations to represent all the upper and lower case alphabetical characters plus numbers plus punctuation plus special formatting characters in use at the time. The first meeting of the committee which adopted this code was in 1960. The code later became known as US ASCII since it was only good for English systems.

During this time frame IBM developed a proprietary code known as EBCDIC which was based on all possible combinations of the 8 bits in a byte. There are 256 possible unique combinations of 8 bits so EBCDIC offered twice as many possibilities as did ASCII. EBCDIC was used in the large mainframe computers that IBM produced and ASCII became the standard coding scheme for the micro-computing industry.

From the late 50’s until recently the use of US ASCII dominated the microcomputer industry and information exchange on the Internet. Today ASCII is taking a back seat to the Unicode in its various implementations as UTF-8, UTF-16, or UTF-32 as in Unicode Transformation Format. Using 16 bits, or a word, offers over 65 thousand unique possibilities so numerous languages can be represented in addition to English and this makes the encoding truly universal and a natural for Internet use.

Historically ASCII is a 7 bit code. Always has been and always will be. However, that inviting 8th bit in the word could be useful and double the number of coding possibilities. Implementations of ASCII using all 8 bits became known as "extended" ASCII and found much use for formatting characters and such when word processing came to fore. Other names were "upper" level ASCII in contrast to the original "lower" level scheme.

Let’s look at 7 bit ASCII in some detail to see how it is structured.

Referring to the Bits, Nybles, Bytes, and Words (presentation 71) a few weeks ago we know that computers like to operate with patterns of 4 bits (a nyble), 8 bits (a byte), or 16 bits (a word). If ASCII requires 7 bits then the best choice for a pattern would be the next largest set or byte consisting of an assemblage of 8 bits. The extra 8th bit became useful as a rudimentary error detecting bit and was called the parity bit. More on this later.

A table of the ASCII characters is generally shown as 16 rows and 8 columns. The multiple of 16x8 is 128 so this array matches all the combination possibilities of 7 bits. The first two columns from the left contain all of the, so called, formatting characters, the third column from the left is where you find the various punctuation symbols and math symbols, the fourth column from the left lists the numerical symbols, and the last four columns are the upper and lower case alphabetical symbols.

To identify any one of 16 rows requires 4 bits and the lower 4 bits of the 7 bits is used as the row identifier where bit 1 is least significant. To identify any one of the 8 columns requires 3 bits and the bits 5, 6, and 7 (most significant) are used for this purpose. So, to recap… bits 1, 2, 3, and 4 are used to identify in which row a character is located and bits 5, 6, and 7 are used to identify in which column a character is located. Every box in the array is uniquely identified by a bit pattern. Bit 8 is always a zero unless the optional parity possibility is in play.

The data entry and identification process goes like this. Suppose you depress the capital or upper case "H" key on your keyboard. Bit 7 is forced to a "1" and 6 and 5 remain at zero. Bit 4 is forced to a "1" and bits 3, 2, and 1 remain at zero. Finally the "H" is coded as 01001000 or 48h in shorthand hexadecimal notation. This bit pattern is sent to the computer where it is checked, pattern by pattern, against a table looking for a match. When the match is found the computer proper knows which key was pressed.

Any error in this transmission process is serious since the computer will interpret your intended bit pattern incorrectly. The 8th bit or parity bit allows a low level error check as follows. Parity can be defined as even or odd. If you choose even then if the sum of the 7 bit 1’s is odd you just force bit 8 to a 1 and the overall number of 1 bits is even. If the sum of the 1 bits is even then leave the 8th bit at zero. This same process holds for odd parity. On the receiving end the system will check the count to determine if parity is correct. This scheme will detect all 1 bit errors or odd multiples thereof.

In summary, bits 5, 6, and 7 determine a column and indicate if the symbol is formatting, a number, or alphabetical character. Bits 1, 2, 3, and 4 determine which character of the alphabet or which number or which formatting symbol is to be used.

This concludes the set up discussion for ASCII one oh one. Are there any questions or comments with regard to tonight's discussion topic?

This is N7KC for the Wednesday night Educational Radio Net

1 comment:

James Lackey said...

I am extremely grateful to found this post. it is really what i wanted to see. hope will see more post in future. thanks for being sharing such wonderful posts.Custom Essay Writing Service comprehends that as an understudy you need to get an astounding result however it is about difficult to make progress all alone. Excessively numerous necessities, steady weight and due dates make it considerably harder to think and concoct something deserving of consideration.