OS/2 eZine - http://www.os2ezine.com
Spacer
September 16, 2004
 
Keith Merrington has been in computing since they used to be programmed with punch cards and were made of discrete components and transistors. He built his first computer soldering the chips in by hand ,back in the 80's before the first PC was born. It was based on a Signetics 2650 8 bit microprocessor with a massive1K memory and with two 8 inch floppy drives. He still builds (assembles) his own PC's and has been using OS/2 since Warp 3 came out. He is married and lives in the Netherlands but was born in London England.
If you have a comment about the content of this article, please feel free to vent in the OS/2 e-Zine discussion forums.

There is also a Printer Friendly version of this page.



Spacer
Previous Article
Home
Next Article


Advertise with OS/2 e-Zine


Keyboard Input from A to Ž - Part 1

Computers only store information as numbers, so in order to store data various encoding schemes such as ASCII (American Standard Code for Information Interchange) and EBCEDIC (Extended Binary Coded Decimal Interchange - generally used by IBM) were devised. The ASCII table is relatively straightforward and is an encoding scheme which originally only used 7 bits. In an 8 bit system the MSB (most significant bit) is always zero. The characters that could be encoded using this scheme were the standard Latin alphabet plus numbers, punctuation marks, and some symbols and is shown in Fig 1.


Fig 1 - ASCII

This was great for those of us that didn't have any other characters in their alphabet, but for those of us that did it was a problem. In order to overcome the limitations of ASCII (only 128 characters) Extended ASCII was developed. Extended ASCII used the 8 bit, extending the character set with 128 characters and typographical symbols (see Fig 2). It was this character set which was used in the PC. Unfortunately not all the characters that were required by various languages could fit into the extra 128 character space. So various versions of Extended ASCII character set came into existence.


Fig 2 - Extended ASCII

So characters entered using the keyboard are converted into some internal coding, but are you aware that there is more than one way to enter data via the keyboard. Apart from the usual method of entering data with a single keystroke there are more. The most common method is to use the numeric keyboard in combination with the ALT key to directly enter the ASCII code. The ALT key (preferably the one on the left - the right hand one could be the AltGr key discuss under Keyboard Redefinition) is held depressed and the ASCII code of the character required is entered. This code is the decimal value of the code used to represent the character. It is this code that is entered one key at a time using the numeric keyboard. Obviously numlock should be selected so that the numeric keys are selected and not the cursor function of these keys. Also you cannot use the numeric keys on the alphanumeric keyboard! Lets take something simple. The decimal code for the character A is 65 (see Fig 1). So using the E editor you hold down the ALT key and then enter first a 6 and then the 5. After releasing the ALT key the A is displayed. It is also possible to enter the code 065 instead of 65. Now since most people can enter an A directly from the keyboard, only those characters which are not directly available on the keyboard would generally be entered using the ALT key method. For example you need to enter the yen sign ¥ but this key is not available on your keyboard and you know the code for this key is 157 (see Fig 2) . So using the E editor you hold down the ALT key and enter 1 then 5 then 7. Low and behold you now have the yen sign or is it something else. It could also be a Ø or blank. The character that is displayed is dependant on which codepage your computer is using.

Codepages

As I said earlier there were various versions of Extended ASCII character in existence the trouble was which version you were using? The concept codepage clarified this by specifying a particular codepage to a particular character set. With codepage 457 if I enter 157 using the ALT key method then I will see ¥, for codepage 850 a Ø and for codepage 1004 nothing. There are about 49 different codepages currently supported by OS/2,and you can find this list using help on codepage or referring to your documentation. The format of the CONFIG.SYS statement is CODEPAGE XXX, (YYY) where XXX is the initial codepage and YYY (which is optional) is the alternate codepage. But even codepages are subject to change as with the introduction of the Euro a new currency symbol was introduced, but since all characters were taken it replaced an existing character.

To see which code page is in use you can simply type in CHCP (Change Codepage) without any parameters at an OS/2 or DOS prompt. To change to a particular codepage entering CHCP with the codepage number as parameter is al that is necessary. NOTE: Only one codepage is loaded at a time and is applicable to that session and all programs started from that session inherit that sessions code page. It is however only possible to switch between codepages specified in CONFIG.SYS. The most common codepages used in OS/2 are 437, which is the codepage for the United States, 850 which is the Latin1 - Multilingual and 1004 Windows Extended. Which codepage you should use depends first on your locale and then on what files you are using. I use 850 as default codepage and 1004 as the alternate codepage so that I can switch to files from windows when necessary. In Fig 3 the three code pages are listed together with the Unicode designation. Colours are used to indicate where the decimal code is the same but the characters differ.




[click to view enlarged image]
. Fig 3 - Codepages

But what if you want to change the codepage of a PM program that is already running? One solution is to use CodePage Pal (cppal30.exe). It can be downloaded from from hobbes

Code Page Pal

This program will allow you to change both PM message queue code page and the process page to any of the possible pages available. It also allows you to query the code page of any PM program. It should be said that this program is still a beta program although I found it to function perfectly correctly in use. I will only deal with this program briefly as it has been written up in OS/2 e-Zine before. It's use is simplicity itself. To query a PM window just drag the get icon using the right hand mouse button from the CpPal window (shown in Fig 4 ) to a PM window. Your mouse pointer will change during this operation to a arrow with a question mark.


Fig 4 - Code Page Pal

Releasing the mouse will result in the following information being displayed on the line Query in CpPal.

  • Pid - The Process ID of the PM program being queried.
  • Cp - The process code page being used PM program being queried.
  • Tid - The thread ID of the PM program being queried.
  • PM - The message queue code page of the PM program being queried.
To set a code page is done by dragging the set icon to the PM program to be changed after setting the PM code page to any one of a hundred different code pages plus the process code page to one of the two code pages as defined in CONFIG.SYS. The shell process (pmshell #1), the WPS, and CpPal are protected by CpPal from having the codepages changed. The mouse pointer in the aforementioned cases changes to a invalid symbol (which looks like a Ө but slightly rotated) plus an arrow. Since this version 0.30 that was released in November 2001 there have not been any updates, although the author indicated IF THERE IS SUFFICIENT INTEREST he would continue with its development. So its up to all of us to encourage Rich Walsh to continue with this program..

Control Codes and CP/M


Fig 5 - ASCII Control Codes

If you think that you can now freely use the ALT key to enter characters taking the codepage into consideration, I am afraid I will have to disappoint you. There are a few limitations still. The most obvious is that in order to display the character you enter, the character set itself must support (have) the character. The next is that some programs interpret these codes as a command. e.g. In Open Office for OS/2 the ALT-key code 4 results in moving the cursor one position to the left. This should not be confused with the shortcut keys where for example, in order to open the File Menu, first ALT is pressed and then the F. The other limitation has to do with the DOS and OS/2 prompt.

The DOS and OS/2 prompt

You can see that there is a difference in the ASCII character set in Fig 1 and that in Fig 5 on the left which also lists the ASCII character set, that from decimal 00 to 31. There are no characters displayed, only the control codes and their abbreviations. You are probably wondering why this list. The reason is that entering any of these values at a DOS or OS/2 prompt has a different effect than in a PM program. Some codes shown are relatively clear as they directly relate to keyboard keys such as Escape, TAB, and Backspace, while others relate to functions originating in the pre-history of computing such as Start of Text, BELL, Linefeed, Carriage Return etc. Other codes such as Acknowledge and Negative Acknowledge are still used today in data communication,

You will notice in this table (Fig 5) that there is also a column which list the Control codes. These codes can be entered simply using the Control (Ctrl) key. This is yet another way of entering data. Holding down the Ctrl key has the effect of zeroing the 3 most significant bits. So an A which is in hex 41 becomes 01, the same as ALT 1. The display follows the Ctrl code (in most cases) which is in the form caret (^) character, in which the caret represents Ctrl. Entering Ctrl-A will be displayed as ^A.

Keyboard Input When entered codes in the range of 1 to 31 (decimal), it is not possible to enter 0, some codes will have the following effect or action:

Ctrl-C this character is the character that is used to terminate a running program. This is can be a batch (BAT) program or a command (CMD) program or a VIO application. Termination of a running program in the VDM is dependant on the DOS properties setting “DOS_BREAK”. If it is off then it is only possible to terminate the program when there is either keyboard input or screen output. A simple test demonstrates this by the following two line batch program called test.bat:

echo off
test
First check in the properties of this program in the tab Session -> DOS properties -> DOS_BREAK is set to OFF. Run the program. Entering Ctrl-C has no effect. If you remove the echo statement you will now be able to to terminate the program as there is now screen output. If DOS_BREAK is on then Ctrl-C always works. Ctrl-C is always active in an OS/2 VIO program.

Ctrl-H this character has the same function as the backspace key.

Ctrl-I this character has the same function as the TAB key

Ctrl-M this character has the same function as the Enter Key. The operating system however translate this to into two actions, line feed and carriage return.

Ctrl-W this character has the same function as the Backspace key.

Ctrl-Z this character is used to indicate end of input or end of file. To enter the batch program above, without using an editor, the simplest way is to type the following at a DOS or OS/2 prompt: (don't forget to press ENTER at every new line)

COPY CON test.bat
echo off
test
Ctrl-Z
It sometimes quicker to use this method than any editor for just a few lines of text. Correcting typos is limited to the current line in the same way you would correct any command line input.

Ctrl-[ this character has the same function as the Esc key.

There is one other Control code that I would like to bring to your attention. It is a left over from CP/M (which by the way was created by Gary Killdall who formed Digital Research). CP/M was used in the pre-PC era on microcomputers such as the Commodore 128 and ZX Spectrum. In 1981 the original version of MS-DOS (it was a renamed version of QDOS) that was used on the first IBM PC's was purchased by an upstart company called Microsoft for less than $100,000 from Seattle Computer Products!

Ctrl-P (This only applies to DOS command prompts!)

This character turns on or off the printer echo function. When the printer echo function is turned on, data displayed on the screen is output to the printer. The printer echo function is turned on by pressing Ctrl P once, and is turned off when Ctrl P is pressed a second time. Make sure the printer is connected properly to your parallel port before using this command, on-line and has paper, as if not then this can cause this session to hang without any error messages. This can be handy if you want to record keyboard input (assuming it is echoed) and screen output for example in a batch program which otherwise is flashed so fast on the screen that it is impossible to follow. Most printers nowadays only react after either the complete page has been filled or if a new page command is given. To give a new page command simply give a FF (Form Feed/New Page - Ctrl-L) as follows:

COPY CON LPT1
Ctrl-L
Ctrl-Z

There is no way of checking the state of the Ctrl-P toggle. Fortunately it exists only in the DOS session in which it was set. This is again one of the few characters that is not echoed on the screen. Although it has no function in an OS/2 VIO session is also not echoed here as well.

WYSI NOT WYG

Sometimes what you input is not always what is output. This can be demonstrated by entering all Ctrl-D, Ctrl-E, Ctrl-F, and Ctrl-G to a file. Terminate the input with Ctrl-Z <ENTER>

on the screen you will see ^D^E^F^G, which you would expect to be ♦♣♠=

Now using the TYPE command to display the file just created to the screen you will see only the following:

♦♣♠

and hear a beep. The code Ctrl-G is the bell code which when displayed in this fashion results in a beep being sounded. Only a limited number of codes can be displayed as characters, and when displayed they will have the same character as listed in fig 1.

Unicode

Within OS/2 characters are being translated from one character set to another using codepages and tables. During loading CONFIG.SYS you may have noticed the following displayed as one of the last lines before OS/2 switches from VIO to the PM. Unicode translate table for CP XXX loaded (where XXX is the initial codepage specified in the CODEPAGE statement)

So what is Unicode (for more info see the Unicode web site) and why in OS/2? Well as shown above a number can represent many different characters depending on the codepage which is problem. This is different in Unicode, which is based on ASCII but assigns a unique number (called code points) to each to each and every character. It is also a standard, is cross platform and supported by an number of major players such as Apple, HP, IBM, Microsoft and Sun. So by having data in Unicode ambiguity is removed. In OS/2 various programs use Unicode. For example JFS (Journalled File System) uses Unicode to store names, as does VFAT. NOTE: If you are using VFAT and change codepages you might need to rerun CACHEF32.EXE as this loads a codepage to Unicode translate table for longnames! Code conversion is used to convert Unicode to your codepage. This is accomplished with the help of the CONFIG.SYS line:

DEVICE=X:\OS2\BOOT\UNICODE.SYS

Character Map/2

Sometimes it can be handy to use enter a character using the methods above, but if you can't remember the code there is a handy freeware program called Character Map/2 (see Fig 5). It can be found together with the source codes at the authors web site. As shown in Fig 6 it is possible to select both the font and the codepage and then the character to be copied to the clipboard by double clicking it. Pressing the “Specs” button opens a secondary window giving details about the selected font. The only minor disadvantage I feel with this program is that every time I use it I have to reselect the font and the codepage as it automatically reverts to the first alphabetically available font and to codepage 850. The author Dmitry Steklenev indicated that he would add this change to the next version of Characters Map/2. A quick selection of a font by typing the first letter of the required font in the Font selection window is however possible. The first entry starting with that character is then selected. Moving up or down the list is then a question of either using the cursor up or down key or the pull down menu. Don't forget to thank Dimitry if you use his program.


Fig 6

In the concluding part of this article I will be looking at ways to redefine your keyboard and how I made use of those additional multimedia keys



Previous Article
Home
Next Article

Copyright (C) 2004. All Rights Reserved.