The ASCII Code
and the
POSIX character Classes



Binary

00000000nul 00000001soh 00000010stx 00000011etx 00000100eot 00000101enq 00000110ack 00000111bel
00001000bs 00001001tab 00001010nl 00001011vt 00001100ff 00001101cr 00001110so 00001111si
00010000dle 00010001dc1 00010010dc2 00010011dc3 00010100dc4 00010101nak 00010110syn 00010111etb
00011000can 00011001em 00011010sub 00011011esc 00011100fs 00011101gs 00011110rs 00011111us
00100000sp 00100001! 00100010" 00100011# 00100100$ 00100101% 00100110& 00100111'
00101000( 00101001) 00101010* 00101011+ 00101100, 00101101- 00101110. 00101111/
001100000 001100011 001100102 001100113 001101004 001101015 001101106 001101117
001110008 001110019 00111010: 00111011; 00111100< 00111101= 00111110> 00111111?
01000000@ 01000001A 01000010B 01000011C 01000100D 01000101E 01000110F 01000111G
01001000H 01001001I 01001010J 01001011K 01001100L 01001101M 01001110N 01001111O
01010000P 01010001Q 01010010R 01010011S 01010100T 01010101U 01010110V 01010111W
01011000X 01011001Y 01011010Z 01011011[ 01011100\ 01011101] 01011110^ 01011111_
01100000` 01100001a 01100010b 01100011c 01100100d 01100101e 01100110f 01100111g
01101000h 01101001i 01101010j 01101011k 01101100l 01101101m 01101110n 01101111o
01110000p 01110001q 01110010r 01110011s 01110100t 01110101u 01110110v 01110111w
01111000x 01111001y 01111010z 01111011{ 01111100| 01111101} 01111110~ 01111111del


Octal

000nul001soh002stx003etx004eot005enq006ack007bel
010bs011tab012nl013vt014ff015cr016so017si
020dle021dc1022dc2023dc3024dc4025nak026syn027etb
030can031em032sub033esc034fs035gs036rs037us
040sp041!042"043#044$045%046&047'
050(051)052*053+054,055-056.057/
06000611062206330644065506660677
07080719072:073;074<075=076>077?
100@101A102B103C104D105E106F107G
110H111I112J113K114L115M116N117O
120P121Q122R123S124T125U126V127W
130X131Y132Z133[134\135]136^137_
140`141a142b143c144d145e146f147g
150h151i152j153k154l155m156n157o
160p161q162r163s164t165u166v167w
170x171y172z173{174 |175}176~177del


Decimal

00nul01soh02stx03etx04eot05enq06ack07bel
08bs09tab10nl11vt12ff13cr14so15si
16dle17dc118dc219dc320dc421nak22syn23etb
24can25em26sub27esc28fs29gs30rs31us
32sp33!34"35#36$37%38&39'
40(41)42*43+44,45-46.47/
480491502513524535546557
56857958:59;60<61=62>63?
64@65A66B67C68D69E70F71G
72H73I74J75K76L77M78N79O
80P81Q82R83S84T85U86V87W
88X89Y90Z91[92\93]94^95_
96`97a98b99c100d101e102f103g
104h105i106j107k108l109m110n111o
112p113q114r115s116t117u118v119w
120x121y122z123{124 |125}126~127del


Hexadecimal

00nul01soh02stx03etx04eot05enq06ack07bel
08bs09tab0anl0bvt0cff0dcr0eso0fsi
10dle11dc112dc213dc314dc415nak16syn17etb
18can19em1asub1besc1cfs1dgs1ers1fus
20sp21!22"23#24$25%26&27'
28(29)2a*2b+2c,2d-2e.2f/
300311322333344355366377
3883993a:3b;3c<3d=3e>3f?
40@41A42B43C44D45E46F47G
48H49I4aJ4bK4cL4dM4eN4fO
50P51Q52R53S54T55U56V57W
58X59Y5aZ5b[5c\5d]5e^5f_
60`61a62b63c64d65e66f67g
68h69i6aj6bk6cl6dm6en6fo
70p71q72r73s74t75u76v77w
78x79y7az7b{7c|7d}7e~7fdel


The POSIX Classes

The basic POSIX character classes are shown by color-coding as follows:

Control Characters[:cntrl:]
Space 
Punctuation[:punct:]
Digits[:digit:]
Upper Case Letters[:upper:]
Lower Case Letters[:lower:]

Notice that the space character stands on its own and is not included in any basic class.

Most of the control characters should not appear in normal text. The ones that are likely to are:

0x09TABhorizontal tab
0x0ANLnewline/linefeed
0x0DCRcarriage return

The usual derived classes are as follows.

ClassDefinition
[:alpha:][:upper:] ∪ [:lower:]
[:alnum:][:alpha:] ∪ [:digit:]
[:xdigit:][:digit:] ∪ [AaBbCcDdEeFf]
[:graph:][:alnum:] ∪ [:punct:]
[:print:][:graph:] ∪ Space
[:blank:] Space ∪ Tab
[:space:][:blank:] ∪ [NL VT FF CR]
[:word:][:alnum:] ∪ Underscore

All but [:word:] are defined in the POSIX standard. [:word:] is not a POSIX class (pace the bash manual) but reflects the fact that in quite a few programming languages the characters in this class are those permitted in identifiers.

The principle governing the classification of characters outside the ASCII range is that the structure of the system as applied to ASCII must be maintained, except that additional classes may be created. The rules for the derived classes must continue to hold, and the basic classes must remain disjoint.


The Control Characters

The first 31 characters (0-30 decimal, 000-037 octal, 0x00-0x1E hex) together with the last character DEL (decimal 127, octal 177, hex 0x7F) are the "control characters". These were originally used to control teletype machines. Only a few of them are generally meaningful with most devices used today. However, they are often used for other purposes, for example, as commands to programs. When you press the control key on a keyboard at the same time as one of the letters, the code sent to the computer is the corresponding control code. That is, CTRL-A sends 001, CTRL-B sends 002, CTRl-C sends 003, etc.

Control characters are sometimes referred to by names like "Control-A", also written "Ctrl-A" or "^A". The correspondance is as follows: The null character, 0x00, is designated "Control-@". 0x01 is "Control-A", 0x02 is "Control-B", and so on through 0x1A, which is "Control-Z". 0x1B is "Control-[", 0x1C "Control-\", 0x1D "Control-]", 0x1E "Control-^", 0x1F "Control-_", and 0x20 "Control-`". In other words, the control characters are regarded as "Control" versions of the range 0x40-0x60.

The original meanings of the control characters are as follows:

ACKacknowledge
BELbell - rings the bell
BSbackspace - moves the cursor or print head back one space
CANcancel
CRcarriage return - moves the cursor or print head back to the beginning of the line
DC1device control 1
DC2device control 2
DC3device control 3
DC4device control 4
DLEdata link escape
EMend of medium
ENQenquiry
EOTend of transmission
ESCescape
ETBend of transmission block
ETXend of text
FFform feed - advances the paper to the top of the next page
FSfile separator
GSgroup separator
NAKnegative acknowledge
NLnewline. Also known as LF "line feed". Originally, moved the print head or cursor to the next line.
NULnull
RSrecord separator
SIshift in - switches output device back to default character set
SOshift out - switches output device to alternate character set
SOHstart of heading
STXstart of text
SUBsubstitute
SYNsynchronous idle
TABhorizontal tab
USunit separator
VTvertical tab



Back to Top