User Tools

Site Tools


products:ict:communications:coding:ascii

ASCII (American Standard Code for Information Interchange)

ASCII (American Standard Code for Information Interchange) characters are a set of standardized numeric codes that represent characters used in computers and electronic communication. Here's a brief overview of ASCII characters:

1. Printable Characters: These are the standard characters that can be printed and displayed. They include uppercase and lowercase letters, digits, punctuation marks, and special symbols such as @, #, $, %, etc.

2. Control Characters: These are non-printable characters used to control peripheral devices such as printers and terminals. Examples include carriage return (CR), line feed (LF), tab (TAB), and escape (ESC).

3. Extended ASCII Characters: Extended ASCII characters are additional characters beyond the standard ASCII set, typically used for specific languages or symbols. These include accented letters, currency symbols, and graphical characters.

Each ASCII character is represented by a 7-bit binary number, which allows for a total of 128 (2^7) possible characters. The ASCII standard has been extended to include additional characters and variations, leading to standards such as ISO 8859 and UTF-8, which support a wider range of characters and languages.

Here are some examples of ASCII characters:

- Uppercase letters: A, B, C, …, Z - Lowercase letters: a, b, c, …, z - Digits: 0, 1, 2, …, 9 - Punctuation marks: ! “ # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~ - Control characters: CR (carriage return), LF (line feed), TAB (tab), ESC (escape), etc.

It's important to note that ASCII characters are limited to the English alphabet and basic symbols. For languages with different character sets, other encoding standards like Unicode are used.

DEC OCT HEX BIN Symbol Description
0 0 0 0 NUL Null character
1 1 1 1 SOH Start of Heading
2 2 2 10 STX Start of Text
3 3 3 11 ETX End of Text
4 4 4 100 EOT End of Transmission
5 5 5 101 ENQ Enquiry
6 6 6 110 ACK Acknowledge
7 7 7 111 BEL Bell, Alert
8 10 8 1000 BS Backspace
9 11 9 1001 HT Horizontal Tab
10 12 0A 1010 LF Line Feed
11 13 0B 1011 VT Vertical Tabulation
12 14 0C 1100 FF Form Feed
13 15 0D 1101 CR Carriage Return
14 16 0E 1110 SO Shift Out
15 17 0F 1111 SI Shift In
16 20 10 10000 DLE Data Link Escape
17 21 11 10001 DC1 Device Control One (XON)
18 22 12 10010 DC2 Device Control Two
19 23 13 10011 DC3 Device Control Three (XOFF)
20 24 14 10100 DC4 Device Control Four
21 25 15 10101 NAK Negative Acknowledge
22 26 16 10110 SYN Synchronous Idle
23 27 17 10111 ETB End of Transmission Block
24 30 18 11000 CAN Cancel
25 31 19 11001 EM End of medium
26 32 1A 11010 SUB Substitute
27 33 1B 11011 ESC Escape
28 34 1C 11100 FS File Separator
29 35 1D 11101 GS Group Separator
30 36 1E 11110 RS Record Separator
31 37 1F 11111 US Unit Separator

The special characters serve various purposes in computing, communication, and text formatting. Here's an explanation of some of the most commonly encountered special ASCII characters:

1. Control Characters:

  1. Null (NUL): Represents the null character, often used as a string terminator.
  2. Start of Heading (SOH): Used to mark the beginning of a transmission header.
  3. Start of Text (STX): Marks the start of a text transmission.
  4. End of Text (ETX): Marks the end of a text transmission.
  5. End of Transmission (EOT): Indicates the end of a transmission.
  6. Enquiry (ENQ): A request for acknowledgment from the receiver.
  7. Acknowledgment (ACK): Confirms successful receipt of a transmission.
  8. Bell (BEL): Produces an audible or visual alert, often a beep sound.
  9. Backspace (BS): Moves the cursor back one position, typically used for text editing.
  10. Horizontal Tab (HT): Moves the cursor to the next tab stop.
  11. Line Feed (LF): Moves the cursor to the next line.
  12. Vertical Tab (VT): Similar to a line feed but moves the cursor to the next vertical tab stop.
  13. Form Feed (FF): Advances the paper to the next page or form feed position in printers.
  14. Carriage Return (CR): Moves the cursor to the beginning of the line.
  15. Shift Out (SO): Switches to an alternate character set.
  16. Shift In (SI): Returns to the default character set after Shift Out.

2. Escape Character:

  1. Escape (ESC): Used to initiate escape sequences for controlling terminal behavior, formatting, and special functions.

3. Punctuation and Symbols:

  1. Exclamation Mark (!): Used for emphasis, factorial notation, and logical negation.
  2. Quotation Marks (”): Encloses text to indicate speech or quotation.
  3. Number Sign or Hash (#): Used for numbering, as a comment indicator, and in programming languages.
  4. Dollar Sign ($) and Percent Sign (%): Commonly used in financial contexts and mathematical expressions.
  5. Ampersand (&): Represents the “and” conjunction.
  6. Apostrophe ('): Indicates possession or contraction in English.
  7. Parentheses (()) : Used for grouping, enclosures, and mathematical expressions.
  8. Asterisk (*): Multiplication operator, wildcard character, and used for emphasis.
  9. Plus (+) and Minus (-): Addition and subtraction operators respectively.
  10. Comma (,): Separates items in a list or indicates a pause in writing.
  11. Period or Full Stop (.): Marks the end of a sentence or abbreviation.
  12. Colon (:): Used in punctuation and to denote ratios, time, and divisions in text.
  13. Semicolon (;): Used to separate clauses in a sentence or items in a list.
  14. Question Mark (?): Indicates a question or uncertainty.
  15. Slash (/): Used in fractions, URLs, directory paths, and division.
  16. Backslash (\): Used to escape characters or denote file paths in computing.
  17. Vertical Bar or Pipe (|): Used as a logical OR operator and to separate items.
  18. Tilde (~): Represents approximation, home directory in Unix-like systems, and negation in logic.

4. Other Characters:

  1. Space: Represents a space between words or characters.
  2. At Sign (@): Commonly used in email addresses and to indicate online presence or usernames.
  3. Underscore (_) and Hyphen (-): Used in naming conventions, as word separators, and in URLs.
  4. Curly Braces ({}) and Square Brackets ([]): Used for grouping, data structures, and denoting scope in programming.
  5. Less Than (<) and Greater Than (>): Comparison operators and used in HTML tags.
  6. Brackets (<>): Used in mathematical expressions, HTML tags, and file redirection in computing.

These special ASCII characters have various applications in computer programming, text processing, communication protocols, and data representation. Understanding their meanings and uses is essential for effective communication and programming.

Extended ASCII refers to ASCII character sets that use 8 bits (1 byte) instead of the original 7 bits, thus allowing for 256 (2^8) possible characters. The standard ASCII character set only utilizes 7 bits, which allows for 128 characters, including control characters, alphanumeric characters, punctuation marks, and symbols.

With extended ASCII, the additional bit provides room for an additional 128 characters, enabling the inclusion of accented characters, additional symbols, and other special characters beyond the standard ASCII set.

There are several variations of extended ASCII character sets, with different characters mapped to the extra 128 values. However, there is no single standard for extended ASCII, leading to compatibility issues between different systems and languages.

One common extended ASCII character set is ISO 8859-1 (also known as Latin-1), which is widely used in Western European languages. ISO 8859-1 includes characters such as accented letters (e.g., á, é, í, ó, ú), special symbols (e.g., ©, ®, €), and additional punctuation marks.

Here's a brief overview of the ISO 8859-1 (Latin-1) extended ASCII character set:

- Characters 0-127: Same as standard ASCII (basic Latin characters). - Characters 128-159: Control characters and additional symbols. - Characters 160-255: Extended ASCII characters, including accented letters, currency symbols, and other special characters.

It's important to note that extended ASCII character sets are limited in their support for different languages and scripts. For more comprehensive character encoding and support for a wider range of languages, Unicode has become the standard. Unicode uses a variable-length encoding scheme to represent characters from all languages and includes thousands of characters beyond the scope of extended ASCII. UTF-8, a Unicode encoding scheme, has become particularly popular due to its compatibility with ASCII and efficient use of storage space.

Extended ASCII character sets vary depending on the specific encoding scheme being used. One of the most common extended ASCII character sets is ISO 8859-1 (Latin-1), which includes characters primarily used in Western European languages. Here's a list of some common extended ASCII characters from ISO 8859-1 along with a brief description of each:

1. Character 128 (€): Euro Sign

  1. Represents the currency symbol for the Euro.

2. Character 129 (): Not Used

3. Character 130 (‚): Single Low-9 Quotation Mark

  1. Used as a typographic quotation mark in some languages.

4. Character 131 (ƒ): Latin Small Letter F with Hook

  1. Represents the Latin letter “f” with a hook or flourish.

5. Character 132 („): Double Low-9 Quotation Mark

  1. Used as a typographic quotation mark in some languages.

6. Character 133 (…): Horizontal Ellipsis

  1. Represents an ellipsis, indicating an omission or continuation of text.

7. Character 134 (†): Dagger

  1. Used as a typographic symbol, often indicating a footnote or reference.

8. Character 135 (‡): Double Dagger

  1. Similar to the dagger symbol but used with additional significance.

9. Character 136 (ˆ): Modifier Letter Circumflex Accent

  1. Used as a diacritic mark in some languages.

10. Character 137 (‰): Per Mille Sign

  1. Represents the symbol for parts per thousand.

11. Character 138 (Š): Latin Capital Letter S with Caron

  1. Represents the Latin letter “S” with a caron, used in some European languages.

12. Character 139 (‹): Single Left-Pointing Angle Quotation Mark

  1. Used as a typographic quotation mark in some languages.

13. Character 140 (Œ): Latin Capital Ligature OE

  1. Represents the ligature of the Latin letters “O” and “E” as a single character.

14. Character 141 (): Not Used

15. Character 142 (Ž): Latin Capital Letter Z with Caron

  1. Represents the Latin letter “Z” with a caron, used in some European languages.

16. Character 143 (): Not Used

17. Character 144 (): Not Used

18. Character 145 (‘): Left Single Quotation Mark

  1. Used as a typographic quotation mark in some languages.

19. Character 146 (’): Right Single Quotation Mark

  1. Used as a typographic quotation mark in some languages.

20. Character 147 (“): Left Double Quotation Mark

  1. Used as a typographic quotation mark in some languages.

21. Character 148 (”): Right Double Quotation Mark

  1. Used as a typographic quotation mark in some languages.

22. Character 149 (•): Bullet

  1. Represents a bullet point or list item marker.

23. Character 150 (–): En Dash

  1. Used to indicate a range or relationship, often shorter than an em dash.

24. Character 151 (—): Em Dash

  1. Used to indicate a break in thought or emphasis, often longer than an en dash.

25. Character 152 (˜): Small Tilde

  1. Represents a diacritic mark or accent in some languages.

26. Character 153 (™): Trade Mark Sign

  1. Represents the symbol for a trademark.

27. Character 154 (š): Latin Small Letter S with Caron

  1. Represents the Latin letter “s” with a caron, used in some European languages.

28. Character 155 (›): Single Right-Pointing Angle Quotation Mark

  1. Used as a typographic quotation mark in some languages.

29. Character 156 (œ): Latin Small Ligature OE

  1. Represents the ligature of the Latin letters “o” and “e” as a single character.

30. Character 157 (): Not Used

31. Character 158 (ž): Latin Small Letter Z with Caron

  1. Represents the Latin letter “z” with a caron, used in some European languages.

32. Character 159 (Ÿ): Latin Capital Letter Y with Diaeresis

  1. Represents the Latin letter “Y” with a diaeresis or umlaut.

These characters are just a selection from the ISO 8859-1 (Latin-1) extended ASCII character set. Other extended ASCII character sets may include different characters, especially those designed for specific languages or regions.

Some more detail available here.

products/ict/communications/coding/ascii.txt · Last modified: 2024/03/27 17:01 by wikiadmin