This is an old revision of the document!

ASCII (American Standard Code for Information Interchange)

ASCII (American Standard Code for Information Interchange) characters are a set of standardized numeric codes that represent characters used in computers and electronic communication. Here's a brief overview of ASCII characters:

1. Printable Characters: These are the standard characters that can be printed and displayed. They include uppercase and lowercase letters, digits, punctuation marks, and special symbols such as @, #, $, %, etc.

2. Control Characters: These are non-printable characters used to control peripheral devices such as printers and terminals. Examples include carriage return (CR), line feed (LF), tab (TAB), and escape (ESC).

3. Extended ASCII Characters: Extended ASCII characters are additional characters beyond the standard ASCII set, typically used for specific languages or symbols. These include accented letters, currency symbols, and graphical characters.

Each ASCII character is represented by a 7-bit binary number, which allows for a total of 128 (2^7) possible characters. The ASCII standard has been extended to include additional characters and variations, leading to standards such as ISO 8859 and UTF-8, which support a wider range of characters and languages.

Here are some examples of ASCII characters:

- Uppercase letters: A, B, C, …, Z - Lowercase letters: a, b, c, …, z - Digits: 0, 1, 2, …, 9 - Punctuation marks: ! “ # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~ - Control characters: CR (carriage return), LF (line feed), TAB (tab), ESC (escape), etc.

It's important to note that ASCII characters are limited to the English alphabet and basic symbols. For languages with different character sets, other encoding standards like Unicode are used.

The special characters serve various purposes in computing, communication, and text formatting. Here's an explanation of some of the most commonly encountered special ASCII characters:

1. Control Characters:

Null (NUL): Represents the null character, often used as a string terminator.
Start of Heading (SOH): Used to mark the beginning of a transmission header.
Start of Text (STX): Marks the start of a text transmission.
End of Text (ETX): Marks the end of a text transmission.
End of Transmission (EOT): Indicates the end of a transmission.
Enquiry (ENQ): A request for acknowledgment from the receiver.
Acknowledgment (ACK): Confirms successful receipt of a transmission.
Bell (BEL): Produces an audible or visual alert, often a beep sound.
Backspace (BS): Moves the cursor back one position, typically used for text editing.
Horizontal Tab (HT): Moves the cursor to the next tab stop.
Line Feed (LF): Moves the cursor to the next line.
Vertical Tab (VT): Similar to a line feed but moves the cursor to the next vertical tab stop.
Form Feed (FF): Advances the paper to the next page or form feed position in printers.
Carriage Return (CR): Moves the cursor to the beginning of the line.
Shift Out (SO): Switches to an alternate character set.
Shift In (SI): Returns to the default character set after Shift Out.

2. Escape Character:

Escape (ESC): Used to initiate escape sequences for controlling terminal behavior, formatting, and special functions.

3. Punctuation and Symbols:

Exclamation Mark (!): Used for emphasis, factorial notation, and logical negation.
Quotation Marks (”): Encloses text to indicate speech or quotation.
Number Sign or Hash (#): Used for numbering, as a comment indicator, and in programming languages.
Dollar Sign ($) and Percent Sign (%): Commonly used in financial contexts and mathematical expressions.
Ampersand (&): Represents the “and” conjunction.
Apostrophe ('): Indicates possession or contraction in English.
Parentheses ¹⁾: Used for grouping, enclosures, and mathematical expressions.
Asterisk (*): Multiplication operator, wildcard character, and used for emphasis.
Plus (+) and Minus (-): Addition and subtraction operators respectively.
Comma (,): Separates items in a list or indicates a pause in writing.
Period or Full Stop (.): Marks the end of a sentence or abbreviation.
Colon (:): Used in punctuation and to denote ratios, time, and divisions in text.
Semicolon (;): Used to separate clauses in a sentence or items in a list.
Question Mark (?): Indicates a question or uncertainty.
Slash (/): Used in fractions, URLs, directory paths, and division.
Backslash (\): Used to escape characters or denote file paths in computing.
Vertical Bar or Pipe (|): Used as a logical OR operator and to separate items.
Tilde (~): Represents approximation, home directory in Unix-like systems, and negation in logic.

4. Other Characters:

Space: Represents a space between words or characters.
At Sign (@): Commonly used in email addresses and to indicate online presence or usernames.
Underscore (_) and Hyphen (-): Used in naming conventions, as word separators, and in URLs.
Curly Braces ({}) and Square Brackets ([]): Used for grouping, data structures, and denoting scope in programming.
Less Than (<) and Greater Than (>): Comparison operators and used in HTML tags.
Brackets (<>): Used in mathematical expressions, HTML tags, and file redirection in computing.

These special ASCII characters have various applications in computer programming, text processing, communication protocols, and data representation. Understanding their meanings and uses is essential for effective communication and programming.

Extended ASCII refers to ASCII character sets that use 8 bits (1 byte) instead of the original 7 bits, thus allowing for 256 (2^8) possible characters. The standard ASCII character set only utilizes 7 bits, which allows for 128 characters, including control characters, alphanumeric characters, punctuation marks, and symbols.

With extended ASCII, the additional bit provides room for an additional 128 characters, enabling the inclusion of accented characters, additional symbols, and other special characters beyond the standard ASCII set.

There are several variations of extended ASCII character sets, with different characters mapped to the extra 128 values. However, there is no single standard for extended ASCII, leading to compatibility issues between different systems and languages.

One common extended ASCII character set is ISO 8859-1 (also known as Latin-1), which is widely used in Western European languages. ISO 8859-1 includes characters such as accented letters (e.g., á, é, í, ó, ú), special symbols (e.g., ©, ®, €), and additional punctuation marks.

Here's a brief overview of the ISO 8859-1 (Latin-1) extended ASCII character set:

- Characters 0-127: Same as standard ASCII (basic Latin characters). - Characters 128-159: Control characters and additional symbols. - Characters 160-255: Extended ASCII characters, including accented letters, currency symbols, and other special characters.

It's important to note that extended ASCII character sets are limited in their support for different languages and scripts. For more comprehensive character encoding and support for a wider range of languages, Unicode has become the standard. Unicode uses a variable-length encoding scheme to represent characters from all languages and includes thousands of characters beyond the scope of extended ASCII. UTF-8, a Unicode encoding scheme, has become particularly popular due to its compatibility with ASCII and efficient use of storage space.

¹⁾