HTML Data Types

In HTML, a "data type" is explained as the type of data that is used in the content of an element or in the value of an attribute.

In the HTML Data Types Section, we will learn about the different data types classified under the following different specifications:

Now, let's move on to providing a concise explanation of all the data types based on the aforementioned three.

HTML Basic Data Types

In HTML, the most commonly used data types are classified under "basic data types." There are the following four basic data types in HTML:

  • Character data type: stores single alphanumeric text, which includes letters, numbers, symbols, spaces, and punctuation.
  • Text data type: Stores a string with a maximum length of 2,147,483,647 printable characters.
  • Name data type: Refers to a name given to any particular datun (singular form of data), function, or unit of a program in a programming language.
  • Number data type: refers to the data type that can store a number in the range of 1E-323 to 1.79E+308 (positive or negative) with an accuracy of about 15 digits. Arithmetical operations can be performed with number data types.

Different types of alphanumeric text

In the character data types, there are different types of alphanumeric text. The following table lists the different types of commonly used alphanumeric text:

Letters A...Z and a...z
Number 0...9
Symbols @ , # , $ , % , ∧ , & , * , () , _ , — , + , = , \ , | , {} , [] , ~
Punctuations Comma, full stop, and exclamation mark

Note: Arithmetic operations cannot be performed on the numbers included under the character data type.

Data types defined by the RFC and IANA documentation

RFC is a memorandum that describes the methods, behaviors, or research on the workings of the Internet. IANA is the entity that looks over the global IP address allocation, media types, and other Internet protocol-related assignments.

Note: The RFC stands for "Request for Comments," and the IANA stands for "Internet Assigned Numbers Authority."

According to the RFC and IANA documentation, there are the following four basic data types:

Let's go over each of the four basic data types listed above one by one.

HTML Uniform Resource Identifier (URI)

A uniform resource identifier (URI) is a string of characters that can be used to locate or designate a specific item on the web. As the following example demonstrates, URIs can also be defined as a shorthand way of identifying an online resource that can be easily expanded upon.

http://william@codescracker.com:80/over/there/index.dtb;type=anime1?name=ferret#nose

In the above example,

URI Component Description
http scheme name
william user information (also known as userinfo)
codescracker.com host name
80 port
william@codescracker.com:80 authority
over/there/index.dtb;type=anime1 path
index file name
dtb extension
type=anime1? parameter
?name=ferret# query
nose fragment

Let me briefly define each of these URI components or terms used in the above example using the following table:

Component Description
Scheme refers to the specification for assigning an identifier. The schemes that are used in URI are: Hypertext Transfer Protocol (HTTP), File Transfer Protocol (FTP), mailto, Uniform Resource Name (URN), tel, Rapid Spanning Tree Protocol (RTSP), and file.
User Information refers to personal information, such as a user name and password, that is used to access websites or resources.
Authority refers to the part that consists of optional user information that is terminated with @, a host name, and an optional port number preceded by a color.
Host name refers to the scheme required to access the given host on the Internet. It is also used for reusing the registration created by the Domain Name System (DNS), therefore saving the cost of deploying another registration.
Port refers to the optional decimal number that follows the host after a semicolon. Schemes also define their default port number. For example, http has 80 as its default port number.
Path consists of a sequence of text segments that are separated by a forward slash (/).
File Name refers to any name that can be given to a targeted file.
Extension refers to a code of three to four characters that comes after the file name followed by a dot (.). It specifies the information contained in the file. The .html extension signifies that the file contains an HTML document, and the .jpg extension signifies that the file under consideration is an image file.
Query starts with a question mark (? ), when the URI requests a program to run rather than a file to be accessed. Query represents the parameter to be passed into the server-side program.
Fragment refers to a particular point in the accessed file.

HTML Content-Type

The Content-Type (also known as the media type or MIME) represents the type of content used in an embedded or linked resource. For instance, the Content-Type can be plain text or a JPEG image. It is not case-sensitive. Its syntax is divided into two parts: top-level and bottom-level. The top level is separated from the bottom level by a slash (/) symbol. Following are some examples of the Content-Type:

  • Text/plain: represents plain text.
  • Image/jpeg: represents a compressed image file.
  • Audio/basis: represents an audio file.
  • Video/mpeg: represents a transmitted compressed video file.
  • Application/octet-stream: represents a binary file.

HTML Language Code

The language code is used to represent the code of various literal languages, which are used to script the HTML document. It is not case-sensitive and is written by using the lang attribute used in the HTML document.

The implementation of the language code is shown in the following example:

<html lang="en">
   ...
   ...
   ...
</html>

The following table lists some of the most famous and well-known language codes around the globe.

Language Language Code
English en
French fr
German de
Japanese ja
Arabic ar
Chinese zh
Dutch nl
Italian it
Korean ko
Russian ru
Spanish es

HTML Character Set

The character set is a set of standard characters taken from several languages and scripts around the world and represented by unique code points. These code points can be defined as the unique names and integers that are assigned to the character sets for their unique identification.

Following are some examples of the character set:

  • dollar symbol
  • yen symbol
  • lower case letters
  • upper case letters
  • delta
  • omega
  • exclamation mark
  • quotation marks

Data types defined by W3C Specifications

W3C is an international community that develops standards to ensure the long-term growth of the Web. W3C stands for "World Wide Web Consortium."

The W3C specifies the following five additional data types for HTML:

HTML DateTime Format

DateTime uses the ISO date format (ISO 8601), that is, "YYYY-MM-DDThh:mm:ssTZD." The components of the given format are described using the following table:

Component Description
YYYY represents a year in four-digit format, for example: 2022.
MM a two-digit numerical value of a month (01 through 12).
DD Represents the date of the month (01 through 31).
T acts as the separator between the date and time, and it must be written in capital letters.
hh represents the hour that ranges from 00 through 23.
mm represents the minutes that range from 00 through 59.
ss represents the second that ranges from 00 through 59.
TZD stands for Time Zone Designator (Z or +hh:mm or -hh:mm).

HTML RGB Triplet

The RGB triplet denotes three standard colors: red, green, and blue. All possible colors can be created by combining these three colors in various intensities and proportions.

All colors can be represented by a six-digit hexadecimal number, such as (xxyyzz), where:

  • The first two consecutive digits (xx) of the hexadecimal number represent red. The hexadecimal equivalent of the red color is #xx0000.
  • The second set of two consecutive digits (yy) represents green. The hexadecimal equivalent of green is represented by #00yy00.
  • The last two digits (zz) represent blue. The hexadecimal equivalent is represented by #0000zz.

HTML Color Names

In HTML, 16 colors can be called directly by their names rather than their hexadecimal values. This feature makes it easy for the users of HTML to call a color by its name if they are unaware of its hexadecimal number or the concept of the RGB triplet.

The following table represents 16 color names along with their hexadecimal values:

Color Name Hexadecimal Value
Black #000000
Silver #C0C0C0
Gray #808080
White #FFFFFF
Maroon #800000
Green #008000
Lime #00FF00
Olive #808000
Yellow #FFFF00
Red #FF0000
Purple #800080
Fuchsia #FF00FF
Navy #000080
Blue #0000FF
Teal #008800
Aqua #00FFFF

HTML Link Types

Link types are used to provide search engines with a variety of information. You can use the various recognized link types and their standard interpretations. These link types are not case-sensitive, which means you can represent a link type with both lower- and upper-case characters.

There are the following link types available in HTML:

  • Alternate: Refers to the substitute for the document in which the link occurs. When used with the lang attribute, the alternate link type represents the translated version of the current document. When combined with the media attribute, it represents a version intended for a different medium.
  • Style Sheet: This represents the external style sheet. You can select a style from alternate style sheets by using the style sheet and alternate link types together.
  • Start: This represents the first document in the collection of different documents. The start link type provides information about the initial document to the search engine.
  • Next: represents the next document in a linear order in the set of documents.
  • Prev: represents the previous document in the linear set of documents.
  • Contents: denotes the document's table of contents.
  • Index: denotes the document that contains the index.
  • Glossary: represents the document having a glossary.
  • Copyright: refers to a document that includes a copyright statement.
  • Chapter: represents a document collection's chapter.
  • Section: represents a document collection's section.
  • Subsection: a subsection of a collection of documents.
  • Appendix: represents a collection of documents' appendices.
  • Help: Represents the document with the help feature.
  • Bookmark: represents the bookmark.

HTML Media Types

We are given the ability to specify how HTML documents are displayed on various media, such as paper, the screen of a computer, or an aural browser, thanks to the existence of media types. Because of HTML5, we are now able to use the CSS properties to display the text of an HTML page in a variety of font types, colors, and sizes according to the type of media being used.

There are the following available media types:

  • screen: represents a computer screen.
  • tty: represents media using a fixed-pitch character grid with limited display capabilities.
  • tv: denotes a television.
  • projection: Represents a projector.
  • handheld: represents hand-held devices, such as a mouse, joystick, and keyboard.
  • print: represents the print preview mode.
  • aural: represents the speech synthesizer.
  • all - Represents the media type that is suitable for all the devices

HTML Online Test


« Previous Tutorial Next Tutorial »


Follow/Like Us on Facebook


Subscribe Us on YouTube