Postgresql Data Type

In this section, we will discuss PostgreSQL's data types. Data types are what we set for each field when creating a table.

Benefits of setting data types:

PostgreSQL provides a rich set of data types. Users can create new data types in the database using the CREATE TYPE command. PostgreSQL has many data types, which we will explain in detail below.

Numeric Types

Numeric types consist of 2-byte, 4-byte, or 8-byte integers, as well as 4-byte or 8-byte floating-point numbers and decimals with optional precision.

The following table lists the available numeric types.

Name	Storage Size	Description	Range
smallint	2 bytes	Small-range integer	-32768 to +32767
integer	4 bytes	Typical choice for integer	-2147483648 to +2147483647
bigint	8 bytes	Large-range integer	-9223372036854775808 to +9223372036854775807
decimal	variable	User-specified precision, exact	Up to 131072 digits before the decimal point; up to 16383 digits after
numeric	variable	User-specified precision, exact	Up to 131072 digits before the decimal point; up to 16383 digits after
real	4 bytes	Variable precision, inexact	6 decimal digits precision
double precision	8 bytes	Variable precision, inexact	15 decimal digits precision
smallserial	2 bytes	Auto-incrementing small-range integer	1 to 32767
serial	4 bytes	Auto-incrementing integer	1 to 2147483647
bigserial	8 bytes	Auto-incrementing large-range integer	1 to 9223372036854775807

Monetary Types

The money type stores monetary amounts with a fixed fractional precision.

Values of numeric, int, and bigint types can be converted to money. It is not recommended to use floating-point numbers to handle monetary types due to the possibility of rounding errors.

Name	Storage Size	Description	Range
money	8 bytes	Monetary amount	-92233720368547758.08 to +92233720368547758.07

Character Types

The following table lists the character types supported by PostgreSQL:

No.	Name & Description
1	character varying(n), varchar(n) Variable-length with length limit
2	character(n), char(n) Fixed-length, padded with spaces
3	text Variable-length, no limit

Date/Time Types

The following table lists the date and time types supported by PostgreSQL.

Name	Storage Size	Description	Lowest Value	Highest Value	Resolution
timestamp [ (p) ]	8 bytes	Date and time (without time zone)	4713 BC	294276 AD	1 microsecond / 14 digits
timestamp [ (p) ] with time zone	8 bytes	Date and time, with time zone	4713 BC	294276 AD	1 microsecond / 14 digits
date	4 bytes	Date (no time of day)	4713 BC	5874897 AD	1 day
time [ (p) ]	8 bytes	Time of day (without time zone)	00:00:00	24:00:00	1 microsecond / 14 digits
time [ (p) ] with time zone	12 bytes	Time of day, with time zone	00:00:00+1459	24:00:00-1459	1 microsecond / 14 digits
interval [ fields ] [ (p) ]	12 bytes	Time span	-178000000 years	178000000 years	1 microsecond / 14 digits

Boolean Type

PostgreSQL supports the standard boolean data type.

boolean has two states: "true" or "false". A third state, "unknown", is represented by NULL.

Name	Storage Format	Description
boolean	1 byte	true/false

Enumerated Types

An enumerated (enum) type is a data type consisting of a static, ordered set of values.

Enum types in PostgreSQL are similar to enum types in the C language.

Unlike other types, enum types must be created using the CREATE TYPE command.

CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');

Creating days of the week, as shown below:

CREATE TYPE week AS ENUM ('Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun');

Like other types, once created, enum types can be used in table and function definitions.

CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');

CREATE TABLE person (
    name text,
    current_mood mood
);

INSERT INTO person VALUES ('Moe', 'happy');

SELECT * FROM person WHERE current_mood = 'happy';
 name | current_mood
------+--------------
 Moe | happy
(1 row)

Geometric Types

Geometric data types represent two-dimensional plane objects.

The following table lists the geometric types supported by PostgreSQL.

The most fundamental type: the point. It is the basis for other types.

Name	Storage Size	Description	Representation
point	16 bytes	Point on a plane	(x,y)
line	32 bytes	Infinite line (not fully implemented)	((x1,y1),(x2,y2))
lseg	32 bytes	Finite line segment	((x1,y1),(x2,y2))
box	32 bytes	Rectangle	((x1,y1),(x2,y2))
path	16+16n bytes	Closed path (similar to polygon)	((x1,y1),...)
path	16+16n bytes	Open path	[(x1,y1),...]
polygon	40+16n bytes	Polygon (similar to closed path)	((x1,y1),...)
circle	24 bytes	Circle	<(x,y),r> (center and radius)

Network Address Types

PostgreSQL provides data types for storing IPv4, IPv6, and MAC addresses.

Storing network addresses with these data types is better than using plain text types because these types provide input error checking and special operators and functions.

Name	Storage Size	Description
cidr	7 or 19 bytes	IPv4 or IPv6 network
inet	7 or 19 bytes	IPv4 or IPv6 host and network
macaddr	6 bytes	MAC address

When sorting inet or cidr data types, IPv4 addresses are always sorted before IPv6 addresses, including those encapsulated or mapped in IPv6 addresses, such as ::10.2.3.4 or ::ffff:10.4.3.2.

Bit String Types

Bit strings are strings of 1s and 0s. They can be used to store and visualize bitmasks. We have two SQL bit types: bit(n) and bit varying(n), where n is a positive integer.

Data of type bit must match the length n exactly; attempting to store shorter or longer data is an error. Data of type bit varying is of variable length up to length n; longer strings are rejected. Writing bit without a length is equivalent to bit(1), and bit varying without a length means no length limit.

Text Search Types

Full-text search is the technique of finding documents from a collection of natural language documents that match a query.

PostgreSQL provides two data types to support full-text search:

No.	Name & Description
1	tsvector A `tsvector` value is a sorted list of distinct lexemes, which are the normalized forms of different variants of the same word.
2	tsquery A `tsquery` stores the lexemes to be searched for, and uses boolean operators `&` (AND), `\|` (OR), and `!` (NOT) to combine them. Parentheses can be used to enforce operator grouping.

UUID Type

The uuid data type is used to store Universally Unique Identifiers (UUID) as defined by RFC 4122, ISO/IEC 9834-8:2005, and related standards. (Some systems refer to this data type as a globally unique identifier, or GUID.) This identifier is a 128-bit identifier generated by an algorithm, making it impossible for it to be identical to identifiers generated by other means in modules known to use the same algorithm. Therefore, for distributed systems, this type of identifier provides better uniqueness guarantees than sequences, which can only guarantee uniqueness within a single database.

UUIDs are written as a sequence of lowercase hexadecimal digits, separated into groups by hyphens, specifically: a group of 8 digits + 3 groups of 4 digits + a group of 12 digits, totaling 32 digits representing 128 bits. An example of a standard UUID is:

a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11

XML Type

The xml data type can be used to store XML data. The advantage of storing XML data in the text type is that it can check input values for well-formedness and also supports functions for type-safe checking. To use this data type, you must compile with configure --with-libxml.

xml can store well-formed "documents" as defined by the XML standard, as well as "content" fragments defined by XMLDecl? content in the XML standard. Roughly, this means that content fragments can have multiple top-level elements or character nodes. The expression xmlvalue IS DOCUMENT can be used to determine whether a specific xml value is a complete document or a content fragment.

Creating XML Values

Use the function xmlparse to produce an xml type value from character data:

XMLPARSE (DOCUMENT '<?xml version="1.0"?><book><title>Manual</title><chapter>...</chapter></book>')
XMLPARSE (CONTENT 'abc<foo>bar</foo><bar>foo</bar>')

JSON Type

The json data type can be used to store JSON (JavaScript Object Notation) data. Such data can also be stored as text, but the json data type is more convenient for checking that each stored value is a valid JSON value.

There are also related functions for processing json data:

Example	Result
array_to_json('{{1,5},{99,100}}'::int[])	[[1,5],[99,100]]
row_to_json(row(1,'foo'))	{"f1":1,"f2":"foo"}

Array Types

PostgreSQL allows defining fields as variable-length multidimensional arrays.

Array types can be of any base type, user-defined type, enum type, or composite type.

Declaring Arrays

When creating a table, we can declare arrays as follows:

CREATE TABLE sal_emp (
    name text,
    pay_by_quarter integer[],
    schedule text[][]
);

pay_by_quarter is a one-dimensional integer array, and schedule is a two-dimensional text array.

We can also use the ARRAY keyword, as shown below:

CREATE TABLE sal_emp (
    name text,
    pay_by_quarter integer ARRAY,
    schedule text[][]
);

Inserting Values

Insert values using curly braces {}, with elements separated by commas within {}:

INSERT INTO sal_emp VALUES ('Bill', '{10000, 10000, 10000, 10000}', '{{"meeting", "lunch"}, {"training", "presentation"}}');
INSERT INTO sal_emp VALUES ('Carol', '{20000, 25000, 25000, 25000}', '{{"breakfast", "consulting"}, {"meeting", "lunch"}}');

Accessing Arrays

Now we can run some queries on this table.

First, we demonstrate how to access an element of an array. This query retrieves the names of employees whose salary changed in the second quarter:

SELECT name FROM sal_emp WHERE pay_by_quarter <> pay_by_quarter;
 name
-------
 Carol
(1 row)

The array subscript numbers are written in square brackets.

Modifying Arrays

We can modify the values of an array:

UPDATE sal_emp SET pay_by_quarter = '{25000,25000,27000,27000}' WHERE name = 'Carol';

Or use the ARRAY constructor syntax:

UPDATE sal_emp SET pay_by_quarter = ARRAY[25000,25000,27000,27000] WHERE name = 'Carol';

Searching in Arrays

To search for a value in an array, you must check each value of the array.

For example:

SELECT * FROM sal_emp WHERE pay_by_quarter = 10000 OR pay_by_quarter = 10000 OR pay_by_quarter = 10000 OR pay_by_quarter = 10000;

Alternatively, you can use the following statement to find rows where all elements in the array are equal to 10000:

SELECT * FROM sal_emp WHERE 10000 = ALL (pay_by_quarter);

Or, you can use the generate_subscripts function. For example:

SELECT * FROM (SELECT pay_by_quarter, generate_subscripts(pay_by_quarter, 1) AS s FROM sal_emp) AS foo WHERE pay_by_quarter = 10000;

Composite Types

Composite types represent the structure of a row or record; they are essentially just a list of field names and their data types. PostgreSQL allows using composite types like simple data types. For example, a field in a table can be declared as a composite type.

Declaring Composite Types

Here are two simple examples of defining composite types:

CREATE TYPE complex AS (
    r double precision,
    i double precision
);

CREATE TYPE inventory_item AS (
    name text,
    supplier_id integer,
    price numeric
);

The syntax is similar to CREATE TABLE, except that only field names and types can be declared here.

Once the type is defined, we can use it to create a table:

CREATE TABLE on_hand (
    item inventory_item,
    count integer
);

INSERT INTO on_hand VALUES (ROW('fuzzy dice', 42, 1.99), 1000);

Composite Type Value Input

To write a composite type value as a text constant, enclose the field values in parentheses and separate them with commas. You can put double quotes around any field value; if the value itself contains commas or parentheses, you must enclose it in double quotes.

The general format of a composite type constant is:

'( val1 , val2 , ... )'

An example is:

'("fuzzy dice",42,1.99)'

Accessing Composite Types

To access a field of a composite type, we write a dot and the field name, much like selecting a field from a table name. In fact, because it is so similar to selecting a field from a table name, we often need to use parentheses to avoid parser confusion. For example, you might need to select a subfield from the on_hand example table like this:

SELECT item.name FROM on_hand WHERE item.price > 9.99;

This will not work because, according to SQL syntax, item is being selected from a table name, not a field name. You must write it like this:

SELECT (item).name FROM on_hand WHERE (item).price > 9.99;

Or if you also need to use the table name (for example, in a multi-table query), write it like this:

SELECT (on_hand.item).name FROM on_hand WHERE (on_hand.item).price > 9.99;

Now the parenthesized object is correctly parsed as a reference to the item field, from which the subfield can be selected.

Range Types

Range data types represent a range of values of a certain element type.

For example, a timestamp range could be used to represent the time range during which a meeting room is reserved.

Built-in range types in PostgreSQL include:

int4range — range of integer
int8range — range of bigint
numrange — range of numeric
tsrange — range of timestamp without time zone
tstzrange — range of timestamp with time zone
daterange — range of date

In addition, you can define your own range types.

CREATE TABLE reservation (room int, during tsrange);
INSERT INTO reservation VALUES (1108, '[2010-01-01 14:30, 2010-01-01 15:30)');
-- Contains
SELECT int4range(10, 20) @> 3;
-- Overlaps
SELECT numrange(11.1, 22.2) && numrange(20.0, 30.0);
-- Extract upper bound
SELECT upper(int8range(15, 25));
-- Compute intersection
SELECT int4range(10, 20) * int4range(15, 25);
-- Is range empty?
SELECT isempty(numrange(1, 5));

Range values must be input according to the following format:

(lower,upper)
(lower,upper]
[lower,upper)
[lower,upper]
empty

Parentheses or brackets indicate whether the lower and upper bounds are exclusive or inclusive. Note that the last format is empty, representing an empty range (a range containing no values).

-- Includes 3, excludes 7, and includes all points in between
SELECT '[3,7)'::int4range;
-- Excludes both 3 and 7, but includes all points in between
SELECT '(3,7)'::int4range;
-- Includes only the single value 4
SELECT '[4,4]'::int4range;
-- Excludes the point (normalized to 'empty')
SELECT '[4,4)'::int4range;

Object Identifier Types

PostgreSQL uses object identifiers (OIDs) internally as primary keys for various system tables.

At the same time, the system does not add an OID system field to user-created tables (unless WITH OIDS is specified during table creation or the configuration parameter default_with_oids is set to enabled). The oid type represents an object identifier. Besides oid, there are several aliases: regproc, regprocedure, regoper, regoperator, regclass, regtype, regconfig, and regdictionary.

Name	References	Description	Example Values
oid	Any	Numeric object identifier	564182
regproc	pg_proc	Function name	sum
regprocedure	pg_proc	Function with argument types	sum(int4)
regoper	pg_operator	Operator name	+
regoperator	pg_operator	Operator with argument types	*(integer,integer) or -(NONE,integer)
regclass	pg_class	Relation name	pg_type
regtype	pg_type	Data type name	integer
regconfig	pg_ts_config	Text search configuration	english
regdictionary	pg_ts_dict	Text search dictionary	simple

Pseudo-Types

The PostgreSQL type system contains a set of special-purpose entries, which are collectively called pseudo-types. Pseudo-types cannot be used as data types for columns, but they can be used to declare the argument or result types of a function. Pseudo-types are useful when a function does not simply accept and return a specific SQL data type.

The following table lists all pseudo-types:

Name	Description
any	Indicates that a function accepts any input data type.
anyelement	Indicates that a function accepts any data type.
anyarray	Indicates that a function accepts any array data type.
anynonarray	Indicates that a function accepts any non-array data type.
anyenum	Indicates that a function accepts any enum data type.
anyrange	Indicates that a function accepts any range data type.
cstring	Indicates that a function accepts or returns a null-terminated C string.
internal	Indicates that a function accepts or returns a server-internal data type.
language_handler	A procedural language call handler is declared to return `language_handler`.
fdw_handler	A foreign-data wrapper handler is declared to return `fdw_handler`.
record	Identifies that a function returns an unspecified row type.
trigger	A trigger function is declared to return `trigger`.
void	Indicates that a function returns no value.
opaque	An obsolete type, previously used for all the above purposes.

For more information, refer to: PostgreSQL Data Types

YouTip