YouTip LogoYouTip

Postgresql Data Type

In this section, we will discuss PostgreSQL's data types. Data types are what we set for each field when creating a table.

Benefits of setting data types:

PostgreSQL provides a rich set of data types. Users can create new data types in the database using the CREATE TYPE command. PostgreSQL has many data types, which we will explain in detail below.


Numeric Types

Numeric types consist of 2-byte, 4-byte, or 8-byte integers, as well as 4-byte or 8-byte floating-point numbers and decimals with optional precision.

The following table lists the available numeric types.

Name Storage Size Description Range
smallint 2 bytes Small-range integer -32768 to +32767
integer 4 bytes Typical choice for integer -2147483648 to +2147483647
bigint 8 bytes Large-range integer -9223372036854775808 to +9223372036854775807
decimal variable User-specified precision, exact Up to 131072 digits before the decimal point; up to 16383 digits after
numeric variable User-specified precision, exact Up to 131072 digits before the decimal point; up to 16383 digits after
real 4 bytes Variable precision, inexact 6 decimal digits precision
double precision 8 bytes Variable precision, inexact 15 decimal digits precision
smallserial 2 bytes Auto-incrementing small-range integer 1 to 32767
serial 4 bytes Auto-incrementing integer 1 to 2147483647
bigserial 8 bytes Auto-incrementing large-range integer 1 to 9223372036854775807

Monetary Types

The money type stores monetary amounts with a fixed fractional precision.

Values of numeric, int, and bigint types can be converted to money. It is not recommended to use floating-point numbers to handle monetary types due to the possibility of rounding errors.

Name Storage Size Description Range
money 8 bytes Monetary amount -92233720368547758.08 to +92233720368547758.07

Character Types

The following table lists the character types supported by PostgreSQL:

No. Name & Description
1 character varying(n), varchar(n) Variable-length with length limit
2 character(n), char(n) Fixed-length, padded with spaces
3 text Variable-length, no limit

Date/Time Types

The following table lists the date and time types supported by PostgreSQL.

Name Storage Size Description Lowest Value Highest Value Resolution
timestamp [ (p) ] 8 bytes Date and time (without time zone) 4713 BC 294276 AD 1 microsecond / 14 digits
timestamp [ (p) ] with time zone 8 bytes Date and time, with time zone 4713 BC 294276 AD 1 microsecond / 14 digits
date 4 bytes Date (no time of day) 4713 BC 5874897 AD 1 day
time [ (p) ] 8 bytes Time of day (without time zone) 00:00:00 24:00:00 1 microsecond / 14 digits
time [ (p) ] with time zone 12 bytes Time of day, with time zone 00:00:00+1459 24:00:00-1459 1 microsecond / 14 digits
interval [ fields ] [ (p) ] 12 bytes Time span -178000000 years 178000000 years 1 microsecond / 14 digits

Boolean Type

PostgreSQL supports the standard boolean data type.

boolean has two states: "true" or "false". A third state, "unknown", is represented by NULL.

Name Storage Format Description
boolean 1 byte true/false

Enumerated Types

An enumerated (enum) type is a data type consisting of a static, ordered set of values.

Enum types in PostgreSQL are similar to enum types in the C language.

Unlike other types, enum types must be created using the CREATE TYPE command.

CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');

Creating days of the week, as shown below:

CREATE TYPE week AS ENUM ('Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun');

Like other types, once created, enum types can be used in table and function definitions.

CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');

CREATE TABLE person (
    name text,
    current_mood mood
);

INSERT INTO person VALUES ('Moe', 'happy');

SELECT * FROM person WHERE current_mood = 'happy';
 name | current_mood
------+--------------
 Moe | happy
(1 row)

Geometric Types

Geometric data types represent two-dimensional plane objects.

The following table lists the geometric types supported by PostgreSQL.

The most fundamental type: the point. It is the basis for other types.

Name Storage Size Description Representation
point 16 bytes Point on a plane (x,y)
line 32 bytes Infinite line (not fully implemented) ((x1,y1),(x2,y2))
lseg 32 bytes Finite line segment ((x1,y1),(x2,y2))
box 32 bytes Rectangle ((x1,y1),(x2,y2))
path 16+16n bytes Closed path (similar to polygon) ((x1,y1),...)
path 16+16n bytes Open path [(x1,y1),...]
polygon 40+16n bytes Polygon (similar to closed path) ((x1,y1),...)
circle 24 bytes Circle <(x,y),r> (center and radius)

Network Address Types

PostgreSQL provides data types for storing IPv4, IPv6, and MAC addresses.

Storing network addresses with these data types is better than using plain text types because these types provide input error checking and special operators and functions.

Name Storage Size Description
cidr 7 or 19 bytes IPv4 or IPv6 network
inet 7 or 19 bytes IPv4 or IPv6 host and network
macaddr 6 bytes MAC address

When sorting inet or cidr data types, IPv4 addresses are always sorted before IPv6 addresses, including those encapsulated or mapped in IPv6 addresses, such as ::10.2.3.4 or ::ffff:10.4.3.2.


Bit String Types

Bit strings are strings of 1s and 0s. They can be used to store and visualize bitmasks. We have two SQL bit types: bit(n) and bit varying(n), where n is a positive integer.

Data of type bit must match the length n exactly; attempting to store shorter or longer data is an error. Data of type bit varying is of variable length up to length n; longer strings are rejected. Writing bit without a length is equivalent to bit(1), and bit varying without a length means no length limit.


Text Search Types

Full-text search is the technique of finding documents from a collection of natural language documents that match a query.

PostgreSQL provides two data types to support full-text search:

No. Name & Description
1 tsvector A tsvector value is a sorted list of distinct lexemes, which are the normalized forms of different variants of the same word.
2 tsquery A tsquery stores the lexemes to be searched for, and uses boolean operators & (AND), | (OR), and ! (NOT) to combine them. Parentheses can be used to enforce operator grouping.

UUID Type

The uuid data type is used to store Universally Unique Identifiers (UUID) as defined by RFC 4122, ISO/IEC 9834-8:2005, and related standards. (Some systems refer to this data type as a globally unique identifier, or GUID.) This identifier is a 128-bit identifier generated by an algorithm, making it impossible for it to be identical to identifiers generated by other means in modules known to use the same algorithm. Therefore, for distributed systems, this type of identifier provides better uniqueness guarantees than sequences, which can only guarantee uniqueness within a single database.

UUIDs are written as a sequence of lowercase hexadecimal digits, separated into groups by hyphens, specifically: a group of 8 digits + 3 groups of 4 digits + a group of 12 digits, totaling 32 digits representing 128 bits. An example of a standard UUID is:

a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11

XML Type

The xml data type can be used to store XML data. The advantage of storing XML data in the text type is that it can check input values for well-formedness and also supports functions for type-safe checking. To use this data type, you must compile with configure --with-libxml.

xml can store well-formed "documents" as defined by the XML standard, as well as "content" fragments defined by XMLDecl? content in the XML standard. Roughly, this means that content fragments can have multiple top-level elements or character nodes. The expression xmlvalue IS DOCUMENT can be used to determine whether a specific xml value is a complete document or a content fragment.

Creating XML Values

Use the function xmlparse to produce an xml type value from character data:

XMLPARSE (DOCUMENT '<?xml version="1.0"?><book><title>Manual</title><chapter>...</chapter></book>')
XMLPARSE (CONTENT 'abc<foo>bar</foo><bar>foo</bar>')

JSON Type

The json data type can be used to store JSON (JavaScript Object Notation) data. Such data can also be stored as text, but the json data type is more convenient for checking that each stored value is a valid JSON value.

There are also related functions for processing json data:

Example Result
array_to_json('{{1,5},{99,100}}'::int[]) [[1,5],[99,100]]
row_to_json(row(1,'foo')) {"f1":1,"f2":"foo"}

Array Types

PostgreSQL allows defining fields as variable-length multidimensional arrays.

Array types can be of any base type, user-defined type, enum type, or composite type.

Declaring Arrays

When creating a table, we can declare arrays as follows:

CREATE TABLE sal_emp (
    name text,
    pay_by_quarter integer[],
    schedule text[][]
);

pay_by_quarter is a one-dimensional integer array, and schedule is a two-dimensional text array.

We can also use the ARRAY keyword, as shown below:

CREATE TABLE sal_emp (
    name text,
    pay_by_quarter integer ARRAY,
    schedule text[][]
);

Inserting Values

Insert values using curly braces {}, with elements separated by commas within {}:

INSERT INTO sal_emp VALUES ('Bill', '{10000, 10000, 10000, 10000}', '{{"meeting", "lunch"}, {"training", "presentation"}}');
INSERT INTO sal_emp VALUES ('Carol', '{20000, 25000, 25000, 25000}', '{{"breakfast", "consulting"}, {"meeting", "lunch"}}');

Accessing Arrays

Now we can run some queries on this table.

First, we demonstrate how to access an element of an array. This query retrieves the names of employees whose salary changed in the second quarter:

SELECT name FROM sal_emp WHERE pay_by_quarter <> pay_by_quarter;
 name
-------
 Carol
(1 row)

The array subscript numbers are written in square brackets.

Modifying Arrays

We can modify the values of an array:

UPDATE sal_emp SET pay_by_quarter = '{25000,25000,27000,27000}' WHERE name = 'Carol';

Or use the ARRAY constructor syntax:

UPDATE sal_emp SET pay_by_quarter = ARRAY[25000,25000,27000,27000] WHERE name = 'Carol';

Searching in Arrays

To search for a value in an array, you must check each value of the array.

For example:

SELECT * FROM sal_emp WHERE pay_by_quarter = 10000 OR pay_by_quarter = 10000 OR pay_by_quarter = 10000 OR pay_by_quarter = 10000;

Alternatively, you can use the following statement to find rows where all elements in the array are equal to 10000:

SELECT * FROM sal_emp WHERE 10000 = ALL (pay_by_quarter);

Or, you can use the generate_subscripts function. For example:

SELECT * FROM (SELECT pay_by_quarter, generate_subscripts(pay_by_quarter, 1) AS s FROM sal_emp) AS foo WHERE pay_by_quarter = 10000;

Composite Types

Composite types represent the structure of a row or record; they are essentially just a list of field names and their data types. PostgreSQL allows using composite types like simple data types. For example, a field in a table can be declared as a composite type.

Declaring Composite Types

Here are two simple examples of defining composite types:

CREATE TYPE complex AS (
    r double precision,
    i double precision
);

CREATE TYPE inventory_item AS (
    name text,
    supplier_id integer,
    price numeric
);

The syntax is similar to CREATE TABLE, except that only field names and types can be declared here.

Once the type is defined, we can use it to create a table:

CREATE TABLE on_hand (
    item inventory_item,
    count integer
);

INSERT INTO on_hand VALUES (ROW('fuzzy dice', 42, 1.99), 1000);

Composite Type Value Input

To write a composite type value as a text constant, enclose the field values in parentheses and separate them with commas. You can put double quotes around any field value; if the value itself contains commas or parentheses, you must enclose it in double quotes.

The general format of a composite type constant is:

'( val1 , val2 , ... )'

An example is:

'("fuzzy dice",42,1.99)'

Accessing Composite Types

To access a field of a composite type, we write a dot and the field name, much like selecting a field from a table name. In fact, because it is so similar to selecting a field from a table name, we often need to use parentheses to avoid parser confusion. For example, you might need to select a subfield from the on_hand example table like this:

SELECT item.name FROM on_hand WHERE item.price > 9.99;

This will not work because, according to SQL syntax, item is being selected from a table name, not a field name. You must write it like this:

SELECT (item).name FROM on_hand WHERE (item).price > 9.99;

Or if you also need to use the table name (for example, in a multi-table query), write it like this:

SELECT (on_hand.item).name FROM on_hand WHERE (on_hand.item).price > 9.99;

Now the parenthesized object is correctly parsed as a reference to the item field, from which the subfield can be selected.


Range Types

Range data types represent a range of values of a certain element type.

For example, a timestamp range could be used to represent the time range during which a meeting room is reserved.

Built-in range types in PostgreSQL include:

  • int4range β€” range of integer
  • int8range β€” range of bigint
  • numrange β€” range of numeric
  • tsrange β€” range of timestamp without time zone
  • tstzrange β€” range of timestamp with time zone
  • daterange β€” range of date

In addition, you can define your own range types.

CREATE TABLE reservation (room int, during tsrange);
INSERT INTO reservation VALUES (1108, '[2010-01-01 14:30, 2010-01-01 15:30)');
-- Contains
SELECT int4range(10, 20) @> 3;
-- Overlaps
SELECT numrange(11.1, 22.2) && numrange(20.0, 30.0);
-- Extract upper bound
SELECT upper(int8range(15, 25));
-- Compute intersection
SELECT int4range(10, 20) * int4range(15, 25);
-- Is range empty?
SELECT isempty(numrange(1, 5));

Range values must be input according to the following format:

(lower,upper)
(lower,upper]
[lower,upper)
[lower,upper]
empty

Parentheses or brackets indicate whether the lower and upper bounds are exclusive or inclusive. Note that the last format is empty, representing an empty range (a range containing no values).

-- Includes 3, excludes 7, and includes all points in between
SELECT '[3,7)'::int4range;
-- Excludes both 3 and 7, but includes all points in between
SELECT '(3,7)'::int4range;
-- Includes only the single value 4
SELECT '[4,4]'::int4range;
-- Excludes the point (normalized to 'empty')
SELECT '[4,4)'::int4range;

Object Identifier Types

PostgreSQL uses object identifiers (OIDs) internally as primary keys for various system tables.

At the same time, the system does not add an OID system field to user-created tables (unless WITH OIDS is specified during table creation or the configuration parameter default_with_oids is set to enabled). The oid type represents an object identifier. Besides oid, there are several aliases: regproc, regprocedure, regoper, regoperator, regclass, regtype, regconfig, and regdictionary.

Name References Description Example Values
oid Any Numeric object identifier 564182
regproc pg_proc Function name sum
regprocedure pg_proc Function with argument types sum(int4)
regoper pg_operator Operator name +
regoperator pg_operator Operator with argument types *(integer,integer) or -(NONE,integer)
regclass pg_class Relation name pg_type
regtype pg_type Data type name integer
regconfig pg_ts_config Text search configuration english
regdictionary pg_ts_dict Text search dictionary simple

Pseudo-Types

The PostgreSQL type system contains a set of special-purpose entries, which are collectively called pseudo-types. Pseudo-types cannot be used as data types for columns, but they can be used to declare the argument or result types of a function. Pseudo-types are useful when a function does not simply accept and return a specific SQL data type.

The following table lists all pseudo-types:

Name Description
any Indicates that a function accepts any input data type.
anyelement Indicates that a function accepts any data type.
anyarray Indicates that a function accepts any array data type.
anynonarray Indicates that a function accepts any non-array data type.
anyenum Indicates that a function accepts any enum data type.
anyrange Indicates that a function accepts any range data type.
cstring Indicates that a function accepts or returns a null-terminated C string.
internal Indicates that a function accepts or returns a server-internal data type.
language_handler A procedural language call handler is declared to return language_handler.
fdw_handler A foreign-data wrapper handler is declared to return fdw_handler.
record Identifies that a function returns an unspecified row type.
trigger A trigger function is declared to return trigger.
void Indicates that a function returns no value.
opaque An obsolete type, previously used for all the above purposes.

For more information, refer to: PostgreSQL Data Types

← Data ElementDir Parent β†’