Creating a standard B-tree index in PostgreSQL is nothing special and usually we do it without thinking about what happens in the background. For the standard data types like e.g. int or text we do not need to know anything about operators or operator classes. In those cases the standard operator class for the data type is applied and all is fine. But sometimes we need to know about the possibilities PostgreSQL comes with, and this is the topic of this post.
As usual, let’s start with a simple setup:
postgres=# create table t1 ( a int, b text ); CREATE TABLE postgres=# insert into t1 select i, md5(i::text) from generate_series(1,1000000) i; INSERT 0 1000000 postgres=# select * from t1 limit 5; a | b ---+---------------------------------- 1 | c4ca4238a0b923820dcc509a6f75849b 2 | c81e728d9d4c2f636f067f89cc14862c 3 | eccbc87e4b5ce2fe28308fd9f2a7baf3 4 | a87ff679a2f3e71d9181a67b7542122c 5 | e4da3b7fbbce2345d7772b0674a318d5 (5 rows)
We now have a simple table with two columns and can create standard B-tree indexes like this:
postgres=# create index i1 on t1(a); CREATE INDEX postgres=# create index i2 on t1(b); CREATE INDEX postgres=# \d t1 Table "public.t1" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- a | integer | | | b | text | | | Indexes: "i1" btree (a) "i2" btree (b)
These indexes work perfectly fine, when you ask for rows like e.g. this:
postgres=# explain select * from t1 where a = 5; QUERY PLAN -------------------------------------------------------------- Index Scan using i1 on t1 (cost=0.42..8.44 rows=1 width=37) Index Cond: (a = 5) (2 rows) postgres=# explain select * from t1 where b = 'e4da3b7fbbce2345d7772b0674a318d5'; QUERY PLAN -------------------------------------------------------------- Index Scan using i2 on t1 (cost=0.42..8.44 rows=1 width=37) Index Cond: (b = 'e4da3b7fbbce2345d7772b0674a318d5'::text) (2 rows)
But how can we know which operators are supported by a specific index and data type? This can easily be answered by asking the PostgreSQL catalog:
postgres=# select am.amname AS index_method , opc.opcname AS opclass_name , opc.opcintype::regtype AS indexed_type , opc.opcdefault AS is_default from pg_am am , pg_opclass opc where opc.opcmethod = am.oid and am.amname = 'btree' order by index_method, opclass_name; index_method | opclass_name | indexed_type | is_default --------------+---------------------+-----------------------------+------------ btree | array_ops | anyarray | t btree | bit_ops | bit | t btree | bool_ops | boolean | t btree | bpchar_ops | character | t btree | bpchar_pattern_ops | character | f btree | bytea_ops | bytea | t btree | char_ops | "char" | t btree | cidr_ops | inet | f btree | date_ops | date | t btree | enum_ops | anyenum | t btree | float4_ops | real | t btree | float8_ops | double precision | t btree | inet_ops | inet | t btree | int2_ops | smallint | t btree | int4_ops | integer | t btree | int8_ops | bigint | t btree | interval_ops | interval | t btree | jsonb_ops | jsonb | t btree | macaddr8_ops | macaddr8 | t btree | macaddr_ops | macaddr | t btree | money_ops | money | t btree | name_ops | name | t btree | numeric_ops | numeric | t btree | oid_ops | oid | t btree | oidvector_ops | oidvector | t btree | pg_lsn_ops | pg_lsn | t btree | range_ops | anyrange | t btree | record_image_ops | record | f btree | record_ops | record | t btree | text_ops | text | t btree | text_pattern_ops | text | f btree | tid_ops | tid | t btree | time_ops | time without time zone | t btree | timestamp_ops | timestamp without time zone | t btree | timestamptz_ops | timestamp with time zone | t btree | timetz_ops | time with time zone | t btree | tsquery_ops | tsquery | t btree | tsvector_ops | tsvector | t btree | uuid_ops | uuid | t btree | varbit_ops | bit varying | t btree | varchar_ops | text | f btree | varchar_pattern_ops | text | f btree | xid8_ops | xid8 | t (43 rows)
A B-tree index, as you can see above, supports all the data types listed in the “indexed_type” column. If you want to know which operators are supported by a operator family you can also ask the PostgreSQL catalog, e.g. for integer and text:
postgres=# select am.amname as index_method , opf.opfname as opfamily_name , amop.amopopr::regoperator as opfamily_operator from pg_am am , pg_opfamily opf , pg_amop amop where opf.opfmethod = am.oid and amop.amopfamily = opf.oid and am.amname = 'btree' and opf.opfname in ('integer_ops','text_ops') order by 1,2,3; index_method | opfamily_name | opfamily_operator --------------+---------------+----------------------- btree | integer_ops | =(integer,bigint) btree | integer_ops | (integer,bigint) btree | integer_ops | =(integer,bigint) btree | integer_ops | =(smallint,smallint) btree | integer_ops | <(smallint,smallint) btree | integer_ops | =(integer,integer) btree | integer_ops | <(integer,integer) btree | integer_ops | =(bigint,bigint) btree | integer_ops | (bigint,bigint) btree | integer_ops | =(bigint,bigint) btree | integer_ops | =(bigint,integer) btree | integer_ops | (bigint,integer) btree | integer_ops | =(bigint,integer) btree | integer_ops | >(smallint,smallint) btree | integer_ops | >(integer,integer) btree | integer_ops | <=(smallint,smallint) btree | integer_ops | =(smallint,smallint) btree | integer_ops | >=(integer,integer) btree | integer_ops | =(smallint,integer) btree | integer_ops | =(integer,smallint) btree | integer_ops | <(smallint,integer) btree | integer_ops | (smallint,integer) btree | integer_ops | >(integer,smallint) btree | integer_ops | <=(smallint,integer) btree | integer_ops | =(smallint,integer) btree | integer_ops | >=(integer,smallint) btree | integer_ops | =(smallint,bigint) btree | integer_ops | (smallint,bigint) btree | integer_ops | =(smallint,bigint) btree | integer_ops | =(bigint,smallint) btree | integer_ops | (bigint,smallint) btree | integer_ops | =(bigint,smallint) btree | text_ops | =(name,name) btree | text_ops | =(text,text) btree | text_ops | =(name,text) btree | text_ops | <(name,text) btree | text_ops | =(name,text) btree | text_ops | >(name,text) btree | text_ops | =(text,name) btree | text_ops | <(text,name) btree | text_ops | =(text,name) btree | text_ops | >(text,name) btree | text_ops | <(name,name) btree | text_ops | (name,name) btree | text_ops | >=(name,name) btree | text_ops | <(text,text) btree | text_ops | (text,text) btree | text_ops | >=(text,text) (65 rows)
Using a B-tree index for these data types the index will support “>”, “=”, “<=" and "=". This can be annoying if you want to query a text column like this, as the index cannot support this:
postgres=# explain select * from t1 where b like 'e4da3b7fbbce2345d77%'; QUERY PLAN ----------------------------------------------------------------------- Gather (cost=1000.00..14552.33 rows=100 width=37) Workers Planned: 2 -> Parallel Seq Scan on t1 (cost=0.00..13542.33 rows=42 width=37) Filter: (b ~~ 'e4da3b7fbbce2345d77%'::text) (4 rows)
For “text” PostgreSQL comes with another build-in operator class you can use exactly for this purpose:
postgres=# select am.amname AS index_method , opc.opcname AS opclass_name , opc.opcintype::regtype AS indexed_type , opc.opcdefault AS is_default from pg_am am , pg_opclass opc where opc.opcmethod = am.oid and am.amname = 'btree' and opc.opcintype::regtype::text = 'text' order by index_method, opclass_name; index_method | opclass_name | indexed_type | is_default --------------+---------------------+--------------+------------ btree | text_ops | text | t btree | text_pattern_ops | text | f btree | varchar_ops | text | f btree | varchar_pattern_ops | text | f (4 rows)
The default for “text ” is “text_ops”, but there is also “text_pattern_ops”, which you can use like this:
postgres=# select am.amname as index_method , opf.opfname as opfamily_name , amop.amopopr::regoperator as opfamily_operator from pg_am am , pg_opfamily opf , pg_amop amop where opf.opfmethod = am.oid and amop.amopfamily = opf.oid and am.amname = 'btree' and opf.opfname = 'text_pattern_ops' order by 1,2,3; index_method | opfamily_name | opfamily_operator --------------+------------------+------------------- btree | text_pattern_ops | =(text,text) btree | text_pattern_ops | ~<~(text,text) btree | text_pattern_ops | ~=~(text,text) btree | text_pattern_ops | ~>~(text,text) (5 rows) postgres=# create index i3 on t1(b text_pattern_ops); CREATE INDEX postgres=# explain select * from t1 where b like 'e4da3b7fbbce2345d77%'; QUERY PLAN ---------------------------------------------------------------------------------------------- Index Scan using i3 on t1 (cost=0.42..8.45 rows=100 width=37) Index Cond: ((b ~>=~ 'e4da3b7fbbce2345d77'::text) AND (b ~<~ 'e4da3b7fbbce2345d78'::text)) Filter: (b ~~ 'e4da3b7fbbce2345d77%'::text) (3 rows)
Here is more detailed description in the documentation. Finally, if you know what you need, you can create your own operators, operator families and operator classes:
postgres=# \h create operator Command: CREATE OPERATOR Description: define a new operator Syntax: CREATE OPERATOR name ( {FUNCTION|PROCEDURE} = function_name [, LEFTARG = left_type ] [, RIGHTARG = right_type ] [, COMMUTATOR = com_op ] [, NEGATOR = neg_op ] [, RESTRICT = res_proc ] [, JOIN = join_proc ] [, HASHES ] [, MERGES ] ) URL: https://www.postgresql.org/docs/13/sql-createoperator.html postgres=# \h create operator family Command: CREATE OPERATOR FAMILY Description: define a new operator family Syntax: CREATE OPERATOR FAMILY name USING index_method URL: https://www.postgresql.org/docs/13/sql-createopfamily.html postgres=# \h create operator class Command: CREATE OPERATOR CLASS Description: define a new operator class Syntax: CREATE OPERATOR CLASS name [ DEFAULT ] FOR TYPE data_type USING index_method [ FAMILY family_name ] AS { OPERATOR strategy_number operator_name [ ( op_type, op_type ) ] [ FOR SEARCH | FOR ORDER BY sort_family_name ] | FUNCTION support_number [ ( op_type [ , op_type ] ) ] function_name ( argument_type [, ...] ) | STORAGE storage_type } [, ... ] URL: https://www.postgresql.org/docs/13/sql-createopclass.html
Cet article PostgreSQL indexes and operators est apparu en premier sur Blog dbi services.