Yesterday we talked about attaching and detaching of partitions. Today we will look at indexing and constraints when it comes to partitioned tables. If you missed the last posts, again, here they are:
- PostgreSQL partitioning (1): Preparing the data set
- PostgreSQL partitioning (2): Range partitioning
- PostgreSQL partitioning (3): List partitioning
- PostgreSQL partitioning (4) : Hash partitioning
- PostgreSQL partitioning (5): Partition pruning
- PostgreSQL partitioning (6): Attaching and detaching partitions
When declarative partitioning was introduced in PostgreSQL 10 there were quite some limitations. For example: If you wanted to create a primary key on a partitioned table that just failed and PostgreSQL told you that it is not supported. Things improved quite much since then and today you can do many things with partitioned tables that did not work initially (You can check this for an overview of the improvements that came in PostgreSQL 11).
This time we will use the list partitioned table and this is how it looks like currently:
postgres=# \d+ traffic_violations_p_list Partitioned table "public.traffic_violations_p_list" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description -------------------------+------------------------+-----------+----------+---------+----------+--------------+------------- seqid | text | | | | extended | | date_of_stop | date | | | | plain | | time_of_stop | time without time zone | | | | plain | | agency | text | | | | extended | | subagency | text | | | | extended | | description | text | | | | extended | | location | text | | | | extended | | latitude | numeric | | | | main | | longitude | numeric | | | | main | | accident | text | | | | extended | | belts | boolean | | | | plain | | personal_injury | boolean | | | | plain | | property_damage | boolean | | | | plain | | fatal | boolean | | | | plain | | commercial_license | boolean | | | | plain | | hazmat | boolean | | | | plain | | commercial_vehicle | boolean | | | | plain | | alcohol | boolean | | | | plain | | workzone | boolean | | | | plain | | state | text | | | | extended | | vehicletype | text | | | | extended | | year | smallint | | | | plain | | make | text | | | | extended | | model | text | | | | extended | | color | text | | | | extended | | violation_type | text | | | | extended | | charge | text | | | | extended | | article | text | | | | extended | | contributed_to_accident | boolean | | | | plain | | race | text | | | | extended | | gender | text | | | | extended | | driver_city | text | | | | extended | | driver_state | text | | | | extended | | dl_state | text | | | | extended | | arrest_type | text | | | | extended | | geolocation | point | | | | plain | | council_districts | smallint | | | | plain | | councils | smallint | | | | plain | | communities | smallint | | | | plain | | zip_codes | smallint | | | | plain | | municipalities | smallint | | | | plain | | Partition key: LIST (violation_type) Partitions: traffic_violations_p_list_citation FOR VALUES IN ('Citation'), traffic_violations_p_list_esero FOR VALUES IN ('ESERO'), traffic_violations_p_list_sero FOR VALUES IN ('SERO'), traffic_violations_p_list_warning FOR VALUES IN ('Warning'), traffic_violations_p_list_default DEFAULT
There is not a single constraint or index and the same is true for the partitions (only showing the first one here but is is the same for all of them):
postgres=# \d traffic_violations_p_list_citation Table "public.traffic_violations_p_list_citation" Column | Type | Collation | Nullable | Default -------------------------+------------------------+-----------+----------+--------- seqid | text | | | date_of_stop | date | | | time_of_stop | time without time zone | | | agency | text | | | subagency | text | | | description | text | | | location | text | | | latitude | numeric | | | longitude | numeric | | | accident | text | | | belts | boolean | | | personal_injury | boolean | | | property_damage | boolean | | | fatal | boolean | | | commercial_license | boolean | | | hazmat | boolean | | | commercial_vehicle | boolean | | | alcohol | boolean | | | workzone | boolean | | | state | text | | | vehicletype | text | | | year | smallint | | | make | text | | | model | text | | | color | text | | | violation_type | text | | | charge | text | | | article | text | | | contributed_to_accident | boolean | | | race | text | | | gender | text | | | driver_city | text | | | driver_state | text | | | dl_state | text | | | arrest_type | text | | | geolocation | point | | | council_districts | smallint | | | councils | smallint | | | communities | smallint | | | zip_codes | smallint | | | municipalities | smallint | | | Partition of: traffic_violations_p_list FOR VALUES IN ('Citation')
As already mentioned in one of the previous posts we can not create a primary key or unique index because there are duplicate rows in the partitioned table. We can, however, create a standard btree index:
postgres=# create index i1 on traffic_violations_p_list ( model ); CREATE INDEX postgres=# \d+ traffic_violations_p_list Partitioned table "public.traffic_violations_p_list" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description -------------------------+------------------------+-----------+----------+---------+----------+--------------+------------- seqid | text | | | | extended | | date_of_stop | date | | | | plain | | time_of_stop | time without time zone | | | | plain | | agency | text | | | | extended | | subagency | text | | | | extended | | description | text | | | | extended | | location | text | | | | extended | | latitude | numeric | | | | main | | longitude | numeric | | | | main | | accident | text | | | | extended | | belts | boolean | | | | plain | | personal_injury | boolean | | | | plain | | property_damage | boolean | | | | plain | | fatal | boolean | | | | plain | | commercial_license | boolean | | | | plain | | hazmat | boolean | | | | plain | | commercial_vehicle | boolean | | | | plain | | alcohol | boolean | | | | plain | | workzone | boolean | | | | plain | | state | text | | | | extended | | vehicletype | text | | | | extended | | year | smallint | | | | plain | | make | text | | | | extended | | model | text | | | | extended | | color | text | | | | extended | | violation_type | text | | | | extended | | charge | text | | | | extended | | article | text | | | | extended | | contributed_to_accident | boolean | | | | plain | | race | text | | | | extended | | gender | text | | | | extended | | driver_city | text | | | | extended | | driver_state | text | | | | extended | | dl_state | text | | | | extended | | arrest_type | text | | | | extended | | geolocation | point | | | | plain | | council_districts | smallint | | | | plain | | councils | smallint | | | | plain | | communities | smallint | | | | plain | | zip_codes | smallint | | | | plain | | municipalities | smallint | | | | plain | | Partition key: LIST (violation_type) Indexes: "i1" btree (model) Partitions: traffic_violations_p_list_citation FOR VALUES IN ('Citation'), traffic_violations_p_list_esero FOR VALUES IN ('ESERO'), traffic_violations_p_list_sero FOR VALUES IN ('SERO'), traffic_violations_p_list_warning FOR VALUES IN ('Warning'), traffic_violations_p_list_default DEFAULT
This is a so called partitioned index and you can verify that with:
postgres=# select * from pg_partition_tree('i1'); relid | parentrelid | isleaf | level ----------------------------------------------+-------------+--------+------- i1 | | f | 0 traffic_violations_p_list_citation_model_idx | i1 | t | 1 traffic_violations_p_list_esero_model_idx | i1 | t | 1 traffic_violations_p_list_sero_model_idx | i1 | t | 1 traffic_violations_p_list_warning_model_idx | i1 | t | 1 traffic_violations_p_list_default_model_idx | i1 | t | 1
Indeed the index cascaded down to all the partitions:
postgres=# \d traffic_violations_p_list_citation Table "public.traffic_violations_p_list_citation" Column | Type | Collation | Nullable | Default -------------------------+------------------------+-----------+----------+--------- seqid | text | | | date_of_stop | date | | | time_of_stop | time without time zone | | | agency | text | | | subagency | text | | | description | text | | | location | text | | | latitude | numeric | | | longitude | numeric | | | accident | text | | | belts | boolean | | | personal_injury | boolean | | | property_damage | boolean | | | fatal | boolean | | | commercial_license | boolean | | | hazmat | boolean | | | commercial_vehicle | boolean | | | alcohol | boolean | | | workzone | boolean | | | state | text | | | vehicletype | text | | | year | smallint | | | make | text | | | model | text | | | color | text | | | violation_type | text | | | charge | text | | | article | text | | | contributed_to_accident | boolean | | | race | text | | | gender | text | | | driver_city | text | | | driver_state | text | | | dl_state | text | | | arrest_type | text | | | geolocation | point | | | council_districts | smallint | | | councils | smallint | | | communities | smallint | | | zip_codes | smallint | | | municipalities | smallint | | | Partition of: traffic_violations_p_list FOR VALUES IN ('Citation') Indexes: "traffic_violations_p_list_citation_model_idx" btree (model)
As soon as you add another partition it will be indexed automatically:
postgres=# create table traffic_violations_p_list_demo postgres-# partition of traffic_violations_p_list postgres-# for values in ('demo'); CREATE TABLE postgres=# \d traffic_violations_p_list_demo Table "public.traffic_violations_p_list_demo" Column | Type | Collation | Nullable | Default -------------------------+------------------------+-----------+----------+--------- seqid | text | | not null | date_of_stop | date | | | time_of_stop | time without time zone | | | agency | text | | | subagency | text | | | description | text | | | location | text | | | latitude | numeric | | | longitude | numeric | | | accident | text | | | belts | boolean | | | personal_injury | boolean | | | property_damage | boolean | | | fatal | boolean | | | commercial_license | boolean | | | hazmat | boolean | | | commercial_vehicle | boolean | | | alcohol | boolean | | | workzone | boolean | | | state | text | | | vehicletype | text | | | year | smallint | | | make | text | | | model | text | | | color | text | | | violation_type | text | | | charge | text | | | article | text | | | contributed_to_accident | boolean | | | race | text | | | gender | text | | | driver_city | text | | | driver_state | text | | | dl_state | text | | | arrest_type | text | | | geolocation | point | | | council_districts | smallint | | | councils | smallint | | | communities | smallint | | | zip_codes | smallint | | | municipalities | smallint | | | Partition of: traffic_violations_p_list FOR VALUES IN ('demo') Indexes: "traffic_violations_p_list_demo_model_idx" btree (model) Check constraints: "chk_make" CHECK (length(seqid) > 1)
You can as well create an index on a specific partition only (maybe because you know that the application is searching on a specific column on that partition):
postgres=# create index i2 on traffic_violations_p_list_citation (make); CREATE INDEX postgres=# \d traffic_violations_p_list_citation Table "public.traffic_violations_p_list_citation" Column | Type | Collation | Nullable | Default -------------------------+------------------------+-----------+----------+--------- seqid | text | | | date_of_stop | date | | | time_of_stop | time without time zone | | | agency | text | | | subagency | text | | | description | text | | | location | text | | | latitude | numeric | | | longitude | numeric | | | accident | text | | | belts | boolean | | | personal_injury | boolean | | | property_damage | boolean | | | fatal | boolean | | | commercial_license | boolean | | | hazmat | boolean | | | commercial_vehicle | boolean | | | alcohol | boolean | | | workzone | boolean | | | state | text | | | vehicletype | text | | | year | smallint | | | make | text | | | model | text | | | color | text | | | violation_type | text | | | charge | text | | | article | text | | | contributed_to_accident | boolean | | | race | text | | | gender | text | | | driver_city | text | | | driver_state | text | | | dl_state | text | | | arrest_type | text | | | geolocation | point | | | council_districts | smallint | | | councils | smallint | | | communities | smallint | | | zip_codes | smallint | | | municipalities | smallint | | | Partition of: traffic_violations_p_list FOR VALUES IN ('Citation') Indexes: "i2" btree (make) "traffic_violations_p_list_citation_model_idx" btree (model)
What is not working right now, is creating a partitioned index concurrently:
postgres=# create index CONCURRENTLY i_con on traffic_violations_p_list (zip_codes); psql: ERROR: cannot create index on partitioned table "traffic_violations_p_list" concurrently
This implies that there will be locking when you create a partition index an indeed if you create the index in one session:
postgres=# create ndex i_mun on traffic_violations_p_list (municipalities); CREATE INDEX
… and at the same time insert something in another session it will block until the index got created in the first session:
postgres=# insert into traffic_violations_p_list ( seqid, date_of_stop ) values ( 'xxxxx', date('01.01.2023')); -- blocks until index above is created
You can limit locking time when you create the partitioned index on the partitioned table only but do not cascade down to the partitions:
postgres=# create index i_demo on only traffic_violations_p_list (accident); CREATE INDEX
This will leave the index in an invalid state:
postgres=# select indisvalid from pg_index where indexrelid = 'i_demo'::regclass; indisvalid ------------ f (1 row)
Now you can create the index concurrently on all the partitions:
postgres=# create index concurrently i_demo_citation on traffic_violations_p_list_citation (accident); CREATE INDEX postgres=# create index concurrently i_demo_demo on traffic_violations_p_list_demo (accident); CREATE INDEX postgres=# create index concurrently i_demo_esero on traffic_violations_p_list_esero(accident); CREATE INDEX postgres=# create index concurrently i_demo_sero on traffic_violations_p_list_sero(accident); CREATE INDEX postgres=# create index concurrently i_demo_warning on traffic_violations_p_list_warning(accident); CREATE INDEX postgres=# create index concurrently i_demo_default on traffic_violations_p_list_default(accident); CREATE INDEX
Once you have that you can attach all the indexes to the partitioned index:
postgres=# alter index i_demo attach partition i_demo_citation; ALTER INDEX postgres=# alter index i_demo attach partition i_demo_demo; ALTER INDEX postgres=# alter index i_demo attach partition i_demo_esero; ALTER INDEX postgres=# alter index i_demo attach partition i_demo_sero; ALTER INDEX postgres=# alter index i_demo attach partition i_demo_warning; ALTER INDEX postgres=# alter index i_demo attach partition i_demo_default; ALTER INDEX
This makes the partitioned index valid automatically:
postgres=# select indisvalid from pg_index where indexrelid = 'i_demo'::regclass; indisvalid ------------ t (1 row)
The very same is true for constraints: You can create them on the partitioned table and on specific partitions:
postgres=# alter table traffic_violations_p_list add constraint chk_make check (length(seqid)>1); ALTER TABLE postgres=# \d+ traffic_violations_p_list Partitioned table "public.traffic_violations_p_list" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description -------------------------+------------------------+-----------+----------+---------+----------+--------------+------------- seqid | text | | | | extended | | date_of_stop | date | | | | plain | | time_of_stop | time without time zone | | | | plain | | agency | text | | | | extended | | subagency | text | | | | extended | | description | text | | | | extended | | location | text | | | | extended | | latitude | numeric | | | | main | | longitude | numeric | | | | main | | accident | text | | | | extended | | belts | boolean | | | | plain | | personal_injury | boolean | | | | plain | | property_damage | boolean | | | | plain | | fatal | boolean | | | | plain | | commercial_license | boolean | | | | plain | | hazmat | boolean | | | | plain | | commercial_vehicle | boolean | | | | plain | | alcohol | boolean | | | | plain | | workzone | boolean | | | | plain | | state | text | | | | extended | | vehicletype | text | | | | extended | | year | smallint | | | | plain | | make | text | | | | extended | | model | text | | | | extended | | color | text | | | | extended | | violation_type | text | | | | extended | | charge | text | | | | extended | | article | text | | | | extended | | contributed_to_accident | boolean | | | | plain | | race | text | | | | extended | | gender | text | | | | extended | | driver_city | text | | | | extended | | driver_state | text | | | | extended | | dl_state | text | | | | extended | | arrest_type | text | | | | extended | | geolocation | point | | | | plain | | council_districts | smallint | | | | plain | | councils | smallint | | | | plain | | communities | smallint | | | | plain | | zip_codes | smallint | | | | plain | | municipalities | smallint | | | | plain | | Partition key: LIST (violation_type) Indexes: "i1" btree (model) Check constraints: "chk_make" CHECK (length(seqid) > 1) Partitions: traffic_violations_p_list_citation FOR VALUES IN ('Citation'), traffic_violations_p_list_esero FOR VALUES IN ('ESERO'), traffic_violations_p_list_sero FOR VALUES IN ('SERO'), traffic_violations_p_list_warning FOR VALUES IN ('Warning'), traffic_violations_p_list_default DEFAULT postgres=# \d+ traffic_violations_p_list_citation Table "public.traffic_violations_p_list_citation" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description -------------------------+------------------------+-----------+----------+---------+----------+--------------+------------- seqid | text | | | | extended | | date_of_stop | date | | | | plain | | time_of_stop | time without time zone | | | | plain | | agency | text | | | | extended | | subagency | text | | | | extended | | description | text | | | | extended | | location | text | | | | extended | | latitude | numeric | | | | main | | longitude | numeric | | | | main | | accident | text | | | | extended | | belts | boolean | | | | plain | | personal_injury | boolean | | | | plain | | property_damage | boolean | | | | plain | | fatal | boolean | | | | plain | | commercial_license | boolean | | | | plain | | hazmat | boolean | | | | plain | | commercial_vehicle | boolean | | | | plain | | alcohol | boolean | | | | plain | | workzone | boolean | | | | plain | | state | text | | | | extended | | vehicletype | text | | | | extended | | year | smallint | | | | plain | | make | text | | | | extended | | model | text | | | | extended | | color | text | | | | extended | | violation_type | text | | | | extended | | charge | text | | | | extended | | article | text | | | | extended | | contributed_to_accident | boolean | | | | plain | | race | text | | | | extended | | gender | text | | | | extended | | driver_city | text | | | | extended | | driver_state | text | | | | extended | | dl_state | text | | | | extended | | arrest_type | text | | | | extended | | geolocation | point | | | | plain | | council_districts | smallint | | | | plain | | councils | smallint | | | | plain | | communities | smallint | | | | plain | | zip_codes | smallint | | | | plain | | municipalities | smallint | | | | plain | | Partition of: traffic_violations_p_list FOR VALUES IN ('Citation') Partition constraint: ((violation_type IS NOT NULL) AND (violation_type = 'Citation'::text)) Indexes: "i2" btree (make) "traffic_violations_p_list_citation_model_idx" btree (model) Check constraints: "chk_make" CHECK (length(seqid) > 1) Access method: heap
For a specific partition only:
postgres=# alter table traffic_violations_p_list_citation add constraint chk_state check (state is not null); ALTER TABLE postgres=# \d traffic_violations_p_list_citation Table "public.traffic_violations_p_list_citation" Column | Type | Collation | Nullable | Default -------------------------+------------------------+-----------+----------+--------- seqid | text | | | date_of_stop | date | | | time_of_stop | time without time zone | | | agency | text | | | subagency | text | | | description | text | | | location | text | | | latitude | numeric | | | longitude | numeric | | | accident | text | | | belts | boolean | | | personal_injury | boolean | | | property_damage | boolean | | | fatal | boolean | | | commercial_license | boolean | | | hazmat | boolean | | | commercial_vehicle | boolean | | | alcohol | boolean | | | workzone | boolean | | | state | text | | | vehicletype | text | | | year | smallint | | | make | text | | | model | text | | | color | text | | | violation_type | text | | | charge | text | | | article | text | | | contributed_to_accident | boolean | | | race | text | | | gender | text | | | driver_city | text | | | driver_state | text | | | dl_state | text | | | arrest_type | text | | | geolocation | point | | | council_districts | smallint | | | councils | smallint | | | communities | smallint | | | zip_codes | smallint | | | municipalities | smallint | | | Partition of: traffic_violations_p_list FOR VALUES IN ('Citation') Indexes: "i2" btree (make) "traffic_violations_p_list_citation_model_idx" btree (model) Check constraints: "chk_make" CHECK (length(seqid) > 1) "chk_state" CHECK (state IS NOT NULL)
Changing the properties of a column works the same way: Either on the partitioned table level or for a specific partition only:
postgres=# alter table traffic_violations_p_list alter column seqid set not null; ALTER TABLE postgres=# alter table traffic_violations_p_list_citation alter column time_of_stop set not null; ALTER TABLE postgres=# \d+ traffic_violations_p_list Partitioned table "public.traffic_violations_p_list" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description -------------------------+------------------------+-----------+----------+---------+----------+--------------+------------- seqid | text | | not null | | extended | | date_of_stop | date | | | | plain | | time_of_stop | time without time zone | | | | plain | | agency | text | | | | extended | | subagency | text | | | | extended | | description | text | | | | extended | | location | text | | | | extended | | latitude | numeric | | | | main | | longitude | numeric | | | | main | | accident | text | | | | extended | | belts | boolean | | | | plain | | personal_injury | boolean | | | | plain | | property_damage | boolean | | | | plain | | fatal | boolean | | | | plain | | commercial_license | boolean | | | | plain | | hazmat | boolean | | | | plain | | commercial_vehicle | boolean | | | | plain | | alcohol | boolean | | | | plain | | workzone | boolean | | | | plain | | state | text | | | | extended | | vehicletype | text | | | | extended | | year | smallint | | | | plain | | make | text | | | | extended | | model | text | | | | extended | | color | text | | | | extended | | violation_type | text | | | | extended | | charge | text | | | | extended | | article | text | | | | extended | | contributed_to_accident | boolean | | | | plain | | race | text | | | | extended | | gender | text | | | | extended | | driver_city | text | | | | extended | | driver_state | text | | | | extended | | dl_state | text | | | | extended | | arrest_type | text | | | | extended | | geolocation | point | | | | plain | | council_districts | smallint | | | | plain | | councils | smallint | | | | plain | | communities | smallint | | | | plain | | zip_codes | smallint | | | | plain | | municipalities | smallint | | | | plain | | Partition key: LIST (violation_type) Indexes: "i1" btree (model) Check constraints: "chk_make" CHECK (length(seqid) > 1) Partitions: traffic_violations_p_list_citation FOR VALUES IN ('Citation'), traffic_violations_p_list_esero FOR VALUES IN ('ESERO'), traffic_violations_p_list_sero FOR VALUES IN ('SERO'), traffic_violations_p_list_warning FOR VALUES IN ('Warning'), traffic_violations_p_list_default DEFAULT postgres=# \d traffic_violations_p_list_citation Table "public.traffic_violations_p_list_citation" Column | Type | Collation | Nullable | Default -------------------------+------------------------+-----------+----------+--------- seqid | text | | not null | date_of_stop | date | | | time_of_stop | time without time zone | | not null | agency | text | | | subagency | text | | | description | text | | | location | text | | | latitude | numeric | | | longitude | numeric | | | accident | text | | | belts | boolean | | | personal_injury | boolean | | | property_damage | boolean | | | fatal | boolean | | | commercial_license | boolean | | | hazmat | boolean | | | commercial_vehicle | boolean | | | alcohol | boolean | | | workzone | boolean | | | state | text | | | vehicletype | text | | | year | smallint | | | make | text | | | model | text | | | color | text | | | violation_type | text | | | charge | text | | | article | text | | | contributed_to_accident | boolean | | | race | text | | | gender | text | | | driver_city | text | | | driver_state | text | | | dl_state | text | | | arrest_type | text | | | geolocation | point | | | council_districts | smallint | | | councils | smallint | | | communities | smallint | | | zip_codes | smallint | | | municipalities | smallint | | | Partition of: traffic_violations_p_list FOR VALUES IN ('Citation') Indexes: "i2" btree (make) "traffic_violations_p_list_citation_model_idx" btree (model) Check constraints: "chk_make" CHECK (length(seqid) > 1) "chk_state" CHECK (state IS NOT NULL)
This was indexing and constraints with partitioned tables. In the next post we will have a look sub partitioning.
Cet article PostgreSQL partitioning (7): Indexing and constraints est apparu en premier sur Blog dbi services.