If you followed Getting started with Exasol – Setting up an environment and Getting started with Exasol – Loading data from PostgreSQL you should have an Exasol test system up and running and a virtual schema pointing to a PostgreSQL schema. What we’ll be doing in this post is to load the data from PostgreSQL into Exasol and then have a look at how transactions work in Exasol and how Exasol is handling indexes.
As a quick reminder the virtual schema we’ve created in the previous post looks like this:
SQL_EXA> select table_name from exa_dba_tables where TABLE_SCHEMA = 'POSTGRESQL_REMOTE'; EXA: select table_name from exa_dba_tables where TABLE_SCHEMA = 'POSTGRESQL... TABLE_NAME -------------------------------------------------------------------------------------------------------------------------------- PGBENCH_ACCOUNTS PGBENCH_BRANCHES PGBENCH_HISTORY PGBENCH_TELLERS 4 rows in resultset.
As these table are actually in PostgreSQL we’re going to load them locally into an Exasol schema:
SQL_EXA> create schema demo; EXA: create schema demo; Rows affected: 0 SQL_EXA> open schema demo; EXA: open schema demo; Rows affected: 0 SQL_EXA> create table PGBENCH_ACCOUNTS as select * from POSTGRESQL_REMOTE.PGBENCH_ACCOUNTS; EXA: create table PGBENCH_ACCOUNTS as select * from POSTGRESQL_REMOTE.PGBEN... Rows affected: 10000000
An alternative method to do the same is to use SELECT INTO statement in Exasol:
SQL_EXA> select * into table PGBENCH_BRANCHES from POSTGRESQL_REMOTE.PGBENCH_BRANCHES; EXA: select * into table PGBENCH_BRANCHES from POSTGRESQL_REMOTE.PGBENCH_BR... Rows affected: 100 SQL_EXA> select * into table PGBENCH_HISTORY from POSTGRESQL_REMOTE.PGBENCH_HISTORY; EXA: select * into table PGBENCH_HISTORY from POSTGRESQL_REMOTE.PGBENCH_HIS... Rows affected: 0 SQL_EXA> select * into table PGBENCH_TELLERS from POSTGRESQL_REMOTE.PGBENCH_TELLERS; EXA: select * into table PGBENCH_TELLERS from POSTGRESQL_REMOTE.PGBENCH_TEL... Rows affected: 1000
Of course we lost all primary keys, foreign keys and check constraints by doing this:
SQL_EXA> select count(*) from EXA_DBA_CONSTRAINTS where CONSTRAINT_SCHEMA = 'DEMO'; EXA: select count(*) from EXA_DBA_CONSTRAINTS where CONSTRAINT_SCHEMA = 'DE... COUNT(*) --------------------- 0 1 row in resultset.
Let’s add the same constraints as we have them on the PostgreSQL side:
SQL_EXA> alter table PGBENCH_ACCOUNTS add constraint PGBENCH_ACCOUNTS_PK primary key (AID); EXA: alter table PGBENCH_ACCOUNTS add constraint PGBENCH_ACCOUNTS_PK primar... Rows affected: 0 SQL_EXA> alter table PGBENCH_BRANCHES add constraint PGBENCH_BRANCHES_PK primary key (BID); EXA: alter table PGBENCH_BRANCHES add constraint PGBENCH_BRANCHES_PK primar... Rows affected: 0 SQL_EXA> alter table PGBENCH_TELLERS add constraint PGBENCH_TELLERS_PK primary key (TID); EXA: alter table PGBENCH_TELLERS add constraint PGBENCH_TELLERS_PK primary ... Rows affected: 0 SQL_EXA> select count(*) from EXA_DBA_CONSTRAINTS where CONSTRAINT_SCHEMA = 'DEMO'; EXA: select count(*) from EXA_DBA_CONSTRAINTS where CONSTRAINT_SCHEMA = 'DE... COUNT(*) --------------------- 3 1 row in resultset.
Now we have exactly the same setup as in PostgreSQL. Creating the primary keys should also have created some indexes in the background:
SQL_EXA> col index_schema for a30; COLUMN index_schema ON FORMAT a30 SQL_EXA> col index_table for a30; COLUMN index_table ON FORMAT a30 SQL_EXA> col index_type for a15; COLUMN index_type ON FORMAT a15 SQL_EXA> select INDEX_SCHEMA,INDEX_TABLE,INDEX_TYPE from EXA_DBA_INDICES where INDEX_SCHEMA = 'DEMO'; EXA: select INDEX_SCHEMA,INDEX_TABLE,INDEX_TYPE from EXA_DBA_INDICES where ... INDEX_SCHEMA INDEX_TABLE INDEX_TYPE ------------------------------ ------------------------------ --------------- DEMO PGBENCH_ACCOUNTS GLOBAL DEMO PGBENCH_BRANCHES GLOBAL DEMO PGBENCH_TELLERS GLOBAL 3 rows in resultset. SQL_EXA>
If you’re looking for an index name, there is none:
SQL_EXA> col column_name for a30; COLUMN column_name ON FORMAT a30 SQL_EXA> col column_comment for a50; COLUMN column_comment ON FORMAT a50 SQL_EXA> desc full EXA_DBA_INDICES; EXA: desc full EXA_DBA_INDICES; COLUMN_NAME SQL_TYPE NULLABLE DISTRIBUTION_KEY PARTITION_KEY COLUMN_COMMENT ------------------------------ ---------------------------------------- -------- ---------------- ---------------- -------------------------------------------------- INDEX_SCHEMA VARCHAR(128) UTF8 Schema of the index INDEX_TABLE VARCHAR(128) UTF8 Table of the index INDEX_OWNER VARCHAR(128) UTF8 Owner of the index INDEX_OBJECT_ID DECIMAL(18,0) ID of the index INDEX_TYPE VARCHAR(20) UTF8 Index type IS_GEOMETRY BOOLEAN States whether this is a geospatial index MEM_OBJECT_SIZE DECIMAL(18,0) Index size in bytes (at last COMMIT) CREATED TIMESTAMP Timestamp of when the index was created LAST_COMMIT TIMESTAMP Last time the object was changed in the DB REMARKS VARCHAR(100000) UTF8 Additional information about the index 10 rows in resultset. SQL_EXA> tree (bi
There isn’t even a CREATE INDEX statement in Exasol. A “Global” index exists on all nodes of the cluster while a “Local” index only exits on the given node. Indexes are created and maintained automatically based on the queries the system is processing. We can easily see this when joining two simple tables:
SQL_EXA> create table demo1 ( a int, b int, c int ); EXA: create table demo1 ( a int, b int, c int ); Rows affected: 0 SQL_EXA> insert into demo1 select 1,1,1 from pgbench_accounts where aid < 1000; EXA: insert into demo1 select 1,1,1 from pgbench_accounts where aid create table demo2 ( a int, b int, c int ); SQL_EXA> create table demo2 ( a int, b int, c int ); EXA: create table demo2 ( a int, b int, c int ); Rows affected: 0 SQL_EXA> insert into demo2 select * from demo1; EXA: insert into demo2 select * from demo1; Rows affected: 999 SQL_EXA> select count(*) from demo1 a, demo2 b where a.a = b.a; EXA: select count(*) from demo1 a, demo2 b where a.a = b.a; COUNT(*) --------------------- 998001 1 row in resultset. SQL_EXA> select INDEX_SCHEMA,INDEX_TABLE,INDEX_TYPE from EXA_DBA_INDICES where INDEX_SCHEMA = 'DEMO'; EXA: select INDEX_SCHEMA,INDEX_TABLE,INDEX_TYPE from EXA_DBA_INDICES where ... INDEX_SCHEMA INDEX_TABLE INDEX_TYPE -------------------------------------------- ------------------------------ -------------------- DEMO PGBENCH_ACCOUNTS GLOBAL DEMO PGBENCH_BRANCHES GLOBAL DEMO PGBENCH_TELLERS GLOBAL DEMO DEMO1 LOCAL
If an index is not used for 5 weeks it will be dropped automatically.
Coming to transaction handling. First of all you need to know the auto commit is enabled by default (the same as in PostgreSQL) in the clients:
SQL_EXA> show autocommit; AUTOCOMMIT = "ON"
Exasol supports the serializable transaction isolation level. This is the strongest level but also comes with some downsides. It can happen that transaction need to wait or even abort when the order of the transactions can not be guaranteed. DDL is transactional as well:
-- session 1 SQL_EXA> set AUTOCOMMIT off; SQL_EXA> create table t1 ( a int ); EXA: create table t1 ( a int ); -- session 2 SQL_EXA> set AUTOCOMMIT off; SQL_EXA> SQL_EXA> select * from t1; EXA: select * from t1; Error: [42000] object T1 not found [line 1, column 15] (Session: 1679358615345561600) -- session 1 SQL_EXA> commit; EXA: commit; -- session 2 SQL_EXA> select * from t1; EXA: select * from t1; A --------------------- 0 rows in resultset. SQL_EXA>
The recommendation is to go with the default and let auto commit on to keep transactions as short as possible. Otherwise this can happen:
-- session 1 SQL_EXA> set AUTOCOMMIT off; SQL_EXA> insert into t1 values(1); EXA: insert into t1 values(1); Rows affected: 1 SQL_EXA> insert into t1 values(2); EXA: insert into t1 values(2); Rows affected: 1 SQL_EXA> insert into t1 values(3); EXA: insert into t1 values(3); Rows affected: 1 SQL_EXA> commit; EXA: commit; Rows affected: 0 SQL_EXA> update t1 set a = 1 where a = 2; EXA: update t1 set a = 1 where a = 2; Rows affected: 1 SQL_EXA> -- session 2 SQL_EXA> set AUTOCOMMIT off; SQL_EXA> insert into t1 values (1); EXA: insert into t1 values (1); --- must wait for session on to either commit or rollback
… or even this:
-- session 2 SQL_EXA> select * from t1; EXA: select * from t1; A --------------------- 1 1 3 1 4 rows in resultset. -- session 1 SQL_EXA> update t1 set a = 1 where a = 2; EXA: update t1 set a = 1 where a = 2; Rows affected: 0 SQL_EXA> -- session 2 SQL_EXA> insert into t1 values (5); EXA: insert into t1 values (5); Error: [40001] GlobalTransactionRollback msg: Transaction collision: automatic transaction rollback. (Session: 1679360319468797952)
These kind of conflict are recorded in the catalog and you can check them:
SQL_EXA> col conflict_objects for a30; COLUMN conflict_objects ON FORMAT a30 SQL_EXA> select * from EXA_DBA_TRANSACTION_CONFLICTS; EXA: select * from EXA_DBA_TRANSACTION_CONFLICTS; SESSION_ID CONFLICT_SESSION_ID START_TIME STOP_TIME CONFLICT_TYPE CONFLICT_OBJECTS CONFLICT_INFO --------------------- --------------------- -------------------------- -------------------------- -------------------- ------------------------------ ---------------------- 1679358615345561600 2020-10-01 14:13:12.871000 2020-10-01 14:13:12.871000 TRANSACTION ROLLBACK DEMO.T1 intern merged sessions 1679358615345561600 1679357953088880640 2020-10-01 14:23:31.669000 2020-10-01 14:27:35.941000 WAIT FOR COMMIT DEMO.T1 1679359718013206528 1679357953088880640 2020-10-01 14:24:05.772000 2020-10-01 14:27:35.941000 WAIT FOR COMMIT DEMO.T1 1679359718013206528 1679357953088880640 2020-10-01 14:29:05.514000 2020-10-01 14:32:06.309000 WAIT FOR COMMIT DEMO.T1 1679360160561037312 1679357953088880640 2020-10-01 14:31:11.577000 2020-10-01 14:32:06.309000 WAIT FOR COMMIT DEMO.T1 1679360185071960064 1679357953088880640 2020-10-01 14:31:59.416000 2020-10-01 14:32:06.309000 WAIT FOR COMMIT DEMO.T1 1679360185071960064 1679359718013206528 2020-10-01 14:32:06.352000 2020-10-01 14:32:12.255000 WAIT FOR COMMIT DEMO.T1 1679359718013206528 1679357953088880640 2020-10-01 14:32:51.164000 2020-10-01 14:32:51.164000 TRANSACTION ROLLBACK DEMO.T1 8 rows in resultset.
As soon as you write to an object you are working on your own (temporary copy) of the object, Exasol calls this Multi-Copy. When a transaction commits that version of the objects becomes the valid one. Transactions that started before will still use the previous version of the object.
The recommendation therefore is to always have auto commit turned on in the client and for the cases where you need to turn it off: Keep the transactions as short as possible. That’s it for the introduction of indexes and transactions in Exasol. In the next post we’ll look at database sessions and auditing.
Cet article Getting started with Exasol – Some words about indexes and transactions est apparu en premier sur Blog dbi services.