Oracle ASH SQL_PLAN_LINE_ID in adaptive plans

July 20, 2020, 7:13 am

≫ Next: The myth of NoSQL (vs. RDBMS) “a simpler API to bound resources”

≪ Previous: Control-M/EM : How to manage a workflow with ecaqrtab utility

There are several methods to find out where time is spent in an execution plan of a query running in an Oracle database. Classical methods like SQL Trace and running a formatter tool like tkprof on the raw trace, or newer methods like SQL Monitor (when the Tuning Pack has been licensed) or running a query with the GATHER_PLAN_STATISTICS-hint (or with statistics_level=all set in the session) and then using DBMS_XPLAN.DISPLAY_CURSOR(format=>’ALLSTATS LAST’). However, what I often use is the information in SQL_PLAN_LINE_ID of Active Session History (ASH). I.e. more detailed by sampling every second in the recent past through V$ACTIVE_SESSION_HISTORY or in the AWR-history through DBA_HIST_ACTIVE_SESS_HISTORY. However, you have to be careful when interpreting the ASH SQL_PLAN_LINE_ID in an adaptive plan.

Here’s an example:

REMARK: I artificially slowed down the processing with the method described by Chris Antognini here: https://antognini.ch/2013/12/adaptive-plans-in-active-session-history/

SQL> @test

DEPARTMENT_NAME
------------------------------
Executive
IT
Finance
Purchasing
Shipping
Sales
Administration
Marketing
Human Resources
Public Relations
Accounting

11 rows selected.

Elapsed: 00:00:27.83
SQL> select * from table(dbms_xplan.display_cursor(format=>'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------
SQL_ID  8nuct789bh45m, child number 0
-------------------------------------
 select distinct DEPARTMENT_NAME from DEPARTMENTS d   join EMPLOYEES e
using(DEPARTMENT_ID)   where d.LOCATION_ID like '%0' and e.SALARY>2000
burn_cpu(e.employee_id/e.employee_id/4) = 1

Plan hash value: 1057942366

------------------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |             |      1 |        |     11 |00:00:27.82 |      12 |       |       |          |
|   1 |  HASH UNIQUE        |             |      1 |      1 |     11 |00:00:27.82 |      12 |  1422K|  1422K|61440  (0)|
|*  2 |   HASH JOIN         |             |      1 |      1 |    106 |00:00:27.82 |      12 |  1572K|  1572K| 1592K (0)|
|*  3 |    TABLE ACCESS FULL| DEPARTMENTS |      1 |      1 |     27 |00:00:00.01 |       6 |       |       |          |
|*  4 |    TABLE ACCESS FULL| EMPLOYEES   |      1 |      1 |    107 |00:00:27.82 |       6 |       |       |          |
------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("D"."DEPARTMENT_ID"="E"."DEPARTMENT_ID")
   3 - filter(TO_CHAR("D"."LOCATION_ID") LIKE '%0')
   4 - filter(("E"."SALARY">2000 AND "BURN_CPU"("E"."EMPLOYEE_ID"/"E"."EMPLOYEE_ID"/4)=1))

Note
-----
   - this is an adaptive plan


30 rows selected.

So I’d expect to see SQL_PLAN_LINE_ID = 4 in ASH for the 28 seconds of wait time here. But it’s different:

SQL> select sql_plan_line_id, count(*) from v$active_session_history 
   2 where sql_id='8nuct789bh45m'and sql_plan_hash_value=1057942366 group by sql_plan_line_id;

SQL_PLAN_LINE_ID   COUNT(*)
---------------- ----------
               9         28

1 row selected.

The 28 seconds are spent on SQL_PLAN_LINE_ID 9. Why that?

The reason is that ASH provides the SQL_PLAN_LINE_ID in adaptive plans according the whole adaptive plan:

SQL> select * from table(dbms_xplan.display_cursor('8nuct789bh45m',format=>'+ADAPTIVE'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------
SQL_ID  8nuct789bh45m, child number 0
-------------------------------------
 select distinct DEPARTMENT_NAME from DEPARTMENTS d   join EMPLOYEES e
using(DEPARTMENT_ID)   where d.LOCATION_ID like '%0' and e.SALARY>2000
burn_cpu(e.employee_id/e.employee_id/4) = 1

Plan hash value: 1057942366

------------------------------------------------------------------------------------------------------
|   Id  | Operation                      | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT               |                   |       |       |     5 (100)|          |
|     1 |  HASH UNIQUE                   |                   |     1 |    30 |     5  (20)| 00:00:01 |
|  *  2 |   HASH JOIN                    |                   |     1 |    30 |     4   (0)| 00:00:01 |
|-    3 |    NESTED LOOPS                |                   |     1 |    30 |     4   (0)| 00:00:01 |
|-    4 |     NESTED LOOPS               |                   |    10 |    30 |     4   (0)| 00:00:01 |
|-    5 |      STATISTICS COLLECTOR      |                   |       |       |            |          |
|  *  6 |       TABLE ACCESS FULL        | DEPARTMENTS       |     1 |    19 |     3   (0)| 00:00:01 |
|- *  7 |      INDEX RANGE SCAN          | EMP_DEPARTMENT_IX |    10 |       |     0   (0)|          |
|- *  8 |     TABLE ACCESS BY INDEX ROWID| EMPLOYEES         |     1 |    11 |     1   (0)| 00:00:01 |
|  *  9 |    TABLE ACCESS FULL           | EMPLOYEES         |     1 |    11 |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("D"."DEPARTMENT_ID"="E"."DEPARTMENT_ID")
   6 - filter(TO_CHAR("D"."LOCATION_ID") LIKE '%0')
   7 - access("D"."DEPARTMENT_ID"="E"."DEPARTMENT_ID")
   8 - filter(("E"."SALARY">2000 AND "BURN_CPU"("E"."EMPLOYEE_ID"/"E"."EMPLOYEE_ID"/4)=1))
   9 - filter(("E"."SALARY">2000 AND "BURN_CPU"("E"."EMPLOYEE_ID"/"E"."EMPLOYEE_ID"/4)=1))

Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)


37 rows selected.

So here we see the correct line 9. It’s actually a good idea to use

format=>'+ADAPTIVE'

when gathering plan statistics (through GATHER_PLAN_STATISTICS-hint or STATISTICS_LEVEL=ALL):

SQL> @test

DEPARTMENT_NAME
------------------------------
Executive
IT
Finance
Purchasing
Shipping
Sales
Administration
Marketing
Human Resources
Public Relations
Accounting

11 rows selected.

Elapsed: 00:00:27.96
SQL> select * from table(dbms_xplan.display_cursor(format=>'ALLSTATS LAST +ADAPTIVE'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
SQL_ID  8nuct789bh45m, child number 0
-------------------------------------
 select distinct DEPARTMENT_NAME from DEPARTMENTS d   join EMPLOYEES e
using(DEPARTMENT_ID)   where d.LOCATION_ID like '%0' and e.SALARY>2000
burn_cpu(e.employee_id/e.employee_id/4) = 1

Plan hash value: 1057942366

-------------------------------------------------------------------------------------------------------------------------------------------
|   Id  | Operation                      | Name              | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT               |                   |      1 |        |     11 |00:00:27.85 |    1216 |       |       |          |
|     1 |  HASH UNIQUE                   |                   |      1 |      1 |     11 |00:00:27.85 |    1216 |  1422K|  1422K|73728  (0)|
|  *  2 |   HASH JOIN                    |                   |      1 |      1 |    106 |00:00:27.85 |    1216 |  1572K|  1572K| 1612K (0)|
|-    3 |    NESTED LOOPS                |                   |      1 |      1 |     27 |00:00:00.01 |       6 |       |       |          |
|-    4 |     NESTED LOOPS               |                   |      1 |     10 |     27 |00:00:00.01 |       6 |       |       |          |
|-    5 |      STATISTICS COLLECTOR      |                   |      1 |        |     27 |00:00:00.01 |       6 |       |       |          |
|  *  6 |       TABLE ACCESS FULL        | DEPARTMENTS       |      1 |      1 |     27 |00:00:00.01 |       6 |       |       |          |
|- *  7 |      INDEX RANGE SCAN          | EMP_DEPARTMENT_IX |      0 |     10 |      0 |00:00:00.01 |       0 |       |       |          |
|- *  8 |     TABLE ACCESS BY INDEX ROWID| EMPLOYEES         |      0 |      1 |      0 |00:00:00.01 |       0 |       |       |          |
|  *  9 |    TABLE ACCESS FULL           | EMPLOYEES         |      1 |      1 |    107 |00:00:27.85 |    1210 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("D"."DEPARTMENT_ID"="E"."DEPARTMENT_ID")
   6 - filter(TO_CHAR("D"."LOCATION_ID") LIKE '%0')
   7 - access("D"."DEPARTMENT_ID"="E"."DEPARTMENT_ID")
   8 - filter(("E"."SALARY">2000 AND "BURN_CPU"("E"."EMPLOYEE_ID"/"E"."EMPLOYEE_ID"/4)=1))
   9 - filter(("E"."SALARY">2000 AND "BURN_CPU"("E"."EMPLOYEE_ID"/"E"."EMPLOYEE_ID"/4)=1))

Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)


37 rows selected.

Summary: Be careful checking SQL_PLAN_LINE_IDs in adaptive plans. This seems obvious in simple plans like the one above, but you may go in wrong directions if there are plans with hundreds of lines. Your analysis may be wrong if you are looking at the wrong plan step. To do some further validation, you may query sql_plan_operation, sql_plan_options from v$active_session_history as well.

Cet article Oracle ASH SQL_PLAN_LINE_ID in adaptive plans est apparu en premier sur Blog dbi services.

↧

The myth of NoSQL (vs. RDBMS) “a simpler API to bound resources”

July 21, 2020, 1:18 am

≫ Next: Java and InfluxDB http api

≪ Previous: Oracle ASH SQL_PLAN_LINE_ID in adaptive plans

By Franck Pachot

.
NoSQL provides an API that is much simpler than SQL. And one advantage of it is that users cannot exceed a defined amount of resources in one call. You can read this in Alex DeBrie article https://www.alexdebrie.com/posts/dynamodb-no-bad-queries/#relational-queries-are-unbounded which I take as a base for some of my “Myth of NoSQL vs RDBMS” posts because he explains very well how SQL and NoSQL are perceived by the users. But this idea of simpler API to limit what users can do is is quite common, precedes the NoSQL era, and is still valid with some SQL databases. Here I’m demonstrating that some RDBMS provide a powerful API and still can bound what users can do. Oracle Database has a resource manager for a long time, like defining resource limits on a per-service base, and those features are very simple to use in the Oracle Autonomous Database – the managed database in the Oracle Cloud.

I am using the example schema from the ATP database in the free tier, so that anyone can play with this. As usual, what I show on 1 million rows, and one thread, can scale to multiples vCPU and nodes. Once you get the algorithms (execution plan) you know how it scales.


06:36:08 SQL> set echo on serveroutput on time on timing on

06:36:14 SQL> select count(*) ,sum(s.amount_sold),sum(p.prod_list_price) 
              from sh.sales s join sh.products p using(prod_id);


   COUNT(*)    SUM(S.AMOUNT_SOLD)    SUM(P.PROD_LIST_PRICE)
___________ _____________________ _________________________
     918843           98205831.21               86564235.57

Elapsed: 00:00:00.092

I have scanned nearly one million rows from the SALES table and joined it to the PRODUCTS table, aggregated data to show the sum from both tables columns. That takes 92 milliseconds here (including network roundtrip). You are not surprised to get fast response with a join because you have read The myth of NoSQL (vs. RDBMS) “joins dont scale”

Ok, now let’s say that a developer never learned about SQL joins and wants to do the same with a simpler scan/query API:


06:36:14 SQL> declare
    l_count_sales    number:=0;
    l_amount_sold    number:=0;
    l_sum_list_price number:=0;
   begin
    -- scan SALES
    for s in (select * from sh.sales) loop
     -- query PRODUCTS
     for p in (select * from sh.products where prod_id=s.prod_id) loop
      -- aggregate SUM and COUNT
      l_count_sales:=l_count_sales+1;
      l_amount_sold:=l_amount_sold+s.amount_sold;
      l_sum_list_price:=l_sum_list_price+p.prod_list_price;
     end loop;
    end loop;
    dbms_output.put_line('count_sales='||l_count_sales||' amount_sold='||l_amount_sold||' sum_list_price='||l_sum_list_price);
   end;
   /

PL/SQL procedure successfully completed.

Elapsed: 00:02:00.374

I have run this within the database with PL/SQL because I don’t want to add network rountrips and process switches to this bad design. You see that it takes 2 minutes here. Why? Because the risk when providing an API that doesn’t support joins is that the developer will do the join in his procedural code. Without SQL, the developer has no efficient and agile way to do this GROUP BY and SUM that was a one-liner in SQL: he will either loop on this simple scan/get API, or she will add a lot of code to initialize and maintain an aggregate derived from this table.

So, what can I do to avoid a user running this kind of query that will take a lot of CPU and IO resources? A simpler API will not solve this problem as the user will workaround this with many small queries. In the Oracle Autonomous Database, the admin can set some limits per service:

This says: when connected to the ‘TP’ service (which is the one for transactional processing with high concurrency) a user query cannot use more than 5 seconds of elapsed time or the query is canceled.

Now if I run the statement again:


Error starting at line : 54 File @ /home/opc/demo/tmp/atp-resource-mgmt-rules.sql
In command -
declare
 l_count_sales    number:=0;
 l_amount_sold    number:=0;
 l_sum_list_price number:=0;
begin
 -- scan SALES
 for s in (select * from sh.sales) loop
  -- query PRODUCTS
  for p in (select * from sh.products where prod_id=s.prod_id) loop
   -- aggregate SUM and COUNT
   l_count_sales:=l_count_sales+1;
   l_amount_sold:=l_amount_sold+s.amount_sold;
   l_sum_list_price:=l_sum_list_price+p.prod_list_price;
  end loop;
 end loop;
 dbms_output.put_line('count_sales='||l_count_sales||' amount_sold='||l_amount_sold||' sum_list_price='||l_sum_list_price);
end;
Error report -
ORA-56735: elapsed time limit exceeded - call aborted
ORA-06512: at line 9
ORA-06512: at line 9
56735. 00000 -  "elapsed time limit exceeded - call aborted"
*Cause:    The Resource Manager SWITCH_ELAPSED_TIME limit was exceeded.
*Action:   Reduce the complexity of the update or query, or contact your
           database administrator for more information.

Elapsed: 00:00:05.930

I get a message that I exceeded the limit. I hope that, from the message “Action: Reduce the complexity”, the user will understand something like “Please use SQL to process data sets” and will write a query with the join.

Of course, if the developer is thick-headed he will run his loop from his application code and will run one million short queries that will not exceed the time limit per execution. And it will be worse because of the roundtrips between the application and the database. The “Set Resource Management Rules” has another tab than “Run-away criteria”, which is “CPU/IO shares”, so that one service can be throttled when the overall resources are saturated. With this, we can give higher priority to critical services. But I prefer to address the root cause and show to the developer that when you need to join data, the most efficient is a SQL JOIN. And when you need to aggregate data, the most efficient is SQL GROUP BY. Of course, we can also re-design the tables to pre-join (materialized views in SQL or single-table design in DynamoDB for example) when data is ingested, but that’s another topic.

In the autonomous database, the GUI makes it simple, but you can query V$ views to monitor it. For example:


06:38:20 SQL> select sid,current_consumer_group_id,state,active,yields,sql_canceled,last_action,last_action_reason,last_action_time,current_active_time,active_time,current_consumed_cpu_time,consumed_cpu_time 
              from v$rsrc_session_info where sid=sys_context('userenv','sid');


     SID    CURRENT_CONSUMER_GROUP_ID      STATE    ACTIVE    YIELDS    SQL_CANCELED    LAST_ACTION     LAST_ACTION_REASON       LAST_ACTION_TIME    CURRENT_ACTIVE_TIME    ACTIVE_TIME    CURRENT_CONSUMED_CPU_TIME    CONSUMED_CPU_TIME
________ ____________________________ __________ _________ _________ _______________ ______________ ______________________ ______________________ ______________________ ______________ ____________________________ ____________________
   41150                        30407 RUNNING    TRUE             21               1 CANCEL_SQL     SWITCH_ELAPSED_TIME    2020-07-21 06:38:21                       168           5731                         5731                 5731


06:39:02 SQL> select id,name,cpu_wait_time,cpu_waits,consumed_cpu_time,yields,sql_canceled 
              from v$rsrc_consumer_group;


      ID            NAME    CPU_WAIT_TIME    CPU_WAITS    CONSUMED_CPU_TIME    YIELDS    SQL_CANCELED
________ _______________ ________________ ____________ ____________________ _________ _______________
   30409 MEDIUM                         0            0                    0         0               0
   30406 TPURGENT                       0            0                    0         0               0
   30407 TP                           286           21                 5764        21               1
   30408 HIGH                           0            0                    0         0               0
   30410 LOW                            0            0                    0         0               0
   19515 OTHER_GROUPS                 324           33                18320        33               0

You can see one SQL canceled here in the TP consumer group and my session was at 5.7 consumed CPU time.

I could have set the same programmatically with:


exec cs_resource_manager.update_plan_directive(consumer_group => 'TP', elapsed_time_limit => 5);

So, rather than limiting the API, better to give full SQL possibilities and limit the resources used per service: it makes sense to accept only short queries from the Transaction Processing services (TP/TPURGENT) and allow more time, but less shares, for the reporting ones (LOW/MEDIUM/HIGH)

Cet article The myth of NoSQL (vs. RDBMS) “a simpler API to bound resources” est apparu en premier sur Blog dbi services.

↧

Java and InfluxDB http api

July 21, 2020, 1:29 am

≫ Next: Control-M/EM sending alert to SNMP

≪ Previous: The myth of NoSQL (vs. RDBMS) “a simpler API to bound resources”

InfluxDB

InfluxDB is a powerfull open source time series database (TSDB) developped by InfluxData. It’s a database optimized for time-stamped or time series data. Time series data are simply measurements or events that are tracked.
It is particully interresting for metric tracking like server cpu, ram, application performances and so on.

It is similar to SQL databases but different in many ways. The key here is the time. The database engine is optimized for high ingest and data compression. It is wrtten in Go and compiles in one single binary without any dependencies.
We can use it through a cli but the most interesting part is it’s HTTP API which will allow us to perform actions on the measurements without any third party api.
It can be installed on a single server or in the cloud. It is highly compatible with Telegraf, a time series data collector, but we will discuss it on another blog.

Installation

I did the installation on a CentOS but it can be installed on other linux distributions, OS X and windows. You can download the rpm/zip/tar on the download page or you can install it directly like following:

Add the repo:

cat <<EOF | sudo tee /etc/yum.repos.d/influxdb.repo
[influxdb]
name = InfluxDB Repository - RHEL \$releasever
baseurl = https://repos.influxdata.com/rhel/\$releasever/\$basearch/stable
enabled = 1
gpgcheck = 1
gpgkey = https://repos.influxdata.com/influxdb.key
EOF

Ensure the cache is up to date:

sudo yum makecache fast

Install the database, we install curl in order to test the HTTP API:

sudo yum -y install influxdb vim curl

We enable the HTTP API:

sudo vim /etc/influxdb/influxdb.conf
[http]
  enabled = true
  bind-address = ":8086"
  auth-enabled = true
  log-enabled = true
  write-tracing = false
  pprof-enabled = true
  pprof-auth-enabled = true
  debug-pprof-enabled = false
  ping-auth-enabled = true

We start and enable the service:

sudo systemctl start influxdb && sudo systemctl enable influxdb

We need to open the port used by InfluxDB:

sudo firewall-cmd --add-port=8086/tcp --permanent
sudo firewall-cmd --reload

Now we create an admin account:

curl -XPOST "http://localhost:8086/query" --data-urlencode "q=CREATE USER username WITH PASSWORD 'password' WITH ALL PRIVILEGES"

If the query executed correctly you shouldn’t have any response.

You can now test your connection with the cli:

influx -username 'username' -password 'password'

You can list the databases with:

SHOW DATABASES

To create a database do the following:

CREATE DATABASE mymetrics

And to navigate to your created DB:

USE mymetrics

Once you’ll have some entries you will be able to list them like:

SHOW MEASUREMENTS

And then you can execute basic SQL like queries (only when you have measurements):

SELECT * FROM metric_cpu_load

Measurements via Java

Now that we have our DB setup and a user to connect, we will generate measurements with Java. But you can do it with any HTTP request compatible languages. It exists a Java api to discuss with the database directly (here) but I wanted to do it through the HTTP API.

You will need httpclient.jar and httpcore.jar from Apache Commons to use the latest http api. The version I used was:

httpclient5-5.0.1.jar
httpcore5-5.0.1.jar

String data = "metric_cpu_load value=0.5"
// We create a default HTTP client
HttpClient httpclient = HttpClients.createDefault();
// We create the URI with all informations
URI uri = new URIBuilder()
        .setScheme("http") // Can be https if ssl is enabled
        .setHost("localhost") // The hostname of the database
        .setPort(8086) // Default InfluxDB port
        .setPath("/write") // We call write to put data into the database
        .setParameter("db", "mymetrics") // The name of the database we created
        .setParameter("u", "username") // The username of the account we created
        .setParameter("p", "password") // The password of the account
        .build();
HttpPost httppost = new HttpPost(uri); // We create a POST request in order to add data
StringEntity entity = new StringEntity(data, ContentType.create("plain/text")); // The metric name and value is set as the body of the post request
httppost.setEntity(entity); // We link the entity to the post request
httpclient.execute(httppost); // Executes the post request, the result is an HttpResponse object, if the query went well, you should have a code 204

InfluxDB stores measurements in “tables”. One measure type is a table, you can specify several columns in the table, but note that the “value” column is mandatory. Here is the explanation of the body of the post request:

“table_name, col1=value1, col2=value2 value=value0 timestamp”

table_name is the name of the metric, if the table doesn’t exist it will be created, if it exists, a new row will be add to this “table”
colx are optionnal, they are tags or info that will specify this row, you can even have some rows with theses options while others don’t
value is the mandatory field, it must be specified or the post request will fail
timestamp is the timestamp of the row, you can specify it or remove it. If not specified, it will be set to the time when the row was added

Example

Here a simple example

We register several values:

[POST] http://localhost:8086/write?db=mymetrics&u=username&p=password
BODY: "metric_cpu_load,hostname=localhost value=0.1"

[POST] http://localhost:8086/write?db=mymetrics&u=username&p=password
BODY: "metric_cpu_load,hostname=localhost  value=0.4"

[POST] http://localhost:8086/write?db=mymetrics&u=username&p=password
BODY: "metric_cpu_load,hostname=localhost  value=0.8"

[POST] http://localhost:8086/write?db=mymetrics&u=username&p=password
BODY: "metric_cpu_load,hostname=localhost  value=0.5"

Or in CURL we create the DB if it doesn’t exist:

curl -i -XPOST 'http://localhost:8086/query?u=username&p=password' --data-urlencode "q=CREATE DATABASE mymetrics" -> returns code 200 OK
curl -i -XPOST 'http://localhost:8086/write?db=mymetrics&u=username&p=password' --data-binary 'metric_cpu_load,hostname=localhost value=0.1' -> returns 204 NO CONTENT
curl -i -XPOST 'http://localhost:8086/write?db=mymetrics&u=username&p=password' --data-binary 'metric_cpu_load,hostname=localhost value=0.4'
curl -i -XPOST 'http://localhost:8086/write?db=mymetrics&u=username&p=password' --data-binary 'metric_cpu_load,hostname=localhost value=0.8'
curl -i -XPOST 'http://localhost:8086/write?db=mymetrics&u=username&p=password' --data-binary 'metric_cpu_load,hostname=localhost value=0.5'

Now you can see the result from the cli:

influx -username 'username' -password 'password'

Connected to http://localhost:8086 version 1.8.1
InfluxDB shell version: 1.8.1

> SHOW DATABASES
name: databases
name
----
_internal
mymetrics

> USE mymetrics
Using database mymetrics

> SHOW MEASUREMENTS
name: measurements
name
----
metric_cpu_load

> SELECT * FROM metric_cpu_load
name: metric_cpu_load
time                hostname  value
----                --------  -----
1595318363624096578 localhost 0.1
1595318430152497136 localhost 0.4
1595318433209434527 localhost 0.8
1595318436384650878 localhost 0.5

We can also get it from HTTP:

http://localhost:8086/query?db=mymetrics&u=username&p=password&q=select * from metric_cpu_load

{"results":
  [{
    "statement_id":0,
    "series":[{
       "name":"metric_cpu_load",
       "columns":["time","hostname","value"],
       "values":[
           ["2020-07-21T07:59:23.624096578Z","localhost",0.1],
           ["2020-07-21T08:00:30.152497136Z","localhost",0.4],
           ["2020-07-21T08:00:33.209434527Z","localhost",0.8],
           ["2020-07-21T08:00:36.384650878Z","localhost",0.5]]}]}]}

Next steps

And now what? We have a time based database where we can list and add measurements through Java and CURL, what can we do with this?

Next time we will see how to display the results in shiny graphs thanks to Grafana.

Cet article Java and InfluxDB http api est apparu en premier sur Blog dbi services.

↧

Control-M/EM sending alert to SNMP

July 23, 2020, 5:11 am

≫ Next: Changes that I like in the new MySQL 8.0.21

≪ Previous: Java and InfluxDB http api

Hello everybody, today we will see how to send Control-M alerts to a central monitoring software

Introduction

The aim is to send alerts and logs from Control-M to an event manager system such as Patrol/ BMC Truesight or Nagios.
We will see together how to link Control-M to a central monitoring tool,and for that,Control-M admin and monitoring admin must work hand in hand .So let’s start!

Copy the Control-M/EM MIB file to your SNMP server:

First, we must copy the MIB file to the SNMP server (send it to the monitoring application admin as he will have to use it for his monitoring tool’s configuration)
Control-M EM _DIR\Data\ BMC-CONTROLMEM-MIB.txt
The BMC-CONTROLMEM-MIB.txt contains SNMP variable and trap format that Control-M/EM uses to send alerts to an event management system via SNMP
For more details follow this link:
http://documents.bmc.com/supportu/9.0.19/help/Main_help/en-US/index.htm#45731.htm
Configure ControlM/EM the system parameters
Reminder on Alerts and Xalerts:
Alerts and Xalerts can be sent to a central monitoring tool,trigger a script and pass infos as parameters and of course be configured to send notifications when a problem occurs ( more details on bmc support site )

Alerts:
-Alerts can receive shouts from jobs (must be configured on the job)
-Alerts can display communication messages such as agent availability or communication to Control-M/Server errors

Exceptions alerts (Xalerts):
-Generated by the EM server of internal errors from components
-We can view these alerts under Manage/Exception panel
From the CCM, define the following system parameters, as described in Defining Control-M/EM system parameters:

SNMPHost: Define the hostname of the SNMP server where the alerts are sent.
SNMPSendActive: Change the value to 1 to generate SNMP messages for Active Alerts.
SendSNMP: Change the value to 0 to send alerts to SNMP server only.
SendAlertNotesSnmp: Change the value to 1 if you want to send the NOTES field to the SNMP server.
XAlertsEnableSending: Change the value to 1 to enable xAlert sending. Note that default value is 1.
XAlertsSnmpHosts: Define the hostname of the SNMP server where the xAlerts are sent.
XalertsSendSnmp: Change the value to 1 to send xAlerts to SNMP server only (you can also define it to send Xalerts to a script or both.) For more information about Alert parameters, see Control-M/EM general parameters.

To proceed, we will use the Control-M Configuration Manager and configure it from Control-M/EM the system parameters part:

The search bar allows you to display every parameter linked to SNMP and alert sending

Most important fields are:
Name: indicate the name of the parameter
Description: indicate the parameter role
Refresh method: indicate if Control-M component should be restarted (recycle) for changes to take effect
Following this information, we will activate SNMP trap sending and define the host corresponding to the monitoring machine
Especially we will focus on configuring and sending SNMP traps to the central monitoring tool by adding it’s address in the appropriate field

Important
Note that the refresh method is a critical point for changes to take effect:

Automatic: no action needed for the change to take effect
Manual: action needed for the change to take effect
Recycle: Cycle the related component for the change to take effect
Recycle GATEWAY
Here we will have to recycle components for the changes to take effect:
Select on the gateway the option “recycle”

Once the gateway indicates connected status the changes are taken in account
Test to send alert to central monitoring tool.
As from Control-M side the SNMP sending configuration is done, the final step is to test by sending an alert to the alert window with a failing test job for example
Check if the central monitoring server is receiving the trap (synchronize with the monitoring application administrator). If your administrator confirm the trap is received then everything is ok

Conclusion

Now you know how to send traps to an external monitoring tool. For more information and further configuration please go to BMC site and check the documentation
You can go to the help part of your Control-M workload client., also following this link for example you can have the SNMP trap format:
http://documents.bmc.com/supportu/9.0.19/help/Main_help/en-US/index.htm#45731.htm
Feel free to check dbi-services bloggers and share with us you experience and tips of BMC Control-M!

Cet article Control-M/EM sending alert to SNMP est apparu en premier sur Blog dbi services.

↧

Changes that I like in the new MySQL 8.0.21

July 24, 2020, 9:12 am

≫ Next: Git collaboration: quick start

≪ Previous: Control-M/EM sending alert to SNMP

A new release of MySQL was out on July 13 (8.0.21).
Among all the changes, there are some that I have already tested and that I really appreciate.

Who stopped the MySQL server?

In previous releases, I can already see in the error log file who stopped the MySQL server if this was done through the shutdown statement.
In MySQL 8.0.20:

mysql> shutdown;
Query OK, 0 rows affected (0.00 sec)
mysql> exit;
Bye
# tail -2 /u01/app/mysql/admin/mysqld2/log/mysqld2.err
2020-07-23T20:39:08.049064+02:00 9 [System] [MY-013172] [Server] Received SHUTDOWN from user root. Shutting down mysqld (Version: 8.0.20).
2020-07-23T20:39:09.796726+02:00 0 [System] [MY-010910] [Server] /u01/app/mysql/product/mysql-8.0.20/bin/mysqld: Shutdown complete (mysqld 8.0.20)  MySQL Community Server - GPL.

In MySQL 8.0.21:

mysql> shutdown;
Query OK, 0 rows affected (0.00 sec)
mysql> exit;
Bye
# tail -2 /u01/app/mysql/admin/mysqld1/log/mysqld1.err
2020-07-23T20:39:15.725223+02:00 9 [System] [MY-013172] [Server] Received SHUTDOWN from user root. Shutting down mysqld (Version: 8.0.21).
2020-07-23T20:39:17.029262+02:00 0 [System] [MY-010910] [Server] /u01/app/mysql/product/mysql-8.0.21/bin/mysqld: Shutdown complete (mysqld 8.0.21)  MySQL Community Server - GPL.

But when the server was stopped through the systemctl command, this information was lost:

# systemctl stop mysqld
# tail -2 /u01/app/mysql/admin/mysqld2/log/mysqld2.err
2020-07-23T20:56:23.488468+02:00 0 [ERROR] [MY-010119] [Server] Aborting
2020-07-23T20:56:23.489010+02:00 0 [System] [MY-010910] [Server] /u01/app/mysql/product/mysql-8.0.20/bin/mysqld: Shutdown complete (mysqld 8.0.20)  MySQL Community Server - GPL.

As of MySQL 8.0.21, the error log file traces who stopped the MySQL server in any case (except if you are using the kill -9 command):

# systemctl stop mysqld
# tail -2 /u01/app/mysql/admin/mysqld1/log/mysqld1.err
2020-07-23T21:01:00.731589+02:00 0 [System] [MY-013172] [Server] Received SHUTDOWN from user . Shutting down mysqld (Version: 8.0.21).
2020-07-23T21:01:01.560078+02:00 0 [System] [MY-010910] [Server] /u01/app/mysql/product/mysql-8.0.21/bin/mysqld: Shutdown complete (mysqld 8.0.21)  MySQL Community Server - GPL.

How can I enable/disable redo logging?

This is a nice but dangerous feature. Disable redo logging can be interesting when creating a new instance to avoid wasting time during redo log writes and doublewrite buffering operation. But otherwise don’t do that in your production environment!
Check if redo logging is enabled:

mysql> SHOW GLOBAL STATUS LIKE 'Innodb_redo_log_enabled';
+-------------------------+-------+
| Variable_name           | Value |
+-------------------------+-------+
| Innodb_redo_log_enabled | ON    |
+-------------------------+-------+

Create an user account with permissions to execute these operations:

mysql> GRANT INNODB_REDO_LOG_ENABLE ON *.* to 'load'@'localhost';

Connect to the MySQL server with the load user and disable redo logging:

mysql> ALTER INSTANCE DISABLE INNODB REDO_LOG;
mysql> SHOW GLOBAL STATUS LIKE 'Innodb_redo_log_enabled';
+-------------------------+-------+
| Variable_name           | Value |
+-------------------------+-------+
| Innodb_redo_log_enabled | OFF   |
+-------------------------+-------+

You can now load your data into the system, then enable again redo logging:

mysql> ALTER INSTANCE ENABLE INNODB REDO_LOG;
mysql> SHOW GLOBAL STATUS LIKE 'Innodb_redo_log_enabled';
+-------------------------+-------+
| Variable_name           | Value |
+-------------------------+-------+
| Innodb_redo_log_enabled | ON    |
+-------------------------+-------+

What if we add attributes and comments to MySQL user accounts?

As of MySQL 8.0.21, we can add attributes as a JSON object and comments during the creation of MySQL user accounts.
A comment:

mysql> CREATE USER 'backup'@'localhost' identified by 'Supercal1frag1l1st1cexp1al1doc10us!' COMMENT 'This is the user account used to backup the MySQL server';
mysql> select User, Host, User_attributes from mysql.user
    -> where User='backup';
+--------+-----------+---------------------------------------------------------------------------------------+
| User   | Host      | User_attributes                                                                       |
+--------+-----------+---------------------------------------------------------------------------------------+
| backup | localhost | {"metadata": {"comment": "This is the user account used to backup the MySQL server"}} |
+--------+-----------+---------------------------------------------------------------------------------------+
mysql> select * from INFORMATION_SCHEMA.USER_ATTRIBUTES 
   -> where USER='backup';
+--------+-----------+-------------------------------------------------------------------------+
| USER   | HOST      | ATTRIBUTE                                                               |
+--------+-----------+-------------------------------------------------------------------------+
| backup | localhost | {"comment": "This is the user account used to backup the MySQL server"} |
+--------+-----------+-------------------------------------------------------------------------+

An attribute:

mysql> CREATE USER 'elisa'@'localhost' identified by 'CatchMe1fY0uCan!' ATTRIBUTE '{"LastName": "Usai", "FirstName": "Elisa", "Email": "elisa.usai\@dbi\-services\.com", "Department": "DBA Team"}';
mysql> select User, Host, User_attributes from mysql.user
    -> where User='elisa';
+-------+-----------+----------------------------------------------------------------------------------------------------------------------------+
| User  | Host      | User_attributes                                                                                                            |
+-------+-----------+----------------------------------------------------------------------------------------------------------------------------+
| elisa | localhost | {"metadata": {"Email": "elisa.usai@dbi-services.com", "LastName": "Usai", "FirstName": "Elisa", "Department": "DBA Team"}} |
+-------+-----------+----------------------------------------------------------------------------------------------------------------------------+
mysql> select * from INFORMATION_SCHEMA.USER_ATTRIBUTES
   -> where User='elisa';
+-------+-----------+--------------------------------------------------------------------------------------------------------------+
| USER  | HOST      | ATTRIBUTE                                                                                                    |
+-------+-----------+--------------------------------------------------------------------------------------------------------------+
| elisa | localhost | {"Email": "elisa.usai@dbi-services.com", "LastName": "Usai", "FirstName": "Elisa", "Department": "DBA Team"} |
+-------+-----------+--------------------------------------------------------------------------------------------------------------+
mysql> select user as User,
    -> host as Host,
    -> concat(attribute->>"$.LastName"," ",attribute->>"$.FirstName") as 'Name',
    -> attribute->>"$.Department" as Department,
    -> attribute->>"$.Email" as Email
    -> from INFORMATION_SCHEMA.USER_ATTRIBUTES
    -> where user='elisa';
+-------+-----------+------------+------------+-----------------------------+
| User  | Host      | Name       | Department | Email                       |
+-------+-----------+------------+------------+-----------------------------+
| elisa | localhost | Usai Elisa | DBA Team   | elisa.usai@dbi-services.com |
+-------+-----------+------------+------------+-----------------------------+

Check your backups execution!

With the MySQL 8.0.21 release, your backup through mysqldump could fail with the following error:

 
# mysqldump --all-databases --user=backup --password > test.sql
Enter password:
mysqldump: Error: 'Access denied; you need (at least one of) the PROCESS privilege(s) for this operation' when trying to dump tablespaces
mysql> show grants for backup@localhost;
+------------------------------------------------------------------------------+
| Grants for backup@localhost                                                  |
+------------------------------------------------------------------------------+
| GRANT SELECT, LOCK TABLES, SHOW VIEW, TRIGGER ON *.* TO `backup`@`localhost` |
+------------------------------------------------------------------------------+

Why? Actually in the new release the INFORMATION_SCHEMA.FILES table requires by now the PROCESS privilege and this change has an impact on mysqldump operations:

 
mysql> grant PROCESS ON *.* TO `backup`@`localhost`;
Query OK, 0 rows affected (0.02 sec)
# mysqldump --all-databases --user=backup --password > test.sql
Enter password:

And as usual, stay tuned with MySQL!

Cet article Changes that I like in the new MySQL 8.0.21 est apparu en premier sur Blog dbi services.

↧

Git collaboration: quick start

July 24, 2020, 11:13 pm

≫ Next: SQL Server: Change Availability Group Endpoint Ownership

≪ Previous: Changes that I like in the new MySQL 8.0.21

Git collaboration: quick start

If you want to keep the git repository of your project clean and predictive (which is highly recommendable), here is a simple workflow to follow.
For the sake of this article, we are going to join the MongoDB project.

First step: work safely on your branch

When joining the project, the first step to get a local copy in your laptop:

$ cd my_working_directory $ git clone git@github.com:nico/mongo.git

Once you do that, you have a copy of the source code locally. This is called the master:

$ cd ./mongo $ ll total 340 drwxrwxr-x 13 nico nico 4096 Jun 25 08:59 ./ drwxrwxr-x 3 nico nico 4096 Jun 25 08:56 ../ -rw-rw-r-- 1 nico nico 6709 Jun 25 08:59 .clang-format -rw-rw-r-- 1 nico nico 29 Jun 25 08:59 .eslintignore -rw-rw-r-- 1 nico nico 795 Jun 25 08:59 .eslintrc.yml -rw-rw-r-- 1 nico nico 308 Jun 25 08:59 .gdbinit drwxrwxr-x 8 nico nico 4096 Jun 25 08:59 .git/ -rw-rw-r-- 1 nico nico 74 Jun 25 08:59 .gitattributes -rw-rw-r-- 1 nico nico 2145 Jun 25 08:59 .gitignore -rw-rw-r-- 1 nico nico 337 Jun 25 08:59 .lldbinit -rw-rw-r-- 1 nico nico 588 Jun 25 08:59 .pydocstyle -rw-rw-r-- 1 nico nico 1703 Jun 25 08:59 .pylintrc -rw-rw-r-- 1 nico nico 193 Jun 25 08:59 .style.yapf -rw-rw-r-- 1 nico nico 366 Jun 25 08:59 CONTRIBUTING.rst -rw-rw-r-- 1 nico nico 13460 Jun 25 08:59 CreativeCommons.txt -rw-rw-r-- 1 nico nico 30608 Jun 25 08:59 LICENSE-Community.txt -rw-rw-r-- 1 nico nico 1987 Jun 25 08:59 README -rw-rw-r-- 1 nico nico 9933 Jun 25 08:59 README.third_party.md -rw-rw-r-- 1 nico nico 173224 Jun 25 08:59 SConstruct drwxrwxr-x 17 nico nico 4096 Jun 25 08:59 buildscripts/ drwxrwxr-x 2 nico nico 4096 Jun 25 08:59 debian/ drwxrwxr-x 2 nico nico 4096 Jun 25 08:59 distsrc/ drwxrwxr-x 2 nico nico 4096 Jun 25 08:59 docs/ drwxrwxr-x 4 nico nico 4096 Jun 25 08:59 etc/ drwxrwxr-x 33 nico nico 4096 Jun 25 08:59 jstests/ -rw-rw-r-- 1 nico nico 570 Jun 25 08:59 mypy.ini drwxrwxr-x 2 nico nico 4096 Jun 25 08:59 pytests/ drwxrwxr-x 2 nico nico 4096 Jun 25 08:59 rpm/ drwxrwxr-x 5 nico nico 4096 Jun 25 08:59 site_scons/ drwxrwxr-x 4 nico nico 4096 Jun 25 08:59 src/

Now, in a big project like this, you shouldn’t work directly to the master. It is often not permitted to push your change in the master.
What we should do instead is to work on a separate copy that is commonly called a branch.

There should be a convention for the branch naming depending on the project. But as a rule of thumb, the branch name should be short, and self-describing.
Preferably, it will refer to a task (Jira task, GitLab issue, GitHub issue, etc.).

For our example, I’m going to change the MongoDB project documentation, so I call my branch “remove_obsolete_doc_part”.

$ git branch remove_obsolete_doc_part

You can see my local branches with that command:

$ git branch * master ------------------------> The star here tells you that you are currently in that branch remove_obsolete_doc_part

And You can see all local branches (local in your laptop and remote from the project) with the next command.
You’ll see two master branch, one called “remotes/origin/master” and one “master”. “master” is your local copy of
the master, while “remotes/origin/master” is the remote copy. “Origin” refers to the repository you cloned the code from (“git clone “).

$ git branch -a * master remove_obsolete_doc_part remotes/origin/HEAD -> origin/master remotes/origin/count-trace-events remotes/origin/f1b99df5 remotes/origin/master remotes/origin/r3.4.14 ... remotes/origin/v3.4 remotes/origin/v3.6 remotes/origin/v3.6.9-dbaas-testing remotes/origin/v4.0 remotes/origin/v4.2 remotes/origin/v4.2.1-dbaas-testing remotes/origin/v4.4

As you can above, there are many remote branches from the repository origin. But only two locals on your laptop.

Then, we need to switch the local files to our branch “remove_obsolete_doc_part”:

$ git checkout remove_obsolete_doc_part $ git branch master * remove_obsolete_doc_part

Now I made my changes on the code and I can see the state of my change:

$ git status On branch remove_obsolete_doc_part Changes not staged for commit: (use "git add/rm ..." to update what will be committed) (use "git restore ..." to discard changes in working directory) modified: README deleted: docs/vpat.md

I want to commit those changes:

$ git add -A # ----------------> I add all my changes (-A) to the Git staging area (which means that will be committed at next commit time). $ git commit -m "Remove obsolete documentation" # ---------------> I commit with a clear and short message of what I did. [remove_obsolete_doc_part 1445bc4c6a] Remove obsolete documentation 2 files changed, 141 deletions(-) delete mode 100644 docs/vpat.md

We can see the current state of our local Git repo with that command:

$ git status On branch remove_obsolete_doc_part nothing to commit, working tree clean

And we can review the previous commits wiht that one:

$ git log commit 1445bc4c6a80a6dad5b285da4d5a81e9fa3a98cc (HEAD -> remove_obsolete_doc_part) Author: nico Date: Thu Jun 25 09:11:51 2020 +0000 Remove obsolete documentation

Then we are ready to push our branch to the remote repository:

$ git push origin remove_obsolete_doc_part Enumerating objects: 7, done. Counting objects: 100% (7/7), done. Delta compression using up to 2 threads Compressing objects: 100% (4/4), done. Writing objects: 100% (4/4), 356 bytes | 178.00 KiB/s, done. Total 4 (delta 3), reused 0 (delta 0) remote: Resolving deltas: 100% (3/3), completed with 3 local objects. remote: remote: Create a pull request for 'remove_obsolete_doc_part' on GitHub by visiting: remote: https://github.com/nico/mongo/pull/new/remove_obsolete_doc_part remote: To github.com:nico/mongo.git * [new branch] remove_obsolete_doc_part -> remove_obsolete_doc_part

Second step: publish your contribution

Once you like what you did, you can merge your code into the master branch.
This can be done in different ways, and depending on the Git service provider, the philosophy can change.
For this article, I’m going to merge with the Git command-line tool.

Prior to merging your changes from the branch “remove_obsolete_doc_part” to the master branch, you need to
synchronize (pull) the remove master branch with your local one:

$ git checkout master Switched to branch 'master' Your branch is up to date with 'origin/master'.

$ git pull #-------------------------> merge the remote code into your local code.

$ git fetch -p #-------------------------> optional but handy: it fetches remote branch other than the master and remove the local one that has been removed remotely.

Then, you can merge your local branch “remove_obsolete_doc_part” into the local master branch:

$ git merge remove_obsolete_doc_part Updating 8b0f5d5e1d..1445bc4c6a Fast-forward README | 5 ----- docs/vpat.md | 136 --------------------------------------------------------------------------------------------------------------------------------------- 2 files changed, 141 deletions(-) delete mode 100644 docs/vpat.md

At this stage, you may have conflicts because some parts of the code have been changed in the remote master copy while you were working locally. That’s perfectly fine and Git will show you the conflicting file in the staging area: `git status`. Just review the files and decide which part of the code you want to keep and commit.

The last operation conists in pushing your local copy of the master branch into the remote master so that everyone can enjoy
your changes:

$ git push Total 0 (delta 0), reused 0 (delta 0) To github.com:nico/mongo.git 8b0f5d5e1d..1445bc4c6a master -> master

Conclusion

Those are the minimum good practices you should follow if you plan to work on a project involving a few persons.
That guarantee that you can backup regularly your code without impacting the main project.

For more significant projects, you’ll see other concepts like the “pull request” or “merge request”.
Usually, you won’t be able to merge your changes directly into the remote master. Instead, you’ll have to create a request to push your changes into the remote master, and someone else will decide to approve the merging of your code, reject it or comment it if it requires more changes before merging.

Notes

Keep as much as possible your working branch in sync with the remote master. Ideally, execute full sync of your local copy before working on it every day:

git checkout master # Go the master branch git pull # Get remote master changes into the local master branch git checkout remove_obsolete_doc_part # Go back to your working branch git merge master # Synchronize your working branch

Cet article Git collaboration: quick start est apparu en premier sur Blog dbi services.

↧

SQL Server: Change Availability Group Endpoint Ownership

July 27, 2020, 6:26 am

≫ Next: ODA: odacli now supports Data Guard in 19.8

≪ Previous: Git collaboration: quick start

I’m doing some cleaning on my customer’s instances.
I want to delete the login of a previous DBA for 2 reasons; this person does not work anymore in my customer’s company and all DBA are members of a group that is given permission on the instances. I don’t want to see any DBA’s personal login on SQL Server instances.
When I try to delete the login I receive the following error;

Msg 15173, Level 16, State 1, Line 4
Server principal 'MyDomain\AccountName' has granted one or more permission(s). 
Revoke the permission(s) before dropping the server principal.

There are no permissions set at the instance level or at the database level for this login.
I get this error because the login is the owner of the Database Mirroring Endpoint.

Every object on SQL Server has an owner. The endpoint got the ownership back when the AlwaysOn Availability group was created by the DBA.
The following query will show the owner of the Database Mirroring endpoint. Running this query on dozens of instances shows several personal DBA accounts as the owner of the Database Mirroring endpoint.

select name, SUSER_NAME(principal_id) AS OwnerName, type_desc
from sys.endpoints
where name = 'Hadr_endpoint'

As described in the documentation the principal_id here is the “ID of the server principal that created and owns this endpoint”.
The Endpoint owner can be changed using the ALTER AUTHORIZATION command like this;

ALTER AUTHORIZATION ON ENDPOINT::Hadr_endpoint TO sa;

Before doing so, as mentioned in the error message, we need to careful about the permissions granted by the account we want to remove.
The endpoint owner granted the CONNECT permission on the endpoint to the SQL Server service account.
Doing ALTER AUTHORIZATION will drop this permission which will disconnect your AlwaysOn replicas.

We can verify the CONNECT permission using the following query.

select e.name
	, p.state_desc, SUSER_NAME(p.grantor_principal_id) AS Grantor
	, p.permission_name, SUSER_NAME(p.grantee_principal_id) AS Grantee
from sys.endpoints AS e
	join sys.server_permissions AS p
		on e.endpoint_id = p.major_id
where name = 'Hadr_endpoint'

The DBA login is indeed the grantor of the CONNECT permission to the endpoint for the service account.
After running the ALTER AUTHORIZATION command we need to grand again the permission.

Here are the commands to change the ownership on the AlwaysOn endpoint and grant CONNECT back to the service account:

ALTER AUTHORIZATION ON ENDPOINT::Hadr_endpoint TO sa
GRANT CONNECT ON ENDPOINT::Hadr_endpoint TO [MyDomain\svc_sql]

Now, running the 2 previous queries gives the following result.

I can now delete the DBA login from the instance.
In my opinion, changing the AlwaysOn endpoint owner to the “sa” login could be a best practice to apply after setting up a new Availability group.

PS: An other error linked to AlwaysOn that you can face when dropping a login is the following.

The server principal owns one or more availability group(s) and cannot be dropped.

This one is about the owner of the availability group which can be identified and modified with the following queries.

select g.name AS GroupName, p.name AS OwnerName
from sys.availability_groups as g
	join sys.availability_replicas AS r
		on g.group_id = r.group_id
	join sys.server_principals AS p
		on r.owner_sid = p.sid

ALTER AUTHORIZATION ON AVAILABILITY GROUP::[MyAGName] TO sa;

Cet article SQL Server: Change Availability Group Endpoint Ownership est apparu en premier sur Blog dbi services.

↧

ODA: odacli now supports Data Guard in 19.8

July 28, 2020, 11:23 am

≫ Next: SQL Server Installation Wizard error : Failed to retrieve data for this request

≪ Previous: SQL Server: Change Availability Group Endpoint Ownership

Introduction

I’ve been dreaming of this kind of feature: just because most of the ODA configurations now include Disaster Recovery capabilities, through Data Guard or Dbvisit standy. If Dbvisit will obviously never be integrated to odacli, the lack of Data Guard features is now solved by the very latest 19.8 ODA software appliance kit.

How Data Guard was implemented before 19.8?

Those who have been using ODA and Data Guard for a while know that ODA was not aware of a Data Guard configuration. With odacli, you can create databases, mostly primaries, and you can also create an instance, which is a database without any file, just to get the database item in the reposity and a started instance. Creating a standby was done with the standard tools, RMAN for database duplication, and eventually system commands to edit and copy the pfile to standby server. Configuring Data Guard was done with the Data Guard broker dgmgrl, nothing specific to ODA. Lots of steps were also needed, the standby logs creation, the unique naming of the databases, the configuration of standby_file_managemement, … Actually quite a lot of operations not so easy to fully automate.

What are the promises of this 19.8?

As this 19.8 is brand new, I picked up this from the documentation. Quite impatient to test these features as soon as possible.

First, odacli is now aware of Data Guard and can manage a Data Guard configuration. A Data Guard configuration is now an object in the repository, you can manage multiple Data Guard configurations, linking primary and standby databases together. You can also do the SWITCHOVER and FAILOVER operations with odacli. The use of dgmgrl doesn’t seem mandatory here.

The goal of odacli’s Data Guard integration is to simplify everything, as this is the purpose of an appliance.

What are the prerequisites?

For using Data Guard on ODA you will need:
– at least 2 ODAs
– at least 2 different sites (reliable Disaster Recovery absolutely requires different sites)
– similar ODA configuration: mix of lite and HA ODAs is not supported
– Enterprise Edition on your ODAs: Data Guard is embedded with Enterprise Edition and do not exist in Standard Edition 2
– Twice the database size in space (never forget that)
– similar database configuration (shape and settings) and storage configuration. Mix of ASM and ACFS is not supported. Actually, these are best practices.
– All ODAs deployed or upgraded to 19.8 or later but with the same version
– OS customizations, if exist, should be the same on all the ODAs
– ODA backup configuration should exist, to OCI Object Storage or to NFS server (odacli create-backupconfig and odacli modify-database)
– your primary database should already exist

What are the steps for creating a Data Guard configuration?

First, create a backup on the primary with:
odacli create-backup
Once done, save the backup report to a text file with:
odacli describe-backupreport
Copy this backup report to standby ODA and do the restore with:
odacli irestore-database
You will need to restore this database with the STANDBY type (will basically flag the controlfile to standby database).

From now, you have 2 nearly identical databases. You can now create the Data Guard configuration with:
odacli configure-dataguard from the primary ODA. This command will prompt you for various parameters, like the standby ODA IP address, the name of the Data Guard configuration you want, the network to use for Data Guard, the transport type, the listener ports, the protection mode, aso. What’s interesting here is that you can also provide a json file with all these parameters, the same way you do when you deploy the appliance. Far more convenient, and much faster:
odacli configure-dataguard -r my_DG_config.json

You’ll also have to manage the TrustStore passwords, as described in the documentation. I don’t know if it’s new but never have to manage it before.

How to manage Data Guard through odacli?

To have an overview of all your Data Guard configurations, you can use:

odacli list-dataguardstatus

As you can imagine, each Data Guard configuration is identified by an id, and with this id you can have detailed view of the configuration:

odacli describe-dataguardstatus -i xxx

odacli is even able to do the switchover and failover operations you were doing with dgmgrl before:
odacli switchover-dataguard -i xxx odacli failover-dataguard -i xxx
In case of a failover, you probably know that the reinstate of the old primary is needed (because of its current SCN probably being greater than the SCN on standby when the failover was done), you can do the reinstate with odacli too:

odacli reinstate-dataguard -i xxx

Other features

With ODA, you can quite simply migrate a database moving from one home to another. This feature now supports a Data Guard configuration, but you will have to manually disable transport and apply during migration with dgmgrl. It’s quite nice to be able to keep the configuration and not having the need to rebuild it in case of database migration.

If you need to remove a Data Guard configuration, you can do that through odacli: odacli deconfigure-dataguard -i xxx

If you want to use a dedicated network for Data Guard, this is also possible with the network management option of odacli.

Conclusion

The Data Guard management was missing on ODA, and this new version seems to bring nearly all the features. Let’s try it on the next projects!

Cet article ODA: odacli now supports Data Guard in 19.8 est apparu en premier sur Blog dbi services.

↧

SQL Server Installation Wizard error : Failed to retrieve data for this request

July 29, 2020, 11:41 am

≫ Next: A lesson from NoSQL (vs. RDBMS): listen to your users

≪ Previous: ODA: odacli now supports Data Guard in 19.8

Today I faced a strange issue when I tried to install a new SQL Server instance : Failed to retrieve data for this request
This error occurred just after clicking on “New SQL Server stand-alone installation…”

The error message is not helpful at all. So, the first step to troubleshoot an issue is to look at the error logs.

The location by default is : C:\Program Files\Microsoft SQL Server\140\Setup Bootstrap\Log
There is a summary.txt file and many subfolders created every time the Installation wizard is started. In those subfolders, we can find a Detail.txt file.

The error looks like this:

Exception type: Microsoft.SqlServer.Management.Sdk.Sfc.EnumeratorException
    Message: 
        Failed to retrieve data for this request.
    HResult : 0x80131500
    Data: 
      HelpLink.ProdName = Microsoft SQL Server
      HelpLink.BaseHelpUrl = http://go.microsoft.com/fwlink
      HelpLink.LinkId = 20476
      HelpLink.EvtType = 0xE8A0C283@0xAC7B1A58@1233@53
      DisableWatson = true
    Stack: 
        at Microsoft.SqlServer.Discovery.SqlDiscoveryDatastoreInterface.LoadData(IEnumerable`1 machineNames, String discoveryDocRootPath, String clusterDiscoveryDocRootPath)
        at Microsoft.SqlServer.Discovery.SqlDiscoveryProvider.DiscoverMachines(ServiceContainer context, Boolean runRemoteDetection, String discoveryDocRootPath, String clusterDiscoveryDocRootPath)
        at Microsoft.SqlServer.Configuration.SetupExtension.RunDiscoveryAction.ExecuteAction(String actionId)
        at Microsoft.SqlServer.Chainer.Infrastructure.Action.Execute(String actionId, TextWriter errorStream)
        at Microsoft.SqlServer.Setup.Chainer.Workflow.ActionInvocation.<>c__DisplayClasse.<ExecuteActionWithRetryHelper>b__b()
        at Microsoft.SqlServer.Setup.Chainer.Workflow.ActionInvocation.ExecuteActionHelper(ActionWorker workerDelegate)
    Inner exception type: Microsoft.SqlServer.Configuration.Sco.SqlRegistryException
        Message: 
                The network path was not found.
                
        HResult : 0x84d10035
                FacilityCode : 1233 (4d1)
                ErrorCode : 53 (0035)

The useful information here is the message about network: The network path was not found.
I try to install the instance on a server that is a member of a Failover Cluster. So this gives me a hint.
I need to check any connectivity issue between the 2 nodes of my cluster.

The 2 nodes can ping each other. The DNS resolution is fine.
Out of curiously, I tried the administrative shares and it was not working from my current server to the other node. The other way around is fine. This is surprising and I tried to fix this.

Without further ado, I will show you where the issue comes from.

Someone had disabled File and Printer Sharing for Microsoft Networks on the Ethernet Adapter of the other cluster node.

Enabling it made both the administrative shares and the SQL Server Wizard behave as expected.

I hope this was helpful if you encountered the same error message when trying to install an SQL Server instance in a Failover Cluster context.

Cet article SQL Server Installation Wizard error : Failed to retrieve data for this request est apparu en premier sur Blog dbi services.

↧

A lesson from NoSQL (vs. RDBMS): listen to your users

August 1, 2020, 12:24 am

≫ Next: RDBMS (vs. NoSQL) scales the algorithm before the hardware

≪ Previous: SQL Server Installation Wizard error : Failed to retrieve data for this request

By Franck Pachot

.
I have written a few blog posts about some NoSQL (vs. RDBMS) myths (“joins dont scale”, “agility: adding attributes” and “simpler API to bound resources”). And I’ll continue on other points that are claimed by some NoSQL vendors and are, in my opinion, misleading by lack of knowledge and facts about RDBMS databases. But here I’m sharing an opposite opinion: SQL being user-friendly is now a myth.
Yes, that was the initial goal for SQL: design a relational language that would be more accessible to users without formal training in mathematics or computer programming (This is quoted from “Early History of SQL” by Donald D. Chamberlin on the history behind “SEQUEL: A STRUCTURED ENGLISH QUERY LANGUAGE”

However, it seems that this language became too complex for doing simple things. What was designed to be an end-user language is finally generated by software most of the time. Generated by BI tools for reporting and analytics. Or by ORM framework for OLTP. And the SQL generated is, often, far from optimal (we have all seen many bad queries generated by Tableau, or by Hibernate, for example) not because the tool that generates it is bad, but because no tool can compensate the lack of understanding of the data model.

Then, because the SQL generated was bad, people came with the idea that SQL is slow. Rather than understanding why an index is not used in a BI query (example), or why an OLTP request doesn’t scale (example), they went to Hadoop for their BI analytics (when you read too much, better read it faster) and to NoSQL for their OLTP (when you use the database as an ISAM row store, better optimize it for hashed key-value tables).

And then there are two reactions from database vendors. The legacy ones improve their RDBMS to cope with those bad queries (more transformations in the optimizer/query planner, adaptive optimization features,…).

And the newcomers build something new for them (limited Get/Set API in key-value stores like the PutItem/GetItem/Query/Scan of DynamoDB). And each camp has its advocates. RDBMS team tries to explain how to write SQL correctly (just look at the number of search hits for “bind variables” or “row-by-row” in http://asktom.oracle.com/). NoSQL team claimed that SQL is dead and explains how to build complex data models on a key-value store (see Rick Houlihan single-table design https://youtu.be/HaEPXoXVf2k?t=2964).

Who wins? It depends

And then who wins? It depends on the user population. Those who built a complex ERP on Oracle, SQL Server, PostgreSQL,… will continue with SQL because they know that ACID, Stored Procedure, SQL joins and aggregations, logical structure independent from physical storage,… made their life easier (security, agility, performance). For the oldest DBA and developers, they already had this debate between Codasyl vs. Relational (thinking, like David J. DeWitt and Michael Stonebraker, that it would be a major step backwards to say “No” to the relational view).

But architects in modern development context (with very short release cycles, multiple frameworks and polyglot code, and large coding teams of full-stack devs, rather than few specific experts) tend to favour the NoSQL approach. Their developer teams already know procedural languages, objects, loops, HTTP calls, JSON,… and they can persist their objects into a NoSQL database without learning something new. Of course, there’s something wrong in the idea that you don’t have to learn anything when going from manipulating transient objects in memory to storing persistent and shared data. When data will be queried by multiple users, for the years to come, and new use-cases, you need specific design and implementation that you don’t need for an application server that you can stop and start from scratch, in multiple nodes.

Whatever the architecture you choose you will have to learn. It can be on ORM (for example Vlad Mihalcea on Hibernate https://vladmihalcea.com/tutorials/hibernate/), on NoSQL (for example Alex DeBrie on DynamoDB https://www.dynamodbbook.com/), as well on SQL (like Markus Winand https://sql-performance-explained.com/). When you look at this content you will see that there are no shortcuts: you need to learn, understand and design. And while I’m referencing many nice people who share knowledge and tools, you should have a look at Lukas Eder JDBC abstraction framework https://www.jooq.org/ which is a nice intermediate between the procedural code and the database query language. Because you may understand the power of SQL (and the flaws of top-down object-to-relational generators) but refuse to write queries as plain text character strings, and prefer to write them in a Java DSL.

Both approaches need some learning and require good design. Then why NoSQL (or ORM before, or GraphSQL now, or any new API that replace or hides SQL) appears easier to the developers? I think the reason is that the NoSQL vendors listen to their users better than the SQL vendors. Look at MongoDB marketing: they propose the exact API that application developers are looking for: insert and find items, from a data store, that are directly mapped to the Java objects. Yes, that’s appealing and easily adopted. However, you cannot manipulate shared and persistent data in the same way as in-memory transient objects that are private, but priority was at user API before consistency and reliability. The ORM answer was complex mapping (the “object-relational impedance mismatch”), finally too complex for generating optimal queries. MongoDB, listening to their users, just keep it simple: persistence and sharing is best effort only, not the priority: eventual consistency. This lack of feature is actually sold as a feature: the users complain about transactions, normal forms,… let’s tell them that they don’t need it. It is interesting to read Mark Porter, the new MongoDB CTO, propaganda in “Next Chapter in Delighting Customers”:
Normalized data, mathematically pure or not, is agonizing for humans to program against; it’s just not how we think. […] And while SQL might look pretty in an editor, from a programming point of view, it’s close to the hardest way to query information you could think of. Mark Porter, who knows RDBMS very well, is adopting the MongoDB language: we hear you, you don’t want to learn SQL, you want a simple API, we have it. And, on the opposite, the RDBMS vendors, rather than listening and saying “yes” to this avidity of new cool stuff, are more authoritarian and say: “no, there’s no compromise with consistency, you need to learn the relational model and SQL concepts because that is is inescapable to build reliable and efficient databases”.

I’ll throw a silly analogy here. Silly because most of my readers have a scientific approach, but… did you ever go to a Homeopathic Doctor? I’m not giving any opinion about Homeopathy cures here. But you may realize that Homeopathic Doctors spend a lot of time listening to you, to your symptoms, to your general health and mood, before giving any recommendation. That’s their strength, in my opinion. When you go to an allopathic specialist, you may feel that he gives you a solution before fully listening to your questions. Because he knows, because he is the specialist, because he has statistics on large population with the same symptoms. Similarly, I think this is where RDBMS and SQL experts missed the point. It goes beyond the DBA-Dev lack of communication. If developers think they need a PhD to understand SQL, that’s because the SQL advocates failed in their task and came with all their science, and maybe their ego, rather than listening to users.

Listen to the problems

Ok, sorry for this long introduction. I wanted to through some links and thoughts to get multiple ideas on the subject.

Here is where I got the idea for this blog post:

Also @FranckPachot, your recent article about NoSQL myth was great. But the way you and @JBeresniewicz decided to lecture me about *basic* SQL concepts instead of answering my question might explain why some people prefer NoSQL …

— Felix Geisendörfer (@felixge) May 9, 2020

Felix Geisendörfer is someone I follow, then I know his high technical level. To his simple question (the order of execution for a function in a SELECT with an ORDER BY) I didn’t just answer “you can’t” but tried to explain the reason. And then, without realizing it, I was giving the kind of “answer” that I hate to see in technical forums, like “what you do is wrong”, “your question is not valid”, “you shouldn’t do that”… My intention was to explain something larger than the question, but finally, I didn’t answer the question.

When people ask a question, they have a problem to solve and may not desire to think about all concepts behind. I like to learn the database concepts because that’s the foundation of my job. Taking the time to understand the concepts helps me to answers hundreds of future questions. And, as a consultant, I need to explain the reasons. Because the recommendations I give to a customer are valid only within a specific context. If I don’t give the “How” and “Why” with my recommendations, they will make no sense in the long term. But DBAs and SQL theoreticians should understand that developers have different concerns. They have a software release to deliver before midnight, and they have a problem to fix. They are looking for a solution, and not for an academic course. This is why Stackoverflow is popular: you can copy-paste a solution that works immediately (at least which worked for others). And this is why ORMs and NoSQL are appealing and popular: they provide a quick way to persist an object without going through the relational theory and SQL syntax.

Listen to the users

I’m convinced that understanding the concepts is mandatory for the long term, and that ACID, SQL and relational database is a huge evolution over eventual consistency, procedural languages and key-value hierarchical models. But in those technologies, we forget to listen to the users. I explained how joins can scale by showing an execution plan. But each RDBMS has a different way to display the execution plan, and you need some background knowledge to understand it. A lot, actually: access paths, join methods, RDBMS specific terms and metrics. If the execution plan were displayed as a sequence diagram or a procedural pseudo-code, it would be immediately understandable by developers who are not database specialists. But it is not the case and reading an execution plan is more and more difficult.
NoSQL makes it simple:

2. Franck notes that SQL isn't a black box because you can use EXPLAIN. And it's true! But it's also absurdly hard to read & hard to understand how it's going to interact as data scales.

With DynamoDB, you learn 3 concepts and the mental model around scalability works.

(cont.)

— Alex DeBrie (@alexbdebrie) July 8, 2020

All depends on the point of view. I admit that RDBMS execution plans are not easy to read. But I don’t find NoSQL easier. Here is an example of myself trying to understand the metrics from a DynamoDB Scan and match it with CloudWatch metrics:
https://twitter.com/FranckPachot/status/1287773195031007234?s=20

If I stick to my point of view (as a SQL Developer, database architect, DBA, a Consultant,…) I’m convinced that SQL is user-friendly. But when I listen to some developers, I realize that it is not. And that is not new: CRUD, ORM, NoSQL,… all those APIs were created because SQL is not easy. My point of view is also biased by the database engines I have been working with. A few years of DB2 at the beginning of my career. And 20 years mostly with Oracle Database. This commercial database is very powerful and have nothing to envy to NoSQL about scalability: RAC, Hash partitioning, Parallel Query,… But when you look at the papers about MongoDB or DynamoDB the comparisons are always with MySQL. I even tend to think that NoSQL movement started as a “NoMySQL” rant at a time where MySQL had very limited features and people ignored the other databases. We need to listen to our users and if we think that an RDBMS database is still a solution for modern applications, we need to listen to the developers.

If we don’t learn and take lessons from the past, we will continue to do always the same mistake. When CRUD APIs were developed, the SQL advocates answered with their science: CRUD is bad, we need to work with set of rows. When Hibernate was adopted by the Java developers, the relational database administrators answered again: ORM is bad you need to learn SQL. And the same happens now with NoSQL. But we need to open our eyes: developers need those simple APIs. Hibernate authors listened to them, and Hibernate is popular. MongoDB listened to them and is popular. DynamoDB listened to them and is popular. And SQLAlchemy for python developers. And GraphSQL to federate from multiple sources. Yes, they lack a lot of features that we have in RDBMS, and they need the same level of learning and design, but the most important is that they offer the APIs that the users are looking for. Forty years ago, SQL was invented to match what the users wanted to do. Because users were, at that time, what we call today ‘Data Scientists’: they need a simple error-prone API for ad-hoc queries. However, it looks like SQL became too complex for current developers and missed the point with the integration with procedural languages: mapping Java objects to relational rowsets though SQL is not easy. And even if SQL standard evolved, the RDBMS vendors forgot to listen to the developer experience. Look, even the case-insensitivity is a problem for Java programmers:

I'll start with something dead simple like naming. Using mysql you can use camelCase for your table and column names without double quoting everything. Now you can keep a consistent naming convention throughout your whole stack.

— Jacob Duval (@jladuval) July 31, 2020

Using the same naming conventions for procedural code and database objects is a valid requirement. The SQL standard has evolved for this (SQL-92 defines case-sensitive identifiers) but actually, only a few RDBMS took the effort to be compliant with it (just play with all databases in this db<>fiddle). And even on those databases which implement the SQL evolution correctly (Oracle, DB2 and Firebird – paradoxically the oldest ones and without ‘SQL’ in their name), using quoted identifiers will probably break some old-school DBA scripts which do not correctly handle the case-insensitive identifiers.

The lack of simple API is not only for SQL requests to the database. In all RDBMS, understanding how an execution plan can scale requires lot of knowledge. I’ll go on that in a next post. I’ll continue to write about the NoSQL myths, but that’s not sufficient to get developers adopting SQL again like their parents did 40 years ago. We need an easier API. Not for data scientists but for industrial coding factories. Developers should not have to learn normal forms and just think about business entities. They should not have to write SQL text strings in their Java code. They should see execution plans like sequence diagrams or procedural pseudo-code to understand the scalability.

That’s what DBA and RDBMS advocates should learn from NoSQL, because they didn’t take the lesson with ORM: listen to your users, and improve the developer experience. Or we will end again with a N+1 attempt to abstract the relational model, rowset data manipulation, and stateful transaction consistency, which can scale only with massive hardware and IT resources. I hope to see interesting discussions in this blog or twitter.

Cet article A lesson from NoSQL (vs. RDBMS): listen to your users est apparu en premier sur Blog dbi services.

↧

RDBMS (vs. NoSQL) scales the algorithm before the hardware

August 3, 2020, 8:00 am

≫ Next: AWS DynamoDB Local: running NoSQL on SQLite

≪ Previous: A lesson from NoSQL (vs. RDBMS): listen to your users

By Franck Pachot

.
In The myth of NoSQL (vs. RDBMS) “joins dont scale” I explained that joins actually scale very well with an O(logN) on the input tables size, thanks to B*Tree index access, and can even be bounded by hash partitioning with local index, like in DynamoDB single-table design. Jonathan Lewis added a comment that, given the name of the tables (USERS and ORDERS). we should expect an increasing number of rows returned by the join.

In this post I’ll focus on this: how does it scale when index lookup has to read more and more rows. I’ll still use DynamoDB for the NoSQL example, and this time I’ll do the same in Oracle for the RDBMS example.

NoSQL: DynamoDB


aws dynamodb create-table \
 --attribute-definitions \
  AttributeName=MyKeyPart,AttributeType=S \
  AttributeName=MyKeySort,AttributeType=S \
 --key-schema \
  AttributeName=MyKeyPart,KeyType=HASH \
  AttributeName=MyKeySort,KeyType=RANGE \
 --billing-mode PROVISIONED \
 --provisioned-throughput ReadCapacityUnits=25,WriteCapacityUnits=25 \
 --table-name Demo

This creates a DEMO table with MyKeyPart as a hash key and MyKeySort as a sort key. I’m on the AWS Free Tier with limited capacity unit, so don’t check the time. I’ll measure the efficiency from CloudWatch metrics and consumed capacity units.


while aws --output json dynamodb describe-table --table-name Demo | grep '"TableStatus": "CREATING"' ; do sleep 1 ; done 2>/dev/null | awk '{printf "."}END{print}'

This is how I wait the the creation to be completed.

I’ll store 1999000 items as 0+1+2+3+…+1999 to query them later. Each hash key value having a different number of rows.


import boto3, time, datetime
from botocore.config import Config
dynamodb = boto3.resource('dynamodb',config=Config(retries={'mode':'adaptive','total_max_attempts': 10}))
n=0 ; t1=time.time()
try:
 for k in range(0,2000):
  for s in range(1,k+1):
     r=dynamodb.Table('Demo').put_item(Item={'MyKeyPart':f"K-{k:08}",'MyKeySort':f"S-{s:08}",'seconds':int(time.time()-t1),'timestamp':datetime.datetime.now().isoformat()})
     time.sleep(0.05);
     n=n+1
except Exception as e:
 print(str(e))
t2=time.time()
print(f"Last: %s\n\n===> Total: %d seconds, %d keys %d items/second\n"%(r,(t2-t1),k,n/(t2-t1)))

The outer loop iterates 2000 times to generate 2000 values for MyKeyPart, so that when I read for one value of MyKeyPart I read only one hash partition. The inner loop generates from 1 to 2000 values for S. Then this generates 1999000 rows in total. K-00000001 has one item, K-00000002 has two items, K-00000042 has 42 items,… until K-00001999 with 1999 items.

I’ll query for each value of MyKeyPart. The query will read from one partition and return the items. The goal is to show how it scales with an increasing number of items.


for i in {100000000..100001999}
do
aws dynamodb query \
 --key-condition-expression "MyKeyPart = :k" \
 --expression-attribute-values  '{":k":{"S":"'K-${i#1}'"}}' \
 --return-consumed-capacity TABLE \
 --return-consumed-capacity INDEXES \
 --select ALL_ATTRIBUTES \
 --table-name Demo
done
}

This is a simple Query with the partition key value from K-00000000 to K-00001999. I’m returning the consumed capacity.

For example, the last query for K-00001999 returned:


...
        },
        {
            "seconds": {
                "N": "117618"
            },
            "MyKeyPart": {
                "S": "K-00001999"
            },
            "MyKeySort": {
                "S": "S-00001999"
            },
            "timestamp": {
                "S": "2020-08-02T17:05:53.549532"
            }
        }
    ],
    "ScannedCount": 1999,
    "ConsumedCapacity": {
        "CapacityUnits": 20.5,
        "TableName": "Demo",
        "Table": {
            "CapacityUnits": 20.5
        }
    }
}

This query returned 1999 items using 20.5 RCU which means about 49 items per 0.5 RCU. Let’s do some math: eventually consistent query reads 4KB with 0.5 RCU, my items are about 90 bytes (in DynamoDB the attribute names are stored for each row)

$ wc -c <<<"117618 K-00001999 S-00001999 2020-08-02T17:05:53.549532 seconds MyKeyPart MyKeySort timestamp"
94

So 49 items like this one is about 4KB… we are in the right ballpark.

Here are some CloudWatch statistics which shows that everything scales more or less linearly with the number of items retrieved:

The CloudWatch gathered every minute is not very precise, and I benefit from some bustring capacity. Here is the graph I’ve made from ScannedCount and CapacityUnits returned by the queries:

No doubt, that the big advantage of NoSQL: simplicity. Each time you read 4KB with eventual consistency you consume 0.5 RCU. The more items you have to read, the more RCU. This is linear and because the cost (in time and money) is proportional to the RCU, you can clearly see that it scales linearly. It increases in small steps because each 0.5 RCU holds many rows (49 on average).

This is how NoSQL scales: more work needs more resources, in a simple linear way: reading 500 items consumes 10x the resources needed to read 50 items. This is acceptable in the cloud because underlying resources are elastic, provisioned with auto-scaling and billed on-demand. But can we do better? Yes. For higher workloads, there may be faster access paths. With DynamoDB there’s only one: the RCU which depends on the 4KB reads. But SQL databases have multiple read paths and an optimizer to choose the best one for each query.

RDBMS SQL: Oracle Autonomous Database

For the SQL example, I’ve run a similar workload on the Autonomous Database in the Oracle Free Tier.


create table DEMO (
 MyKeyPart varchar2(10)
,MyKeySort varchar2(10)
,"seconds" number
,"timestamp" timestamp
,constraint DEMO_PK primary key(MyKeyPart,MyKeySort) using index local compress)
partition by hash(MyKeyPart) partitions 8;

This is a table very similar to the DynamoDB one: hash partition on MyKeyPart and local index on MyKeySort.


insert /*+ append */ into DEMO
select
 'K-'||to_char(k,'FM09999999') K,
 'S-'||to_char(s,'FM09999999') S,
 0 "seconds",
 current_timestamp "timestamp"
from
 (select rownum-1 k from xmltable('1 to 2000'))
,lateral  (select rownum s from xmltable('1 to 2000') where rownum<=k) x
order by k,s
/

This is similar to the python loops I’ve used to fill the DynamoDB table. I use XMLTABLE as a row generator, and lateral join as an inner loop. The select defines the rows and the insert loads them directly without going though application loops.


SQL> select count(*) from DEMO;

    COUNT(*)
____________
   1,999,000

I have my 1999000 rows here.

commit;
When you are ok with your changes, don’t forget to get them visible and durable as a whole. This takes not time.

In order to do something similar to the DynamoDB queries, I’ve generated a command file like:


select * from DEMO where K='K-00000000';
select * from dbms_xplan.display_cursor(format=>'allstats last +cost');
select * from DEMO where K='K-00000001';
select * from dbms_xplan.display_cursor(format=>'allstats last +cost');
select * from DEMO where K='K-00000002';
select * from dbms_xplan.display_cursor(format=>'allstats last +cost');
...
select * from DEMO where K='K-00001998';
select * from dbms_xplan.display_cursor(format=>'allstats last +cost');
select * from DEMO where K='K-00001999';
select * from dbms_xplan.display_cursor(format=>'allstats last +cost');

For each one I queried the execution plan which is much more detailed than the –return-consumed-capacity

I have built the following graph with the results from the execution plan. Rather than showing the optimizer estimated cost, I used the execution statistics about the buffer reads, as they are the most similar to the DynamoDB RCU. However, I have two kinds of execution plan: index access which reads 8KB blocks, and full table scan which is optimized with multiblock reads. In the following, I have normalized this metric with the ratio of multiblock reads to single block reads, in order to show both of them:

The Plan hash value: 3867611502, in orange, is PARTITION HASH SINGLE + INDEX RANGE SCAN + TABLE ACCESS BY LOCAL INDEX ROWID which is very similar to the DynamoDB query. The cost is proportional to the number of rows returned.
The Plan hash value: 388464122, in blue, is PARTITION HASH SINGLE + TABLE FULL SCAN which is scanning a partition with multi-block I/O and direct-path read (and even storage index in this case as the Autonomous Database runs on Exadata). Thanks to those RDBMS optimizations, this access path is fast even when we don’t read all rows. In this example, I never read more than 1% of a partition (1999 rows from a total of 1999000 distributed to 8 partitions). What is awesome is that the cost here does not depend on the size of the result, but is constant: index access is faster for few rows but as soon as you read many, the cost is capped with full table scan.

Of course, the table can grow, and then a full table scan is more expensive. But we can also increase the number of partitions in order to keep the FULL TABLE SCAN within the response time requirements because it actually reads one partition. And the beauty is that, thanks to SQL, we don’t have to change any line of code. The RDBMS has an optimizer that estimates the number of rows to be retrieved and chooses the right access path. Here, when retrieving between 500 and 800 rows, the cost of both is similar and the optimizer chooses one or the other, probably because I’m touching partitions with a small difference in the distribution. Below a few hundreds, the index access is clearly the best. Above a thousand, scanning the whole partition gets a constant cost even when the resultset increases.

Here are the execution plans I’ve built the graph from. This one is index access for few rows (I picked up the one for 42 rows):


DEMO@atp1_tp> select * from DEMO where K='K-00000042';

    MyKeyPart     MyKeySort    seconds                          timestamp
_____________ _____________ __________ __________________________________
K-00000042    S-00000001             0 02-AUG-20 06.27.46.757178000 PM
K-00000042    S-00000002             0 02-AUG-20 06.27.46.757178000 PM
K-00000042    S-00000003             0 02-AUG-20 06.27.46.757178000 PM
...
K-00000042    S-00000042             0 02-AUG-20 06.27.46.757178000 PM

42 rows selected.

DEMO@atp1_tp> select * from dbms_xplan.display_cursor(format=>'allstats last +cost');

                                                                                                                        PLAN_TABLE_OUTPUT
_________________________________________________________________________________________________________________________________________
SQL_ID  04mmtv49jk782, child number 0
-------------------------------------
select * from DEMO where K='K-00000042'

Plan hash value: 3867611502

--------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name    | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |
--------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |         |      1 |        |     6 (100)|     42 |00:00:00.01 |       5 |      1 |
|   1 |  PARTITION HASH SINGLE                     |         |      1 |     42 |     6   (0)|     42 |00:00:00.01 |       5 |      1 |
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| DEMO    |      1 |     42 |     6   (0)|     42 |00:00:00.01 |       5 |      1 |
|*  3 |    INDEX RANGE SCAN                        | DEMO_PK |      1 |     42 |     3   (0)|     42 |00:00:00.01 |       3 |      0 |
--------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("K"='K-00000042')

3 buffers to read from the index and 2 additional buffers to read from the table ( one of them was not in cache and was a physical read). It is a single partition. The more rows are in the result and the more buffers have to be read from the table. I have about one thousand rows per 8KB buffer here (columns names are in the dictionary and not in each block, and I used the optimal datatypes like timestamp to store the timestamp).

Here I take the last query returning 1999 rows:


DEMO@atp1_tp> select * from DEMO where K='K-00001999';

            K             S    seconds                          timestamp
_____________ _____________ __________ __________________________________
K-00001999    S-00000001             0 02-AUG-20 06.27.46.757178000 PM
K-00001999    S-00000002             0 02-AUG-20 06.27.46.757178000 PM
...
1,999 rows selected.

DEMO@atp1_tp> select * from dbms_xplan.display_cursor(format=>'allstats last +cost');

                                                                                            PLAN_TABLE_OUTPUT
_____________________________________________________________________________________________________________
SQL_ID  g62zuthccp7fu, child number 0
-------------------------------------
select * from DEMO where K='K-00001999'

Plan hash value: 388464122

----------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |      1 |        |    21 (100)|   1999 |00:00:00.01 |    1604 |
|   1 |  PARTITION HASH SINGLE     |      |      1 |   2181 |    21  (10)|   1999 |00:00:00.01 |    1604 |
|*  2 |   TABLE ACCESS STORAGE FULL| DEMO |      1 |   2181 |    21  (10)|   1999 |00:00:00.01 |    1604 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - storage("K"='K-00001999')
       filter("K"='K-00001999')

One partition is full scanned. the execution plan shows 1604 buffers but there are many optimizations to get them faster and this is why finally the cost is not very high (cost=21 here means that reading 1605 sequential blocks is estimated to take the same time as reading 21 random blocks). One major optimization is reading 128 blocks with one I/O call (1MB multiblock read), another one is reading them bypassing the shared buffers (direct-path read) and here there’s even some offloading to STORAGE where rows are already filtered even before reaching the database instance.

I voluntarily didn’t get into the details, like why the cost of the full table scan has some variations. This depends on the hash distribution and the optimizer statistics (I used dynamic sampling here). You can read more about the inflection point where a full table scan is better than index access in a previous post: https://blog.dbi-services.com/oracle-12c-adaptive-plan-inflexion-point/ as this also applies to joins and scaling the algorithm can even happen after the SQL query compilation – at execution time – in some RDBMS (Oracle and SQL Server for example). As usual, the point is not that you take the numbers from this small example but just understand the behavior: linear increase and then constant cost. NoSQL DynamoDB is optimized for key-value access. If you have queries reading many keys, you should stream the data into another database service. SQL databases like Oracle are optimized for the data and you can run multiple use cases without changing your application code. You just need to define the data access structures (index, partitioning) and the optimizer will choose the right algorithm.

Cet article RDBMS (vs. NoSQL) scales the algorithm before the hardware est apparu en premier sur Blog dbi services.

↧

AWS DynamoDB Local: running NoSQL on SQLite

August 7, 2020, 12:45 am

≫ Next: Merge-Statement crashes with ORA-7445 [kdu_close] caused by Real Time Statistics?

≪ Previous: RDBMS (vs. NoSQL) scales the algorithm before the hardware

By Franck Pachot

.
DynamoDB is a cloud-native, managed, key-value proprietary database designed by AWS to handle massive throughput for large volume and high concurrency with a simple API.

simple API: Get, Put, Query, Scan on a table without joins, optimizer, transparent indexes,…
high concurrency: queries are directed to one shard with a hash function
massive throughput: you can just add partitions to increase the IOPS
large volume: it is a shared-nothing architecture where all tables are hash partitioned
key-value: you access to any data with a key that is hashed to go to the right partition and the right range within it
managed: you have zero administration to do. Just define the key and the throughput you desire and use it
cloud-native: it was designed from the beginning to run in the AWS cloud

One problem with cloud-native solution is that you need to access the service during the development of your application. This is not a major cost issue because DynamoDB is available on the Free Tier (with limited throughput, but that’s sufficient for development). But users may want to develop offline, on their laptop, without a reliable internet connection. And this is possible because Amazon provides a downloadable version of this database: DynamoDB Local.

Difference

The most important thing is that the API is the same as with the cloud version. For sure, all the partitioning stuff is missing in the local version. And I have no idea if the underlying data format is similar or not:

I would love to know how the @dynamodb Local is similar to the cloud version. The API of course is the same, the disks for sure are different. But I wonder if the persistence layer and internal format of data in files is the same

— Franck Pachot (@FranckPachot) August 6, 2020

However, this is more for curiosity. The local version just needs a compatible API. You will not measure the performance there.

Install


[oracle@cloud DynamoDBLocal]$ cat /etc/oracle-release
Oracle Linux Server release 7.7
[oracle@cloud DynamoDBLocal]$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.7 (Maipo)

I am doing this installation on OEL 7.7 which is similar to RHEL 7.7 or CentOS 7.7


[oracle@cloud DynamoDBLocal]$ java -version
java version "1.8.0_231"
Java(TM) SE Runtime Environment (build 1.8.0_231-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.231-b11, mixed mode)

I have a JRE installed


mkdir -p /var/tmp/DynamoDBLocal && cd $_

I’m installing everything in a local temporary directory.

All is documented in: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DynamoDBLocal.DownloadingAndRunning.html#download-locally


curl https://s3.eu-central-1.amazonaws.com/dynamodb-local-frankfurt/dynamodb_local_latest.tar.gz | tar -xvzf -

This simply downloads and extract the DynamoDB local distribution

Run


java -Djava.library.path=/var/tmp/DynamoDBLocal/DynamoDBLocal_lib -jar /var/tmp/DynamoDBLocal/DynamoDBLocal.jar -sharedDb -dbPath /var/tmp/DynamoDBLocal &

This will use a persistent file (you can run it in memory only with -inMemory instead of it) in the directory mentioned by -dbPath and -sharedDb will use the following file name:


[oracle@cloud ~]$ ls -l /var/tmp/DynamoDBLocal/shared-local-instance.db
-rw-r--r-- 1 oracle oinstall 12346368 Aug  6 12:20 /var/tmp/DynamoDBLocal/shared-local-instance.db

I’ll tell you more about this file later.

so, when started it displays on which port it listens:



[oracle@cloud ~]$ pkill -f -- '-jar DynamoDBLocal.jar -sharedDb'

[oracle@cloud ~]$ java -Djava.library.path=/var/tmp/DynamoDBLocal/DynamoDBLocal_lib -jar /var/tmp/DynamoDBLocal/DynamoDBLocal.jar -sharedDb -dbPath /var/tmp/DynamoDBLocal &
[1] 33294
[oracle@cloud ~]$ Initializing DynamoDB Local with the following configuration:
Port:   8000
InMemory:       false
DbPath: /var/tmp/DynamoDBLocal
SharedDb:       true
shouldDelayTransientStatuses:   false
CorsParams:     *

Another port can be defined with -port

AWS CLI

I use the AWS commandline interface, here is how to install it:


wget --continue https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip
unzip -oq awscli-exe-linux-x86_64.zip
sudo ./aws/install
aws configure

For the configuration, as you are local, you can put anything you want for the access key and region:


[oracle@cloud ~]$ aws configure
AWS Access Key ID [****************chot]: @FranckPachot
AWS Secret Access Key [****************chot]: @FranckPachot
Default region name [Lausanne]: Lausanne
Default output format [table]:

Because this information is not used, I’ll need to define the endpoint “–endpoint-url http://localhost:8000” with each call.

Create table


aws dynamodb --endpoint-url http://localhost:8000 create-table \
 --attribute-definitions \
  AttributeName=MyKeyPart,AttributeType=S \
  AttributeName=MyKeySort,AttributeType=S \
 --key-schema \
  AttributeName=MyKeyPart,KeyType=HASH \
  AttributeName=MyKeySort,KeyType=RANGE \
 --billing-mode PROVISIONED \
 --provisioned-throughput ReadCapacityUnits=25,WriteCapacityUnits=25 \
 --table-name Demo

I mentioned some provisioned capacity ready for my test on the Free Tier but they are actually ignored by DynamoDB local.


[oracle@cloud ~]$ aws dynamodb --endpoint-url http://localhost:8000 create-table \
>  --attribute-definitions \
>   AttributeName=MyKeyPart,AttributeType=S \
>   AttributeName=MyKeySort,AttributeType=S \
>  --key-schema \
>   AttributeName=MyKeyPart,KeyType=HASH \
>   AttributeName=MyKeySort,KeyType=RANGE \
>  --billing-mode PROVISIONED \
>  --provisioned-throughput ReadCapacityUnits=25,WriteCapacityUnits=25 \
>  --table-name Demo
--------------------------------------------------------------------------------------------------------------------------------------------------------
|                                                                      CreateTable                                                                     |
+------------------------------------------------------------------------------------------------------------------------------------------------------+
||                                                                  TableDescription                                                                  ||
|+----------------------------------+------------+-----------------------------------------------------+------------+-----------------+---------------+|
||         CreationDateTime         | ItemCount  |                      TableArn                       | TableName  | TableSizeBytes  |  TableStatus  ||
|+----------------------------------+------------+-----------------------------------------------------+------------+-----------------+---------------+|
||  2020-08-06T12:42:23.669000+00:00|  0         |  arn:aws:dynamodb:ddblocal:000000000000:table/Demo  |  Demo      |  0              |  ACTIVE       ||
|+----------------------------------+------------+-----------------------------------------------------+------------+-----------------+---------------+|
|||                                                               AttributeDefinitions                                                               |||
||+------------------------------------------------------------------------+-------------------------------------------------------------------------+||
|||                              AttributeName                             |                              AttributeType                              |||
||+------------------------------------------------------------------------+-------------------------------------------------------------------------+||
|||  MyKeyPart                                                             |  S                                                                      |||
|||  MyKeySort                                                             |  S                                                                      |||
||+------------------------------------------------------------------------+-------------------------------------------------------------------------+||
|||                                                                     KeySchema                                                                    |||
||+----------------------------------------------------------------------------------------+---------------------------------------------------------+||
|||                                      AttributeName                                     |                         KeyType                         |||
||+----------------------------------------------------------------------------------------+---------------------------------------------------------+||
|||  MyKeyPart                                                                             |  HASH                                                   |||
|||  MyKeySort                                                                             |  RANGE                                                  |||
||+----------------------------------------------------------------------------------------+---------------------------------------------------------+||
|||                                                               ProvisionedThroughput                                                              |||
||+--------------------------------+---------------------------------+-----------------------------+-----------------------+-------------------------+||
|||      LastDecreaseDateTime      |      LastIncreaseDateTime       |   NumberOfDecreasesToday    |   ReadCapacityUnits   |   WriteCapacityUnits    |||
||+--------------------------------+---------------------------------+-----------------------------+-----------------------+-------------------------+||
|||  1970-01-01T00:00:00+00:00     |  1970-01-01T00:00:00+00:00      |  0                          |  25                   |  25                     |||
||+--------------------------------+---------------------------------+-----------------------------+-----------------------+-------------------------+||

Another difference with the cloud version is that this command returns immediately (no “CREATING” status).

Python

I’ll put some items with Python, thus installing it.


yum install -y python3
pip3 install boto3

boto3 is the AWS SDK for Python

Insert some items

Here is my demo.py program:


import boto3, time, datetime
from botocore.config import Config
dynamodb = boto3.resource('dynamodb',config=Config(retries={'mode':'adaptive','total_max_attempts': 10}),endpoint_url='http://localhost:8000')
n=0 ; t1=time.time()
try:
 for k in range(0,10):
  for s in range(1,k+1):
     r=dynamodb.Table('Demo').put_item(Item={'MyKeyPart':f"K-{k:08}",'MyKeySort':f"S-{s:08}",'seconds':int(time.time()-t1),'timestamp':datetime.datetime.now().isoformat()})
     time.sleep(0.05);
     n=n+1
except Exception as e:
 print(str(e))
t2=time.time()
print(f"Last: %s\n\n===> Total: %d seconds, %d keys %d items/second\n"%(r,(t2-t1),k,n/(t2-t1)))

I just fill each collection with an increasing number of items.


[oracle@cloud DynamoDBLocal]$ python3 demo.py
Last: {'ResponseMetadata': {'RequestId': '6b23dcd2-dbb0-404e-bf5d-57e7a9426c9b', 'HTTPStatusCode': 200, 'HTTPHeaders': {'content-type': 'application/x-amz-json-1.0', 'x-amz-crc32': '2745614147', 'x-amzn-requestid': '6b23dcd2-dbb0-404e-bf5d-57e7a9426c9b', 'content-length': '2', 'server': 'Jetty(8.1.12.v20130726)'}, 'RetryAttempts': 0}}

===> Total: 3 seconds, 9 keys 14 items/second

[oracle@cloud DynamoDBLocal]$

count items


[oracle@cloud DynamoDBLocal]$ aws dynamodb --endpoint-url http://localhost:8000 scan --table-name Demo --select=COUNT --return-consumed-capacity TOTAL
----------------------------------
|              Scan              |
+----------+---------------------+
|   Count  |    ScannedCount     |
+----------+---------------------+
|  45      |  45                 |
+----------+---------------------+
||       ConsumedCapacity       ||
|+----------------+-------------+|
||  CapacityUnits |  TableName  ||
|+----------------+-------------+|
||  0.5           |  Demo       ||
|+----------------+-------------+|

The nice thing here is that you can see the ConsumedCapacity which gives you an idea about how it scales. Here, I read 45 items that have a size of 81 bytes and this is lower than 4K. Then the cost of it is 0.5 RCU for eventually consistent queries.

shared-local-instance.db

You know how I’m curious. If you want to build a local NoSQL database, which storage engine would you use?


[oracle@cloud DynamoDBLocal]$ cd /var/tmp/DynamoDBLocal
[oracle@cloud DynamoDBLocal]$ file shared-local-instance.db
shared-local-instance.db: SQLite 3.x database

Yes, this NoSQL database is actually stored in a SQL database!

They use SQLite for this DynamoDB Local engine, embedded in Java.


[oracle@cloud DynamoDBLocal]$ sudo yum install sqlite
Loaded plugins: ulninfo, versionlock
Excluding 247 updates due to versionlock (use "yum versionlock status" to show them)
Package sqlite-3.7.17-8.el7_7.1.x86_64 already installed and latest version
Nothing to do

I have SQLite installed here and then can look at what is inside with my preferred data API: SQL.


[oracle@cloud DynamoDBLocal]$ sqlite3 /var/tmp/DynamoDBLocal/shared-local-instance.db
SQLite version 3.7.17 2013-05-20 00:56:22
Enter ".help" for instructions
Enter SQL statements terminated with a ";"


sqlite> .databases
seq  name             file
---  ---------------  ----------------------------------------------------------
0    main             /var/tmp/DynamoDBLocal/shared-local-instance.db

sqlite> .tables
Demo  cf    dm    sm    ss    tr    us

Here is my Demo table accompanied with some internal tables.
Let’s look at the fixed tables there (which I would call the catalog or dictionary if DynamoDB was not a NoSQL database)


sqlite> .headers on
sqlite> .mode column
sqlite> select * from cf;
version
----------
v2.4.0
sqlite>

That looks like the version of the database


sqlite> select * from dm;
TableName   CreationDateTime  LastDecreaseDate  LastIncreaseDate  NumberOfDecreasesToday  ReadCapacityUnits  WriteCapacityUnits  TableInfo                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    BillingMode  PayPerRequestDateTime
----------  ----------------  ----------------  ----------------  ----------------------  -----------------  ------------------  ---------------------------------------------------------------------------------------------                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                -----------  ---------------------
Demo        1596718271246     0                 0                 0                       25                 25                  {"Attributes":[{"AttributeName":"MyKeyPart","AttributeType":"S"},{"AttributeName":"MyKeySort","AttributeType":"S"}],"GSIList":[],"GSIDescList":[],"SQLiteIndex":{"":[{"DynamoDBAttribute":{"AttributeName":"MyKeyPart","AttributeType":"S"},"KeyType":"HASH","SQLiteColumnName":"hashKey","SQLiteDataType":"TEXT"},{"DynamoDBAttribute":{"AttributeName":"MyKeySort","AttributeType":"S"},"KeyType":"RANGE","SQLiteColumnName":"rangeKey","SQLiteDataType":"TEXT"}]},"UniqueIndexes":[{"DynamoDBAttribute":{"AttributeName":"MyKeyPart","AttributeType":"S"},"KeyType":"HASH","SQLiteColumnName":"hashKey","SQLiteDataType":"TEXT"},{"DynamoDBAttribute":{"AttributeName":"MyKeySort","AttributeType":"S"},"KeyType":"RANGE","SQLiteColumnName":"rangeKey","SQLiteDataType":"TEXT"}],"UniqueGSIIndexes":[]}  0            0
sqlite>

Here are the metadata about my table, the DynamoDB ones, like “AttributeName”,”AttributeType” and their mapping to the SQLite “SQLiteColumnName”,”SQLiteDataType”,…

The tables ss, tr and us are empty and are related with Streams and Transactions and I may have a look at them for a next post.

Now the most interesting one: my Demo table. For this one, I’ve opened it in DBeaver:

I have one SQLite table per DynamoDB table (global secondary indexes are just indexes on the table), one SQLite row per DynamoDB item, the keys (the HASH for partitioning and the RANGE for sorting within the partition) for which I used a string are stored as TEXT in SQLite but containing their ASCII hexadecimal codes (hashKey and rangeKey). And those are the columns for the SQLite primary key. They are also stored in an even larger binary (hashValue,rangeValue where hashValue is indexed), probably a hash function applied to it. And finally, the full item is stored as JSON in a BLOB. The itemSize is interesting because that’s what counts in Capacity Units (the sum of attribute names and attribute values).

The power of SQL to verify the NoSQL database

Actually, there’s a big advantage to have this NoSQL backed by a SQL database. During the development phase, you don’t only need a database to run your code. You have to verify the integrity of data, even after some race conditions. For example, I’ve inserted more items by increasing the ‘k’ loop in my demo.py and letting it run for 6 hours:


[oracle@cloud aws]$ time aws dynamodb --endpoint-url http://localhost:8000 scan --table-name Demo --select=COUNT --return-consumed-capacity TOTAL
----------------------------------
|              Scan              |
+-----------+--------------------+
|   Count   |   ScannedCount     |
+-----------+--------------------+
|  338498   |  338498            |
+-----------+--------------------+
||       ConsumedCapacity       ||
|+----------------+-------------+|
||  CapacityUnits |  TableName  ||
|+----------------+-------------+|
||  128.5         |  Demo       ||
|+----------------+-------------+|

real    0m50.385s
user    0m0.743s
sys     0m0.092s

The DynamoDB scan is long here: 1 minute for a small table (300K rows). This API is designed for the cloud where a huge amount of disks can provide high throughput for many concurrent requests. There’s no optimization when scanning all items, as I described it in a previous post: RDBMS (vs. NoSQL) scales the algorithm before the hardware. SQL databases have optimization for full table scans, and the database for those 338498 rows is really small:


[oracle@cloud aws]$ du -h /var/tmp/DynamoDBLocal/shared-local-instance.db
106M    /var/tmp/DynamoDBLocal/shared-local-instance.db

Counting the rows is faster from SQLite directly:


[oracle@cloud aws]$ time sqlite3 /var/tmp/DynamoDBLocal/shared-local-instance.db "select count(*) from Demo;"
338498

real    0m0.045s
user    0m0.015s
sys     0m0.029s

But be careful: SQLite is not a multi-user database. Query it only when the DynamoDB Local is stopped.

And with the power of SQL it is easy to analyze the data beyond the API provided by DynamoDB:


[oracle@cloud aws]$ sqlite3 /var/tmp/DynamoDBLocal/shared-local-instance.db
SQLite version 3.32.3 2020-06-18 14:00:33
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .mode column
sqlite> .header on
sqlite> .timer on

sqlite> select count(distinct hashKey),count(distinct hashKey),count(distinct rangeKey),count(distinct rangeValue) from Demo;
count(distinct hashKey)  count(distinct hashKey)  count(distinct rangeKey)  count(distinct rangeValue)
-----------------------  -----------------------  ------------------------  --------------------------
823                      823                      822                       822

CPU Time: user 0.570834 sys 0.168966

This simple query confirms that I have as many hash/range Key as Value.


sqlite> select cast(hashKey as varchar),json_extract(ObjectJSON,'$.MyKeyPart')
   ...> ,count(rangeKey),count(distinct rangeKey)
   ...> from Demo group by hashKey order by count(rangeKey) desc limit 10;

cast(hashKey as varchar)  json_extract(ObjectJSON,'$.MyKeyPart')  count(rangeKey)  count(distinct rangeKey)
------------------------  --------------------------------------  ---------------  ------------------------
K-00000823                {"S":"K-00000823"}                      245              245
K-00000822                {"S":"K-00000822"}                      822              822
K-00000821                {"S":"K-00000821"}                      821              821
K-00000820                {"S":"K-00000820"}                      820              820
K-00000819                {"S":"K-00000819"}                      819              819
K-00000818                {"S":"K-00000818"}                      818              818
K-00000817                {"S":"K-00000817"}                      817              817
K-00000816                {"S":"K-00000816"}                      816              816
K-00000815                {"S":"K-00000815"}                      815              815
K-00000814                {"S":"K-00000814"}                      814              814

Run Time: real 0.297 user 0.253256 sys 0.042886

There I checked how many distinct range keys I have for the 10 ones (LIMIT 10) with the highest value (ORDER BY count(rangeKey) DESC), and converted this hexadecimal int a string (CAST) and also compare with what is in the JSON column (JSON_EXTRACT). Yes, many RDBMS database can manipulate easily a semi-structured JSON with SQL.


sqlite> select
   ...>  round(timestamp_as_seconds-lag(timestamp_as_seconds)over(order by timestamp)) seconds
   ...>  ,MyKeyPart,MyKeySort,MyKeySort_First,MyKeySort_Last,timestamp
   ...> from (
   ...>  select
   ...>    MyKeyPart,MyKeySort
   ...>   ,first_value(MyKeySort)over(partition by MyKeyPart) MyKeySort_First
   ...>   ,last_value(MyKeySort)over(partition by MyKeyPart) MyKeySort_Last
   ...>   ,timestamp,timestamp_as_seconds
   ...>  from (
   ...>   select
   ...>     json_extract(ObjectJSON,'$.MyKeyPart.S') MyKeyPart,json_extract(ObjectJSON,'$.MyKeySort.S') MyKeySort
   ...>    ,json_extract(ObjectJSON,'$.timestamp.S') timestamp
   ...>    ,julianday(datetime(json_extract(ObjectJSON,'$.timestamp.S')))*24*60*60 timestamp_as_seconds
   ...>   from Demo
   ...>  )
   ...> )
   ...> where MyKeySort=MyKeySort_Last
   ...> order by timestamp desc limit 5
   ...> ;

seconds     MyKeyPart   MyKeySort   MyKeySort_First  MyKeySort_Last  timestamp
----------  ----------  ----------  ---------------  --------------  --------------------------
 16.0       K-00000823  S-00000245  S-00000001       S-00000245      2020-08-07T04:19:55.470202
 54.0       K-00000822  S-00000822  S-00000001       S-00000822      2020-08-07T04:19:39.388729
111.0       K-00000821  S-00000821  S-00000001       S-00000821      2020-08-07T04:18:45.306205
 53.0       K-00000820  S-00000820  S-00000001       S-00000820      2020-08-07T04:16:54.977931
 54.0       K-00000819  S-00000819  S-00000001       S-00000819      2020-08-07T04:16:01.003016

Run Time: real 3.367 user 2.948707 sys 0.414206
sqlite>

Here is how I checked the time take by the insert. My Python code added a timestamp which I convert it to seconds (JULIANDAY) and get the difference with the previous row (LAG). I actually did that only for the last item of each collection (LAST_VALUE).

Those are examples. You can play and improve your SQL skills on your NoSQL data. SQLite is one of the database with the best documentation: https://www.sqlite.org/lang.html. And it is not only about learning. During development and UAT you need to verify the quality of data and this often goes beyond the application API (especially when the goal is to verify that the application API is correct).

That’s all for this post. You know how to run DynamoDB locally, and can even access it with SQL for powerful queries

Cet article AWS DynamoDB Local: running NoSQL on SQLite est apparu en premier sur Blog dbi services.

↧

Merge-Statement crashes with ORA-7445 [kdu_close] caused by Real Time Statistics?

August 7, 2020, 10:53 am

≫ Next: Installing MariaDB Server 10.5.5 on CentOS 8

≪ Previous: AWS DynamoDB Local: running NoSQL on SQLite

In a recent project we migrated an Oracle database previously running on 12.1.0.2 on an Oracle Database Appliance to an Exadata X8 with DB version 19.7. Shortly after the migration a merge-statement (upsert) failed with an

ORA-07445: exception encountered: core dump [kdu_close()+107] [SIGSEGV] [ADDR:0xE0] [PC:0x1276AE6B] [Address not mapped to object] []

The stack looked as follows:

kdu_close - updThreePhaseExe - upsexe - opiexe - kpoal8 - opiodr - ttcpip - opitsk - opiino - opiodr - opidrv - sou2o - opimai_real - ssthrdmain - main - __libc_start_main - _start

As experienced Oracle DBAs know an ORA-7445 error is usually caused by an Oracle bug (defect). Searching in My Oracle Support didn’t reveal much for module “kdu_close” and the associated error stack. Working on a Service Request (SR) with Oracle Support hasn’t provided a solution or workaround to the issue so far as well. Checking Orafun also didn’t provide much insight about kdu_close other than the fact that we are in the area of the code about kernel data update (kdu).

As the merge crashed at the end of its processing (from earlier successful executions we knew how long the statement usually takes) I setup the hypothesis that this issue might be related to the 19c new feature Real Time Statistics on Exadata. To verify if the hypothesis is correct, I did some tests first with Real Time Statistics and merge-statements in my environment to see if they do work as expected and if we can disable them with a hint:

1.) Enable Exadata Features

alter system set "_exadata_feature_on"=TRUE scope=spfile;
shutdown immediate
startup

2.) Test if a merge-statement triggers real time statistics

I setup a table tab1 and tab2 similar to the setup on Oracle-Base and run a merge statement, which actually updates 1000 rows:

Initially we just have statistics on tab1 from dbms_stats.gather_table_stats. Here e.g. the columns:

testuser1@orcl@orcl> select column_name, last_analyzed, notes from user_tab_col_statistics where table_name='TAB1';

COLUMN_NAME      LAST_ANALYZED       NOTES
---------------- ------------------- ----------------------------------------------------------------
ID               07.08.2020 17:29:37
DESCRIPTION      07.08.2020 17:29:37

Then I ran the merge:

testuser1@orcl@orcl> merge
  2  into	tab1
  3  using	tab2
  4  on	(tab1.id = tab2.id)
  5  when matched then
  6  	     update set tab1.description = tab2.description
  7  WHEN NOT MATCHED THEN
  8  	 INSERT (  id, description )
  9  	 VALUES ( tab2.id, tab2.description )
 10  ;

1000 rows merged.

testuser1@orcl@orcl> commit;

Commit complete.

testuser1@orcl@orcl> exec dbms_stats.flush_database_monitoring_info;

PL/SQL procedure successfully completed.

testuser1@orcl@orcl> select column_name, last_analyzed, notes from user_tab_col_statistics where table_name='TAB1';

COLUMN_NAME      LAST_ANALYZED       NOTES
---------------- ------------------- ----------------------------------------------------------------
ID               07.08.2020 17:29:37
DESCRIPTION      07.08.2020 17:29:37
ID               07.08.2020 17:37:34 STATS_ON_CONVENTIONAL_DML
DESCRIPTION      07.08.2020 17:37:34 STATS_ON_CONVENTIONAL_DML

So obviously Real Time Statistics gathering was triggered.

After the verification that merge statements trigger statistics to be gathered in real time I disabled Real Time Statistics on that specific merge-statement by adding the hint

/*+ NO_GATHER_OPTIMIZER_STATISTICS */

to it.

testuser1@orcl@orcl> select column_name, last_analyzed, notes from user_tab_col_statistics where table_name='TAB1';

COLUMN_NAME      LAST_ANALYZED       NOTES
---------------- ------------------- ----------------------------------------------------------------
ID               07.08.2020 17:46:38
DESCRIPTION      07.08.2020 17:46:38

testuser1@orcl@orcl> merge /*+ NO_GATHER_OPTIMIZER_STATISTICS */
  2  into	tab1
  3  using	tab2
  4  on	(tab1.id = tab2.id)
  5  when matched then
  6  	     update set tab1.description = tab2.description
  7  WHEN NOT MATCHED THEN
  8  	 INSERT (  id, description )
  9  	 VALUES ( tab2.id, tab2.description )
 10  ;

1000 rows merged.

testuser1@orcl@orcl> commit;

Commit complete.

testuser1@orcl@orcl> exec dbms_stats.flush_database_monitoring_info;

PL/SQL procedure successfully completed.

testuser1@orcl@orcl> select column_name, last_analyzed, notes from user_tab_col_statistics where table_name='TAB1';

COLUMN_NAME      LAST_ANALYZED       NOTES
---------------- ------------------- ----------------------------------------------------------------
ID               07.08.2020 17:46:38
DESCRIPTION      07.08.2020 17:46:38

So the hint works as expected.

The statement of the real application was generated and could not be modified, so I had to create a SQL-Patch to add the hint at parse-time to it:

var rv varchar2(32);
begin
   :rv:=dbms_sqldiag.create_sql_patch(sql_id=>'13szq2g6xbsg5',
                                      hint_text=>'NO_GATHER_OPTIMIZER_STATISTICS',
                                      name=>'disable_real_time_stats_on_merge',
                                      description=>'disable real time stats');
end;
/
print rv

REMARK: If a statement is no longer in the shared pool, but available in the AWR history, you may use below method to create the sql patch:

var rv varchar2(32);
declare
   v_sql CLOB;
begin
   select sql_text into v_sql from dba_hist_sqltext where sql_id='13szq2g6xbsg5';
   :rv:=dbms_sqldiag.create_sql_patch(
             sql_text  => v_sql,
             hint_text=>'NO_GATHER_OPTIMIZER_STATISTICS',
             name=>'disable_real_time_stats_on_merge',
             description=>'disable real time stats');
end;
/
print rv

It turned out that disabling Real Time Statistics actually worked around the ORA-7445 issue. It might be a coincidence and positive side effect that disabling Real Time Statistics worked around the issue, but for the moment we can cope with it and hope that this information helps to resolve the opened SR so that we get a permanent fix from Oracle for this defect.

Cet article Merge-Statement crashes with ORA-7445 [kdu_close] caused by Real Time Statistics? est apparu en premier sur Blog dbi services.

↧

Installing MariaDB Server 10.5.5 on CentOS 8

August 12, 2020, 6:10 am

≫ Next: AWS DynamoDB: the cost of indexes

≪ Previous: Merge-Statement crashes with ORA-7445 [kdu_close] caused by Real Time Statistics?

In the following blog, you will learn how to install the latest version of the MariaDB Server 10.5.5 on CentOS 8 and how to use the mariadb user & binaries instead of mysql.

As a reminder, MariaDB Server is highly scalable and can be either used as a Stand Alone, a Master/Slave or a Galera based MariaDB Cluster.
MariaDB Server 10.5.5 was released on 10-Aug-2020 with many new features and some notable changes as the S3 storage engine which allows to archive MariaDB tables in Amazon S3 or the binaries which are now all beginning with mariadb (mariadb-admin, mariadb-dump,…)
It is the very latest release at the time of this writing.

Preparation

To get our linuxbox up-to-date, it is a good practice to always update the Linux software packages before installing anything new.

[root@mariadb-01 ~]# dnf update -y;
Last metadata expiration check: 0:00:44 ago on Tue 11 Aug 2020 11:44:19 AM CEST.
Dependencies resolved.
====================================================================================================================================================
Package                        Architecture              Version                                  Repository             Size
====================================================================================================================================================
Installing:
Installed:
kernel-4.18.0-193.14.2.el8_2.x86_64 kernel-core-4.18.0-193.14.2.el8_2.x86_64 kernel-modules-4.18.0-193.14.2.el8_2.x86_64 grub2-tools-efi-1:2.02-87.el8_2.x86_64 libzstd-1.4.2-2.el8.x86_64 python3-nftables-1:0.9.3-12.el8.x86_64

Removed:
dnf-plugin-spacewalk-2.8.5-11.module_el8.1.0+211+ad6c0bc7.noarch python3-dnf-plugin-spacewalk-2.8.5-11.module_el8.1.0+211+ad6c0bc7.noarch python3-newt-0.52.20-9.el8.x86_64 python3-rhn-client-tools-2.8.16-13.module_el8.1.0+211+ad6c0bc7.x86_64
rhn-client-tools-2.8.16-13.module_el8.1.0+211+ad6c0bc7.x86_64

Complete!

To customize your own MariaDB YUM repository, go on https://mariadb.org, click on Download, then on the MariaDB Repositories link.
Just choose your distribution and MariaDB Server version. It will then generate the content needed.

You just have to copy and paste it into a file under /etc/yum.repos.d/ called mariadb.repo
[root@CentOs8 yum.repos.d]# vi /etc/yum.repo.d/mariadb.repo # MariaDB 10.5 [Stable] CentOS repository list - created 2020-08-11 13:57 UTC # https://mariadb.org/download-test/ [mariadb] name = MariaDB baseurl = http://mirror.mva-n.net/mariadb/yum/10.5/centos8-amd64 module_hotfixes=1 gpgkey=http://mirror.mva-n.net/mariadb/yum/RPM-GPG-KEY-MariaDB gpgcheck=1
After the file is in place, we download and cache metadata for all known repos.

[root@CentOs8 ~]#dnf makecache
CentOS-8 - AppStream 9.9 kB/s | 4.3 kB 00:00
CentOS-8 - Base 27 kB/s | 3.9 kB 00:00
CentOS-8 - Extras 10 kB/s | 1.5 kB 00:00
MariaDB

Installation

[root@CentOs8 ~]# dnf search MariaDB-server
Last metadata expiration check: 0:11:48 ago on Tue 11 Aug 2020 04:07:47 PM CEST.
================================================= Name Exactly Matched: MariaDB-server =================================================
MariaDB-server.x86_64 : MariaDB: a very fast and robust SQL database server     
================================================= Name & Summary Matched: MariaDB-server =============================================== 
MariaDB-server-debuginfo.x86_64 : Debug information for package MariaDB-server
================================================= Name Matched: MariaDB-server =========================================================
mariadb-server.x86_64 : The MariaDB server and related files
mariadb-server-utils.x86_64 : Non-essential server utilities for MariaDB/MySQL applications
mariadb-server-galera.x86_64 : The configuration files and scripts for galera replication

[root@CentOs8 ~]# dnf install mariadb-server
Verifying        : MariaDB-client-10.5.5-1.el8.x86_64    50/54                                                                                                                                                                                                 
Verifying        : MariaDB-common-10.5.5-1.el8.x86_64    51/54                                                                                                                                                                                                              
Verifying        : MariaDB-server-10.5.5-1.el8.x86_64    52/54                                                                                                                                                                                                              
Verifying        : MariaDB-shared-10.5.5-1.el8.x86_64    53/54                                                                                                                                                                                                             
Verifying        : galera-4-26.4.5-1.el8.x86_64

The installation is successful, we can now enable, start the MariaDB service & check the version

[root@CentOs8 ~]# systemctl enable --now mariadb.service
Created symlink /etc/systemd/system/multi-user.target.wants/mariadb.service → /usr/lib/systemd/system/mariadb.service.
[root@CentOs8 ~]# sudo systemctl start mariadb
[root@CentOs8 ~]# systemctl status mariadb.service
● mariadb.service - MariaDB 10.5.5 database server
   Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/mariadb.service.d
           └─migrated-from-my.cnf-settings.conf
   Active: active (running) since Tue 2020-08-11 16:27:38 CEST; 18s ago
[root@CentOs8 ~]# mariadb -V
mariadb  Ver 15.1 Distrib 10.5.5-MariaDB, for Linux (x86_64) using readline 5.1

We secure now the mariadb instance

[root@CentOs8 ~]# mariadb-secure-installation
NOTE: RUNNING ALL PARTS OF THIS SCRIPT IS RECOMMENDED FOR ALL MariaDB
      SERVERS IN PRODUCTION USE!  PLEASE READ EACH STEP CAREFULLY!
In order to log into MariaDB to secure it, we'll need the current
password for the root user. If you've just installed MariaDB, and
haven't set the root password yet, you should just press enter here.
Enter current password for root (enter for none):
Setting the root password or using the unix_socket ensures that nobody
can log into the MariaDB root user without the proper authorisation.
You already have your root account protected, so you can safely answer 'n'.
Switch to unix_socket authentication [Y/n] n
 ... skipping.
 Remove anonymous users? [Y/n] Y
  ... Success!
Disallow root login remotely? [Y/n] Y
 ... Success!
 Remove test database and access to it? [Y/n] Y
  - Dropping test database...
  ... Success!
  - Removing privileges on test database...
  ... Success!
  Reload privilege tables now? [Y/n] Y
   ... Success!
  Cleaning up...
  All done!  If you've completed all of the above steps, your MariaDB
  installation should now be secure.
  Thanks for using MariaDB!

We need now a mariadb user and group

[root@CentOs8 ~]# groupadd mariadb
[root@CentOs8 ~]# useradd -d /home/mariadb -m -g mariadb mariadb 
[root@CentOs8 ~]# passwd mariadb

We connect as root to the mariadb instance and create the user mariadb to be OS authentified

[mysql@CentOs8 ~]$ mariadb
  Welcome to the MariaDB monitor.  Commands end with ; or \g.
  Your MariaDB connection id is 4
  Server version: 10.5.5-MariaDB MariaDB Server
  Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
  Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
  MariaDB [(none)]> CREATE USER mariadb@localhost IDENTIFIED VIA unix_socket;
  Query OK, 0 rows affected (0.002 sec)
  MariaDB [(none)]> flush privileges;
  Query OK, 0 rows affected (0.000 sec)
  MariaDB [(none)]> select user,host,password from mysql.user;
  +-------------+-----------+----------+
  | User        | Host      | Password |
  +-------------+-----------+----------+
  | mariadb.sys | localhost |          |
  | root        | localhost | invalid  |
  | mysql       | localhost | invalid  |
  | mariadb     | localhost |          |
  +-------------+-----------+----------+

We switch user to mariadb and try the connection

[root@CentOs8 ~]# su - mariadb
Last login: Tue Aug 11 17:19:09 CEST 2020 on pts/0
[mariadb@CentOs8 ~]$ mariadb
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 22
Server version: 10.5.5-MariaDB MariaDB Server
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

Conclusion

As you can see, the installation on CentOS 8 is really fast and easy.
The use of the mariadb user and binaries instead of mysql’s allows to differentiate definitely from MySQL and avoid confusion,
knowing that now it’s 2 different products

Cet article Installing MariaDB Server 10.5.5 on CentOS 8 est apparu en premier sur Blog dbi services.

↧

AWS DynamoDB: the cost of indexes

August 12, 2020, 10:19 pm

≫ Next: Oracle Database Appliance and CPU speed

≪ Previous: Installing MariaDB Server 10.5.5 on CentOS 8

By Franck Pachot

.
That’s common to any data structure, whether it is RDBMS or NoSQL, indexes are good to accelerate reads but slow the writes. This post explains the consequences of adding indexes in DynamoDB.

Secondary Indexes

What we call an index in DynamoDB is different from an index in RDBMS. They have the same goal: store your data with some redundancy in order to have it physically partitioned, sorted, and clustered differently than the table, in order to optimize the performance for specific access patterns. It can be full redundancy (covering indexes) so that there’s is no need to look at the table, or partial redundancy (only the key values and the sufficient values for accessing the table efficiently). The indexes are maintained automatically: when the table is updated, the index entries are maintained by the database engine. This can be synchronous, or asynchronous if eventual consistency is accepted. The major difference is that a relational database separates the logical and physical implementation (Codd Rule 8: Physical Data Independence) for better agility: there is no change to do in the application code to access through an index or another. RDBMS have an optimizer (query planner) that selects the best access path for the query predicates. That was the topic of the previous post. But following the NoSQL spirit, AWS DynamoDB delegates this responsibility to the application code: the index access will be used only when you explicitly query it.

Because DynamoDB tables are physically organized by the primary key (hash partitioning with local index when a sort key is defined) this KeySchema can be considered the primary index. Then any additional index is a secondary index. It can be local, prefixed by the hash key, or global, prefixed by another hash key than the table.

Table with no indexes


aws dynamodb create-table --table-name Demo \
 --billing-mode PROVISIONED --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5 \
 --attribute-definitions AttributeName=P,AttributeType=S AttributeName=S,AttributeType=S \
 --key-schema AttributeName=P,KeyType=HASH AttributeName=S,KeyType=RANGE

I have created a HASH/RANGE table that is perfect to access with a single value for the attribute P, the partition key, and a single value or a range for S, the sort key.


{
    "TableDescription": {
        "TableArn": "arn:aws:dynamodb:eu-central-1:802756008554:table/Demo",
        "AttributeDefinitions": [
            {
                "AttributeName": "P",
                "AttributeType": "S"
            },
            {
                "AttributeName": "S",
                "AttributeType": "S"
            }
        ],
        "ProvisionedThroughput": {
            "NumberOfDecreasesToday": 0,
            "WriteCapacityUnits": 5,
            "ReadCapacityUnits": 5
        },
        "TableSizeBytes": 0,
        "TableName": "Demo",
        "TableStatus": "CREATING",
        "TableId": "b2a97f98-611d-451d-99ee-c3aab1129b30",
        "KeySchema": [
            {
                "KeyType": "HASH",
                "AttributeName": "P"
            },
            {
                "KeyType": "RANGE",
                "AttributeName": "S"
            }
        ],
        "ItemCount": 0,
        "CreationDateTime": 1597052214.276
    }
}

This is the output of the create-table command. I use small reserved capacity in my blog posts so that you can run the test on the AWS Free Tier without risk. I look at the metrics, for better understanding, not the response time which depends on many other factors (network latency, bursting, throttling,…). But of course, you will get the same on larger data sets.

In this table, I’ll insert items by batch from the following JSON (values are randomly generated for each call):


{                                                                                                                                                                                                                                  [16/91205]
 "Demo": [
  {"PutRequest":{"Item":{"P":{"S":"4707"},"S":{"S":"23535"},"A0":{"S":"18781"}
,"A01":{"S":"10065"} ,"A02":{"S":"2614"} ,"A03":{"S":"7777"} ,"A04":{"S":"19950"} ,"A05":{"S":"30864"} ,"A06":{"S":"24176"} ,"A07":{"S":"22257"} ,"A08":{"S":"11549"} ,"A09":{"S":"28368"} ,"A10":{"S":"29095"} ,"A11":{"S":"23060"} ,"A12":{
"S":"3321"} ,"A13":{"S":"30588"} ,"A14":{"S":"16039"} ,"A15":{"S":"31388"} ,"A16":{"S":"21811"} ,"A17":{"S":"10593"} ,"A18":{"S":"18914"} ,"A19":{"S":"23120"} ,"A20":{"S":"25238"} }}},
  {"PutRequest":{"Item":{"P":{"S":"4106"},"S":{"S":"15829"},"A0":{"S":"28144"}
,"A01":{"S":"9051"} ,"A02":{"S":"26834"} ,"A03":{"S":"1614"} ,"A04":{"S":"6458"} ,"A05":{"S":"1721"} ,"A06":{"S":"8022"} ,"A07":{"S":"49"} ,"A08":{"S":"23158"} ,"A09":{"S":"6588"} ,"A10":{"S":"17560"} ,"A11":{"S":"4330"} ,"A12":{"S":"175
78"} ,"A13":{"S":"8548"} ,"A14":{"S":"57"} ,"A15":{"S":"27601"} ,"A16":{"S":"8766"} ,"A17":{"S":"24400"} ,"A18":{"S":"18881"} ,"A19":{"S":"28418"} ,"A20":{"S":"14915"} }}},
... 
  {"PutRequest":{"Item":{"P":{"S":"27274"},"S":{"S":"8548"},"A0":{"S":"11557"}
,"A01":{"S":"28758"} ,"A02":{"S":"17212"} ,"A03":{"S":"17658"} ,"A04":{"S":"10456"} ,"A05":{"S":"8488"} ,"A06":{"S":"28852"} ,"A07":{"S":"22763"} ,"A08":{"S":"21667"} ,"A09":{"S":"15240"} ,"A10":{"S":"12092"} ,"A11":{"S":"25045"} ,"A12":{"S":"9156"} ,"A13":{"S":"27596"} ,"A14":{"S":"27305"} ,"A15":{"S":"22214"} ,"A16":{"S":"13384"} ,"A17":{"S":"12300"} ,"A18":{"S":"12913"} ,"A19":{"S":"20121"} ,"A20":{"S":"20224"} }}}
 ]
}

In addition to the primary key, I have attributes from A0 to A20. I put 25 items per call (that’s the maximum for DynamoDB) and my goal is to have many attributes that I can index later.


aws dynamodb batch-write-item --request-items file://batch-write.json \
 --return-consumed-capacity INDEXES --return-item-collection-metrics SIZE

This is the simple call for this batch insert, returning the consumed capacity on table and indexes:


aws dynamodb batch-write-item --request-items file://batch-write139.json --return-consumed-capacity INDEXES --return-item-collection-metrics SIZE
...
{
    "UnprocessedItems": {},
    "ItemCollectionMetrics": {},
    "ConsumedCapacity": [
        {
            "CapacityUnits": 25.0,
            "TableName": "Demo",
            "Table": {
                "CapacityUnits": 25.0
            }
        }
    ]
}

25 Write Capacity Units for 25 items: each item that is smaller than 1KB consumes 1 WCU. My items are on average 170 Bytes here, so they fit in 1 WCU. And batching doesn’t help there: it is batched for the network call only, but they all go into a different place, and then require a WCU for each of them. There is nothing like preparing full blocks with many items (like RDBMS direct-path inserts, or fast load, or insert append…). DynamoDB is there to scale small transactions by scattering data though multiple partitions.

Table with 5 local indexes

Here is the same create statement but with 5 Local Secondary Indexes declared:


aws dynamodb create-table --table-name Demo --billing-mode PROVISIONED --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5 \
 --attribute-definitions AttributeName=P,AttributeType=S AttributeName=S,AttributeType=S \
  AttributeName=A01,AttributeType=S AttributeName=A02,AttributeType=S AttributeName=A03,AttributeType=S AttributeName=A04,AttributeType=S AttributeName=A05,AttributeType=S \
 --key-schema AttributeName=P,KeyType=HASH AttributeName=S,KeyType=RANGE \
 --local-secondary-indexes \
  'IndexName=LSI01,KeySchema=[{AttributeName=P,KeyType=HASH},{AttributeName=A01,KeyType=RANGE}],Projection={ProjectionType=ALL}' \
  'IndexName=LSI02,KeySchema=[{AttributeName=P,KeyType=HASH},{AttributeName=A02,KeyType=RANGE}],Projection={ProjectionType=ALL}' \
  'IndexName=LSI03,KeySchema=[{AttributeName=P,KeyType=HASH},{AttributeName=A03,KeyType=RANGE}],Projection={ProjectionType=ALL}' \
  'IndexName=LSI04,KeySchema=[{AttributeName=P,KeyType=HASH},{AttributeName=A04,KeyType=RANGE}],Projection={ProjectionType=ALL}' \
  'IndexName=LSI05,KeySchema=[{AttributeName=P,KeyType=HASH},{AttributeName=A05,KeyType=RANGE}],Projection={ProjectionType=ALL}'

This recreated the table with adding the definition for 5 local indexes, on the same partition key but different sort key. I had to add the attribute definition for them as I reference them in the index definition.


...
        },
        "TableSizeBytes": 0,
        "TableName": "Demo",
        "TableStatus": "CREATING",
        "TableId": "84fc745b-66c5-4c75-bcf4-7686b2daeacb",
        "KeySchema": [
            {
                "KeyType": "HASH",
                "AttributeName": "P"
            },
            {
                "KeyType": "RANGE",
                "AttributeName": "S"
            }
        ],
        "ItemCount": 0,
        "CreationDateTime": 1597054018.546
    }
}

The hash partition size in DynamoDB is fixed, 10GB, and because the local indexes are stored within each partition, the total size of an item plus all its index entries cannot go higher than this limit. Here, I’m far from the limit, which will be often the case: if your key-value store is a document store, you will not project the document into all local indexes. Then use KEYS_ONLY for the projection type and not the ALL one I used there. And anyway, 5 local indexes is the maximum you can create in DynamoDB.


aws dynamodb batch-write-item --request-items file://batch-write139.json --return-consumed-capacity INDEXES --return-item-collection-metrics SIZE
...
   "ConsumedCapacity": [
        {
            "CapacityUnits": 150.0,
            "TableName": "Demo",
            "LocalSecondaryIndexes": {
                "LSI01": {
                    "CapacityUnits": 25.0
                },
                "LSI03": {
                    "CapacityUnits": 25.0
                },
                "LSI02": {
                    "CapacityUnits": 25.0
                },
                "LSI05": {
                    "CapacityUnits": 25.0
                },
                "LSI04": {
                    "CapacityUnits": 25.0
                }
            },
            "Table": {
                "CapacityUnits": 25.0
            }
        }
    ]
}

Here we are 155 WCU in total here. The same 25 WCU as before, for the 25 items put in the table. And each local index accounts for an additional 25 WCU. I have no idea why 26 and not 25 by the way. Note that I’ve seen a few with 26 WCU for all indexes in the test and I don’t really know why.

Table with 20 global indexes

Now, without any local indexes but the maximum global indexes we can have here: 20 Global Secondary Indexes (GSI)


aws dynamodb create-table --table-name Demo --billing-mode PROVISIONED --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5 \
 --attribute-definitions AttributeName=P,AttributeType=S AttributeName=S,AttributeType=S \
AttributeName=A01,AttributeType=S AttributeName=A02,AttributeType=S AttributeName=A03,AttributeType=S AttributeName=A04,AttributeType=S AttributeName=A05,AttributeType=S AttributeName=A06,AttributeType=S AttributeName=A07,AttributeType=S AttributeName=A08,AttributeType=S AttributeName=A09,AttributeType=S AttributeName=A10,AttributeType=S AttributeName=A11,AttributeType=S AttributeName=A12,AttributeType=S AttributeName=A13,AttributeType=S AttributeName=A14,AttributeType=S AttributeName=A15,AttributeType=S AttributeName=A16,AttributeType=S AttributeName=A17,AttributeType=S AttributeName=A18,AttributeType=S AttributeName=A19,AttributeType=S AttributeName=A20,AttributeType=S \
 --key-schema AttributeName=P,KeyType=HASH AttributeName=S,KeyType=RANGE \
 --global-secondary-indexes \
  'IndexName=GSI01,KeySchema=[{AttributeName=A01,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI02,KeySchema=[{AttributeName=A02,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI03,KeySchema=[{AttributeName=A03,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI04,KeySchema=[{AttributeName=A04,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI05,KeySchema=[{AttributeName=A05,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI06,KeySchema=[{AttributeName=A06,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI07,KeySchema=[{AttributeName=A07,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI08,KeySchema=[{AttributeName=A08,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI09,KeySchema=[{AttributeName=A09,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI10,KeySchema=[{AttributeName=A10,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI11,KeySchema=[{AttributeName=A11,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI12,KeySchema=[{AttributeName=A12,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI13,KeySchema=[{AttributeName=A13,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI14,KeySchema=[{AttributeName=A14,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI15,KeySchema=[{AttributeName=A15,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI16,KeySchema=[{AttributeName=A16,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI17,KeySchema=[{AttributeName=A17,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI18,KeySchema=[{AttributeName=A18,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI19,KeySchema=[{AttributeName=A19,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}' \
  'IndexName=GSI20,KeySchema=[{AttributeName=A20,KeyType=HASH},{AttributeName=S,KeyType=RANGE}],Projection={ProjectionType=ALL},ProvisionedThroughput={ReadCapacityUnits=1,WriteCapacityUnits=1}'

This takes much longer to create because global indexes are actually like other tables that are maintained asynchronously (there’s only eventual consistency when you read them).


aws dynamodb batch-write-item --request-items file://batch-write139.json --return-consumed-capacity INDEXES --return-item-collection-metrics SIZE
...
{                                                                                                                                                                                                                                      "UnprocessedItems": {},
    "ItemCollectionMetrics": {},
    "ConsumedCapacity": [
        {
            "CapacityUnits": 525.0,
            "GlobalSecondaryIndexes": {
                "GSI06": {
                    "CapacityUnits": 25.0
                },
                "GSI07": {
                    "CapacityUnits": 25.0
                },
                "GSI05": {
                    "CapacityUnits": 25.0
                },
 ...
                "GSI08": {
                    "CapacityUnits": 25.0
                },
                "GSI03": {
                    "CapacityUnits": 25.0
                }
            },
            "TableName": "Demo",
            "Table": {
                "CapacityUnits": 25.0
            }
        }
    ]
}

The cost is the same as with local indexes: one capacity unit per item per index in addition to the table.

So this post is simply there to get your attention to the fact that adding indexes will slow the writes, in NoSQL, as in any database. In DynamoDB this is measured by Write Capacity Unit and you can get the whole detail, how many WCU for the table, for the LSI and for the GSI, with “ReturnConsumedCapacity”. But what is important is that this capacity can scale. You will probably not see a difference in the response time. Except of course if you go beyond the provisioned capacity. And then you can increase it (it has a cost of course). How does it scale? Because DynamoDB allows us to do only things that scale. Maintaining global indexes requires cross-node synchronization in a distributed database, and this cannot scale. So DynamoDB does it asynchronously (reads on the GSI is eventually consistent). And the number of GSI is limited to 20. Maintaining local indexes do not involve cross-partition latency and are maintained synchronously. But to limit the overhead, you can create 5 LSI at maximum. Within those limits, local and global indexes are useful to keep item access fast (see previous posts on covering GSI and LSI)

Cet article AWS DynamoDB: the cost of indexes est apparu en premier sur Blog dbi services.

↧

Oracle Database Appliance and CPU speed

August 13, 2020, 3:28 am

≫ Next: Oracle ADB: rename the service_name connect_data

≪ Previous: AWS DynamoDB: the cost of indexes

Introduction

A complaint I heard from customers about ODA is the low core speed of the Intel Xeon processor embedded in the X8-2 servers: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz. 2.30GHz only? Because of its comfortable number of cores (16 per processor), the cruise speed of each core is limited. Is it a problem compared to a home made server with less cores?

Why clock speed is important?

As you may now, the faster a core is running, the less it takes time to complete a task. Single core clock speed is still an important parameter for Oracle databases. Software architecture of Oracle is brilliant: automatic parallelism can dramatically reduce the time needed for some statements to complete, but the vast majority of them will be processed on a single thread. Regarding Standard Edition 2, parallelism does not exist on this version, thus each statement is limited to a single thread.

Is ODA X8-2 processor really limited to 2.3GHz?

Don’t be affraid by this low CPU speed, this is actually the lowest speed the cores are guaranteed to operate. Speed of the cores can be increased by the system, depending on various parameters, and fastest speed is 3.9GHz for this kind of CPU, which is nearly twice the base frequency. This Xeon processor, as most of its predecessors, features Turbo boost technology, a kind of intelligent automatic overclocking.

Turbo boost technology?

As far as I know, all the Xeon family has Turbo boost technology. If you need more MHz than normal from time to time, you CPU speed can greatly increase to something like 180% of its nominal speed, which is quite amazing. But why this speed is not the default speed of the cores? Simply because running all the cores at full speed has a thermal impact on the CPU itself, and the complete system. As a consequence, heating can exceed cooling capacity and damage hardware. To manage speed and thermal efficiency, Intel’s processor dynamically distributes Turbo bins, which are basically slices of MHz increase. For each CPU model, a defined number of Turbo bins is available and will be given to the cores. The rule is that each core will receive the same Turbo bins numbers at the same time. What’s most interesting on ODA is that it’s related to enabled cores: the less cores are enabled on the CPU, the more Turbo bins are available for each single core.

Turbo bins and limited number cores

With limited number of cores, the heating of your CPU will be quite low in normal condition, and still low under heavy load because the heatsink and the fans are sized for using all the cores. As a result, most of the time, the Turbo bins will be allocated to your cores, and if you’re lucky, you’ll be running at full throttle, meaning that, for example, instead of a 16 cores CPU running at 2.3GHz, you’ll have a 4 cores CPU running at 3.9GHz. Quite nice isn’t it?

With Enterprise Edition

One of the main feature of ODA is the ability to configure the number of cores you need, and only pay the license for these enabled cores. Most of the customers are only using a few cores, and that’s nice for single threaded performance. You can expect full speed at least for 2 and 4 enabled cores.

What about Standard Edition 2?

With Standard Edition 2, you don’t need to decrease the cores on your server because your license is related to the socket, and not the cores. But nothing prevent you from decreasing the core numbers. There should be a limit where less but faster cores will benefit to all of your databases. If you only have a few databases on you ODA (let’s say less than 10 on a X8-2M), there is no question about decreasing the number of cores: it will most probably bring you more performance. If you have much more databases, the overall perfomance will probably be better with all the cores running at lower speed.

And when using old software/hardware?

Turbo boost was also available on X7-2, but old software releases (18.x) do not seem to let the cores go faster than normal speed. Maybe it’s due to the Linux version: the jump from Linux 6 to Linux 7 starting from earlier versions of 19.x has probably something to do with that. Patching to 19.x is highly recommended on X7 for a new reason: better performance.

Conclusion

If you’re using Standard Edition 2, don’t hesitate to decrease the number of enabled cores on your ODA, it will probably bring you nice speed bump. If you’re using Enterprise Edition and don’t plan to use all the cores on your ODA, you will benefit from very fast cores and leverage at best your licenses. Take this with a grain of salt, as it will depends on the environment, both physical and logical, and as these conclusions came from a quite limited number of systems. Definitely, with its fast NVMe disks and these Xeon CPUs, ODA is the perfect choice for most of us.

Cet article Oracle Database Appliance and CPU speed est apparu en premier sur Blog dbi services.

↧

Oracle ADB: rename the service_name connect_data

August 13, 2020, 9:26 am

≫ Next: SQL Server: High SQLCONNECTIONPOOL Memory Clerk consumption

≪ Previous: Oracle Database Appliance and CPU speed

By Franck Pachot

.
Since Aug. 4, 2020 we have the possibility to rename an Autonomous Database (ATP, ADW or AJD – the latest JSON database) on shared Exadata infrastructure (what was called ‘serverless’ last year which is a PDB in a public CDB). As the PDB name is internal, we reference the ADB with its database name is actually a part of the service name.

I have an ATP database that I’ve created in the Oracle Cloud Free Tier a few months ago.
I have downloaded the region and instance wallet to be used by client connections:


SQL> host grep _high /var/tmp/Wallet_DB202005052234_instance/tnsnames.ora

db202005052234_high = (description= (retry_count=20)(retry_delay=3)(address=(protocol=tcps)(port=1522)(host=adb.eu-frankfurt-1.oraclecloud.com))(connect_data=(service_name=jgp1nyc204pdpjc_db202005052234_high.atp.oraclecloud.com))(security=(ssl_server_cert_dn="CN=adwc.eucom-central-1.oraclecloud.com,OU=Oracle BMCS FRANKFURT,O=Oracle Corporation,L=Redwood City,ST=California,C=US")))

This is the instance wallet which references only this database (db202005052234)


SQL> host grep _high /var/tmp/Wallet_DB202005052234_region/tnsnames.ora

db202005052234_high = (description= (retry_count=20)(retry_delay=3)(address=(protocol=tcps)(port=1522)(host=adb.eu-frankfurt-1.oraclecloud.com))(connect_data=(service_name=jgp1nyc204pdpjc_db202005052234_high.atp.oraclecloud.com))(security=(ssl_server_cert_dn="CN=adwc.eucom-central-1.oraclecloud.com,OU=Oracle BMCS FRANKFURT,O=Oracle Corporation,L=Redwood City,ST=California,C=US")))
db202003061855_high = (description= (retry_count=20)(retry_delay=3)(address=(protocol=tcps)(port=1522)(host=adb.eu-frankfurt-1.oraclecloud.com))(connect_data=(service_name=jgp1nyc204pdpjc_db202003061855_high.adwc.oraclecloud.com))(security=(ssl_server_cert_dn="CN=adwc.eucom-central-1.oraclecloud.com,OU=Oracle BMCS FRANKFURT,O=Oracle Corporation,L=Redwood City,ST=California,C=US")))

This contains also my other database service that I have in the same region.

I connect using this wallet:


SQL> connect admin/"TheAnswer:=42"@DB202005052234_tp?TNS_ADMIN=/var/tmp/Wallet_DB202005052234_instance
Connected.

SQL> select name,network_name,creation_date,pdb from v$services;

                                                          NAME                                                   NETWORK_NAME          CREATION_DATE                               PDB
______________________________________________________________ ______________________________________________________________ ______________________ _________________________________
JGP1NYC204PDPJC_DB202005052234_high.atp.oraclecloud.com        JGP1NYC204PDPJC_DB202005052234_high.atp.oraclecloud.com        2019-05-17 20:53:03    JGP1NYC204PDPJC_DB202005052234
JGP1NYC204PDPJC_DB202005052234_tpurgent.atp.oraclecloud.com    JGP1NYC204PDPJC_DB202005052234_tpurgent.atp.oraclecloud.com    2019-05-17 20:53:03    JGP1NYC204PDPJC_DB202005052234
JGP1NYC204PDPJC_DB202005052234_low.atp.oraclecloud.com         JGP1NYC204PDPJC_DB202005052234_low.atp.oraclecloud.com         2019-05-17 20:53:03    JGP1NYC204PDPJC_DB202005052234
JGP1NYC204PDPJC_DB202005052234_tp.atp.oraclecloud.com          JGP1NYC204PDPJC_DB202005052234_tp.atp.oraclecloud.com          2019-05-17 20:53:03    JGP1NYC204PDPJC_DB202005052234
jgp1nyc204pdpjc_db202005052234                                 jgp1nyc204pdpjc_db202005052234                                 2020-08-13 09:02:02    JGP1NYC204PDPJC_DB202005052234
JGP1NYC204PDPJC_DB202005052234_medium.atp.oraclecloud.com      JGP1NYC204PDPJC_DB202005052234_medium.atp.oraclecloud.com      2019-05-17 20:53:03    JGP1NYC204PDPJC_DB202005052234

Here are all the services registered: the LOW/MEDIUM/HIGH/TP/TP_URGENT for my connections and the PDB name one.

Now from the Cloud Console I rename the database:

You can see that the “display name” (DB 202008131439) didn’t change but the “Database name” has been renamed (from “DB202008131439” to “FRANCK”).


SQL> select name,network_name,creation_date,pdb from v$services;

Error starting at line : 1 in command -
select name,network_name,creation_date,pdb from v$services
Error at Command Line : 1 Column : 1
Error report -
SQL Error: No more data to read from socket
SQL>

My connection has been canceled. I need to connect again.


SQL> connect admin/"TheAnswer:=42"@DB202005052234_tp?TNS_ADMIN=/var/tmp/Wallet_DB202005052234_instance
Aug 13, 2020 10:13:41 AM oracle.net.resolver.EZConnectResolver parseExtendedProperties
SEVERE: Extended settings parsing failed.
java.lang.RuntimeException: Unable to parse url "/var/tmp/Wallet_DB202005052234_instance:1521/DB202005052234_tp?TNS_ADMIN"
        at oracle.net.resolver.EZConnectResolver.parseExtendedProperties(EZConnectResolver.java:408)
        at oracle.net.resolver.EZConnectResolver.parseExtendedSettings(EZConnectResolver.java:366)
        at oracle.net.resolver.EZConnectResolver.parse(EZConnectResolver.java:171)
        at oracle.net.resolver.EZConnectResolver.(EZConnectResolver.java:130)
        at oracle.net.resolver.EZConnectResolver.newInstance(EZConnectResolver.java:139)
        at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:669)
        at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:562)
        at java.sql.DriverManager.getConnection(DriverManager.java:664)
        at java.sql.DriverManager.getConnection(DriverManager.java:208)
        at oracle.dbtools.raptor.newscriptrunner.SQLPLUS.connect(SQLPLUS.java:5324)
        at oracle.dbtools.raptor.newscriptrunner.SQLPLUS.logConnectionURL(SQLPLUS.java:5418)
        at oracle.dbtools.raptor.newscriptrunner.SQLPLUS.logConnectionURL(SQLPLUS.java:5342)
        at oracle.dbtools.raptor.newscriptrunner.SQLPLUS.getConnection(SQLPLUS.java:5154)
        at oracle.dbtools.raptor.newscriptrunner.SQLPLUS.runConnect(SQLPLUS.java:2414)
        at oracle.dbtools.raptor.newscriptrunner.SQLPLUS.run(SQLPLUS.java:220)
        at oracle.dbtools.raptor.newscriptrunner.ScriptRunner.runSQLPLUS(ScriptRunner.java:425)
        at oracle.dbtools.raptor.newscriptrunner.ScriptRunner.run(ScriptRunner.java:262)
        at oracle.dbtools.raptor.newscriptrunner.ScriptExecutor.run(ScriptExecutor.java:344)
        at oracle.dbtools.raptor.newscriptrunner.ScriptExecutor.run(ScriptExecutor.java:227)
        at oracle.dbtools.raptor.scriptrunner.cmdline.SqlCli.process(SqlCli.java:410)
        at oracle.dbtools.raptor.scriptrunner.cmdline.SqlCli.processLine(SqlCli.java:421)
        at oracle.dbtools.raptor.scriptrunner.cmdline.SqlCli.startSQLPlus(SqlCli.java:1179)
        at oracle.dbtools.raptor.scriptrunner.cmdline.SqlCli.main(SqlCli.java:502)

  USER          = admin
  URL           = jdbc:oracle:thin:@DB202005052234_tp?TNS_ADMIN=/var/tmp/Wallet_DB202005052234_instance
  Error Message = Listener refused the connection with the following error:
ORA-12514, TNS:listener does not currently know of service requested in connect descriptor
  USER          = admin
  URL           = jdbc:oracle:thin:@DB202005052234_tp?TNS_ADMIN=/var/tmp/Wallet_DB202005052234_instance:1521/DB202005052234_tp?TNS_ADMIN=/var/tmp/Wallet_DB202005052234_instance
  Error Message = IO Error: Invalid connection string format, a valid format is: "host:port:sid"

Warning: You are no longer connected to ORACLE.
SQL>

The service is not known, which makes sense because the rename of the database is actually a rename of the services.

Oracle documentation says that we have to download the wallet again after a rename of the database. But that’s not very agile. Let’s rename the service in the tnsnames.ora


SQL> host sed -ie s/_db202005052234/FRANCK/g /var/tmp/Wallet_DB202005052234_instance/tnsnames.ora

This changes only the SERVICE_NAME in CONNECT_DATA but not the tnsnames.ora entry, then I can use the same connection string.


SQL> connect admin/"TheAnswer:=42"@DB202005052234_tp?TNS_ADMIN=/var/tmp/Wallet_DB202005052234_instance

SQL> select name,network_name,creation_date,pdb from v$services;

                                                  NAME                                           NETWORK_NAME          CREATION_DATE                       PDB
______________________________________________________ ______________________________________________________ ______________________ _________________________
JGP1NYC204PDPJC_FRANCK_high.atp.oraclecloud.com        JGP1NYC204PDPJC_FRANCK_high.atp.oraclecloud.com        2019-05-17 20:53:03    JGP1NYC204PDPJC_FRANCK
JGP1NYC204PDPJC_FRANCK_tp.atp.oraclecloud.com          JGP1NYC204PDPJC_FRANCK_tp.atp.oraclecloud.com          2019-05-17 20:53:03    JGP1NYC204PDPJC_FRANCK
JGP1NYC204PDPJC_FRANCK_medium.atp.oraclecloud.com      JGP1NYC204PDPJC_FRANCK_medium.atp.oraclecloud.com      2019-05-17 20:53:03    JGP1NYC204PDPJC_FRANCK
jgp1nyc204pdpjc_franck                                 jgp1nyc204pdpjc_franck                                 2020-08-13 10:05:58    JGP1NYC204PDPJC_FRANCK
JGP1NYC204PDPJC_FRANCK_low.atp.oraclecloud.com         JGP1NYC204PDPJC_FRANCK_low.atp.oraclecloud.com         2019-05-17 20:53:03    JGP1NYC204PDPJC_FRANCK
JGP1NYC204PDPJC_FRANCK_tpurgent.atp.oraclecloud.com    JGP1NYC204PDPJC_FRANCK_tpurgent.atp.oraclecloud.com    2019-05-17 20:53:03    JGP1NYC204PDPJC_FRANCK

Using the new SERVICE_NAME is sufficient. As you can see above, some autonomous magic remains: the new services still have the old creation date.

Note that you should follow the documentation and download the wallet and change your connection string. There is probably a reason behind this. But autonomous or not, I like to understand what I do and I don’t see any reason for changing everything when renaming a service.

Cet article Oracle ADB: rename the service_name connect_data est apparu en premier sur Blog dbi services.

↧

SQL Server: High SQLCONNECTIONPOOL Memory Clerk consumption

August 13, 2020, 11:09 am

≫ Next: Installing MySQL Server on Oracle Cloud Infrastructure Compute

≪ Previous: Oracle ADB: rename the service_name connect_data

In this blog post, I will show you what I did to troubleshoot an interesting problem with Memory on SQL Server.

It all started with a job performing DBCC CHECKDB on all databases taking hours to complete instead of 10 minutes.
So the Job ran outside of its maintenance window, still running in the morning when users come back to the office. They immediately complained about poor application performance.

While running the CHECKDB we could see many sessions in a SUSPENDED state with SELECT queries waiting on “RESOURCE_SEMAPHORE”.

The instance seemed to be starving on Memory.
We tried to increase the “Max Server Memory” by 2GB. This solved the issue. Temporarily.
Around a week later the same issue occurred again with users complaining of very bad performance. The CHECKDB Job was running again for hours.

I did more analysis of the memory usage for this instance. Identifying the biggest memory consumers is key to proceeding with investigation.
The amount of memory allocated to each memory clerk can be found using the sys.dm_os_memory_clerks DMV.
We can notice a very high value for the Memory clerk “SQLCONNECTIONPOOL”.

The “Max Server Memory” value of this instance is configured to 14GB. So half of it is allocated to the SQLCONNECTIONPOOL Memory Clerk. This is obviously not a normal situation.

We decided to perform a Failover of the Availability Group to clear all the memory on the instance to perform the CHECKDB.
From there I created a Job monitoring the Memory Clerk usage. The query is from one of Glenn Berry’s diagnostic queries.
Here is the SQL.

-- The monitoring Table
use dbi_tools
go
create table monitoring.memoryClerkType (
	mct_id int identity not null primary key
	, mct_logdate datetime2 default getdate()
	, mct_MemoryClerkType nvarchar(256)	not null
	, mct_memoryUsageMB DECIMAL(15,2)
);

-- The query inside an SQL Server Agent Job
insert into dbi_tools.monitoring.memoryClerkType(mct_MemoryClerkType, mct_memoryUsageMB)
	SELECT TOP(10) mc.[type] AS [Memory Clerk Type], 
		   CAST((SUM(mc.pages_kb)/1024.0) AS DECIMAL (15,2)) AS [Memory Usage (MB)] 
	FROM sys.dm_os_memory_clerks AS mc WITH (NOLOCK)
	GROUP BY mc.[type]  
	ORDER BY SUM(mc.pages_kb) DESC OPTION (RECOMPILE);

This allowed me to confirm that the Memory allocated to SQLCONNECTIONPOOL is increasing over time.

This causes internal Memory pressure on the instance. As more and more memory is allocated to this Memory Clerk the memory available for the Buffer Pool slowly decreases. Any event that requires a lot of memory like a DBCC CHECKDB would flush memory from the Buffer Pool but the memory allocated to SQLCONNECTIONPOOL is be flushed. This is a real issue.

This issue is described in an SQL Server CAT article: Watch out those prepared SQL statements

Basically, some application server is calling sp_prepare system procedure through an ODBC driver and do not call sp_unprepare.
This seems to be a bug in the ODBC driver. There’s not much to do on the MSSQL server.

My instance has dozens of databases for different applications. I need to identify which database, and so which application server needs an ODBC driver update.
To do so I created an Extended Event session. There are events for sp_prepare and sp_unprepare in the “execution” category.

Here is the T-SQL for this XE session:

CREATE EVENT SESSION [dbi_sna_memory] ON SERVER 
ADD EVENT sqlserver.prepare_sql(
    ACTION(sqlserver.client_app_name,sqlserver.client_hostname
		,sqlserver.database_name,sqlserver.session_id,sqlserver.username)),
ADD EVENT sqlserver.unprepare_sql(
    ACTION(sqlserver.client_app_name,sqlserver.client_hostname
		,sqlserver.database_name,sqlserver.session_id,sqlserver.username))
ADD TARGET package0.event_file(SET filename=N'dbi_sna_memory',max_file_size=(500))
WITH (STARTUP_STATE=ON)
GO

Looking at the live data I could see a lot of sp_prepare without any sp_unprepare for one of the databases.

Sessions to the database in blue color doesn’t seem to call the sp_unprepare system procedure.
Looking at the Extended Event data with SQL I get a better view of the situation.

Based on a sample of approximately 1 Million events, it’s obvious that the first database is the one doing the most sp_prepare events. There are only 1% of sp_unprepare calls which is clearly abnormal compared to the other databases with an expected value of 99%-100%.
Just for information here is the query I did to get the result above:

select 
	SUM(IIF(eventName='prepare_sql', 1, 0)) AS Prepare
	, SUM(IIF(eventName='unprepare_sql', 1, 0)) AS Unpepare
	, ROUND(CAST(SUM(IIF(eventName='unprepare_sql', 1, 0)) AS FLOAT)/SUM(IIF(eventName='prepare_sql', 1, 0))*100, 2) AS unPreparePct
	, dbName
from dbi_tools.dbo.xe_Memory_data -- Table with data from the XE session file
group by dbName
order by 1 desc

The culprit is now identified. We can update the client layer on the application server and look for improvement in memory usage.

Cet article SQL Server: High SQLCONNECTIONPOOL Memory Clerk consumption est apparu en premier sur Blog dbi services.

↧

Installing MySQL Server on Oracle Cloud Infrastructure Compute

August 14, 2020, 1:47 am

≫ Next: Oracle Data Pump Integration for Table instantiation with Oracle Golden Gate

≪ Previous: SQL Server: High SQLCONNECTIONPOOL Memory Clerk consumption

If you are thinking about to move your MySQL databases to the Cloud but you are still reticent, you can maybe use the “Oracle Cloud Free Tier” offer to test it.

Oracle Cloud Free Tier offers you 2 Oracle Autonomous Databases and 2 Oracle Cloud Infrastructure Compute VMs as Always Free services and in addition a 30-day Free Trial with US$300 in free credits.
On these VMs instances (provisioned and managed by Oracle Cloud Infrastructure Compute) you can install your MySQL server. Let’s see how…

Oracle Cloud account creation

Connect to the Oracle Cloud page and fill in your credentials:

Define your account type, your Cloud account name, your region and some other details:

Enter your Cloud account password:

Last step is to insert your payment information (this will be just used for verification, you won’t have to pay anything):

And you account will be created:

Instance creation

Some minutes after the account creation, you can sign in to your Oracle Cloud environment:

Once connected, you can directly see that you are in a free trial and you have the possibility to switch to upgrade to paid at any time:

You can now create your VM instance in the “Quick Actions” section in the main page or under “Menu > Compute > Instances > Create Instance”. For example:

You can define the instance name, which image you want to use (Oracle Linux, CentOS, …) as operating system and some other network and storage options:

You also need to define the keys to connect to the VM. If you don’t have them yet, you can generate RSA keys via PuTTYgen for example. You just have to generate the key, assign it a passphrase and save your public and private keys:

You can now add your public key in the “Add SSH keys” section:

and finally launch the instance creation.
The VM creation is in progress in the region that you selected (Zürich in my case):

You can check the status of the operation in the “Work Requests” section:

The instance has been provisioned (it took 2 minutes for me) and it is running now, and you can get the public IP address of the new VM:

Connection to the new VM instance

You can now connect to the new VM via PuTTY, using the given IP address and adding your private key for the authentication in the PuTTY configuration under “Connection > SSH > Auth”:

After entering your passphrase, you are now connected to your VM:

login as: opc
Authenticating with public key "rsa-key-20200813"
Passphrase for key "rsa-key-20200813":
Last login: Fri Aug 14 06:25:38 2020 from 213.55.220.190
[opc@instance-20200730-1509 ~]$

MySQL Server installation

You can now start the installation of MySQL Server.
You have to keep attention to the fact that on Oracle Linux the default RDBMS that will be be installed is MariaDB, and we don’t want that:

$ sudo yum install mysql
Loaded plugins: langpacks, ulninfo
ol7_UEKR5                                                                                                                                                             | 2.8 kB  00:00:00
ol7_addons                                                                                                                                                            | 2.8 kB  00:00:00
ol7_developer                                                                                                                                                         | 2.8 kB  00:00:00
ol7_developer_EPEL                                                                                                                                                    | 3.4 kB  00:00:00
ol7_ksplice                                                                                                                                                           | 2.8 kB  00:00:00
ol7_latest                                                                                                                                                            | 3.4 kB  00:00:00
ol7_oci_included                                                                                                                                                      | 2.9 kB  00:00:00
ol7_optional_latest                                                                                                                                                   | 2.8 kB  00:00:00
ol7_software_collections                                                                                                                                              | 2.8 kB  00:00:00
(1/19): ol7_UEKR5/x86_64/updateinfo                                                                                                                                   |  70 kB  00:00:00
(2/19): ol7_developer/x86_64/primary_db                                                                                                                               | 560 kB  00:00:00
(3/19): ol7_addons/x86_64/updateinfo                                                                                                                                  |  91 kB  00:00:00
(4/19): ol7_addons/x86_64/primary_db                                                                                                                                  | 157 kB  00:00:00
(5/19): ol7_UEKR5/x86_64/primary_db                                                                                                                                   | 8.4 MB  00:00:00
(6/19): ol7_developer_EPEL/x86_64/group_gz                                                                                                                            |  87 kB  00:00:00
(7/19): ol7_developer/x86_64/updateinfo                                                                                                                               | 7.2 kB  00:00:00
(8/19): ol7_latest/x86_64/group_gz                                                                                                                                    | 134 kB  00:00:00
(9/19): ol7_latest/x86_64/updateinfo                                                                                                                                  | 2.9 MB  00:00:00
(10/19): ol7_developer_EPEL/x86_64/updateinfo                                                                                                                         | 6.3 kB  00:00:00
(11/19): ol7_oci_included/x86_64/primary_db                                                                                                                           | 260 kB  00:00:00
(12/19): ol7_optional_latest/x86_64/updateinfo                                                                                                                        | 1.0 MB  00:00:00
(13/19): ol7_developer_EPEL/x86_64/primary_db                                                                                                                         |  12 MB  00:00:00
(14/19): ol7_software_collections/x86_64/updateinfo                                                                                                                   | 8.7 kB  00:00:00
(15/19): ol7_software_collections/x86_64/primary_db                                                                                                                   | 5.1 MB  00:00:00
(16/19): ol7_optional_latest/x86_64/primary_db                                                                                                                        | 4.8 MB  00:00:00
(17/19): ol7_ksplice/primary_db                                                                                                                                       | 1.0 MB  00:00:01
(18/19): ol7_latest/x86_64/primary_db                                                                                                                                 |  24 MB  00:00:01
(19/19): ol7_ksplice/updateinfo                                                                                                                                       | 5.3 kB  00:00:01
Resolving Dependencies
--> Running transaction check
---> Package mariadb.x86_64 1:5.5.65-1.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=============================================================================================================================================================================================
 Package                                    Arch                                      Version                                            Repository                                     Size
=============================================================================================================================================================================================
Installing:
 mariadb                                    x86_64                                    1:5.5.65-1.el7                                     ol7_latest                                    8.7 M

Transaction Summary
=============================================================================================================================================================================================
Install  1 Package

Total download size: 8.7 M
Installed size: 49 M
Is this ok [y/d/N]: N
Exiting on user command
Your transaction was saved, rerun it with:
 yum load-transaction /tmp/yum_save_tx.2020-07-30.15-02.2RFWFI.yumtx

$ yum list installed | grep -i -e maria
mariadb-libs.x86_64                   1:5.5.65-1.el7              @anaconda/7.8
$ sudo yum remove mariadb-libs.x86_64
Loaded plugins: langpacks, ulninfo
Resolving Dependencies
--> Running transaction check
---> Package mariadb-libs.x86_64 1:5.5.65-1.el7 will be erased
--> Processing Dependency: libmysqlclient.so.18()(64bit) for package: 2:postfix-2.10.1-9.el7.x86_64
--> Processing Dependency: libmysqlclient.so.18(libmysqlclient_18)(64bit) for package: 2:postfix-2.10.1-9.el7.x86_64
--> Running transaction check
---> Package postfix.x86_64 2:2.10.1-9.el7 will be erased
--> Finished Dependency Resolution

Dependencies Resolved

=============================================================================================================================================================================================
 Package                                       Arch                                    Version                                          Repository                                      Size
=============================================================================================================================================================================================
Removing:
 mariadb-libs                                  x86_64                                  1:5.5.65-1.el7                                   @anaconda/7.8                                  4.4 M
Removing for dependencies:
 postfix                                       x86_64                                  2:2.10.1-9.el7                                   @anaconda/7.8                                   12 M

Transaction Summary
=============================================================================================================================================================================================
Remove  1 Package (+1 Dependent package)

Installed size: 17 M
Is this ok [y/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Erasing    : 2:postfix-2.10.1-9.el7.x86_64                                                                                                                                             1/2
warning: /etc/postfix/main.cf saved as /etc/postfix/main.cf.rpmsave
  Erasing    : 1:mariadb-libs-5.5.65-1.el7.x86_64                                                                                                                                        2/2
  Verifying  : 1:mariadb-libs-5.5.65-1.el7.x86_64                                                                                                                                        1/2
  Verifying  : 2:postfix-2.10.1-9.el7.x86_64                                                                                                                                             2/2

Removed:
  mariadb-libs.x86_64 1:5.5.65-1.el7

Dependency Removed:
  postfix.x86_64 2:2.10.1-9.el7

Complete!

So you need to specify the exact MySQL packages that you want to install (in my case I will use the MySQL 8.0 Community Edition):

$ sudo yum install https://dev.mysql.com/get/mysql-community-common-8.0.21-1.el7.x86_64.rpm
Loaded plugins: langpacks, ulninfo
mysql-community-common-8.0.21-1.el7.x86_64.rpm                                                                                                                        | 617 kB  00:00:00
Examining /var/tmp/yum-root-c8T9EW/mysql-community-common-8.0.21-1.el7.x86_64.rpm: mysql-community-common-8.0.21-1.el7.x86_64
Marking /var/tmp/yum-root-c8T9EW/mysql-community-common-8.0.21-1.el7.x86_64.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package mysql-community-common.x86_64 0:8.0.21-1.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=============================================================================================================================================================================================
 Package                                       Arch                          Version                                Repository                                                          Size
=============================================================================================================================================================================================
Installing:
 mysql-community-common                        x86_64                        8.0.21-1.el7                           /mysql-community-common-8.0.21-1.el7.x86_64                        8.8 M

Transaction Summary
=============================================================================================================================================================================================
Install  1 Package

Total size: 8.8 M
Installed size: 8.8 M
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : mysql-community-common-8.0.21-1.el7.x86_64                                                                                                                                1/1
  Verifying  : mysql-community-common-8.0.21-1.el7.x86_64                                                                                                                                1/1

Installed:
  mysql-community-common.x86_64 0:8.0.21-1.el7

Complete!

$ sudo yum install https://dev.mysql.com/get/mysql-community-libs-8.0.21-1.el7.x86_64.rpm
Loaded plugins: langpacks, ulninfo
mysql-community-libs-8.0.21-1.el7.x86_64.rpm                                                                                                                          | 4.5 MB  00:00:00
Examining /var/tmp/yum-root-c8T9EW/mysql-community-libs-8.0.21-1.el7.x86_64.rpm: mysql-community-libs-8.0.21-1.el7.x86_64
Marking /var/tmp/yum-root-c8T9EW/mysql-community-libs-8.0.21-1.el7.x86_64.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package mysql-community-libs.x86_64 0:8.0.21-1.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=============================================================================================================================================================================================
 Package                                      Arch                           Version                                 Repository                                                         Size
=============================================================================================================================================================================================
Installing:
 mysql-community-libs                         x86_64                         8.0.21-1.el7                            /mysql-community-libs-8.0.21-1.el7.x86_64                          22 M

Transaction Summary
=============================================================================================================================================================================================
Install  1 Package

Total size: 22 M
Installed size: 22 M
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : mysql-community-libs-8.0.21-1.el7.x86_64                                                                                                                                  1/1
  Verifying  : mysql-community-libs-8.0.21-1.el7.x86_64                                                                                                                                  1/1

Installed:
  mysql-community-libs.x86_64 0:8.0.21-1.el7

Complete!

$ sudo yum install https://dev.mysql.com/get/mysql-community-client-8.0.21-1.el7.x86_64.rpm
Loaded plugins: langpacks, ulninfo
mysql-community-client-8.0.21-1.el7.x86_64.rpm                                                                                                                        |  48 MB  00:00:08
Examining /var/tmp/yum-root-c8T9EW/mysql-community-client-8.0.21-1.el7.x86_64.rpm: mysql-community-client-8.0.21-1.el7.x86_64
Marking /var/tmp/yum-root-c8T9EW/mysql-community-client-8.0.21-1.el7.x86_64.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package mysql-community-client.x86_64 0:8.0.21-1.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=============================================================================================================================================================================================
 Package                                       Arch                          Version                                Repository                                                          Size
=============================================================================================================================================================================================
Installing:
 mysql-community-client                        x86_64                        8.0.21-1.el7                           /mysql-community-client-8.0.21-1.el7.x86_64                        231 M

Transaction Summary
=============================================================================================================================================================================================
Install  1 Package

Total size: 231 M
Installed size: 231 M
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : mysql-community-client-8.0.21-1.el7.x86_64                                                                                                                                1/1
  Verifying  : mysql-community-client-8.0.21-1.el7.x86_64                                                                                                                                1/1

Installed:
  mysql-community-client.x86_64 0:8.0.21-1.el7

Complete!

$ sudo yum install https://dev.mysql.com/get/mysql-community-server-8.0.21-1.el7.x86_64.rpm
Loaded plugins: langpacks, ulninfo
mysql-community-server-8.0.21-1.el7.x86_64.rpm                                                                                                                        | 499 MB  00:01:32
Examining /var/tmp/yum-root-c8T9EW/mysql-community-server-8.0.21-1.el7.x86_64.rpm: mysql-community-server-8.0.21-1.el7.x86_64
Marking /var/tmp/yum-root-c8T9EW/mysql-community-server-8.0.21-1.el7.x86_64.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package mysql-community-server.x86_64 0:8.0.21-1.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=============================================================================================================================================================================================
 Package                                       Arch                          Version                                Repository                                                          Size
=============================================================================================================================================================================================
Installing:
 mysql-community-server                        x86_64                        8.0.21-1.el7                           /mysql-community-server-8.0.21-1.el7.x86_64                        2.3 G

Transaction Summary
=============================================================================================================================================================================================
Install  1 Package

Total size: 2.3 G
Installed size: 2.3 G
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : mysql-community-server-8.0.21-1.el7.x86_64                                                                                                                                1/1
  Verifying  : mysql-community-server-8.0.21-1.el7.x86_64                                                                                                                                1/1

Installed:
  mysql-community-server.x86_64 0:8.0.21-1.el7

Complete!

$ sudo yum install https://dev.mysql.com/get/mysql-shell-8.0.21-1.el7.x86_64.rpm
Loaded plugins: langpacks, ulninfo
mysql-shell-8.0.21-1.el7.x86_64.rpm                                                                                                                                   |  31 MB  00:00:05
Examining /var/tmp/yum-root-c8T9EW/mysql-shell-8.0.21-1.el7.x86_64.rpm: mysql-shell-8.0.21-1.el7.x86_64
Marking /var/tmp/yum-root-c8T9EW/mysql-shell-8.0.21-1.el7.x86_64.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package mysql-shell.x86_64 0:8.0.21-1.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=============================================================================================================================================================================================
 Package                                  Arch                                Version                                    Repository                                                     Size
=============================================================================================================================================================================================
Installing:
 mysql-shell                              x86_64                              8.0.21-1.el7                               /mysql-shell-8.0.21-1.el7.x86_64                              106 M

Transaction Summary
=============================================================================================================================================================================================
Install  1 Package

Total size: 106 M
Installed size: 106 M
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : mysql-shell-8.0.21-1.el7.x86_64                                                                                                                                           1/1
  Verifying  : mysql-shell-8.0.21-1.el7.x86_64                                                                                                                                           1/1

Installed:
  mysql-shell.x86_64 0:8.0.21-1.el7

Complete!

Packages are now installed and you can start the MySQL Server service:

$ sudo service mysqld start
Redirecting to /bin/systemctl start mysqld.service

$ ps -eaf|grep mysqld
mysql      600     1 12 16:04 ?        00:00:01 /usr/sbin/mysqld
opc        656  2746  0 16:04 pts/0    00:00:00 grep --color=auto mysqld

and secure your MySQL installation:

$ sudo grep 'temporary password' /var/log/mysqld.log
2020-07-30T16:04:02.227475Z 6 [Note] [MY-010454] [Server] A temporary password is generated for root@localhost: -:NQql*u3/IS

$ sudo mysql_secure_installation

Securing the MySQL server deployment.

Enter password for user root:

The existing password for the user account root has expired. Please set a new password.

New password:

Re-enter new password:
The 'validate_password' component is installed on the server.
The subsequent steps will run with the existing configuration
of the component.
Using existing password for root.

Estimated strength of the password: 100
Change the password for root ? ((Press y|Y for Yes, any other key for No) : No

... skipping.
By default, a MySQL installation has an anonymous user,
allowing anyone to log into MySQL without having to have
a user account created for them. This is intended only for
testing, and to make the installation go a bit smoother.
You should remove them before moving into a production
environment.

Remove anonymous users? (Press y|Y for Yes, any other key for No) : y
Success.

Normally, root should only be allowed to connect from
'localhost'. This ensures that someone cannot guess at
the root password from the network.

Disallow root login remotely? (Press y|Y for Yes, any other key for No) : y
Success.

By default, MySQL comes with a database named 'test' that
anyone can access. This is also intended only for testing,
and should be removed before moving into a production
environment.

Remove test database and access to it? (Press y|Y for Yes, any other key for No) : y
- Dropping test database...
Success.

- Removing privileges on test database...
Success.

Reloading the privilege tables will ensure that all changes
made so far will take effect immediately.

Reload privilege tables now? (Press y|Y for Yes, any other key for No) : y
Success.

All done!

Everything is fine, you have now access to a MySQL Server on Oracle IaaS:

$ mysqlsh root@localhost --sql
Please provide the password for 'root@localhost': *************
Save password for 'root@localhost'? [Y]es/[N]o/Ne[v]er (default No): N
MySQL Shell 8.0.21

Copyright (c) 2016, 2020, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.

Type '\help' or '\?' for help; '\quit' to exit.
Creating a session to 'root@localhost'
Fetching schema names for autocompletion... Press ^C to stop.
Your MySQL connection id is 15 (X protocol)
Server version: 8.0.21 MySQL Community Server - GPL
No default schema selected; type \use  to set one.
 MySQL  localhost:33060+ ssl  SQL >

Conclusion

In this blog post we saw how to configure a MySQL Server Community Edition on an Oracle Cloud Infrastructure Compute using the “Oracle Cloud Free Tier” offer. This let you practice with a Cloud environment.
The idea would be, once tested, to switch to the new MySQL Database Service (MDS) built on the latest MySQL 8.0 Enterprise Edition and powered by Oracle Gen 2 Cloud Infrastructure.
I will give you more information soon, so stay tuned as usual and enjoy MySQL
by Elisa Usai

Cet article Installing MySQL Server on Oracle Cloud Infrastructure Compute est apparu en premier sur Blog dbi services.

↧

Oracle Data Pump Integration for Table instantiation with Oracle Golden Gate

August 19, 2020, 1:46 pm

≫ Next: The EDB tool-set integration into PEM – EDB BART

≪ Previous: Installing MySQL Server on Oracle Cloud Infrastructure Compute

From Oracle GoldenGate (OGG) version 12.2 and above, there is a transparent integration of OGG with Oracle Data Pump as explained in the Document ID 1276058.1.

The CSN for each table is captured on an Oracle Data Pump export. The CSN is then applied to system tables and views on the target database on the import. These views and system tables are referenced by Replicat when applying data to target database.

This 12.2 feature, no longer requires administrators to know what CSN number Replicat should be started with. Replicat will handle it automatically when the Replicat Parameter DBOPTIONS ENABLE_INSTANTATION_FILTERING is enabled. It also avoids the need of specification of individual MAP for each imported table with the @FILTER(@GETENV(‘TRANSACTION’,’CSN’) or HANDLECOLLISIONS clause.

Let’s see how it works :

Create a new schema DEMO and a new table into the source database :

oracle@ora-gg-s-2: [DB1] sqlplus / as sysdba
SQL> grant create session to DEMO identified by toto;

Grant succeeded.

SQL> grant resource to DEMO;

Grant succeeded.

SQL> alter user demo quota unlimited on users;

User altered.

SQL> create table DEMO.ADDRESSES as select * from SOE.ADDRESSES;

Table created.

Stop the Extract process :

oracle@ora-gg-s-2:/u10/app/goldengate/product/19.1.0.0.4/gg_1/ [DB1] ./ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 19.1.0.0.200414 OGGCORE_19.1.0.0.0OGGBP_PLATFORMS_200427.2331_FBO
Linux, x64, 64bit (optimized), Oracle 19c on Apr 28 2020 17:41:48
Operating system character set identified as UTF-8.

Copyright (C) 1995, 2019, Oracle and/or its affiliates. All rights reserved.



GGSCI (ora-gg-s-2) 1> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
EXTRACT     RUNNING     EXTRSOE     00:00:00      00:00:08
EXTRACT     RUNNING     PUMPSOE     00:00:00      00:00:01


GGSCI (ora-gg-s-2) 2> stop extract *

Sending STOP request to EXTRACT EXTRSOE ...
Request processed.

Sending STOP request to EXTRACT PUMPSOE ...
Request processed.

GGSCI (ora-gg-s-2) 4> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
EXTRACT     STOPPED     EXTRSOE     00:00:00      00:00:07
EXTRACT     STOPPED     PUMPSOE     00:00:00      00:00:07


GGSCI (ora-gg-s-2) 5>

Stop the Replicat process :

GGSCI (ora-gg-t-2) 4> stop replicat replsoe

Sending STOP request to REPLICAT REPLSOE ...

GGSCI (ora-gg-t-2) 5> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
REPLICAT    STOPPED     REPLSOE     00:00:00      00:00:00


GGSCI (ora-gg-t-2) 9>

Edit Extract process and add the new table :

GGSCI (ora-gg-s-2) 1> edit params EXTRSOE
Table DEMO.ADDRESSES;
GGSCI (ora-gg-s-2) 1> edit params PUMPSOE
Table DEMO.ADDRESSES;

Add schematrandata for the schema DEMO:

GGSCI (ora-gg-s-2) 5> dblogin useridalias ggadmin
Successfully logged into database.

GGSCI (ora-gg-s-2 as ggadmin@DB1) 6> add schematrandata DEMO

2020-08-19 21:25:47  INFO    OGG-01788  SCHEMATRANDATA has been added on schema "DEMO".

2020-08-19 21:25:47  INFO    OGG-01976  SCHEMATRANDATA for scheduling columns has been added on schema "DEMO".

2020-08-19 21:25:47  INFO    OGG-10154  Schema level PREPARECSN set to mode NOWAIT on schema "DEMO".

2020-08-19 21:25:49  INFO    OGG-10471  ***** Oracle Goldengate support information on table DEMO.ADDRESSES *****
Oracle Goldengate support native capture on table DEMO.ADDRESSES.
Oracle Goldengate marked following column as key columns on table DEMO.ADDRESSES: ADDRESS_ID, CUSTOMER_ID, DATE_CREATED, HOUSE_NO_OR_NAME, STREET_NAME, TOWN, COUNTY, COUNTRY, POST_CODE, ZIP_CODE
No unique key is defined for table DEMO.ADDRESSES.

GGSCI (ora-gg-s-2 as ggadmin@DB1) 7> info schematrandata DEMO

2020-08-19 21:25:54  INFO    OGG-06480  Schema level supplemental logging, excluding non-validated keys, is enabled on schema "DEMO".

2020-08-19 21:25:54  INFO    OGG-01980  Schema level supplemental logging is enabled on schema "DEMO" for all scheduling columns.

2020-08-19 21:25:54  INFO    OGG-10462  Schema "DEMO" have 1 prepared tables for instantiation.

GGSCI (ora-gg-s-2 as ggadmin@DB1) 8>

Source system tables are automatically prepared when issuing the command ADD TRANDATA / ADD SCHEMATRANDATA

Start and check the extract :

GGSCI (ora-gg-s-2 as ggadmin@DB1) 8> start extract *

Sending START request to MANAGER ...
EXTRACT EXTRSOE starting

Sending START request to MANAGER ...
EXTRACT PUMPSOE starting


GGSCI (ora-gg-s-2 as ggadmin@DB1) 9> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
EXTRACT     RUNNING     EXTRSOE     00:00:00      00:19:51
EXTRACT     RUNNING     PUMPSOE     00:00:00      00:19:51


GGSCI (ora-gg-s-2 as ggadmin@DB1) 10> !
info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
EXTRACT     RUNNING     EXTRSOE     00:00:00      00:00:00
EXTRACT     RUNNING     PUMPSOE     00:00:00      00:00:01

Let’s do an update to the source table DEMO.ADDRESSES :

oracle@ora-gg-s-2:/u10/app/goldengate/product/19.1.0.0.4/gg_1/ [DB1] sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Wed Aug 19 21:34:19 2020
Version 19.4.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.4.0.0.0

SQL> update DEMO.ADDRESSES set STREET_NAME= 'Demo Street is open' where ADDRESS_ID=1000;

1 row updated.

SQL> commit;

Commit complete.

Let’s export the DEMO schema :

oracle@ora-gg-s-2:/u10/app/goldengate/product/19.1.0.0.4/gg_1/ [DB1] expdp "'/ as sysdba'" dumpfile=export_tables_DEMO.dmp \
> logfile=export_tables_DEMO.log \
> schemas=demo \
>

Export: Release 19.0.0.0.0 - Production on Wed Aug 19 21:37:09 2020
Version 19.4.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.
Password:

Connected to: Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
FLASHBACK automatically enabled to preserve database integrity.
Starting "SYS"."SYS_EXPORT_SCHEMA_01":  "/******** AS SYSDBA" dumpfile=export_tables_DEMO.dmp logfile=export_tables_DEMO.log schemas=demo
Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type SCHEMA_EXPORT/STATISTICS/MARKER
Processing object type SCHEMA_EXPORT/USER
Processing object type SCHEMA_EXPORT/SYSTEM_GRANT
Processing object type SCHEMA_EXPORT/ROLE_GRANT
Processing object type SCHEMA_EXPORT/DEFAULT_ROLE
Processing object type SCHEMA_EXPORT/TABLESPACE_QUOTA
Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
Processing object type SCHEMA_EXPORT/TABLE/PROCACT_INSTANCE
Processing object type SCHEMA_EXPORT/TABLE/TABLE
. . exported "DEMO"."ADDRESSES"                          35.24 MB  479277 rows
Master table "SYS"."SYS_EXPORT_SCHEMA_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SYS.SYS_EXPORT_SCHEMA_01 is:
  /u01/app/oracle/admin/DB1/dpdump/export_tables_DEMO.dmp
Job "SYS"."SYS_EXPORT_SCHEMA_01" successfully completed at Wed Aug 19 21:37:43 2020 elapsed 0 00:00:28

oracle@ora-gg-s-2:/u10/app/goldengate/product/19.1.0.0.4/gg_1/ [DB1]

The dba_capture_prepared_tables does not get populated till the first export of the tables. The scn is the smallest system change number (SCN) for which the table can be instantiated. It is not the export SCN.

SQL> select table_name, scn from dba_capture_prepared_tables where table_owner = 'DEMO' ;

TABLE_NAME   SCN
--------------------
ADDRESSES    2989419

Let’s copy the dump file to target database :

oracle@ora-gg-s-2:/u10/app/goldengate/product/19.1.0.0.4/gg_1/ [DB1] scp \
> /u01/app/oracle/admin/DB1/dpdump/export_tables_DEMO.dmp \
> oracle@ora-gg-t-2:/u01/app/oracle/admin/DB2/dpdump
oracle@ora-gg-t-2's password:
export_tables_DEMO.dmp                                                            100%   36MB 120.8MB/s   00:00
oracle@ora-gg-s-2:/u10/app/goldengate/product/19.1.0.0.4/gg_1/ [DB1]

Let’s import the new table into target database :

oracle@ora-gg-t-2:/u10/app/goldengate/product/19.1.0.0.4/gg_1/ [DB2] impdp system/manager \
> dumpfile=export_tables_DEMO.dmp \
> logfile=impdemo_tables.log \
>

Import: Release 19.0.0.0.0 - Production on Wed Aug 19 21:45:18 2020
Version 19.4.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Master table "SYSTEM"."SYS_IMPORT_FULL_01" successfully loaded/unloaded
Starting "SYSTEM"."SYS_IMPORT_FULL_01":  system/******** dumpfile=export_tables_DEMO.dmp logfile=impdemo_tables.log
Processing object type SCHEMA_EXPORT/USER
Processing object type SCHEMA_EXPORT/SYSTEM_GRANT
Processing object type SCHEMA_EXPORT/ROLE_GRANT
Processing object type SCHEMA_EXPORT/DEFAULT_ROLE
Processing object type SCHEMA_EXPORT/TABLESPACE_QUOTA
Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
Processing object type SCHEMA_EXPORT/TABLE/PROCACT_INSTANCE
Processing object type SCHEMA_EXPORT/TABLE/TABLE
Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
. . imported "DEMO"."ADDRESSES"                          35.24 MB  479277 rows
Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type SCHEMA_EXPORT/STATISTICS/MARKER
Job "SYSTEM"."SYS_IMPORT_FULL_01" successfully completed at Wed Aug 19 21:45:46 2020 elapsed 0 00:00:26

oracle@ora-gg-t-2:/u10/app/goldengate/product/19.1.0.0.4/gg_1/ [DB2]

Datapump import will populate system tables and views with instantiation CSNs :

SQL> select source_object_name, instantiation_scn, ignore_scn from dba_apply_instantiated_objects where source_object_owner = 'DEMO' ;

SOURCE_OBJECT_NAME INSTANTIATION_SCN IGNORE_SCN
-----------------------------------------------
ADDRESSES          2995590

Let’s update the table source :

oracle@ora-gg-s-2:/u10/app/goldengate/product/19.1.0.0.4/gg_1/ [DB1] sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Wed Aug 19 21:48:33 2020
Version 19.4.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.4.0.0.0

SQL> update DEMO.ADDRESSES set STREET_NAME= 'Demo Street is open' where ADDRESS_ID=1001;

1 row updated.

SQL> commit;

Commit complete.

Let’s check transactions occured into source table DEMO.ADDRESSES :

oracle@ora-gg-s-2:/u10/app/goldengate/product/19.1.0.0.4/gg_1/ [DB1] ./ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 19.1.0.0.200414 OGGCORE_19.1.0.0.0OGGBP_PLATFORMS_200427.2331_FBO
Linux, x64, 64bit (optimized), Oracle 19c on Apr 28 2020 17:41:48
Operating system character set identified as UTF-8.

Copyright (C) 1995, 2019, Oracle and/or its affiliates. All rights reserved.

GGSCI (ora-gg-s-2) 1> stats extract extrsoe table DEMO.ADDRESSES

Sending STATS request to EXTRACT EXTRSOE ...

Start of Statistics at 2020-08-19 21:50:15.

DDL replication statistics (for all trails):

*** Total statistics since extract started     ***
        Operations                                         0.00
        Mapped operations                                  0.00
        Unmapped operations                                0.00
        Other operations                                   0.00
        Excluded operations                                0.00

Output to /u11/app/goldengate/data/DB1/es:

Extracting from DEMO.ADDRESSES to DEMO.ADDRESSES:

*** Total statistics since 2020-08-19 21:34:35 ***
        Total inserts                                      0.00
        Total updates                                      2.00
        Total deletes                                      0.00
        Total upserts                                      0.00
        Total discards                                     0.00
        Total operations                                   2.00

*** Daily statistics since 2020-08-19 21:34:35 ***
        Total inserts                                      0.00
        Total updates                                      2.00
        Total deletes                                      0.00
        Total upserts                                      0.00
        Total discards                                     0.00
        Total operations                                   2.00

*** Hourly statistics since 2020-08-19 21:34:35 ***
        Total inserts                                      0.00
        Total updates                                      2.00
        Total deletes                                      0.00
        Total upserts                                      0.00
        Total discards                                     0.00
        Total operations                                   2.00

*** Latest statistics since 2020-08-19 21:34:35 ***
        Total inserts                                      0.00
        Total updates                                      2.00
        Total deletes                                      0.00
        Total upserts                                      0.00
        Total discards                                     0.00
        Total operations                                   2.00

Let’s modify the replicat parameter file to add MAP statement for the new table DEMO.ADDRESSES plus the parameter DBOPTIONS ENABLE_INSTANTATION_FILTERING.

Replicat REPLSOE
DBOPTIONS INTEGRATEDPARAMS ( parallelism 6 )
DISCARDFILE /u10/app/goldengate/product/19.1.0.0.4/gg_1/dirrpt/REPLSOE_discard.txt, append, megabytes 10
DBOPTIONS ENABLE_INSTANTIATION_FILTERING
USERIDALIAS ggadmin
MAP SOE.*, TARGET SOE.* ;
--MAP DEMO.ADDRESSES ,TARGET DEMO.ADDRESSES,FILTER ( @GETENV ('TRANSACTION', 'CSN') > 2908627) ;
MAP DEMO.ADDRESSES, TARGET DEMO.ADDRESSES;

I commented the old method where we should mention the CSN used by the export. Now with DBOPTIONS ENABLE_INSTANTATION_FILTERING, there is no need to mention the CSN.

Let’s start the replicat process :

GGSCI (ora-gg-t-2) 2> start replicat REPLSOE

Sending START request to MANAGER ...
REPLICAT REPLSOE starting


GGSCI (ora-gg-t-2) 3> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
REPLICAT    RUNNING     REPLSOE     00:00:00      00:41:37


GGSCI (ora-gg-t-2) 4> !
info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
REPLICAT    RUNNING     REPLSOE     00:00:00      00:00:01

Start replicat, who will query instantiation CSN on any new mapping and filter records accordingly Filters out DDL and DML records based on each table’s instantiation CSN . Output in the report file will show the table name and to what CSN the replicat will start applying data :

2020-08-19 21:56:05 INFO OGG-10155 Instantiation CSN filtering is enabled on table DEMO.ADDRESSES at CSN 2,995,59
0.

Let’s wait the lag resolved and let’s check the transaction occured into table DEMO.ADDRESSES:

GGSCI (ora-gg-t-2) 6> stats replicat REPLSOE ,table demo.addresses

Sending STATS request to REPLICAT REPLSOE ...

Start of Statistics at 2020-08-19 21:58:32.


Integrated Replicat Statistics:

        Total transactions                                 1.00
        Redirected                                         0.00
        Replicated procedures                              0.00
        DDL operations                                     0.00
        Stored procedures                                  0.00
        Datatype functionality                             0.00
        Operation type functionality                       0.00
        Event actions                                      0.00
        Direct transactions ratio                          0.00%

Replicating from DEMO.ADDRESSES to DEMO.ADDRESSES:

*** Total statistics since 2020-08-19 21:56:05 ***
        Total inserts                                      0.00
        Total updates                                      1.00
        Total deletes                                      0.00
        Total upserts                                      0.00
        Total discards                                     0.00
        Total operations                                   1.00

*** Daily statistics since 2020-08-19 21:56:05 ***
        Total inserts                                      0.00
        Total updates                                      1.00
        Total deletes                                      0.00
        Total upserts                                      0.00
        Total discards                                     0.00
        Total operations                                   1.00

*** Hourly statistics since 2020-08-19 21:56:05 ***
        Total inserts                                      0.00
        Total updates                                      1.00
        Total deletes                                      0.00
        Total upserts                                      0.00
        Total discards                                     0.00
        Total operations                                   1.00

*** Latest statistics since 2020-08-19 21:56:05 ***
        Total inserts                                      0.00
        Total updates                                      1.00
        Total deletes                                      0.00
        Total upserts                                      0.00
        Total discards                                     0.00
        Total operations                                   1.00

End of Statistics.

Let’s check if data are synchronized :

Into source :

oracle@ora-gg-s-2:/u10/app/goldengate/product/19.1.0.0.4/gg_1/ [DB1] sqh

SQL*Plus: Release 19.0.0.0.0 - Production on Wed Aug 19 22:00:52 2020
Version 19.4.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.4.0.0.0

SQL> select street_name from demo.ADDRESSES where ADDRESS_ID=1000;

STREET_NAME
------------------------------------------------------------
Demo Street is open

SQL> select street_name from demo.ADDRESSES where ADDRESS_ID=1001;

STREET_NAME
------------------------------------------------------------
Demo Street is open

Into target :

oracle@ora-gg-t-2:/u10/app/goldengate/product/19.1.0.0.4/gg_1/ [DB2] sqh

SQL*Plus: Release 19.0.0.0.0 - Production on Wed Aug 19 22:02:44 2020
Version 19.4.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.4.0.0.0

SQL> select street_name from demo.ADDRESSES where ADDRESS_ID=1000;

STREET_NAME
------------------------------------------------------------
Demo Street is open

SQL> select street_name from demo.ADDRESSES where ADDRESS_ID=1001;

STREET_NAME
------------------------------------------------------------
Demo Street is open

Both table DEMO.ADDRESSES on source database and target database has identical data.

DBOPTIONS ENABLE_INSTANTIATION_FILTERING is no longer required:

GGSCI (ora-gg-t-2)> edit params REPLSOE
... 
MAP demo.addresses ,TARGET demo.addresses;

Restart the replicat :

GGSCI (ora-gg-t-2) 9> stop replicat replsoe

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
REPLICAT    STOPPED     REPLSOE     00:00:00      00:00:01

GGSCI (ora-gg-t-2) 10> start replicat replsoe

Sending START request to MANAGER ...
REPLICAT REPLSOE starting


GGSCI (ora-gg-t-2) 11> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
REPLICAT    RUNNING     REPLSOE     00:00:00      00:00:00

Let’s doing a last test :
Update on the source table :

SQL> update DEMO.ADDRESSES set STREET_NAME= 'test 1' where ADDRESS_ID=800;

1 row updated.

SQL> commit;

Commit complete.

Let’s check the target database :

oracle@ora-gg-t-2:/u10/app/goldengate/product/19.1.0.0.4/gg_1/ [DB2] sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Wed Aug 19 22:11:24 2020
Version 19.4.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.4.0.0.0

SQL> select street_name from demo.ADDRESSES where ADDRESS_ID=800;

STREET_NAME
------------------------------------------------------------
test 1

Conclusion :

The parameter DBOPTIONS ENABLE_INSTANTATION_FILTERING avoid for the Golden Gate administrator to find the CSN used for the inital load.

Cet article Oracle Data Pump Integration for Table instantiation with Oracle Golden Gate est apparu en premier sur Blog dbi services.

↧