Blog dbi services

Introduction

When you learn a cloud technology, like OCI, the one from Oracle, you start building your demo infrastructure with the web interface and numerous clicks. It’s convenient and easy to handle, even more if you’re quite used to infrastructure basics: network, routing, firewalling, servers, etc. But when it comes to build complex infrastructures with multiple servers, subnets, rules, databases, it’s more than a few clicks to do. And rebuilding a clone infrastructure (for example for testing purpose) can be a nightmare.

Is it possible to script an infrastructure?

Yes for sure, most of the cloud providers have a command line interface, and actually, all the clouds are based on a command line interface and a web console on top of it. But scripting all the commands is something not very digestible.

Infrastructure as Code

Why we couldn’t manage an infrastructure as if it were a piece of software? It’s the purpose of “Infrastructure as Code”. The benefits seem obvisous: faster to deploy, reusable code, automation of infrastructure deployment, scalability, reduced cost with an embedded “drop infrastructure” feature, …

There are multiple tools to do IaC, but Oracle recommands Terraform. And it looks like the best solution for now.

What is Terraform?

The goal of Terraform is to help infrastructure administrators to model and provision large and complex infrastructures. It’s not dedicated to OCI as it supports multiple providers, so if you think about an infrastructure based on OCI and Microsoft Azure (as they are becoming friends), it makes even more sense.

Terraform is using a specific langage called HCL, HashiCorp Configuration Langage. Obviously, it’s compatible with code repositories, like GIT. Templates are available to ease your job when you’re a beginner.

The main steps for terraforming an infrastructure are:
1) write your HCL code (describe)
2) preview the execution by reading the configuration (plan)
3) build the infrastructure (apply)
4) eventually delete the infrastructure (destroy)

3 ways of using Terraform with OCI

You can use Terraform by copying the binary on your computer (Windows/Mac/Linux), it’s quite easy to set up and use (no installation, only one binary). Terraform can run from a VM already in the cloud.

Terraform is also available in SaaS mode, just sign up on terraform.io website and you will be able to work with Terraform without installing anything.

You can also use Terraform through Oracle Resource Manager (ORM) inside the OCI. ORM is a free service provided by Oracle and based on Terraform language. ORM will manage stacks, each stack being a set of Terraform files you bring to OCI as a zip file. From this stacks, ORM let you perform the actions you would have done in Terraform: plan, apply and destroy.

Typical use cases

Terraform is quite nice to make cross-platform deployments, build demos, give the ability for people to build an infrastructure as a self-service, make a Proof Of Concept, …

Terraform can also be targeted to Devops engineer, giving them the ability to deploy a staging environment, fix the issues and then deploy production environments reusing the terraform configuration.

How does it work?

A terraform configuration is actually a directory with one or multiple .tf files (depending on your preferences). As HCL is not a scripting language, blocks in the file(s) are not describing any order in the execution.

During the various steps previously described, special subfolders should appear during execution: *tfstate* for current status and .terraform, a kind of cache.

If you need to script your infrastructure deployment, you can use Python, Bash, Powershell or other tools to call the binary.

To be able to authorize your Terraform binary to create resources in the cloud, you’ll have to provide the API key of an OCI user with enough authorizations.

As cloud providers are pushing update quite often, Terraform will keep the plugin of your cloud provider updated regularly.

Terraform can also manage dependencies (for example a VM depending on another one) because tasks will be done in parallel to speed up the infrastructure deployment.

Some variables can be provided as an input (most often through environment variables) for example for naming the compartment. Imagine you want to deploy several test infrastructures isolated from each others.

Conclusion

Terraform is a great tool to leverage cloud benefits, even for a simple infrastructure. Don’t miss that point!

Cet article Terraform and Oracle Cloud Infrastructure est apparu en premier sur Blog dbi services.

By Franck Pachot

.
There are good reasons for NoSQL and semi-structured databases. And there are also many mistakes and myths. If people move from RDBMS to NoSQL because of wrong reasons, they will have a bad experience and this finally deserves NoSQL reputation. Those myths were settled by some database newbies who didn’t learn SQL and relational databases. And, rather than learning the basics of data modeling, and capabilities of SQL for data sets processing, they thought they had invented the next generation of persistence… when they actually came back to what was there before the invention of RDBMS: a hierarchical semi-structured data model. And now encountering the same problem that the relational database solved 40 years ago. This blog post is about one of those myths.

Myth: adding a column has to scan and update the whole table

I have read and heard that too many times. Ideas like: RDBMS and SQL are not agile to follow with the evolution of the data domain. Or: NoSQL data stores, because they are loose on the data structure, makes it easier to add new attributes. The wrong, but unfortunately common, idea is that adding a new column to a SQL table is an expensive operation because all rows must be updated. Here are some examples (just taking random examples to show how this idea is widely spread even with smart experts and good reputation forums):

A comment on twitter: “add KVs to JSON is so dramatically easier than altering an RDBMS table, especially a large one, to add a new column”

Right, but what they are more referring to is being able to add KVs to JSON is so dramatically easier than altering an RDBMS table, especially a large one, to add a new column. If you choose to create a new one and move the data over that sucks in a different way.

— Kirk Kirkconnell (@NoSQLKnowHow) April 23, 2020

A question on StackOverflow: “Is ‘column-adding’ (schema modification) a key advantage of a NoSQL (mongodb) database over a RDBMS like MySQL” https://stackoverflow.com/questions/17117294/is-column-adding-schema-modification-a-key-advantage-of-a-nosql-mongodb-da/17118853. They are talking about months for this operation!

An article on Medium: “A migration which would add a new column in RDBMS doesn’t require this Scan and Update style migration for DynamoDB” https://medium.com/serverless-transformation/how-to-remain-agile-with-dynamodb-eca44ff9817.

Those are just examples. People hear it. People repeat it. People believe it. And they don’t test. And they don’t learn. They do not crosscheck with documentation. They do not test with their current database. When it is so easy to do.

Adding a column in SQL

Actually, adding a column is a fast operation in the major modern relational databases. I’ll create a table. Check the size. Then add a nullable column without default. Check the size. Then add a column with a default value. Check the size again. Size staying the same means no rows updated. Of course, you can test further: look at the elapsed time on a large table, and the amount of reads, and the redo/WAL generated,… You will see nothing in the major current RDBMS. Then you actually update all rows and compare. There you will see the size, the time, the reads, and the writes and understand that, with an explicit update the rows are actually updated. But not with the DDL to add a column.

PostgreSQL

Here is the example in PostgreSQL 12 in dbfiddle:
https://dbfiddle.uk/?rdbms=postgres_12&fiddle=9acf5fcc62f0ff1edd0c41aafae91b05

Another example where I show the WAL size:

And @postgresql is also very smart about adding columns: does not touch the tuples. size doesn't change even when adding a value to existing tuples: pic.twitter.com/FT2H7KuLIp

— Franck Pachot (@FranckPachot) April 23, 2020

Oracle Database

Here is the example in Oracle Database 18c in dbfiddle:
https://dbfiddle.uk/?rdbms=oracle_18&fiddle=b3a2d41636daeca5f8e9ea1d771bbd23

Another example:

When was your Oracle experience?
It has always (at least since Oracle7, 24 years ago) been immediate to add a nullable column. Same with not null columns (with default) since 12c (7 years ago): pic.twitter.com/pmoPFX9Q7o

— Franck Pachot (@FranckPachot) April 23, 2020

Yes, I even tested in Oracle7 where, at that time, adding a not null column with a default value actually scanned the table. The workaround is easy with a view. Adding a nullable column (which is what you do in NoSQL) was already a fast operation, and that’s 40 years ago!

MySQL

Here is the example in MySQL 8 in dbfiddle:
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=8c14e107e1f335b505565a0bde85f6ec

Microsoft SQL Server

It seems that the table I use is too large for dbfiddle but I’ve run the same on my laptop:


1> set statistics time on;
2> go

1> create table demo (x numeric);
2> go
SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 2 ms,  elapsed time = 2 ms.

1> with q as (select 42 x union all select 42)
2> insert into demo
3> s join q j cross join q k cross join q l cross join q m cross join q n cross join q o  cross join q p  cross join q r cross join q s cross join q t  cross join q u;
4> go
SQL Server parse and compile time:
   CPU time = 11 ms, elapsed time = 12 ms.

 SQL Server Execution Times:
   CPU time = 2374 ms,  elapsed time = 2148 ms.

(1048576 rows affected)

1> alter table demo add b numeric ;
2> go
SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 3 ms.

1> alter table demo add c numeric default 42 not null;
2> go
SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 1 ms,  elapsed time = 2 ms.

2> go
SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 0 ms.
x                    b                    c
-------------------- -------------------- --------------------
                  42                 NULL                   42
                  42                 NULL                   42
                  42                 NULL                   42

(3 rows affected)

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

3> go
SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 3768 ms,  elapsed time = 3826 ms.

(1048576 rows affected)

2 milliseconds for adding a column with a value, visible on all those million rows (and it can be more).

YugaByte DB

In a distributed database, metadata must be updated in all nodes, but this is still in milliseconds whatever the table size is:

I didn’t show the test with not null and default value as I encountered an issue (adding column is fast but default value not selected). I don’t have the latest version (YugaByte DB is open source and in very active development) and this is probably an issue going to be fixed.

Tibero

Tibero is a database with very high compatibility with Oracle. I’ve run the same SQL. But this version 6 seems to be compatible with Oracle 11 where adding a non null column with default had to update all rows:

You can test on any other databases with a code similar to this one:


-- CTAS CTE Cross Join is the most cross-RDBMS I've found to create one million rows
create table demo as with q as (select 42 x union all select 42) select 42 x from q a cross join q b cross join q c cross join q d cross join q e cross join q f cross join q g cross join q h cross join q i cross join q j cross join q k cross join q l cross join q m cross join q n cross join q o  cross join q p  cross join q r cross join q s cross join q t  cross join q u;
-- check the time to add a column
alter table demo add b numeric ;
-- check the time for a column with a value set for all existing rows
alter table demo add c numeric default 42 not null;
-- check that all rows show this value
select * from demo order by x fetch first 3 rows only;
-- compare with the time to really update all rows
update demo set c=42;

and please don’t hesitate to comment this blog post or the following tweet with you results:

If I hear one more time that NoSQL semi-structured data stores are more agile than RDBMS I write a blog post about SQL add column and Codd rule #9.

— Franck Pachot (@FranckPachot) May 7, 2020

NoSQL semi-structured

The myth comes from old versions of some databases that did no implement the ALTER TABLE .. ADD in an optimal way. And the NoSQL inventors probably knew only MySQL which was late in this area. Who said that MySQL evolution suffered from its acquisition by Oracle? They reduce the gap with other databases, like with this column adding optimisation.

If you stay with this outdated knowledge, you may think that NoSQL with semi-structured collections is more Agile, right? Yes, of course, you can add a new attribute when inserting a new item. It has zero cost and you don’t have to declare it to anyone. But what about the second case I tested in all those SQL databases, where you want to define a value for the existing rows as well? As we have seen, SQL allows that with a DEFAULT clause. In NoSQL you have to scan and update all items. Or you need to implement some logic in the application, like “if null then value”. That is not agile at all: as a side effect of a new feature, you need to change all data or all code.

Relational databases encapsulate the physical storage with a logical view. And in addition to that this logical view protects the existing application code when it evolves. This E.F Codd rule number 9: Logical Data Independence. You can deliver declarative changes to your structures without modifying any procedural code or stored data. Now, who is agile?

Structured data have metadata: performance and agility

How does it work? The RDBMS dictionary holds information about the structure of the rows, and this goes beyond a simple column name and datatype. The default value is defined here, which is why the ADD column was immediate. This is just an update of metadata. It doesn’t touch any existing data: performance. It exposes a virtual view to the application: agility. With Oracle, you can even version those views and deliver them to the application without interruption. This is called Edition Based Redefinition.

There are other smart things in the RDBMS dictionary. For example, when I add a column with the NOT NULL attribute, this assertion is guaranteed. I don’t need any code to check whether the value is set or not. Same with constraints: one declaration in a central dictionary makes all code safe and simpler because the assertion is guaranteed without additional testing. No need to check for data quality as it is enforced by design. Without it, how many sanity assumptions do you need to add in your code to ensure that erroneous data will not corrupt everything around? We have seen adding a column, but think about something even simple. Naming things is the most important in IT. Allow yourself to realize you made a mistake, or some business concepts change, and modify the name of a column for a more meaningful one. That can be done easily, even with a view to keep compatibility with previous code. Changing an attribute name in a large collection of JSON items is not so easy.

Relational databases have been invented for agility

Let me quote the reason why CERN decided to use Oracle in 1982 for the LEP – the ancestor of the LHC: Oracle The Database Management System For LEP: “Relational systems transform complex data structures into simple two-dimensional tables which are easy to visualize. These systems are intended for applications where preplanning is difficult…”

Preplanning not needed… isn’t that the definition of Agility with 20th century words?

Another good read to clear some myth: Relational Database: A Practical Foundation for Productivity by E.F. Codd Some problems that are solved by relational databases are the lack of distinction between “the programmer’s (logical) view of the data and the (physical) representation of data in storage”, the “subsequent changes in data description” forcing code changes, and “programmers were forced to think and code in terms of iterative loops” because of the lack of set processing. Who says that SQL and joins are slow? Are your iterative loops smarter than hash joins, nested loops and sort merge joins?

Never say No

I’m not saying that NoSQL is good or bad or better or worse. It is bad only when the decision is based on myths rather than facts. If you want agility on your data domain structures, stay relational. If you want to allow any future query pattern, stay relational. However, there are also some use cases that can fit in a relational database but may also benefit from another engine with optimal performance in key-value lookups. I have seen tables full of session state with a PRIMARY KEY (user or session ID) and a RAW column containing some data meaningful only for one application module (login service) and without durable purpose. They are acceptable in a SQL table if you take care of the physical model (you don’t want to cluster those rows in a context with many concurrent logins). But a Key-Value may be more suitable. We still see Oracle tables with LONG datatypes. If you like that you probably need a key-value NoSQL. Databases can store documents, but that’s luxury. They benefit from consistent backups and HA but at the prize of operating a very large and growing database. Timeseries, or graphs, are not easy to store in relational tables. NoSQL databases like AWS DynamoDB are very efficient for those specific use cases. But this is when all access patterns are known from design. If you know your data structure and cannot anticipate all queries, then relational databases systems (which means more than a simple data store) and SQL (the 4th generation declarative language to manipulate data by sets) are still the best choice for agility.

Cet article The myth of NoSQL (vs. RDBMS) agility: adding attributes est apparu en premier sur Blog dbi services.

Hello everybody,

Introduction:

Today we will try to debug a classic case of schedulers: The wait host status.

Context:

On monitoring domain, one or more jobs are not executing and are in blue.

Example:

The aim is to know what is happening to this job and fix it.

Analysis:

First of all, we will use the right click on the job and select waiting info to understand why the job is still not in execution.

Result:

Job is in “wait host status” (you can also see it in the run information part located under the right panel.)
We can also read that no machine is available.

Next step is to check if the machine is still alive by making a ping on it:

Result:

Ping is OK machine is alive (of course there can be some other parameters like firewall rules or services not activated)

If the machine is available, maybe the issue is from the agent, so we will check it.

1) By using CCM

We note that the agent is unavailable, to have more info we will connect to the machine where the agent is installed

2) By Connecting on the agent’s machine

By connect on the machine with ctm agent user we see that the agent is not running and should be the cause of our trouble
Below the result after the ag_diag_comm command (used to check the agent status)

If we start it again it can solve the issue.

a) Starting the Control-M/Agent:

[root@CTMSRVCENTOS scripts]# /home/controlm/ctm_agent/ctm/scripts/start-ag

 Enter Control-M/Agent UNIX username [controlm]:

 Enter Control-M/Agent Process Name <AG|AT|AR|ALL> [ALL]:


Starting the agent as 'root' user

Control-M/Agent Listener started. pid: 23534
Control-M/Agent Tracker started. pid: 23603

Control-M/Agent started successfully.

b) Check agent status

Result:

Agent is running successfully, so let’s go back to our blue job in the monitoring domain:
Now we can see that the job has been executed correctly:
As soon as we started the agent again, the job could be executed.

To finish, we can check the agent status on the CCM:

Agent is OK on CCM and jobs can now be executed.

Conclusion:

Now ,we have the steps to unlock this “wait hosts” status , you will be able to fix it quickly
Note:

If you have the “no machine available” message and your ping is not OK, please check with your system admin to get the server where the agent is installed up again. Then check if agent could be started successfully.

Feel free to check our dbi’s bloggers to have more tips and tricks, and also you can have a look on BMC site.

See you for the next blog , don’t forget to share and comment!

Cet article Control-M/EM : job in wait hosts status est apparu en premier sur Blog dbi services.

The authentication, the first level of security for each IT system, is the stage to verify the user identity through the basic username and password scheme. It is crucial to have a mechanism to protect and secure password storing and transmitting over network.

In MySQL, there is plenty of different authentication methods available, and last versions improved the security of this concept.

At the beginning, the mechanism, called mysql_old_password, was pretty insecure: it’s based on a broken hashing function and the password is 16 bytes long. It was not so complex for attackers to find a plaintext password from the hash stored in the password column of mysql.user table. It has been removed in MySQL 5.7.5.

A new method was introduced in MySQL 4.1 and it became the mysql_native_password plugin as of MySQL 5.5, enabled by default. It’s based on a SHA-1 hashing algorithm and the password is 41 bytes long. On one side, it’s more secure than mysql_old_password because hashes cannot be used to authenticate. But on the other side, it still has weaknesses, especially for “too simple” passwords, because for the same passwords we get the same hash. Again, it’s not so complicated to search for the stolen hashes in rainbow tables to obtain the correspondent plaintext password.

There have been other improvements in MySQL 5.6: sha256_password plugin adds a random salt in hashes generation, so this last one is unique and rainbow tables are useless. In MySQL 8.0, default authentication remains strong (same SHA-256 password hashing mechanism) but in addition it becomes faster: a cache was added on the server side to enable a faster re-authentication for accounts that have connected previously. It’s called caching_sha2_password and it’s now the default plugin.

This MySQL authentication evolution must be seriously considered by DBAs during upgrade processes and following aspects must not to be underestimated:

Read MySQL official documentation to prepare your upgrade procedure
Don’t forget to run mysql_upgrade after importing your dump: it not only examines user tables to find out possible incompatibilities, but upgrades also system tables in mysql, performance_schema and sys databases
Try to keep your MySQL server up to date in order to avoid surprises or difficult procedures to put in place due to a too large delta (as an example, you can find here the procedure to migrate away from pre-4.1 password hashing method – good luck )
As of MySQL 5.7.6, the password column was removed and values are now stored in the authentication_string column of mysql.user table
As of MySQL 5.5, the plugin column has been added to mysql.user table and as of MySQL 5.7 the server doesn’t enable any accounts with an empty plugin value
As of MySQL 8.0, you have the choice to define one of the 3 following values for the default_authentication_plugin system variable: mysql_native_password, sha256_password and caching_sha2_password. But then you have also the possibility to set a different authentication plugin for a specific account using the following syntax: CREATE | ALTER USER … IDENTIFIED WITH <auth_plugin> […];

I would like to share with you three last considerations before closing this short blog post:

Once again, keep your MySQL server up to date
MySQL is getting better and better in terms of security
“Security is always excessive until it’s not enough.” (Robbie Sinclair, Head of Security, Country Energy, NSW Australia)

Cet article The evolution of MySQL authentication mechanism est apparu en premier sur Blog dbi services.

This year the APEX connect conference goes virtual online, like all other major IT events, due to the pandemic. Unfortunately it spans only over two days with mixed topics around APEX, like JavaScript, PL/SQL and much more. After the welcome speech and the very interesting Keynote about “APEX 20.1 and beyond: News from APEX Development” by Carsten Czarski, I decided to attend presentations on following topics:
– The Basics of Deep Learning
– “Make it faster”: Myths about SQL performance
– Using RESTful Services and Remote SQL
– The Ultimate Guide to APEX Plug-ins
– Game of Fraud Detection with SQL and Machine Learning

APEX 20.1 and beyond: News from APEX Development

Carsten Czarski from the APEX development team shared about the evolution of the latest APEX releases up to 20.1 released on April 23rd.
Since APEX 18.1 there are 2 release per year. There are no major nor minor releases, all are managed at the same level.
Beside those releases bundle PSE to fix critical issues are provided.
From the recent features a couple have retained my attention:
– Faceted search
– Wider integration of Oracle TEXT
– Application backups
– Session timeout warnings
– New URL
And more to come with next releases like:
– Native PDF export
– New download formats
– External data sources
A lot to test and enjoy!

The Basics of Deep Learning

Artificial Intelligence (AI) is now part of our life mostly without noticing it. Machine learning (ML) is part of AI and Deap Learning (DL) a specific sub-part of ML.
ML is used in different sectors and used for example in:

SPAM filters
Data Analytics
Medical Diagnosis
Image recognition

and much more…
DL is integrating automated feature extraction which makes it suited for:

Natural Language processing
Speech recognition
Text to Speech
Machine translation
Referencing to Text

You can find some example of text generator based on DL with Talk to transformer
It is also heavily used in visual recognition (feature based recognition). ML is dependent on the datasets and preset models used so it’s key to have a large set of data to cover a wide range of possibilities. DL has made a big step forward with Convolutional Neural Networks (by Yann Lecun).
DL is based on complex mathematical models in Neural Networks at different levels, which use activation functions, model design, hyper parameters, backpropagation, loss functions, optimzer.
You can learn how it is implemented in image recognition at pyimagesearch.com
Another nice example of DL with reinforcement learning is AI learns to park

“Make it faster”: Myths about SQL performance

Performance of the Database is a hot topic when it comes to data centric application development like with APEX.
The pillars of DB performance are following:
– Performance planning
– Instance tuning
– SQL tuning
To be efficient performance must be considered at every stage of a project.
Recurring statement is: “Index is GOOD, full table scan is BAD”
But when is index better than full table scan? As a rule of thumb you can consider when selectivity is less than 5%
To improve the performance there are also options like:
– KIWI (Kill It With Iron) where more hardware should solve the performance issue
– Hints where you cut branches of the optimizer decision tree to force its choice (which is always the less expensive plan)
Unfortunately there is no golden hint able to improve performance whenever it’s used

Using RESTful Services and Remote SQL

REST web services are based on URI returning different types of data like HTML, XML, CSV or JSON.
Those web services are based on request methods:

POST to insert data
PUT to update/replace data
GET to read data
DELETE to DELETE data
PATCH to update/modify data

The integration of web services in APEX allows to make use of data outside of the Oracle database and connect to services like:
– Jira
– GitHub
– Online Accounting Services
– Google services
– …
Web services module on ORDS instance provides extensions on top of normal REST which support APEX out of the box but also enables Remote SQL.
Thanks to that, SQL statement can be sent over REST to the Oracle Database and executed remotely returning the data formatted as per REST standards.

The Ultimate Guide to APEX Plug-ins

Even though APEX plug-ins are not trivial to build they have benefits like:
– Introduction of new functionality
– Modularity
– Reusability
which makes them very interesting.
There are already a lot of Plug-ins available which can be found on apex.world or on professional providers like FOEX
What is important to look at with plug-ins are support, quality, security and updates.
The main elements of a plug-in are:
– Name
– Type
– Callbacks
– Standard attributes
– Custom attributes
– Files (CSS, JS, …)
– Events
– Information
– Help Text
Plug-ins are also a way to provide tools to improve the APEX developer experience like APEX Nitro or APEX Builder extension by FOS

Game of Fraud Detection with SQL and Machine Learning

With the example of some banking fraud, the investigation method based on deterministic SQL was compared to the method based on probabilistic ML.
Even though results were close on statistics Supervised Machine Learning (looking for patterns to identify the solutions) was giving more accurate ones. In fact, the combination of both methods was giving even better results.
The challenge is to gain acceptance from the business on results produced using help of ML as they are not based on fully explainable rules.
The Oracle database is embedding ML for free with specific package like DBMS_DATA_MINING for several years now.

The day ended with the most awaited session: Virtual beer!

Cet article APEX Connect 2020 – Day 1 est apparu en premier sur Blog dbi services.

As part of the same migration & upgrade project I talked about in previous blogs already (corrupt lockbox & duplicate objects), I have seen a strange but consistent behavior in each upgrade from Documentum 7.x to 16.x versions. During this process, just before the upgrade, a cloning is done to a staging environment, to keep the source untouched.

As part of this cloning, an update of the target_server value of the dm_job objects is done to use the new host, obviously. Therefore, once the cloning is done, I’m sure that all the jobs, without exception, were using the correct target_server, according to our best practices. This means most of the jobs targeting ANY running server and then distributed jobs with some specific configurations. As you probably already understood via the title of this blog, one of the issue I faced in every single upgrade from 7.x to 16.x is that the Documentum upgrade installer is changing the target_server of some jobs and for one of them, it is actually causing an issue.

Checking the target_server after the upgrade showed that this parameter has been updated for several jobs, which therefore lost our best practices. However, the real problem was for the dm_DataDictionaryPublisher job:

[dmadmin@stg_cs ~]$ repo="REPO1"
[dmadmin@stg_cs ~]$ iapi ${repo} -Udmadmin -Pxxx << EOC
?,c,SELECT r_object_id, object_name, r_creation_date, r_modify_date, target_server FROM dm_job WHERE object_name='dm_DataDictionaryPublisher';
exit
EOC

        OpenText Documentum iapi - Interactive API interface
        Copyright (c) 2018. OpenText Corporation
        All rights reserved.
        Client Library Release 16.4.0200.0080

Connecting to Server using docbase REPO1
[DM_SESSION_I_SESSION_START]info:  "Session 010f123450373995 started for user dmadmin."

Connected to OpenText Documentum Server running Release 16.4.0200.0256  Linux64.Oracle
Session id is s0
API> r_object_id       object_name                 r_creation_date     r_modify_date       target_server
----------------  --------------------------  ------------------  ------------------  ----------------------------------------------------------------------------------------
080f123450000381  dm_DataDictionaryPublisher  11/8/2018 07:04:37  2/10/2020 17:13:54  REPO1.REPO1.REPO1@stg_cs
(1 row affected)

API> Bye
[dmadmin@stg_cs ~]$

Assuming a repository name of REPO1, you can connect to it using just “${repo_name}“. However, you can also connect to a specific Content Server using “${repo_name}.${dm_server_config_name}“. For a Primary CS, it is therefore usually something like “REPO1.REPO1“. I assume that what is happening is that somewhere in the code, instead of using “REPO1” as repository name and appending “.REPO1” to it so that the job run on the Primary CS, they actually used a repository name which is already the Primary CS target and therefore you end-up with “REPO1.REPO1.REPO1@…” in the target_server, which is wrong.

During the upgrade process, there are many things happening, as you can expect… One of these is the execution of the toolset.ebs script. This script is configuring the target_server on several OOTB Documentum jobs. It will create (or rather overwrite in case of an upgrade) the log file “$DOCUMENTUM/dba/config/REPO1/sysadmin.trc“. Looking into this file about changes done to the target_server parameters, you can see the following:

[dmadmin@stg_cs ~]$ cd $DOCUMENTUM/dba/config/${repo}/
[dmadmin@stg_cs REPO1]$ 
[dmadmin@stg_cs REPO1]$ grep -E "dm_DataDictionaryPublisher|target_server" sysadmin.trc | sed 's,.*\.[sg]et(,(,'
("set,c,080f123450000359,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f12345000035a,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f12345000035b,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f12345000035c,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f12345000035d,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f12345000035e,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f12345000035f,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f123450000360,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f123450000361,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f123450000362,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f12345000036d,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f12345000036e,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f12345000036f,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f123450000370,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f123450000371,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f123450000372,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f123450000373,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f123450000374,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f123450000375,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f123450000376,target_server","REPO1.REPO1@stg_cs") ==> true
("retrieve,c,dm_job where object_name = 'dm_DataDictionaryPublisher'") ==> "080f123450000381"
("set,c,080f123450000381,target_server","dummy") ==> true
("retrieve,c,dm_method where object_name = 'dm_DataDictionaryPublisher'") ==> "100f123450000380"
("set,c,080f123450000382,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f123450000383,target_server","REPO1.REPO1@stg_cs") ==> true
("set,c,080f123450000384,target_server","REPO1.REPO1@stg_cs") ==> true
[dmadmin@stg_cs REPO1]$

During the upgrade process, the target_server is set to “dummy” by this script for the dm_DataDictionaryPublisher job. As you know, that’s definitively not a correct value for a target_server but this is expected in this case, most probably to prevent it from running (we deactivate all the jobs before migration anyway)… Later, as part of the step (from the logs) “[main] com.documentum.install.server.installanywhere.actions.cfs.DiWAServerProcessingScripts2 – Executing the Java Method Server Configuration Setup script“, the script named “dm_defaultevent_set.ebs” is executed and create the log file “$DOCUMENTUM/dba/config/REPO1/dm_defaultevent_set.out“. Looking inside this script, you can see the following:

[dmadmin@stg_cs REPO1]$ grep -C5 "retrieve.*dm_DataDictionary" $DM_HOME/install/admin/dm_defaultevent_set.ebs
  TargetServer = docbase & "." & _
  dmAPIGet("get,s0,serverconfig,object_name") & _
  "@" & dmAPIGet("get,s0,serverconfig,r_host_name")


  ID = dmAPIGet("retrieve," & sess & ",dm_job where object_name = 'dm_DataDictionaryPublisher'")
  If ID <> "" Then
    buff = "set," & sess & "," & ID & ",target_server"
    ret% = dmAPISet(buff, TargetServer)
    buff = "set," & sess & "," & ID & ",run_now"
    ret% = dmAPISet(buff, "T")
[dmadmin@stg_cs REPO1]$

Based on the above, it looks like the script is correct: it does set the target_server to be “${repo_name}.${dm_server_config_name}@${dm_server_config_hostname}“. Therefore, as I suspected & mentioned already above, it must be a problem caused by the fact that the “${repo_name}” used to execute this script isn’t “REPO1” but is instead “REPO1.REPO1“. To confirm that, simply compare the definition of the “docbase” variable from the script with the generated log file:

[dmadmin@stg_cs REPO1]$ grep "Start running" $DM_HOME/install/admin/dm_defaultevent_set.ebs
    Print "Start running dm_defaultevent_set.ebs script on docbase " & docbase
[dmadmin@stg_cs REPO1]$
[dmadmin@stg_cs REPO1]$ cat dm_defaultevent_set.out
Connect to docbase REPO1.REPO1
Start running dm_defaultevent_set.ebs script on docbase REPO1.REPO1
Audited the dm_docbase_config for event dm_default_set
Setting the Data Dictionary job state to active
Successfully updated job object for DataDictionaryPublisher
Finished running dm_defaultevent_set.ebs...
Disconnect from the docbase.
[dmadmin@stg_cs REPO1]$

The above confirms that the ebs script is indeed executed with a repository name of “REPO1.REPO1” which cause the wrong target_server in the end. Looking at the other log files created by the upgrade shows that there are apparently some discrepancies. The following log files seems to be OK: dm_crypto_upgrade_52.out, headstart.out, replicate_bootstrap.out, sysadmin.trc, aso… While these ones apparently might not (would needs some debugging to validate): dm_gwm_install.out?, dm_java_methods.out?, dm_defaultevent_set.out, aso…

I guess I will open a Service Reqest with the OpenText Support to get this looked at. It might not be a big issue if it’s just about the target_server of the dm_DataDictionaryPublisher job but I’m afraid there would be something else hidden in another script or object that I haven’t seen yet… The previous blog I wrote about duplicate JMS/ACS object (link at the top of this blog) might also be a consequence? Anyway, while waiting for a feedback from OpenText, we can just fix the issue for this job by re-applying the correct configuration (running on ANY):

[dmadmin@stg_cs REPO1]$ iapi ${repo} -Udmadmin -Pxxx << EOC
?,c,UPDATE dm_job OBJECTS SET target_server=' ' WHERE object_name='dm_DataDictionaryPublisher';
?,c,SELECT r_object_id, object_name, r_creation_date, r_modify_date, target_server FROM dm_job WHERE object_name='dm_DataDictionaryPublisher';
exit
EOC

        OpenText Documentum iapi - Interactive API interface
        Copyright (c) 2018. OpenText Corporation
        All rights reserved.
        Client Library Release 16.4.0200.0080

Connecting to Server using docbase REPO1
[DM_SESSION_I_SESSION_START]info:  "Session 010f123450373a42 started for user dmadmin."

Connected to OpenText Documentum Server running Release 16.4.0200.0256  Linux64.Oracle
Session id is s0
API> objects_updated
---------------
              1
(1 row affected)
[DM_QUERY_I_NUM_UPDATE]info:  "1 objects were affected by your UPDATE statement."

API> ?,c,SELECT r_object_id, object_name, r_creation_date, r_modify_date, target_server FROM dm_job WHERE object_name='dm_DataDictionaryPublisher';
r_object_id       object_name                 r_creation_date     r_modify_date       target_server
----------------  --------------------------  ------------------  ------------------  --------------
080f123450000381  dm_DataDictionaryPublisher  11/8/2018 07:04:37  2/11/2020 07:04:45  
(1 row affected)

API> Bye
[dmadmin@stg_cs REPO1]$

If I receive some important information from the OpenText Service Request, I will try to update this blog to add the details (like other potential issues linked).

Edit: I opened the SR#4485076 and OpenText confirmed the bug. They apparently already knew about the issue for dm_DataDictionaryPublisher since it has been fixed (normally) in 16.4 P23 released on 30-March-2020. They are still looking into potential issues with the other scripts.

Cet article Documentum Upgrade – dm_DataDictionaryPublisher with wrong target server est apparu en premier sur Blog dbi services.

By Clemens Bleile

After cloning e.g. a production database into a database for development or testing purposes, the DBA has to make sure that no activities in the cloned database have an impact on data in other production databases. Because after cloning production data jobs may still try to modify data through e.g. db-links. I.e. scheduled database jobs must not start in the cloned DB and applications connecting to the cloned database must not modify remote production data. Most people are aware of this issue and a first measure is to start the cloned database with the DB-parameter

job_queue_processes=0

That ensures that no database job will start in the cloned database. However, before enabling scheduler jobs again, you have to make sure that no remote production data is modified. Remote data is usually accessed through db-links. So the second step is to handle the db-links in the cloned DB.

In a recent project we decided to be strict and drop all database links in the cloned database.
REMARK: Testers and/or developers should create the needed db-links later again pointing to non-production data.
But how to do that, because private DB-Links can only be dropped by the owner of the db-link? I.e. even a connection with SYSDBA-rights cannot drop private database links:

sys@orcl@orcl> connect / as sysdba
Connected.
sys@orcl@orcl> select db_link from dba_db_links where owner='CBLEILE';
 
DB_LINK
--------------------------------
CBLEILE_DB1
PDB1
 
sys@orcl@orcl> drop database link cbleile.cbleile_db1;
drop database link cbleile.cbleile_db1
                   *
ERROR at line 1:
ORA-02024: database link not found
 
sys@orcl@orcl> alter session set current_schema=cbleile;
 
Session altered.
 
sys@orcl@orcl> drop database link cbleile_db1;
drop database link cbleile_db1
*
ERROR at line 1:
ORA-01031: insufficient privileges

We’ll see later on how to drop the db-links. Before doing that we make a backup of the db-links. That can be achieved with expdp:

Backup of db-links with expdp:

1.) create a directory to store the dump-file:

create directory prod_db_links as '<directory-path>';

2.) create the param-file expdp_db_links.param with the following content:

full=y
INCLUDE=DB_LINK:"IN(SELECT db_link FROM dba_db_links)"

3.) expdp all DB-Links

expdp dumpfile=prod_db_links.dmp logfile=prod_db_links.log directory=prod_db_links parfile=expdp_db_links.param
Username: <user with DATAPUMP_EXP_FULL_DATABASE right>

REMARK: Private db-links owned by SYS are not exported by the command above. But SYS must not own user-objects anyway.

In case the DB-Links have to be restored you can do the following:

impdp dumpfile=prod_db_links.dmp logfile=prod_db_links_imp.log directory=prod_db_links
Username: <user with DATAPUMP_IMP_FULL_DATABASE right>

You may also create a script prod_db_links.sql with all ddl (passwords are not visible in the created script):

impdp dumpfile=prod_db_links.dmp directory=prod_db_links sqlfile=prod_db_links.sql
Username: <user with DATAPUMP_IMP_FULL_DATABASE right>

Finally drop the directory again:

drop directory prod_db_links;

Now, that we have a backup we can drop all db-links. As mentioned earlier, private db-links cannot be dropped, but you can use the following method to drop them:

As procedures are running with definer rights by default, we can create a procedure under the owner of the db-link and in the procedure drop the dblink. SYS has the privileges to execute the procedure. The following example will drop the db-link cbleile.cbleile_db1:

select db_link from dba_db_links where owner='CBLEILE';
 
DB_LINK
--------------------------------
CBLEILE_DB1
PDB1

create or replace procedure CBLEILE.drop_DB_LINK as begin
execute immediate 'drop database link CBLEILE_DB1';
end;
/
 
exec CBLEILE.drop_DB_LINK;
 
select db_link from dba_db_links where owner='CBLEILE';
 
DB_LINK
--------------------------------
PDB1

I.e. the db-link CBLEILE_DB1 has been dropped.
REMARK: Using a proxy-user would also be a possibility to connect as the owner of the db-link, but that cannot be automated in a script that easily.

As we have a method to drop private db-links we can go ahead and automate creating the drop db-link commands with the following sql-script drop_all_db_links.sql:

set lines 200 pages 999 trimspool on heading off feed off verify off
set serveroutput on size unlimited
column dt new_val X
select to_char(sysdate,'yyyymmdd_hh24miss') dt from dual;
spool drop_db_links_&&X..sql
select 'set echo on feed on verify on heading on' from dual;
select 'spool drop_db_links_&&X..log' from dual;
select 'select count(*) from dba_objects where status='''||'INVALID'||''''||';' from dual;
REM Generate all commands to drop public db-links
select 'drop public database link '||db_link||';' from dba_db_links where owner='PUBLIC';
REM Generate all commands to drop db-links owned by SYS (except SYS_HUB, which is oracle maintained)
select 'drop database link '||db_link||';' from dba_db_links where owner='SYS' and db_link not like 'SYS_HUB%';
PROMPT
REM Generate create procedure commands to drop private db-link, generate the execute and the drop of it.
declare
   current_owner varchar2(32);
begin
   for o in (select distinct owner from dba_db_links where owner not in ('PUBLIC','SYS')) loop
      dbms_output.put_line('create or replace procedure '||o.owner||'.drop_DB_LINK as begin');
      for i in (select db_link from dba_db_links where owner=o.owner) loop
         dbms_output.put_line('execute immediate '''||'drop database link '||i.db_link||''''||';');
      end loop;
      dbms_output.put_line('end;');
      dbms_output.put_line('/');
      dbms_output.put_line('exec '||o.owner||'.drop_DB_LINK;');
      dbms_output.put_line('drop procedure '||o.owner||'.drop_DB_LINK;');
      dbms_output.put_line('-- Seperator -- ');
   end loop;
end;
/
select 'select count(*) from dba_objects where status='''||'INVALID'||''''||';' from dual;
select 'set echo off' from dual;
select 'spool off' from dual;
spool off
 
PROMPT
PROMPT A script drop_db_links_&&X..sql has been created. Check it and then run it to drop all DB-Links.
PROMPT

Running above script generates a sql-script drop_db_links_<yyyymmdd_hh24miss>.sql, which contains all drop db-link commands.

sys@orcl@orcl> @drop_all_db_links
...
A script drop_db_links_20200509_234906.sql has been created. Check it and then run it to drop all DB-Links.
 
sys@orcl@orcl> !cat drop_db_links_20200509_234906.sql
 
set echo on feed on verify on heading on
 
spool drop_db_links_20200509_234906.log
 
select count(*) from dba_objects where status='INVALID';
 
drop public database link DB1;
drop public database link PDB2;
 
create or replace procedure CBLEILE.drop_DB_LINK as begin
execute immediate 'drop database link CBLEILE_DB1';
execute immediate 'drop database link PDB1';
end;
/
exec CBLEILE.drop_DB_LINK;
drop procedure CBLEILE.drop_DB_LINK;
-- Seperator --
create or replace procedure CBLEILE1.drop_DB_LINK as begin
execute immediate 'drop database link PDB3';
end;
/
exec CBLEILE1.drop_DB_LINK;
drop procedure CBLEILE1.drop_DB_LINK;
-- Seperator --
 
select count(*) from dba_objects where status='INVALID';
 
set echo off
 
spool off
 
sys@orcl@orcl>

After checking the file drop_db_links_20200509_234906.sql I can run it:

sys@orcl@orcl> @drop_db_links_20200509_234906.sql
sys@orcl@orcl> 
sys@orcl@orcl> spool drop_db_links_20200509_234906.log
sys@orcl@orcl> 
sys@orcl@orcl> select count(*) from dba_objects where status='INVALID';
 
  COUNT(*)
----------
   1
 
1 row selected.
 
sys@orcl@orcl> 
sys@orcl@orcl> drop public database link DB1;
 
Database link dropped.
 
sys@orcl@orcl> drop public database link PDB2;
 
Database link dropped.
 
sys@orcl@orcl> 
sys@orcl@orcl> create or replace procedure CBLEILE.drop_DB_LINK as begin
  2  execute immediate 'drop database link CBLEILE_DB1';
  3  execute immediate 'drop database link PDB1';
  4  end;
  5  /
 
Procedure created.
 
sys@orcl@orcl> exec CBLEILE.drop_DB_LINK;
 
PL/SQL procedure successfully completed.
 
sys@orcl@orcl> drop procedure CBLEILE.drop_DB_LINK;
 
Procedure dropped.
 
sys@orcl@orcl> -- Seperator --
sys@orcl@orcl> create or replace procedure CBLEILE1.drop_DB_LINK as begin
  2  execute immediate 'drop database link PDB3';
  3  end;
  4  /
 
Procedure created.
 
sys@orcl@orcl> exec CBLEILE1.drop_DB_LINK;
 
PL/SQL procedure successfully completed.
 
sys@orcl@orcl> drop procedure CBLEILE1.drop_DB_LINK;
 
Procedure dropped.
 
sys@orcl@orcl> -- Seperator --
sys@orcl@orcl> 
sys@orcl@orcl> select count(*) from dba_objects where status='INVALID';
 
  COUNT(*)
----------
   1
 
1 row selected.
 
sys@orcl@orcl> 
sys@orcl@orcl> set echo off
sys@orcl@orcl> 
sys@orcl@orcl> select owner, db_link from dba_db_links;

OWNER				 DB_LINK
-------------------------------- --------------------------------
SYS				 SYS_HUB

1 row selected.

A log-file drop_db_links_20200509_234906.log has been produced as well.

After dropping all db-links you may do the following checks as well before releasing the cloned database for the testers or the developers:

disable all jobs owned by not Oracle maintained users. You may use the following SQL to generate the commands in sqlplus:


select 'exec dbms_scheduler.disable('||''''||owner||'.'||job_name||''''||');' from dba_scheduler_jobs where enabled='TRUE' and owner not in (select username from dba_users where oracle_maintained='Y');

check all directories in the DB and make sure the directory-paths do not point to shared production folders


column owner format a32
column directory_name format a32
column directory_path format a64
select owner, directory_name, directory_path from dba_directories order by 1;

mask sensitive data, which should not be visible to testers and/or developers.

At that point you are quite sure to not affect production data with your cloned database and you can set
job_queue_processes>0
again and provide access to the cloned database to the testers and/or developers.

Cet article Handle DB-Links after Cloning an Oracle Database est apparu en premier sur Blog dbi services.

For the second and last virtual conference day, I decided to attend presentations on following topics:
– Universal Theme new features
– Oracle APEX Source Code Management and Release Lifecycle
– Why Google Hates My APEX App
– We ain’t got no time! Pragmatic testing with utPLSQL
– Why APEX developers should know FLASHBACK
– ORDS – Behind the scenes … and more!
and the day ended with a keynote from Kellyn Pot’Vin-Gorman about “Becoming – A Technical Leadership Story”

Universal Theme new features

What is the Universal Theme (UT)?
The user interface of APEX integrated since APEX version 5.0 also known as Theme 42 (“Answer to the Ultimate Question of Life, the Universe, and Everything” – The Hitchhiker’s Guide to the Galaxy by Douglas Adams)
New features introduced with UT:
– Template options
– Font APEX
– Full modal dialog page
– Responsive design
– Mobile support
With APEX 20.1 released in April, a new version 1.5 of the UT comes. With that new version different other components related to it like JQuery libraries, OracleJET, Font APEX, … have changed, so check the release notes.
One of the most relevant new features is Mega Menu introducing a new navigation useful if you need to maximize the display of your application pages. You can check the UT sample app embedded on APEX to test it.
Some other changes are:
– Theme Roller enhancement
– Application Builder Redwood UI
– Interactive Grid with Control Break editing
– Friendly URL
Note also that Font Awesome is no longer natively supported (since APEX 19.2) so consider moving to Font APEX.
You can find more about UT online with the dedicated page.

Oracle APEX Source Code Management and Release Lifecycle

Source code management with APEX is always a challenging question for developers used to work with other programming languages and source code version control systems like GitHub.
There are different aspects to be considered like:
– 1 central instance for all developers or 1 local instance for each developer?
– Export full application or export pages individually?
– How to best automate application exports?
There is no universal answers to them. This must be considered based on the size of the development team and the size of the project.
There are different tools provided by APEX to manage export of the applications:
– ApexExport java classes
– Page UI
– APEX_EXPORT package
– SQLcl
But you need to be careful about workspace and application IDs when you run multiple instances.
Don’t forget that merge changes are not supported in APEX!
You should have a look into the Oracle apex life cycle management White Paper for further insight.

Why Google Hates My APEX App

When publishing a public web site Google provides different tools to help you getting more out of it with tools based on:
– Statistics
– Promotion
– Search
– Adds (to get money back)
When checking Google Analytics for the statistics of an APEX application, you realize that the outcome doesn’t really reflect the content of the APEX application, specially in terms of pages. This is mainly due to the way APEX manages page parameter in the f?p= procedure call. That call is much different than the standards URLs where parameters are given by “&” (which Goggle tools are looking for) and not “:”.
LEt’s hope this is going to improve with the new Friendly URL feature introduced by APEX 20.1.

We ain’t got no time! Pragmatic testing with utPLSQL

Unit testing should be considered right from the beginning while developing new PL/SQL packages.
utPLSQL is an open source PL/SQL package that can help to unit test your code. Tests created as part of the development process deliver value during implementation.
What are the criteria of choice for test automation?
– Risk
– Value
– Cost efficiency
– Change probability
Unit testing can be integrated into test automation which is of great value in the validation of your application.
If you want to know more about test automation you can visit the page of Angie Jones.

Why APEX developers should know FLASHBACK

For most people Flashback is an emergency procedure, but it’s much more in fact!
APEX developers know about flashback thanks to the restore as of functionality on pages in the app Builder.
Flashback is provided at different levels:

Flashback query: allows to restore data associated to a specific query based on the SCN. This can be useful for unit testing.
Flashback session: allows to flashback all queries of the session. By default up to 900 seconds in the past (undo retention parameter).
Flashback transaction: allows to rollback committed transaction thanks to transaction ID (XID) with dbms_transaction.undo_transaction
Flashback drop: allows to recover dropped objects thanks to the user recycle bin. Deleted objects are kept until space is free (advice: keep 20% of free sapce). BEWARE! this is not working for truncated objects.
Flashback table: allows to recover a table to a given point in time. Only applicable for data and cannot help in case of DDL or drop.
Flashback database: allows to restore the database to a given point in time based on restore points. This is only for DBAs. This can be useful to rollback an APEX application deployment as a lot of objects are changed. As it works with pluggable databases it can be used to produce copies to be distributed to individual XE instances for multiple developers.
Data archive: allows to recover based on audit history. It’s secure and efficient and can be imported from existing application audits. It’s now FREE (unless using compression option).

The different flashback option can be used to rollback mistakes, but not only. They can also be used for unit testing or reproducing issues. Nevertheless you should always be careful when using commands like DROP and even more TRUNCATE.

ORDS – Behind the scenes … and more!

ORDS provides multiple functionalities:
– RESTful services for the DB
– Web Listener for APEX
– Web Client for the DB
– DB Management REST API
– Mongo style API for the DB
Regarding APEX Web Listener, EPG and mod_plsql are deprecated so ORDS is the only option for the future.
ORDS integrates into different architectures allowing to provide isolation like:
– APEX application isolation
– REST isolation
– LB whitelists
With APEX there are 2 options to use RESTful services:
– Auto REST
– ORDS RESTful services
developers can choose the best suited one according to their needs.
The most powerful feature is REST enabled SQL.

Becoming – A Technical Leadership Story

Being a leader is defined by different streams:
-Influencing others
-Leadership satisfaction
-Technical leadership
and more…
A couple of thoughts to be kept:
– Leaders are not always managers.
– Mentors are really important because they talk to you not about you like sponsors.
– Communication is more than speaking
But what is most important on my point of you is caring about others, how about you?

Thanks to virtualization of the conference all the presentations have been recorded, so keep tuned on DOAG and you will be able to see those and much more! So take some time and watch as much as possible because everything is precious learning. Thanks a lot to the community.
Keep sharing and enjoy APEX!

Cet article APEX Connect 2020 – Day 2 est apparu en premier sur Blog dbi services.

In contrary of views, materialized views avoid executing the SQL query for every access by storing the result set of the query.

When a master table is modified, the related materialized view becomes stale and a refresh is necessary to have the materialized view up to date.

I will not show you the materialized view concepts, the Oracle Datawarehouse Guide is perfect for that.

I will show you, from a user real case, all steps you have to follow to investigate and tune your materialized view refresh.

And, as very often in performance and tuning task, most of the performance issue comes from the way to write and design your SQL (here the SQL statement loading the materialized view).

First of all, I’m saying that spending almost 50 mins (20% of my DWH Load) to refresh materialized view is too much :

The first step is to check which materialized view has the highest refresh time :

SELECT * 
FROM (
      SELECT OWNER,
             MVIEW_NAME,
             CONTAINER_NAME,
             REFRESH_MODE,
             REFRESH_METHOD,
             LAST_REFRESH_TYPE,
             STALENESS,
             ROUND((LAST_REFRESH_END_TIME-LAST_REFRESH_DATE)*24*60,2) as REFRESH_TIME_MINS
       FROM ALL_MVIEWS
       WHERE LAST_REFRESH_TYPE IN ('FAST','COMPLETE')
      )
ORDER BY REFRESH_TIME_MINS DESC;

OWNER   MVIEW_NAME                      CONTAINER_NAME                  REFRESH_MODE  REFRESH_METHOD LAST_REFRESH_TYPE STALENESS      REFRESH_TIME_MINS
------- ------------------------------- ------------------------------- ------------- -------------- ----------------- --------------------------------
SCORE   MV$SCORE_ST_SI_MESSAGE_HISTORY   MV$SCORE_ST_SI_MESSAGE_HISTORY DEMAND        FAST           FAST              FRESH          32.52
SCORE   MV$SCORE_ST_SI_MESSAGE           MV$SCORE_ST_SI_MESSAGE         DEMAND        FAST           FAST              FRESH          16.38
SCORE   MV$SC1_MYHIST2_STOP              MV$SC1_MYHIST2_STOP            DEMAND        FORCE          COMPLETE          NEEDS_COMPILE  .03
SCORE   MV$SC1_MYHIST2_START             MV$SC1_MYHIST2_START           DEMAND        FORCE          COMPLETE          NEEDS_COMPILE  .03
SCORE   MV$SC1_RWQ_FG_TOPO               MV$SC1_RWQ_FG_TOPO             DEMAND        FORCE          COMPLETE          NEEDS_COMPILE  .02

All the refresh time comes from the mview : MV$SCORE_ST_SI_MESSAGE_HISTORY and MV$SCORE_ST_SI_MESSAGE.

Thanks to columns ALL_MVIEWS.LAST_REFRESH_DATE and ALL_MVIEWS.LAST_REFRESH_END_TIME, we got the sql statements and the executions plans related to the refresh operation :

The first operation is a “Delete“:

The second operation is an “Insert“:

Let’s extract the PL/SQL procedure doing the refresh used by the ETL tool :

dbms_mview.refresh('SCORE.'||l_mview||'','?',atomic_refresh=>FALSE);
--'?' = Force : If possible, a fast refresh is attempted, otherwise a complete refresh.

Being given that, here all questions which come to me :

My materialized view can be fast-refreshed, so why it takes more than 48 mins to refresh ?
With atomic_refresh set to false, oracle normally optimize refresh by using parallel DML and truncate DDL, so why a “Delete” operation is done instead a “Truncate” more faster ?

To answer to the first point, to be sure that my materialized view can be fast refresh, we can also use explain_mview procedure and check the capability_name called “REFRESH_FAST”:

SQL> truncate table mv_capabilities_table;

Table truncated.

SQL> exec dbms_mview.explain_mview('MV$SCORE_ST_SI_MESSAGE_HISTORY');

PL/SQL procedure successfully completed.

SQL> select capability_name,possible,related_text,msgtxt from mv_capabilities_table;

CAPABILITY_NAME                POSSIBLE             RELATED_TEXT            MSGTXT
------------------------------ -------------------- ----------------------- ----------------------------------------------------------------------------
PCT                            N
REFRESH_COMPLETE               Y
REFRESH_FAST                   Y
REWRITE                        Y
PCT_TABLE                      N                    ST_SI_MESSAGE_HISTORY_H relation is not a partitioned table
PCT_TABLE                      N                    ST_SI_MESSAGE_HISTORY_V relation is not a partitioned table
PCT_TABLE                      N                    DWH_CODE                relation is not a partitioned table
REFRESH_FAST_AFTER_INSERT      Y
REFRESH_FAST_AFTER_ONETAB_DML  Y
REFRESH_FAST_AFTER_ANY_DML     Y
REFRESH_FAST_PCT               N											PCT is not possible on any of the detail tables in the materialized view
REWRITE_FULL_TEXT_MATCH        Y
REWRITE_PARTIAL_TEXT_MATCH     Y
REWRITE_GENERAL                Y
REWRITE_PCT                    N											general rewrite is not possible or PCT is not possible on any of the detail tables
PCT_TABLE_REWRITE              N                    ST_SI_MESSAGE_HISTORY_H relation is not a partitioned table
PCT_TABLE_REWRITE              N                    ST_SI_MESSAGE_HISTORY_V relation is not a partitioned table
PCT_TABLE_REWRITE              N                    DWH_CODE				relation is not a partitioned table

18 rows selected.

Let’s try to force a complete refresh with atomic_refresh set to FALSE in order to check if the “Delete” operation is replaced by a “Truncate” operation:

Now we have a “Truncate“:

--c = complete refresh
 dbms_mview.refresh('SCORE.'||l_mview||'','C',atomic_refresh=>FALSE);

Plus an “Insert” :

Let’s check now the refresh time :

SELECT * 
FROM ( SELECT OWNER, 
			  MVIEW_NAME, 
			  CONTAINER_NAME, 
			  REFRESH_MODE, 
			  LAST_REFRESH_TYPE, 
			  STALENESS, 
			  round((LAST_REFRESH_END_TIME-LAST_REFRESH_DATE)*24*60,2) as REFRESH_TIME_MINS 
	   FROM ALL_MVIEWS 
	   WHERE LAST_REFRESH_TYPE IN ('FAST','COMPLETE')
	 ) 
ORDER BY REFRESH_TIME_MINS DESC;

OWNER   MVIEW_NAME                       CONTAINER_NAME                   REFRESH_MODE LAST_REFRESH_TYPE STALENESS           REFRESH_TIME_MINS
------- -------------------------------- -------------------------------- ------------ ----------------- -------------------------------------
SCORE   MV$SCORE_ST_SI_MESSAGE           MV$SCORE_ST_SI_MESSAGE           FAST         COMPLETE 	 FRESH                            6.75
SCORE   MV$SCORE_ST_SI_MESSAGE_HISTORY   MV$SCORE_ST_SI_MESSAGE_HISTORY   FAST         COMPLETE 	 FRESH                               1

Conclusion (for my environment) :

The “Complete” refresh (7.75 mins) is more faster than the “Fast” refresh (48.9 mins),
The parameter “atomic_refresh=FALSE” works only with “complete” refresh, so “truncate” is only possible with “complete“.
It’s not a surprise to have “Complete” more faster than “Fast” since the materialized views are truncated instead of being deleted.

Now, I want to understand why “Fast refresh” is very long (48.9 mins).

In order to be fast refreshed, materialized view requires materialized view logs storing the modifications propagated from the base tables to the container tables (regular table with same name as materialized view which stores the results set returned by the query).

Let’s check the base tables used into the SQL statement loading the materialized view :

Be focus on the table names after the clause “FROM“:

ST_SI_MESSAGE_HISTORY_H
ST_SI_MESSAGE_HISTORY_V
DWH_CODE

Let’s check the number of rows which exist on each tables sources :

SQL> SELECT 'DWH_CODE' as TABLE_NAME,NUM_ROWS FROM USER_TABLES WHERE TABLE_NAME = 'DWH_CODE'
  2  UNION ALL
  3  SELECT 'ST_SI_MESSAGE_HISTORY_H' as TABLE_NAME,NUM_ROWS FROM USER_TABLES WHERE TABLE_NAME = 'ST_SI_MESSAGE_HISTORY_H'
  4  UNION ALL
  5  SELECT 'ST_SI_MESSAGE_HISTORY_V' as TABLE_NAME,NUM_ROWS FROM USER_TABLES WHERE TABLE_NAME = 'ST_SI_MESSAGE_HISTORY_V';

TABLE_NAME                NUM_ROWS
----------------------- ----------
DWH_CODE                         1
ST_SI_MESSAGE_HISTORY_H    4801733
ST_SI_MESSAGE_HISTORY_V    5081578

To be fast refreshed, the MV$SCORE_ST_SI_MESSAGE_HISTORY materialized view requires materialized logs on the ST_SI_MESSAGE_HISTORY_H, ST_SI_MESSAGE_HISTORY_V and DWH_CODE tables:

SQL> SELECT LOG_OWNER,MASTER,LOG_TABLE
  2  FROM all_mview_logs
  3  WHERE MASTER IN ('DWH_CODE','ST_SI_MESSAGE_H','ST_SI_MESSAGE_V');

LOG_OWNER   MASTER                 LOG_TABLE
----------- ------------------ ----------------------------
SCORE       ST_SI_MESSAGE_V        MLOG$_ST_SI_MESSAGE_V
SCORE       ST_SI_MESSAGE_H        MLOG$_ST_SI_MESSAGE_H
SCORE       DWH_CODE               MLOG$_DWH_CODE

As, the materialized view logs contains only the modifications during a fast refresh, let’s check the contents (number of rows modified coming from the base tables) just before to execute the fast-refresh :

SQL> SELECT
  2              owner,
  3              mview_name,
  4              container_name,
  5              refresh_mode,
  6              last_refresh_type,
  7              staleness
  8          FROM
  9              all_mviews
 10          WHERE
 11              last_refresh_type IN (
 12                  'FAST',
 13                  'COMPLETE'
 14              )
 15  ;

OWNER   MVIEW_NAME                     CONTAINER_NAME                 REFRESH_MODE LAST_REFRESH_TYPE STALENESS
------- ------------------------------ ---------------------------------------------------------------------------
SCORE   MV$SCORE_ST_SI_MESSAGE         MV$SCORE_ST_SI_MESSAGE         DEMAND       COMPLETE          NEEDS_COMPILE
SCORE   MV$SCORE_ST_SI_MESSAGE_HISTORY MV$SCORE_ST_SI_MESSAGE_HISTORY DEMAND       COMPLETE 	     NEEDS_COMPILE

STALENESS = NEEDS_COMPILE means the materialized view need to be refreshed because base tables have been modified. It’s normal since we have stopped the ETL process just before the execution of the refresh mview procedure in order to see the content of the mview logs.

The contents of materialized view logs are :

SQL> SELECT * FROM "SCORE"."MLOG$_DWH_CODE";

M_ROW$$               SNAPTIME$ DMLTYPE$$ OLD_NEW$$ CHANGE_VECTOR$$  XID$$
--------------------- --------- --------- --------- ---------------- ----------------
AAAbvUAAUAAABtjAAA    01-JAN-00 U 		  U 		02   1125921382632021
AAAbvUAAUAAABtjAAA    01-JAN-00 U 		  N 		02   1125921382632021

SQL> SELECT * FROM "SCORE"."MLOG$_ST_SI_MESSAGE_V";

no rows selected

SQL> SELECT * FROM "SCORE"."MLOG$_ST_SI_MESSAGE_H";

no rows selected 
SQL>

I’m a little bit surprised because:

Being given my refresh time, I expected to have a lot of modifications coming from the big tables : ST_SI_MESSAGE_V (5081578 rows) and ST_SI_MESSAGE_H (4801733 rows) instead of DWH_CODE (1 row).

After analyzing the ETL process, it appears that only this table (DWH_CODE) is modified every day with the sysdate. This table is a metadata table which contents only one row identifying the loading date.

If we check the SQL statement loading the materialized view, this table is used to populate the column DWH_PIT_DATE (see print screen above).

But since this table is joined with ST_SI_MESSAGE_H and ST_SI_MESSAGE_V, the oracle optimizer must do a full scan on the materialized view MV$SCORE_ST_SI_MESSAGE_HISTORY (more than 500K rows) to populate each row with exactly the same value:

SQL> select distinct dwh_pit_date from score.mv$score_st_si_message_history;

DWH_PIT_DATE
------------
09-MAY-20

There is no sense to have a column having always the same value, here we have definitely a materialized view design problem.Whatever the refresh mode using : “Complete” or “Fast”, we always scan all the materialized view logs to populate column DWH_PIT_DATE.

To solve this issue, let’s check the materialized view logs dependencies :

SQL> SELECT DISTINCT NAME
  2  FROM ALL_DEPENDENCIES
  3  WHERE TYPE = 'VIEW'
  4  AND REFERENCED_OWNER = 'DWH_LOAD'
  5  AND REFERENCED_NAME IN ('MV$SCORE_ST_SI_MESSAGE','MV$SCORE_ST_SI_MESSAGE_HISTORY')
  6  ORDER BY NAME;

NAME
--------------------------------------------------------------------------------
V$LOAD_CC_DMSG
V$LOAD_CC_FMSG
V$LOAD_CC_MSG_ACK
V$LOAD_CC_MSG_BOOKEDIN_BY
V$LOAD_CC_MSG_PUSHEDBACK
V$LOAD_CC_MSG_SCENARIO
V$LOAD_CC_MSG_SOURCE
V$LOAD_CC_MSG_SRC
V$LOAD_CC_MSG_TRIAGED_BY

9 rows selected.

SQL>

In my environment, only this objects (oracle views) use the materialized views, so I can safely remove the column DWH_CODE.DWH_PIT_DATE (the column not the join with the table DWH_CODE) from the materialized views and move it to the dependent objects.

After this design modifications, let’s execute the refresh and check the refresh time :

SQL> SELECT *
  2  FROM ( SELECT OWNER,
  3    MVIEW_NAME,
  4    CONTAINER_NAME,
  5    REFRESH_MODE,
  6    REFRESH_METHOD,
  7    LAST_REFRESH_TYPE,
  8    STALENESS,
  9    ROUND((LAST_REFRESH_END_TIME-LAST_REFRESH_DATE)*24*60,2) as REFRESH_TIME_MINS
 10     FROM ALL_MVIEWS
 11     WHERE LAST_REFRESH_TYPE IN ('FAST','COMPLETE')
 12   )
 13  ORDER BY REFRESH_TIME_MINS DESC;

OWNER     MVIEW_NAME                            CONTAINER_NAME                       REFRES REFRESH_ LAST_REF STALENESS           REFRESH_TIME_MINS
--------- --------------------------------- ------------------------------------ ------ -------- -------- ------------------- ---------------------
SCORE     MV$SCORE_ST_SI_MESSAGE                MV$SCORE_ST_SI_MESSAGE               DEMAND FAST     COMPLETE FRESH                            1.58
SCORE     MV$SCORE_ST_SI_MESSAGE_HISTORY        MV$SCORE_ST_SI_MESSAGE_HISTORY       DEMAND FAST     COMPLETE FRESH                             .28

The refresh time is faster (1.86 mins) than the last one (7.75 mins) and now oracle optimizer does not full scan the materialized view to populate each row with same value (DWH_CODE.DWH_PIT_DATE).

Conclusion :

We have reduced the refresh time from 50mins to 1.86 mins.
Fast Refresh is not always more faster than Complete Refresh, it depends of the SQL statement loading the view and the number of rows propagated from the base tables to the container tables within the materialized view logs.
To decrease the refresh time, act only on the refresh option (Fast, Complete, Index,etc.) is not enough, we have to also analyze and modify the SQL statement loading the materialized view.
If you have design problem, never be afraid to modify the SQL statement and even some part of your architecture (like here the dependent objects). Of course you have to know very well the impact on your application and on your ETL process.

Cet article Oracle Materialized View Refresh : Fast or Complete ? est apparu en premier sur Blog dbi services.

During my last blog-posts I tested the MARS (Microsoft Azure Recovery Service) through Windows Admin Center and how to use Azure Backup to recover files/folders or System State for protected servers.
To continue my Azure protection possibilities overview I will today installed a MABS on one of my virtual on-premise server and uses it to protect my others virtual on-premises servers.

To do so you go to your Azure Recovery Services vault. A vault is indeed a place on Azure where backup will be stored. Recovery Service vault give the possibility to manage backup server, monitor backup, manage access to those backup and specify also how the vault will be replicated.
On your Recovery Service vault just go on Getting Started, Backup and select where your workload is running, here on-premises, and what do you want to backup, here select Hyper-V Virtual Machines and Microsoft SQL Server. This selection can be changed later when you will protect your Virtual Machine. Press the Prepare infrastructure button:

Once done you have to download installation files for the MABS and also credential to connect your MABS to your Azure Recovery Server vault:

Next step is to copy downloaded files on your future MABS server and start installation.
This server should fit some minimum requirements:

Two cores and 8GB RAM
Windows server 2016 or 2019 64 bits
Part of a domain
.NET 3.5 has to be installed if you want to install SQL Server (will be required during the installation)

If requirements are fulfilled you can start the installation:

After a prerequisite check the wizard ask you if you want to install a new SQL Server 2017 instance which will be used with MABS, Reporting Service will be also installed for reporting capabilities or use an existing one. Here we will install a new one:

Installation settings will be displayed afterwards with space requirements and possibility to change files location for the MABS. A password will also be asked for the SQL Service accounts created during the SQL Server installation. A summary of the future installation is displayed and the install can start by first checking that necessary Windows Features are installed:

The next step is to connect your MABS to your Service Recovery Vault, you have to use the Vault credential file downloaded before:

Backups are encrypted for security reason, a passphrase as to be typed or generated and as to be saved in a secure location like a keypass locally or in Azure on Key Vault. Once done the server is registered with the Microsoft Azure Backup and installation can start:

At the end all should be green, I had an issue when I tried to use an existing instance of SQL Server during another installation, the problem came because my instance didn’t have SQL Client tools installed, so if you use an existing SQL instance have a look before otherwise you will have to redo the complete installation. But here installation succeeded:

After installation you should find a shortcut on your desktop to open your MABS, let’s open it and go to the Management part in order to add new on-premise servers to your MABS to back up their workload:

After you click on the Add button a wizard appears. You can select the type of server you want to protect, here Windows Servers and the deployment method you will use to deploy the agent, here agent installation:

A list of servers located in the same domain as you MABS are displayed and you can add the servers where you want to deploy the agent:

After you have to specify administrator credential for the servers where you want to install agent. DPM (Data Protection Manager, in fact your MABS) will use those credentials to install the agent on servers. Restart method is also requested in case a restart is necessary after the agent installation. Once those steps are finished, protection agents are installed on servers:

When installation complete successfully you have you selected server in your MABS:

Those servers are marked as unprotected because for the moment you don’t create any rules to protect the workload running on those servers.
It will be the goal of my next blog-post.
See you soon!

Cet article How to install MABS (Microsoft Azure Backup Server) and use it to protect on-premise virtual machines est apparu en premier sur Blog dbi services.

Three new interesting extended stored procedures comes with SQL Server 2019.
I was very interested to discover these new store procedures:

Sys.xp_copy_file is to copy a specific files from a folder to another
- Syntax: exec master.sys.xp_copy_file ‘source folder+file’, ‘destination folder+file’
- Example:
```
exec master.sys.xp_copy_file 'C:\Temp\Sources\File1.txt', 'C:\Temp\Destinations\File1.txt'
```

Before the command:

After the command:

As you can see in my test, you will have these 2 information where indicate the sucess of the query:
Commands completed successfully.
Time: 0.315s

Sys.xp_copy_files is to copy multiple files though the wildcard Character from a folder to another
- Syntax: exec master.sys.xp_copy_files ‘source folder+files’, ‘destination folder’
- Example:
```
exec master.sys.xp_copy_files 'C:\Temp\Sources\File*.txt', 'c:\Temp\Destinations\'
```

Before the command:

After the command:

Sys.xp_delete_files is to delete a specific file or multiple files though the wildcard Character in a folder
- - Syntax: exec master.sys.xp_delete_files ‘source folder+files’
  - Example:
```
exec master.sys.xp_delete_files 'c:\Temp\Destinations\File*.txt'
```
Before the command:
After the command:

I go a little forward to see how it’s reacted if the file already exists or when we update the source
I copy the first time the file ‘c:\ Temp\Sources\file1.txt’ to ‘c:\ Temp\Destinations\file1.txt’
I have one line “first Line”

I had to the source a second Line “second line”
I copy again the file and result…

As you can see, the second line is in the destination.
Without warning or question, the file is overwriting.

The funny part is when you create a file manually (file6.txt) and want to delete it through SQL Server, it’s not possible.

Only the file, where SQL Server account is the owner, will be deleted…

Before SQL Server 2019, to do these operations, we use xp_cmdshell. Now with this new version, we can do easily copy operation or delete operations for files in a folder.
It’s very good and useful! A customer already asks me to implement it…
See you!

Cet article SQL Server 2019: Copy files through SQL Server est apparu en premier sur Blog dbi services.

By Franck Pachot

.
In a previous post https://blog.dbi-services.com/awr-dont-store-explain-plan-predicates/ I explained this limitation in gathering filter and access predicates by Statspack and then AWR because of old bugs about reverse parsing of predicates. Oracle listens to its customers through support (enhancement requests), though the community (votes on database ideas), and through the product managers who participate in User Groups and ACE program. And here it is: in 20c the predicates are collected by AWS and visible with DBMS_XPLAN and AWRSQRPT reports.

I’ll test with a very simple query:


set feedback on sql_id echo on pagesize 1000

SQL> select * from dual where ascii(dummy)=42;

no rows selected

SQL_ID: g4gx2zqbkjwh1

I used the “FEEDBACK ON SQL” feature to get the SQL_ID.

Because this query is fast, it will not be gathered by AWR except if I ‘color’ it:


SQL> exec dbms_workload_repository.add_colored_sql('g4gx2zqbkjwh1');

PL/SQL procedure successfully completed.

Coloring a statement is the AWR feature to use when you want to get a statement always gathered, for example when you have optimized it and want compare the statistics.

Now running the statement between two snapshots:


SQL> exec dbms_workload_repository.create_snapshot;

PL/SQL procedure successfully completed.

SQL> select * from dual where ascii(dummy)=42;

no rows selected

SQL> exec dbms_workload_repository.create_snapshot;

PL/SQL procedure successfully completed.

Here, I’m sure it has been gathered.

Now checking the execution plan:


SQL> select * from dbms_xplan.display_awr('g4gx2zqbkjwh1');

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID g4gx2zqbkjwh1
--------------------
select * from dual where ascii(dummy)=42

Plan hash value: 272002086

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |     2 (100)|          |
|*  1 |  TABLE ACCESS FULL| DUAL |     1 |     2 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ASCII("DUMMY")=42)


18 rows selected.

Here I have the predicate. This is a silly example but the predicate information is very important when looking at a large execution plan trying to understand the cardinality estimation or the reason why an index is not used.

Of course, this is also visible from the ?/rdbms/admin/awrsqrpt report:

What if you upgrade?

AWR gathers the SQL Plan only when it is not already there. Then, when we will update to 20c only the new plans will get the predicates. Here is an example where I simulate the pre-20c behaviour with “_cursor_plan_unparse_enabled”=false:


SQL> alter session set "_cursor_plan_unparse_enabled"=false;

Session altered.

SQL> exec dbms_workload_repository.add_colored_sql('g4gx2zqbkjwh1');

PL/SQL procedure successfully completed.

SQL> exec dbms_workload_repository.create_snapshot;

PL/SQL procedure successfully completed.

SQL> select * from dual where ascii(dummy)=42;

no rows selected

SQL> exec dbms_workload_repository.create_snapshot;

PL/SQL procedure successfully completed.

SQL> select * from dbms_xplan.display_awr('g4gx2zqbkjwh1');

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID g4gx2zqbkjwh1
--------------------
select * from dual where ascii(dummy)=42

Plan hash value: 272002086

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS FULL| DUAL |     1 |     2 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

13 rows selected.

No predicate here. Even If I re-connect to reset the “_cursor_plan_unparse_enabled”:


SQL> connect / as sysdba
Connected.
SQL> exec dbms_workload_repository.create_snapshot;

PL/SQL procedure successfully completed.

SQL> select * from dual where ascii(dummy)=42;

no rows selected

SQL> exec dbms_workload_repository.create_snapshot;

PL/SQL procedure successfully completed.

SQL> select * from dbms_xplan.display_awr('g4gx2zqbkjwh1');

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID g4gx2zqbkjwh1
--------------------
select * from dual where ascii(dummy)=42

Plan hash value: 272002086

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS FULL| DUAL |     1 |     2 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

13 rows selected.

This will be the situation after upgrade.

If you want to re-gather all sql_plans, you need to purge the AWR repository:


SQL> execute dbms_workload_repository.drop_snapshot_range(1,1e22);

PL/SQL procedure successfully completed.

SQL> execute dbms_workload_repository.purge_sql_details();

PL/SQL procedure successfully completed.

SQL> commit;

This clears everything, so I do not recommend to do that at the same time as the upgrade as you may like to compare some performance with the past. Anyway, we have time and maybe this fix will be backported in 19c.

There are very small chances that fix is ported to Statspack, but you can do it yourself as I mentioned in http://viewer.zmags.com/publication/dd9ed62b#/dd9ed62b/36 (“on Improving Statspack Experience”) with something like:


sed -i -e 's/ 0 -- should be//' -e 's/[(]2254299[)]/--&/' $ORACLE_HOME/rdbms/admin/spcpkg.sql

Cet article 20c: AWR now stores explain plan predicates est apparu en premier sur Blog dbi services.

Introduction

If you’re considering ODA for your next platform, you surely already appreciate the simplicity of the offer. 3 models with few options, this is definitely easy to choose from.

One of the other benefit is also the hardware support of 5 years, and combined with software updates generally available for up to 7 years old ODAs, you can keep your ODA running even longer for non-critical databases and/or if you have a strong Disaster Recovery solution (including Data Guard or Dbvisit standby). Some of my customers are still using X4-2 models and are confident in their ODAs because it’s been quite reliable across the years.

Models and storage limits

One of the main drawback of the ODA: it doesn’t have unlimited storage. Disks are local NVMe SSDs (or in a dedicated enclosure), and it’s not possible (technically possible but not recommended) to add storage through a NAS connection.

3 ODA models are available, X8-2S and X8-2M are one-node ODAs, and X8-2HA being a two-nodes ODA with DAS storage including SSD and/or HDD (High Performance or High Capacity version).

Please refer to my previous blog post for more information about the current generation.

Storage on ODA is always dedicated to database related files: datafiles, redologs, controlfiles, archivelogs, flashback logs, backups (if you do it locally on ODA), etc. Linux system, Oracle products (Grid Infrastructure and Oracle database engines), home folders and so on reside on internal M2 SSD disks large enough for a normal use.

X8-2S/X8-2M storage limit

ODA X8-2S is the entry level ODA. It only has one CPU, but with 16 powerful cores and 192GB of memory it’s all but a low end server. 10 empty storage slots are available in the front pane but don’t expect to extend the storage. This ODA is delivered with 2 disks and doesn’t support adding more disks. That’s it. With the two 6.4TB disks, you’ll have a RAW capacity of 12.8TB.

ODA X8-2M is much more capable than his little brother. Physically identical to X8-2S, it has two CPUs and twice the amount of RAM. This 32-cores server fitted with 384GB of RAM is a serious player. It’s still delivered with two 6.4TB disks but unlike the S version, all the 10 empty storage slots can be populated to reach a stunning 76.8TB of RAW storage. This is still not unlimited, but the limit is actually quite high. Disks can be added by pair, so you can have 2-4-6-8-10-12 disks for various configurations and for a maximum of 76.8TB RAW capacity. Only disks dedicated for ODA are suitable, and don’t expect to put bigger disks as it only supports the same 6.4TB disks than those embedded with the base server.

RAW capacity means without redundancy, and you will loose half of the capacity with ASM redundancy. It’s not possible to run an ODA without redundancy, if you think about that. ASM redundancy is the only way to secure data, as no RAID controller is inside the server. You already know that disk capacity and real capacity always differs, so Oracle included several years ago in the documentation the usable capacity depending on your configuration. The usable capacity includes reserved space for a single disk failure (15% starting from 4 disks).

On base ODAs (X8-2S and X8-2M with 2 disks only). The usable storage capacity is actually 5.8TB and no space is reserved for disk failure. If a disk fails, there is no way to rebuild redundancy as only one disk survives.

Usable storage is not database storage, don’t miss that point. You’ll need to split this usable storage between DATA area and RECO area (actually ASM diskgroups). Most often, RECO is sized between 10% and 30% of usable storage.

Here is a table with various configurations. Note that I didn’t include ASM high redundancy configurations here, I’ll explain that later.

Nb disks	Disk size TB	RAW cap. TB	Official cap. TB	DATA ratio	DATA TB	RECO TB
2	6.4	12.8	5.8	90%	5.22	0.58
2	6.4	12.8	5.8	80%	4.64	1.16
2	6.4	12.8	5.8	70%	4.06	1.74
4	6.4	25.6	9.9	90%	8.91	0.99
4	6.4	25.6	9.9	80%	7.92	1.98
4	6.4	25.6	9.9	70%	6.93	2.97
6	6.4	38.4	14.8	90%	13.32	1.48
6	6.4	38.4	14.8	80%	11.84	2.96
6	6.4	38.4	14.8	70%	10.36	4.44
8	6.4	51.2	19.8	90%	17.82	1.98
8	6.4	51.2	19.8	80%	15.84	3.96
8	6.4	51.2	19.8	70%	13.86	5.94
10	6.4	64	24.7	90%	22.23	2.47
10	6.4	64	24.7	80%	19.76	4.94
10	6.4	64	24.7	70%	17.29	7.41
12	6.4	76.8	29.7	90%	26.73	2.97
12	6.4	76.8	29.7	80%	23.76	5.94
12	6.4	76.8	29.7	70%	20.79	8.91

X8-2HA storage limit

Storage is more complex on X8-2HA. If you’re looking for complete information about its storage, review the ODA documentation for all the possibilities.

Briefly, X8-HA is available in two flavors: High Performance, the one I highly recommend, or High Capacity, which is nice if you have really big databases you want to store on only one ODA. But this High Capacity version will make use of spinning disks to achieve such amount of TB. Definitely not the best solution for the performance. The 2 nodes of this ODA are empty, no disk in the front panel, just empty space. All data disks are in a separate enclosure connected on both nodes with SAS cables. Depending on your configuration, you’ll have 6 to 24 SSD (HP) or a mix of 6 SSD and 18 HDD (HC). When your first enclosure is filled with disks, you can also add another storage enclosure of the same kind to eventually double the total capacity. Usable storage starts from 17.8TB to 142.5TB for HP, and from 114.8TB to 230.6TB for HC.

Best practice for storage usage

First you should consider that ODA storage is high performance storage for high database throughput. Thus, storing backups on ODA is a nonsense. Backups are files written once and mainly dedicated to get erased without being used. Don’t loose precious TB for that. Moreover, if backups are done in the FRA, they are actually located on the same disks as DATA. It’s why most of the configuration will be done with 10% to 20% of RECO, not more. Because we definitely won’t put backups on the same disks as DATA. 10% for RECO is a minimum, I wouldn’t recommend setting less than that, Fast Recovery Area being always a problem if too small.

During deployment you’ll have to choose between NORMAL or HIGH redundancy. NORMAL is quite similar to RAID1, but at the block level and without requiring disk parity (you need 2 or more disks). HIGH is available starting from 3 disks and makes each block existing 3 times on 3 different disks. HIGH seems to be better, but you loose even more precious space, and it doesn’t protect you from other failures like disaster in your datacenter or user errors. Most of the failure protection systems embedded in the servers are actually doubling the components: power supplies, network interfaces, system disks, and so on. So increasing the security of block redundancy without increasing the security of other components is not necessary in my opinion. Real solution for increased failure protection is Data Guard or Dbvisit: 2 ODAs, in 2 different geographical regions, with databases replicated from 1 site to the other.

Estimate your storage needs for the next 5 years, and even more

Are you able to do that? It’s not that simple. Most of the time you can estimate for the next 2-3 years, but more than that is highly uncertain. Maybe a new project will start and will require much more storage? Maybe you will have to provision more databases for testing purpose? Maybe your main software will leave Oracle to go to MS SQL or PostgreSQL in 2 years? Maybe a new CTO will arrive and decide that Oracle is too expensive and will build a plan to get rid of Oracle. We never know what’s going to happen in such a long time. But at least you can provide an estimation with all the information you have now and your own margin.

Which margin should I choose?

You probably plan to monitor the free space on your ODA. Based on the classic threshold, higher than 85% of disk usage is something you should not reach. Because you may not have a solution for expanding storage. 75% is on my opinion a good space usage you shouldn’t reach on ODA. So consider 25% less usable space than available when you do your calculations.

Get bigger to last longer

I don’t like wasting money or resources for things that don’t need to, but in that particular case, I mean on ODA, after years working on X3-2, X4-2, and newer versions, I strongly advise to choose the maximum number of extensions you could. Maybe not 76TB on an ODA X8-2M if you only need 10TB, but 50TB is definitely more secure for 5 years and more. Buying new extensions could be challenging after 3 or 4 years, because you have no guarantee that these extensions will still be available. You can live with memory or CPU contentions, but without enough disk space, it’s much more difficult. Order your ODA fully loaded to make sure no extension will be needed.

The more disk you get, the more fast and secure you are

Last but not least, having more disks on your ODA maximize the throughput: because ASM is mirroring and stripping blocks on all the disks. For sure on NVMe disks you probably won’t use all that bandwidth. More disks also adds more security for your data. Loosing one disk in a 4-disk ODA requires the rebalancing of 25% of your data to the 3 safe disks, and rebalancing is not immediate. Loosing one disk in a 8-disk ODA requires the rebalancing of much less data, actually 12.5% assuming you have the same amount of data on the 2 configurations.

A simple example

You need a single-node ODA with expandable storage. So ODA X8-2M seems fine.

You have an overview of your databases growth trend and plan to double the size in 5 years. Starting from 6TB, you plan to reach 12TB at a maximum. As you are aware of the threshold you shouldn’t reach, you know that you’ll need 16TB of usable space for DATA (maximum of 75% of disk space used). You want to make sure to have enough FRA so you plan to set DATA/RECO ratio to 80%/20%. Your RECO should be set to 4TB. Your ODA disk configuration should have at least 20TB of usable disk space. A 8-disk ODA is 19.8TB of usable space, not enough. A 10-disk ODA is 24.7TB of usable space for 19.76TB of DATA and 4.94TB of RECO, 23% more than needed, a comfortable additional margin. And don’t hesitate to take a 12-disk ODA (1 more extension) if you want to secure your choice and be ready for unplanned changes.

Conclusion

Storage on ODA is quite expensive, but don’t forget that you may not find a solution for an ODA with insufficient storage. Take the time to make your calculation, keep a strong margin, and think long-term. Being long-term is definitely the purpose of an ODA.

Cet article Oracle Database Appliance: which storage capacity to choose? est apparu en premier sur Blog dbi services.

By Franck Pachot

.
I used to have many VirtualBox VMs on my laptop. But now, most of my labs are in the Cloud. Easy access from everywhere.

GCP

There’s the Google Cloud free VM which is not limited in time (I still have the 11g XE I’ve created 2 years ago running there) being able to use 40% of CPU with 2GB of RAM:


top - 21:53:10 up 16 min,  4 users,  load average: 9.39, 4.93, 2.09
Tasks:  58 total,   2 running,  56 sleeping,   0 stopped,   0 zombie
%Cpu(s): 12.9 us,  8.2 sy,  0.0 ni, 12.6 id,  0.0 wa,  0.0 hi,  0.3 si, 66.0 st
GiB Mem :    1.949 total,    0.072 free,    0.797 used,    1.080 buff/cache
GiB Swap:    0.750 total,    0.660 free,    0.090 used.    0.855 avail Mem

This is cool, always free but I cannot ssh to it: only accessible with Cloud Shell and it may take a few minutes to start.

AWS

I also use an AWS free tier but this one is limited in time 1 year.


top - 19:56:11 up 2 days, 13:09,  2 users,  load average: 1.00, 1.00, 1.00
Tasks: 110 total,   2 running,  72 sleeping,   0 stopped,   0 zombie
%Cpu(s): 12.4 us,  0.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si, 87.4 st
GiB Mem :      1.0 total,      0.1 free,      0.7 used,      0.2 buff/cache
GiB Swap:      0.0 total,      0.0 free,      0.0 used.      0.2 avail Mem

A bit more CPU throtteled by the hypervisor and only 1GB of RAM. However, I can create it with a public IP and access it though SSH. This is interesting especially to run other AWS services as the AWS CLI is installed here in this Amazon Linux. But limited in time and in credits.

OCI

Not limited in time, the Oracle Cloud OCI free tier allows 2 VMs with 1GB RAM and 2 vCPUs throttled to 1/8:


top - 20:01:37 up 54 days,  6:47,  1 user,  load average: 0.91, 0.64, 0.43
Tasks: 113 total,   4 running,  58 sleeping,   0 stopped,   0 zombie
%Cpu(s): 24.8 us,  0.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.2 si, 74.8 st
KiB Mem :   994500 total,   253108 free,   363664 used,   377728 buff/cache
KiB Swap:  8388604 total,  8336892 free,    51712 used.   396424 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
18666 opc       20   0  112124    748    660 R  25.2  0.1   6:20.18 yes
18927 opc       20   0  112124    700    612 R  24.8  0.1   0:06.53 yes

This is what I definitlely choose as a bastion host for my labs.

tmux

I run mainly one thing here: TMUX. I have windows and panes for all my work and it stays there in this always free VM that never stops. I just ssh to it and ‘tmux attach’ and I’m back to all my terminals opened.

Ksplice

This, always up, with ssh opened to the internet, must be secured. And as I have all my TMUX windows opened, I don’t want to reboot. No problem: I can update the kernel with the latest security patches without reboot. This is Oracle Linux and it includes Ksplice.
My VM is up since February:


[opc@b ~]$ uptime
 20:15:09 up 84 days, 15:29,  1 user,  load average: 0.16, 0.10, 0.03

The kernel is from October:


[opc@b ~]$ uname -r
4.14.35-1902.6.6.el7uek.x86_64

But I’m actually running a newer kernel:


[opc@b ~]$ sudo uptrack-show --available
Available updates:
None

Effective kernel version is 4.14.35-1902.301.1.el7uek
[opc@b ~]$

This kernel is from April. Yes, 2 months more recent than the last reboot. I have simply updated it with uptrack, which downloads and installs the latest Ksplice rebootless kernel updates:

Because there’s no time limit, no credit limit, always up, ready to ssh to it and running the latest security patches without reboot or even quiting my tmux sessions, the Oracle Autonomous Linux is the free tier compute instance I use everyday. You need to open a free trial to get it (https://www.oracle.com/cloud/free/) but it is easy to create a compute instance which is flagged ‘always free’.

Cet article Always free / always up tmux in the Oracle Cloud with KSplice updates est apparu en premier sur Blog dbi services.

During my last blog post I explained how to installed a MABS. Now that your infrastructure is setup you will be able to protect the workload of your Hyper-V Virtual Machines.

You have first to create a new protection group. This group will contain one or more VMs which will share the same backup configuration for retention, synchronization and recovery.
Protection is managed as follow:

DPM create a copy of the selected data on the DPM server
Those copies are synchronized with the data source and recovery point are created on a regular basis
Double backup protection via DPM:
- disk based: data replicas are stored on disk and periodic full backups are created
- online protection: regular backups of the protected data are done to Azure Backup

Go to the protection tab and select New, once done select Servers as you want to backup file and application servers where you already installed DPM protection agent. Those servers should be online during this protection process to be able to select the resources you want to backup:

Here I will select for my two Virtual machines a BMR (Bare Metal Recovery) & a System State to have all the necessary files to recover my VM in case of failure. Also for my SQL Server instance thor1\vienne the workload for the AdventureWordsDW2014 database and on instance thor2\vienne the complete SQL Server instance:

Next step is to give a name to your protection group and define the method to store your backup:

locally with short term protection using disk (mandatory if you select also online protection)
Online with Azure Backup

You can select both here to have your backups protected also on Azure:

Once done a retention period has to be defined for disk-based backups with a synchronization frequency, here 3 retention days on disk and incremental every fifteen minutes. You can also schedule an express full backup for application like SQL Server, here every day at 8PM. It will provide a recovery point for applications, either based on incremental backups or on express full backup:

Based on your prior selections DMP gives an overview of the disk space needed for your protection and lists the available disks:

You need to select the replica creation method you want to use to create the initial copy of your protected data to the MABS. By default, it will be over the network. If you use this method take care to do it during off-peak hours. Otherwise select manually to perform this transfer using removable media:

Select the way you want that DPM runs consistency check over replica copy. By default, it is done automatically but you can also schedule it. Same as before take care to do it during off-peak hours as it could require CPU and disk resources:

Once done you can select the data source which will be added to online protection within Azure, here I select all:

Next step is to specify how often incremental backups to Azure will occur. For each backup a recovery point is created in Azure from the copy of the backup data located on DPM disk.

A retention policy must be defined. It will configure how much time backups must be retained in Azure:

An initial copy of the protected data must be created to the Azure Backup storage. By default, it’s done over the network but in case of really big amount of data it could be done offline using import Azure feature:

DPM is ready to create the protection group. You can review the information and validate the creation:

You can review the creation of the protection group progress under Status:

Now that your workload are protected you are able to restore it in case of issue.
For example if you drop by error a table in one of your protected databases you can restore it from your Azure Service Vault.
Go to Recovery tab, browse the protection group up to the database you want to restore, select the recovery date and time you would like (here an online restore) and finally right click on the database name and select Recover:

Review the information of the items you want to recover and select which type of recover you want. Here I don’t want to overwrite my database so I will restore it to the same instance but with another name. I need also to specify where my database files will be located, you cannot here specify a different location for mdf and ldf files:

You can also specify some recover option:

throttle the network bandwidth depending on the time the recovery is running
enable SAN based recovery using hardware snapshots
send Email notification when the recovery is finished

After that you can review your recovery and start it:

It fails… because I didn’t have previously set a staging folder to store my Azure backup during a recovery. Indeed before the Azure backup is restored to its final destination it is first saved in a staging folder which will be cleaned at the end of the process.

To define this folder you need to go to the Management Tab, select the Online option and click the configure icon. In the Recovery Folder Settings option select a folder with enough space to manage your future recoveries:

Enter a passphrase to encrypt all backups from the server, a PIN code (which has to be generated from your Azure Service Vault) and it’s done.

Once done, you can go back to your recovery and now the recovery process is running. You can close the wizard and go to the Monitoring Tab to check the status of the job if needed:

Once the recovery job is finished successfully, I have my restored database in my instance and I can restore the table dropped by mistake:

Of course, the restored database is consistent and can be used by an application.

Microsoft Azure Backup Server is a really good solution to backup SQL Server instances. If you have disk space issue in your data center and want to get rid off backup space usage which can be very space consumer or get rid of your slow and ancient tape backups it’s a good option.
Moreover by default Recovery Service Vault uses Geo-redundant storage. It means that your data are copied synchronously 3 times locally via LRS in the primary region then copied asynchronously to a secondary region where LRS create 3 copies locally.
No way to lose any data with all those replicas

Cet article Use MABS to protect Hyper-V on-premises VMs workload est apparu en premier sur Blog dbi services.

Everybody has already faced performance problem with oracle CLOB columns.

The aim of this blog is to show you (always from a real user case) how to use one of Oracle Text Indexes (CONTEXT index) to solve performance problem with CLOB column.

The oracle text complete documentation is here : Text Application Developer’s Guide

Let’s start with the following SQL query which take more than 6.18 minutes to execute :

SQL> set timing on
SQL> set autotrace traceonly
SQL> select * from v_sc_case_pat_hist pat_hist where upper(pat_hist.note) LIKE '%FC IV%';

168 rows selected.

Elapsed: 00:06:18.09

Execution Plan
----------------------------------------------------------
Plan hash value: 1557300260

--------------------------------------------------------------------------------
---------------------

| Id  | Operation                           | Name          | Rows  | Bytes | Co
st (%CPU)| Time     |

--------------------------------------------------------------------------------
---------------------

|   0 | SELECT STATEMENT                    |               |   521 |  9455K| 24
285   (1)| 00:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| DWH_CODE      |     1 |    28 |
  2   (0)| 00:00:01 |

|*  2 |   INDEX SKIP SCAN                   | PK_DWH_CODE   |     1 |       |
  1   (0)| 00:00:01 |

|*  3 |  VIEW                               |               |   521 |  9455K| 24
285   (1)| 00:00:01 |

|*  4 |   TABLE ACCESS FULL                 | CASE_PAT_HIST |   521 |   241K| 24
283   (1)| 00:00:01 |

--------------------------------------------------------------------------------
---------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DWH_CODE_NAME"='DWH_PIT_DATE')
       filter("DWH_CODE_NAME"='DWH_PIT_DATE')
   3 - filter("DWH_VALID_FROM"<=TO_NUMBER("PT"."DWH_PIT_DATE") AND
              "DWH_VALID_TO">TO_NUMBER("PT"."DWH_PIT_DATE"))
   4 - filter(UPPER("NOTE") LIKE '%FC IV%')

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     134922  consistent gets
     106327  physical reads
        124  redo size
      94346  bytes sent via SQL*Net to client
      37657  bytes received via SQL*Net from client
        338  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        168  rows processed

SQL>

Checking the execution plan, oracle optimizer does a Full Scan to access the table CASE_PAT_HIST.

As the the sql function upper(pat_hist.note) is used, let’s try to create a function based index :

CREATE INDEX IDX_FBI_I1 ON SCORE.CASE_PAT_HIST (UPPER(note))
                                                *
ERROR at line 1:
ORA-02327: cannot create index on expression with datatype LOB

The message is clear, we cannot create an index on a column with dataype LOB.

Moreover :

Even if my column datatype would not be a CLOB, the search criteria LIKE ‘%FC IV%’ prevent the use of any index since the oracle optimizer has no idea from which letter the string get started, so it will scan the whole table.
Indeed, only the following search criteria will use the index :
1. LIKE ‘%FC IV’
2. LIKE ‘FC IV%’
3. LIKE ‘FC%IV’

So to improve the performance of my SQL query and to index my CLOB column, the solution is to create an Oracle Text Index :

In Oracle 12.1 release, three different Oracle Text Index exists :

CONTEXT: Suited for indexing collections or large coherent documents.
CTXCAT: Suited for small documents or text fragments.
CTXRULE: Used to build a document classification or routing application.

So, let’s try to create an oracle text index of type CONTEXT :

SQL> CREATE INDEX IDX_CPH_I3 ON SCORE.CASE_PAT_HIST LOB(note) INDEXTYPE IS CTXSYS.CONTEXT;

Index created.

Elapsed: 00:00:51.76
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('SCORE','CASE_PAT_HIST');

PL/SQL procedure successfully completed.

Elapsed: 00:00:25.20

Now we have to change the queries Where Clause in order to query it with the CONTAINS operator :

SQL> SELECT * from v_sc_case_pat_hist pat_hist WHERE CONTAINS(note, '%FC IV%', 1) > 0;

170 rows selected.

Elapsed: 00:00:00.82

Execution Plan
----------------------------------------------------------
Plan hash value: 768870586

--------------------------------------------------------------------------------
---------------------

| Id  | Operation                           | Name          | Rows  | Bytes | Co
st (%CPU)| Time     |

--------------------------------------------------------------------------------
---------------------

|   0 | SELECT STATEMENT                    |               |  2770 |    49M|  3
355   (1)| 00:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| DWH_CODE      |     1 |    28 |
  2   (0)| 00:00:01 |

|*  2 |   INDEX SKIP SCAN                   | PK_DWH_CODE   |     1 |       |
  1   (0)| 00:00:01 |

|*  3 |  VIEW                               |               |  2770 |    49M|  3
355   (1)| 00:00:01 |

|   4 |   TABLE ACCESS BY INDEX ROWID       | CASE_PAT_HIST |  2770 |  1284K|  3
353   (1)| 00:00:01 |

|*  5 |    DOMAIN INDEX                     | IDX_CPH_I3    |       |       |
483   (0)| 00:00:01 |

--------------------------------------------------------------------------------
---------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DWH_CODE_NAME"='DWH_PIT_DATE')
       filter("DWH_CODE_NAME"='DWH_PIT_DATE')
   3 - filter("DWH_VALID_FROM"<=TO_NUMBER("PT"."DWH_PIT_DATE") AND
              "DWH_VALID_TO">TO_NUMBER("PT"."DWH_PIT_DATE"))
   5 - access("CTXSYS"."CONTAINS"("NOTE",'%FC IV%',1)>0)

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement


Statistics
----------------------------------------------------------
         59  recursive calls
          0  db block gets
       3175  consistent gets
        417  physical reads
        176  redo size
      95406  bytes sent via SQL*Net to client
      38098  bytes received via SQL*Net from client
        342  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        170  rows processed

Now the query is executed in just few millisecond.

By checking the execution plan, we note an access to the table SCORE.CASE_PAT_HIST within a DOMAIN INDEX represented by our Context Index (IDX_CPH_I3).

Now let’s compare the results given by the LIKE and the CONTAINS operators:

SQL> SELECT count(*) from v_sc_case_pat_hist pat_hist WHERE upper(note) LIKE '%FC IV%'
  2  UNION ALL
  3  SELECT count(*) from v_sc_case_pat_hist pat_hist WHERE CONTAINS(note, '%FC IV%', 1) > 1;

  COUNT(*)
----------
       168
       170

The CONTAINS clause return 2 rows in more, let’s checking :

SQL> select note from v_sc_case_pat_hist pat_hist WHERE CONTAINS(note, '%FC IV%', 1) > 1 and case_id in (1,2);

NOTE
--------------------------------------------------------------------------------
Text Before , functional class (FC) IV
Text Before , WHO FC-IV

For the LIKE clause, the wildcard %FC IV% returns :

Text Before FC IV
FC IV Text After
Text Before FC IV Text After

For the CONTAINS clause, the wildcard %FC IV% returns :

Text Before FC IV
FC IV Text After
Text Before FC IV Text After
Text Before FC%IV
FC%IV Text After
Text Before FC%IV Text After

So, in term of functionality LIKE and CONTAINS clause are not exactly the same since the former returns less data than the last one.

If we translate the CONTAINS clause in the LIKE clause, we should write : “LIKE ‘%FC%IV%'”

For my customer case, the CONTAINS clause is correct, the business confirmed me this datas must be returned.

The CONTEXT oracle text index has some limitations :

You cannot using several CONTAINS through the operand OR / AND, you will face oracle error “ORA-29907: found duplicate labels in primary invocations”

SQL> SELECT count(*) from v_sc_case_pat_hist pat_hist WHERE CONTAINS(note, '%FC IV%', 1) > 1 or CONTAINS(note, '%FC TZ%', 1) > 1;
SELECT count(*) from v_sc_case_pat_hist pat_hist WHERE CONTAINS(note, '%FC IV%', 1) > 1 or CONTAINS(note, '%BE TZ%', 1) > 1
                                                                                           *
ERROR at line 1:
ORA-29907: found duplicate labels in primary invocations

To solve this issue, let’s rewrite the SQL through an UNION clause :

SQL> SELECT count(*) from v_sc_case_pat_hist pat_hist WHERE CONTAINS(note, '%FC IV%', 1) > 1
  2  UNION
  3  SELECT count(*) from v_sc_case_pat_hist pat_hist WHERE CONTAINS(note, '%BE TZ%', 1) > 1;

  COUNT(*)
----------
       170
       112

Conclusion :

The benefits of a creating Oracle Text Index (CONTEXT index in our case) include fast response time for text queries with the CONTAINS, CATSEARCH and MATCHES Oracle Text operators.We decrease the response tie from 6.18 mins to few milliseconds.
CATSEARCH and MATCHES are respectively the operators used for CTXCAT index and CTXRULE index I will present you in a next blog.
Transparent Data Encryption-enabled column does not support Oracle Text Indexes.
Always check if the data returned by the CONTAINS clause correspond to your business needs.

Cet article Oracle Text : Using and Indexing – the CONTEXT Index est apparu en premier sur Blog dbi services.

What is Nagios ?

“Nagios is a powerful monitoring system that enables organizations to identify and resolve IT infrastructure problems before they affect critical business processes.” https://www.nagios.org/

In simple words, you can monitor your servers (linux, MSSSQL, etc …) and databases (Oracle, SQL Server, Postgres, MySQL, MariaDB, etc …) with nagios.

We use the free version !!!

VM configuration :

OS     : CentOS Linux 7 
CPU    : Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz 
Memory : 3GB 
Disk   : 15GB

What we need :

We need Nagios Core. This is the brain of our Nagios server. All the configuration will be done in this part (set the contacts, set the notification massages, etc …)
We must install Nagios plugins. Plugins are standalone extensions to Nagios Core that make it possible to monitor anything and everything with Core. Plugins process command-line arguments, perform a specific check, and then return the results to Nagios Core
The NRPE addon is designed to allow you to execute Nagios plugins on remote Linux/Unix machines. The main reason for doing this is to allow Nagios to monitor “local” resources (like CPU load, memory usage,etc.) on remote machines. Since these public resources are not usually exposed to external machines, an agent like NRPE must be installed on the remote Linux/Unix machines.

Preconditions

All below installation are as root user.

Installation steps

Install Nagios Core (I will not explain it. Because their documentation is complete)
support.nagios.com/kb/article/nagios-core-installing-nagios-core-from-source-96.html#CentOS
Install Nagios Plugins
support.nagios.com/kb/article/nagios-core-installing-nagios-core-from-source-96.html#CentOS
Install NRPE
https://support.nagios.com/kb/article.php?id=515
Now you must decide which type of database you want to monitor. Then install the check_health plugin for it (you can install all of them if you want)
Here we install the Oracle and SQL Server check_health

Install and configure Oracle Check_health

You need Oracle client to communicate with an Oracle instance

Download and install check_oracle_health (https://labs.consol.de/nagios/check_oracle_health/index.html)

wget https://labs.consol.de/assets/downloads/nagios/check_oracle_health-3.2.1.2.tar.gz 
tar xvfz check_oracle_health-3.2.1.2.tar.gz 
cd check_oracle_health-3.2.1.2 
./configure 
make 
make install

Download and install Oracle Client (here we installed 12c version – https://www.oracle.com/database/technologies/oracle12c-linux-12201-downloads.html)
Create check_oracle_health_wrapped file in /usr/local/nagios/libexec

Set parameters+variables needed to start the plugin check_oracle_health

#!/bin/sh
### -------------------------------------------------------------------------------- ###
### We set some environment variable before to start the check_oracle_health script.
### -------------------------------------------------------------------------------- ###
### Set parameters+variables needed to start the plugin check_oracle_health:

export ORACLE_HOME=/u01/app/oracle/product/12.2.0/client_1
export LD_LIBRARY_PATH=$ORACLE_HOME/lib
export TNS_ADMIN=/usr/local/nagios/tns
export PATH=$PATH:$ORACLE_HOME/bin

export ARGS="$*"

### start the plugin check_oracle_health with the arguments of the Nagios Service:

/usr/local/nagios/libexec/check_oracle_health $ARGS

Create the tns folder in /usr/local/nagios and then make a tnsnames.ora file and add a tns

DBTEST =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 172.22.10.2)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SID = DBTEST)
    )
  )

Test the connection

[nagios@vmnagios objects]$ check_oracle_health_wrapped --connect DBTEST --mode tnsping

Install and configure MSSQL Check_health

Download and install the check_mssql_health

wget https://labs.consol.de/assets/downloads/nagios/check_mssql_health-2.6.4.16.tar.gz
tar xvfz check_mssql_health-2.6.4.16.tar.gz
cd check_mssql_health-2.6.4.16
./configure 
make 
make install

Download and install freetds (www.freetds.org/software.html)

wget ftp://ftp.freetds.org/pub/freetds/stable/freetds-1.1.20.tar.gz
tar xvfz freetds-1.1.20
./configure --prefix=/usr/local/freetds
make
make install
yum install freedts freetds-devel gcc make perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker

Download and install DBD-Sybase

wget http://search.cpan.org/CPAN/authors/id/M/ME/MEWP/DBD-Sybase-1.15.tar.gz
tar xvfz DBD-Sybase-1.15
cd DBD-Sybase-1.15
export SYBASE=/usr/local/freetds
perl Makefile.PL
make
make install

Add your instance information in : /usr/local/freetds/etc/freetds.conf

Configuration steps

Set your domaine name

[root@vmnagios ~]# cat /etc/resolv.conf
# Generated by NetworkManager
search exemple.ads
nameserver 10.175.222.10

Set SMPT and Postfix

[root@vmnagios ~]# /etc/postfix/main.cf
### ----------------- added by dbi-services ---------------- ###
relayhost = http://smtp.exemple.net
smtp_generic_maps = hash:/etc/postfix/generic
sender_canonical_maps = hash:/etc/postfix/canonical
-------------------------------------------------------- ###

Configure Postfix

[root@vmnagios ~]# cat /etc/postfix/generic
@localdomain.local dba@exemple.com
@.exemple.ads dba@exemple.com

[root@vmnagios ~]# cat /etc/postfix/canonical
root dba@exemple.com
nagios dba@exemple.com

After we need to generate the generic.db and canonical.db
postmap /etc/postfix/generic
postmap /etc/postfix/canonical

Done !

Now you have a new Nagios Server. All you should do is to configure your client and create a config file on your brand new Nagios server.

Cet article Install & configure a Nagios Server est apparu en premier sur Blog dbi services.

The SQL Server environment I worked with today has dozens of SQL Server instances using AlwaysOn Availability Groups for High Availability.
When a login is created on the Primary replica of an Availability Group it is not synchronized automatically on secondary replicas. This might cause some issues after a failover (Failed logins).

Since this is not done automatically by SQL Server out of the box the DBA has to perform this task.
To avoid doing this with T-SQL (sp_help_revlogin) or SSMS I use the magical dbatools and perform the following tasks once a week.

Get the number of logins on each instance.

$primary = Get-DbaLogin -SqlInstance 'SQL01\APPX'
$primary.Count

$secondary = Get-DbaLogin -SqlInstance 'SQL02\APPX'
$secondary.Count

PS C:\> $primary.Count
41
PS C:\> $secondary.Count
40

If numbers don’t match, I use the Copy-Login function to synchronize the missing login.

Copy-DbaLogin -Source 'SQL01\APPX' -Destination 'SQL02\APPX' `
	-Login (($primary | Where-Object Name -notin $secondary.Name).Name)

PS C:\>
Type             Name         Status     Notes
----             ----         ------     -----
Login - SqlLogin loginName    Successful

Obviously, there are many drawbacks to this process;

Having the same number of logins doesn’t mean they are actually the same.
Some logins can be missing on both sides compared to the other one and both instances have the same number of logins.
I need to know which instance is the current primary replica (-Source in Copy-DbaLogin)
This is a manual process I do on every pair of instances using AlwaysOn.
I want a script that can manage any number of secondary replica

So I decided to write a new script that would automatically synchronize login from primary replicas to all secondary replicas. The only parameter I want to use as input for this script is the name of the listener.

Here is the simplest version of this script I could write;

$lsn = 'APP01-LSTN'

$primaryReplica =    Get-DbaAgReplica -SqlInstance $lsn | Where Role -eq Primary
$secondaryReplicas = Get-DbaAgReplica -SqlInstance $lsn | Where Role -eq Secondary

# primary replica logins
$primaryLogins = (Get-DbaLogin -SqlInstance $primaryReplica.Name)

$secondaryReplicas | ForEach-Object {
    # secondary replica logins
    $secondaryLogins = (Get-DbaLogin -SqlInstance $_.Name)

    $diff = $primaryLogins | Where-Object Name -notin ($secondaryLogins.Name)
    if($diff) {
        Copy-DbaLogin -Source $primaryReplica.Name -Destination $_.Name -Login $diff.Nane
    }   
}

Using just the listener name with Get-DbaAgReplica I can get all the replicas by Role, either Primary or Secondary.
Then I just need to loop through the secondary replicas and call Copy-DbaLogin.

I use a Central Management Server as an inventory for my SQL servers. I have groups containing only listeners.

The list of listeners can be easily retrieved from the CMS with Get-DbaRegisteredServer.

$Listeners= Get-DbaRegisteredServer -SqlInstance 'MyCmsInstance' | Where-Object {$_.Group -Like '*Prod*Listener'};

Now, looping through each listener I can sync dozens of secondary replicas in my SQL Server Prod environment with a single script run.
I had some issues with instances having multiple availability groups so I added: “Sort-Object -Unique”.
Notice I also filtered out some logins I don’t want to synchronize.

$Listeners = Get-DbaRegisteredServer -SqlInstance 'MyCmsInstance' | Where-Object {$_.Group -Like '*Prod*Listener*'};

foreach ($lsn in $Listeners) {

    $primaryReplica =    Get-DbaAgReplica -SqlInstance $lsn.ServerName | Where Role -eq Primary | Sort-Object Name -Unique
    $secondaryReplicas = Get-DbaAgReplica -SqlInstance $lsn.ServerName | Where Role -eq Secondary | Sort-Object Name -Unique
    <#
    Some instances have more than 1 AvailabilityGroup
        => Added Sort-Object -Unique
    #>

    # primary replica logins
    $primaryLogins = (Get-DbaLogin -SqlInstance $primaryReplica.Name -ExcludeFilter '##*','NT *','BUILTIN*', '*$')
    
    $secondaryReplicas | ForEach-Object {
        # secondary replica logins
        $secondaryLogins = (Get-DbaLogin -SqlInstance $_.Name -ExcludeFilter '##*','NT *','BUILTIN*', '*$')
        
        $diff = $primaryLogins | Where-Object Name -notin ($secondaryLogins.Name)
        if($diff) {
            Copy-DbaLogin -Source $primaryReplica.Name -Destination $_.Name -Login ($diff.Nane) -Whatif
        } 
    }  
}

Do not test this script in Production. Try it in a safe environment first, then remove the “-WhatIf” switch.
The next step for me might be to run this script on a schedule. Or even better, trigger the execution after an Availability Group failover?

Copy-DbaLogin is one of many dbatools commands that can be very useful to synchronize objects between instances. You can find a few examples below.

Cet article SQL Server: Synchronize logins on AlwaysOn replicas with dbatools est apparu en premier sur Blog dbi services.

By Franck Pachot

.
Note that I’ve written previously about Oracle Standard Edition 2 licensing before but a few rules change. This is written in May 2020.
TL;DR: 4 vCPU count for 1 socket and 2 sockets count for 1 server wherever hyper-threading is enabled or not.

The SE2 rules

I think the Standard Edition rules are quite clear now: maximum server capacity, cluster limit, minimum NUP, and processor metric. Oracle has them in the Database Licensing guideline.

2 socket capacity per server

Oracle Database Standard Edition 2 may only be licensed on servers that have a maximum capacity of 2 sockets.

We are talking about capacity which means that even when you remove a processor from a 4 socket server, it is still a 4 socket server. You cannot run Standard Edition if the servers have be the possibility for more than 2 sockets per server whether there is a processor in the socket or not.

2 socket used per cluster

When used with Oracle Real Application Clusters, Oracle Database Standard Edition 2 may only be licensed on a maximum of 2 one-socket servers

This one is not about capacity. You can remove a processor from a bi-socket to become a one-socket server, and then build a cluster running RAC in Standard Edition with 2 of those nodes. The good thing is that you can even use Oracle hypervisor (OVM or KVM), LPAR or Zones to pin one socket only for the usage of Oracle, and use the other for something else. The bad thing is that as of 19c, RAC with Standard Edition is not possible anymore. You can run the new SE HA which allows more on one node (up to the 2 socket rule) because the other node is stopped (the 10 days rule).

At least 10 NUP per server

The minimum when licensing by Named User Plus (NUP) metric is 10 NUP licenses per server.

Even when you didn’t choose the processor metric you need to count the servers. For example, if your vSphere cluster runs on 4 bi-socket servers, you need to buy 40 NUP licenses even if you can count a smaller population of users.

Processor metric

When licensing Oracle programs with … Standard Edition in the product name, a processor is counted equivalent to a socket; however, in the case of multi-chip modules, each chip in the multi-chip module is counted as one occupied socket.

A socket is a plastic slot where you can put a processor in it. This is what counts for the “2 socket capacity per server”. An occupied socket is one with a processor, physically or pinned with an accepted hard partitioning hypervisor method (Solaris Zones, IBM LPAR, Oracle OVM or KVM,…). This is what counts for the “2 socket occupied per cluster” rule. Intel is not concerned by the multi-module chip exception.

What about the cloud?

So, the rules mention servers, sockets and processors. How does this apply to modern computing where you provision a number of vCPU without knowing anything about the underlying hardware? In the AWS shared responsibility model you are responsible for the Oracle Licences (BYOL – Bring Your Own Licences) but they are responsible for the physical servers.

Oracle established the rules (which may or may not be referenced by your contract) in the Licensing Oracle Software in the Cloud Computing Environment (for educational purposes only – refer to your contract if you want the legal interpretation).

This document is only for AWS and Azure. There’s no agreement with Google Cloud and then you cannot run an Oracle software under license. Same without your local cloud provider: you are reduced to hosting on physical servers. The Oracle Public Cloud has its own rules and you can license Standard Edition on a compute instance with up to 16 OCPU and one processor license covers 4 OCPU (which is 2 hyper-threaded Intel cores).

Oracle authorizes to run on those 2 competitor public clouds. But they generally double the licenses required on competitor platforms in order to be cheaper on their own. They did that on-premises a long time ago for IBM processors and they do that now for Amazon AWS and Microsoft Azure.

So, the arithmetic is based on the following idea: 4 vCPU counts for 1 socket and 2 sockets counts for 1 server

Note that there was a time where it was 1 socket = 2 cores which meant that it was 4 vCPU when hyper-threading is enabled but 2 vCPU when not. They have changed the document and we count vCPU without looking at cores or threads. Needless to say that for optimal performance/price in SE you should disable hyper-threading in AWS in order to have your processes running on full cores. And use instance caging to limit the user sessions in order to leave a core available for the background processes.

Here are the rules:

2 socket capacity per server: maximum 8 vCPU
2 socket occupied per cluster: forget about RAC in Standard Edition and RAC in AWS
Minimum NUP: 10 NUP are ok to cover the maximum allowed 8 vCPU
Processor metric: 1 license covers 4 vCPU

Example

The maximum you can use for one database:
2 SE2 processor licences = 1 server = 2 sockets = 8 AWS vCPU
2 SE2 processor licences = 8 cores = 16 OCPU in Oracle Cloud

The cheaper option means smaller capacity:
1 SE2 processor licences = 1 sockets = 4 AWS vCPU
1 SE2 processor licences = 4 cores = 8 OCPU in Oracle Cloud

As you can see The difference between Standard and Enterprise Edition in the clouds is much smaller than on-premises where a socket can run more and more cores. The per-socket licensing was made at a time where processors had only a few cores. With the evolution, Oracle realized that SE was too cheap. They caged the SE2 usage to 16 threads per database and limit further on their competitor’s cloud. Those limits are not technical but governed by revenue management: they provide a lot of features in SE but also need to ensure that large companies still require EE.

But…

… there’s always an exception. It seems that Amazon has a special deal to allow Oracle Standard Edition on AWS RDS with EC2 instances up to 16 vCPU:

You know that I always try to test what I’m writing in a blog post. So, at least as of the publishing date and with the tested versions, it gets some verified facts.
I started an AWS RDS Oracle database on db.m4.4xlarge which is 16 vCPU. I’ve installed the instant client in my bastion console to access it:


sudo yum localinstall http://yum.oracle.com/repo/OracleLinux/OL7/oracle/instantclient/x86_64/getPackage/oracle-instantclient19.5-basic-19.5.0.0.0-1.x86_64.rpm
sudo yum localinstall http://yum.oracle.com/repo/OracleLinux/OL7/oracle/instantclient/x86_64/getPackage/oracle-instantclient19.5-sqlplus-19.5.0.0.0-1.x86_64.rpm

This is Standard Edition 2:


[ec2-user@ip-10-0-2-28 ~]$ sqlplus admin/FranckPachot@//database-1.ce45l0qjpoax.us-east-1.rds.amazonaws.com/ORCL

SQL*Plus: Release 19.0.0.0.0 - Production on Tue May 19 21:32:47 2020
Version 19.5.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Last Successful login time: Tue May 19 2020 21:32:38 +00:00

Connected to:
Oracle Database 19c Standard Edition 2 Release 19.0.0.0.0 - Production
Version 19.7.0.0.0

On 16 vCPU:


SQL> show parameter cpu_count

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cpu_count                            integer     16

On AWS:

SQL> host curl http://169.254.169.254/latest/meta-data/services/domain
amazonaws.com

With more than 16 threads in CPU:

SQL> @ snapper.sql ash 10 1 all
Sampling SID all with interval 10 seconds, taking 1 snapshots...

-- Session Snapper v4.31 - by Tanel Poder ( http://blog.tanelpoder.com/snapper ) - Enjoy the Most Advanced Oracle Troubleshooting Script on the Planet! :)


---------------------------------------------------------------------------------------------------------------
  ActSes   %Thread | INST | SQL_ID          | SQL_CHILD | EVENT                               | WAIT_CLASS
---------------------------------------------------------------------------------------------------------------
   19.29   (1929%) |    1 | 3zkr1jbq4ufuk   | 0         | ON CPU                              | ON CPU
    2.71    (271%) |    1 | 3zkr1jbq4ufuk   | 0         | resmgr:cpu quantum                  | Scheduler
     .06      (6%) |    1 |                 | 0         | ON CPU                              | ON CPU

--  End of ASH snap 1, end=2020-05-19 21:34:00, seconds=10, samples_taken=49, AAS=22.1

PL/SQL procedure successfully completed.

I also checked on CloudWatch (the AWS monitoring from the hypervisor) that I am running 100% on CPU.

I tested this on a very limited time free lab environment (this configuration is expensive) and didn’t check whether hyperthreading was enabled or not (my guess is: disabled) and I didn’t test if setting CPU_COUNT would enable instance caging (SE2 is supposed to be internally caged at 16 CPUs but I see more sessions on CPU there).

Of course, I shared my surprise (follow me on Twitter if you like this kind of short info about databases – I don’t really look at the numbers but it seems I may reach 5000 followers soon so I’ll continue at the same rate):

Do you think that running @OracleDatabase Standard Edition 2 in @awscloud is limited to 8 vCPU, as mentioned in https://t.co/J5IIQl49t1 Licensing Oracle Software in the Cloud Computing Environment?
Seems we can go further. This is RDS db.m4.4xlarge: pic.twitter.com/denby1JaLM

— Franck Pachot (@FranckPachot) May 19, 2020

and I’ll update this post when I have more info about this.

Cet article Oracle Standard Edition on AWS ☁ socket arithmetic est apparu en premier sur Blog dbi services.

PostgreSQL, like all other database engines, modifies the table and index blocks in shared buffers. People think that the main goal of buffered reads is to act as a cache to avoid reading from disk. But that’s not the main reason as this is not mandatory. For example PostgreSQL expects that the filesystem cache is used. The primary goal of shared buffers is simply to share them because multiple sessions may want to read a write the same blocks and concurrent access is managed at block level in memory. Without shared buffers, you would need to lock a whole table. Most of the database engines use the shared buffers for caching. Allocating more memory can keep the frequently used blocks in memory rather than accessing disk. And because they manage the cache with methods suited to the database (performance and reliability) they don’t need another level of cache and recommend direct I/O to the database files. But not with Postgres. In order to keep the database engine simple and portable, PostgreSQL relies on the filesystem cache. For example, no multiblock read is implemented. Reading contiguous blocks should be optimized by the filesystem read-ahead.
But this simplicity of code implies a complexity for configuring a PostgreSQL database system. How much to set for the shared_buffers? And how much to keep free in the OS to be used for the filesystem cache?

I am not giving any answer here. And I think there is, unfortunately, no answer. The documentation is vague. It defines the recommended value as a percentage of the available RAM. That makes no sense for me. The cache is there to keep frequently used blocks and that depends on your application workload. Not on the available RAM. There is also this idea that because there is double buffering, you should allocate the same size to both caches. But that makes no sense again. If you keep blocks in the shared buffers, they will not be frequently accessed from the filesystem and will not stay in the filesystem cache. Finally, don’t think the defaults are good. The default shared_buffers is always too small and the reason is that a small value eases the installation on Linux without having to think about SHM size. But you have more, need more, and will need to configure SHM and huge pages for it.

My only recommendation is to understand your workload: what is the set of data that is frequently used (lookup tables, index branches, active customers, last orders,…). This should fit in the shared buffers. And the second recommendation is to understand how it works. For the set of data that is frequently read only, this is not so difficult. You need to have an idea about the postgres cache algorithm (LRU/Clock sweep in shared buffers which is static size) and the filesystem cache (LRU with the variable size of free memory). For the set of data that is modified, this much more complex.

Let’s start with the reads. As an example I have run a workload that reads at random within a small set of 150MB. I used pgio from Kevin Closson for that. I’ve executed multiple runs with warying the shared_buffers from smaller than my work set, 50MB, to 2x larger: 50MB, 100MB, 150MB, 200MB, 250MB, 300MB. Then, for each size of shared buffers, I’ve run with variations in the available RAM (which is used by the filesystem cache): 50MB, 100MB, 150MB, 200MB, 250MB, 300MB, 350MB, 400MB. I ensured that I have enough physical memory so that the system does not swap.

I determined the free space in my system:


[ec2-user@ip-172-31-37-67 postgresql]$ sync ; sync ; sync ; free -hw
              total        used        free      shared     buffers       cache   available
              total        used        free      shared     buffers       cache   available
Mem:           983M         63M        835M        436K          0B         84M        807M
Swap:            0B          0B          0B

I also use https://medium.com/@FranckPachot/proc-meminfo-formatted-for-humans-350c6bebc380


[ec2-user@ip-172-31-37-67 postgresql]$ awk '/Hugepagesize:/{p=$2} / 0 /{next} / kB$/{v[sprintf("%9d MB %-s",int($2/1024),$0)]=$2;next} {h[$0]=$2} /HugePages_Total/{hpt=$2} /HugePages_Free/{hpf=$2} {h["HugePages Used (Total-Free)"]=hpt-hpf} END{for(k in v) print sprintf("%-60s %10d",k,v[k]/p); for (k in h) print sprintf("%9d MB %-s",p*h[k]/1024,k)}' /proc/meminfo|sort -nr|grep --color=auto -iE "^|( HugePage)[^:]*" #awk #meminfo                                                                           33554431 MB VmallocTotal:   34359738367 kB                    16777215
      983 MB MemTotal:        1006964 kB                            491
      946 MB DirectMap2M:      968704 kB                            473
      835 MB MemFree:          855628 kB                            417
      808 MB MemAvailable:     827392 kB                            404
      491 MB CommitLimit:      503480 kB                            245
      200 MB Committed_AS:     205416 kB                            100
       78 MB DirectMap4k:       79872 kB                             39
       69 MB Cached:            71436 kB                             34
       55 MB Active:            56640 kB                             27
       43 MB Inactive:          44036 kB                             21
       42 MB Inactive(file):    43612 kB                             21
       29 MB Slab:              30000 kB                             14
       28 MB AnonPages:         29260 kB                             14
       28 MB Active(anon):      29252 kB                             14
       26 MB Active(file):      27388 kB                             13
       14 MB SUnreclaim:        14876 kB                              7
       14 MB SReclaimable:      15124 kB                              7
       13 MB Mapped:            14212 kB                              6
        4 MB PageTables:         4940 kB                              2
        2 MB Hugepagesize:       2048 kB                              1
        1 MB KernelStack:        1952 kB                              0
        0 MB Shmem:               436 kB                              0
        0 MB Inactive(anon):      424 kB                              0
        0 MB HugePages Used (Total-Free)
        0 MB HugePages_Total:       0
        0 MB HugePages_Surp:        0
        0 MB HugePages_Rsvd:        0
        0 MB HugePages_Free:        0
        0 MB Dirty:                 4 kB                              0

Then in order to control how much free RAM I want to set ($fs_MB) I allocate the remaining as huge pages:


sudo bash -c "echo 'vm.nr_hugepages = $(( (835 - $fs_MB) / 2 ))' > /etc/sysctl.d/42-hugepages.conf ; sysctl --system"

This limits the RAM available for the fileystem cache because huges pages cannot be used for it. And the huges pages can be used for the postgres shared buffers:


sed -i -e "/shared_buffers/s/.*/shared_buffers = ${pg_MB}MB/" -e "/huge_pages =/s/.*/huge_pages = on/" $PGDATA/postgresql.conf
grep -E "(shared_buffers|huge_pages).*=" $PGDATA/postgresql.conf
/usr/local/pgsql/bin/pg_ctl -l $PGDATA/logfile restart

Note that I actually stopped postgres before to be sure that no huge pages are used when resizing them:


for pg_MB in 50 100 150 200 250 300 350 400
do
for fs_MB in 400 350 300 250 200 150 100 50
do
/usr/local/pgsql/bin/pg_ctl -l $PGDATA/logfile stop
# set huge pages and shared buffers
/usr/local/pgsql/bin/pg_ctl -l $PGDATA/logfile restart
# run pgio runit.sh
done
done

100% Read only workload on 150MB

Here is the result. Each slide on the z-axis is a size of shared-buffers allocated by postgres. On the x-axis the size of the available RAM that can be used for filesystem cache by the Linux kernel. the y-axis is the number of tuples read during the run.

You will never get optimal performance when the frequent read set doesn’t fit in shared buffers. When the read set is larger than the shared buffers, you need more RAM in order to get lower performance. The frequently read set of data should fit in shared buffers.

50% updates on 150MB

Here is the same run where I only changed PCT_UPDATE to 50 in pgio.conf

This looks similar but there are two main differences, one visible here and another that is not represented in this graphic because I aggregated several runs.

First, increasing the shared buffers above the set of frequently manipulated data still improves performance, which was not the case with reads. As soon as the shared buffer is above the working set of 150MB the buffer cache hit ratio is at 100%. But that’s for reads. Updates generate a new version of data and both versions will have to be vacuumed and checkpoint.

Here is a graph about blks_read/s which shows that for a read-only workload we do not do any physical reads (I/O calls from the database to the filesystem) as soon as the working set fits in shared buffers. When we write, the read calls still improve when we increase the shared buffers a bit above the modified set. And the physical read efficiency is the best when there is as much free RAM as shared buffers.

Second point about the write workload, performance is not homogenous at all. Vacuum and Checkpoint happen at regular intervals and make the performance un-predictable. When showing the tuples/second for the 50% write workload, I aggregated many runs to display the maximum throughput achieved. Having the modified set of data fitting in free RAM helps to lower this variation as it avoids immediate physical write latency. The balance between shared buffers and free RAM is then a balance between high performance and predictability of performance: keep free ram as a performance “buffer” between userspace access and disk reads. There are also many parameters to “tune” this like with the frequency of vacuum and checkpoints. And this makes memory tuning very complex. Note that I changed only the shared_buffers here. When increasing shared_buffers for high write throughput, there are other structures to increase like the WAL segments.

The filesystem cache adds another level of un-predictable performance. For example, you may run a backup that reads all the database, bringing the blocks into the Least Recently Used. And I didn’t do any sequential scans here. They benefit from filesystem buffers with pre-fetching. All theses make any quick recommendation incorrect. Buffer cache hit ratios make no sense to estimate the response time improvement as they are not aware of the filesystem cache hits. But looking at them per relation, tables or indexes may help to estimate which relation is frequently accessed. Because that’s what matters: not the size of the database, not the size of your server RAM, not the general buffer cache hit ratio, but the size of data that is read and written frequently by your application.

Cet article PostgreSQL Shared Buffers vs free RAM est apparu en premier sur Blog dbi services.