Quantcast
Channel: Blog dbi services
Viewing all 1431 articles
Browse latest View live

SQLNET.EXPIRE_TIME and ENABLE=BROKEN

$
0
0

By Franck Pachot

.
Those parameters, SQLNET.EXPIRE_TIME in sqlnet.ora and ENABLE=BROKEN in a connection description exist for a long time but may have changed in behavior. They are both related to detecting dead TCP connections with keep-alive probes. The former from the server, and the latter from the client.

The change in 12c is described in the following MOS note: Oracle Net 12c: Changes to the Functionality of Dead Connection Detection (DCD) (Doc ID 1591874.1). Basically instead sending a TNS packet for the keep-alive, the server Dead Connection Detection now relies on the TCP keep-alive feature when available. The note mentions that it may be required to set (ENABLE=BROKEN) in the connection string “in some circumstances” - which is not very precise. This “ENABLE=BROKEN” was used in the past for transparent failover when we had no VIP (virtual IP) in order to detect a lost connection to the server.

I don’t like those statements like “on some platform”, “in some circumstances”, “with some drivers”, “it may be necessary”… so there’s only one solution: test it in your context.

My listener is on (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=localhost)(PORT=1521))) and I will connect to it and keep my connection idle (no user call to the server).I trace the server (through the forks of the listener, found by pgrep with the name of listener associated with this TCP address) and color it in green (GREP_COLORS=’ms=01;32′):

pkill strace ; strace -fyye trace=socket,setsockopt -p $(pgrep -f "tnslsnr $(lsnrctl status "(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=localhost)(PORT=1521)))" | awk '/^Alias/{print $2}') ") 2>&1 | GREP_COLORS='ms=01;32' grep --color=auto -E '^|.*sock.*|^=*' &

I trace the client and color it in yellow (GREP_COLORS=’ms=01;32′):

strace -fyye trace=socket,setsockopt sqlplus demo/demo@"(DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=PDB1))(ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=1521)))" <<<quit 2>&1 | GREP_COLORS='ms=01;33' grep --color=auto -E '^|.*sock.*|^=*'

I’m mainly interested by the setsockopt() here because this is how to enable TCP Keep Alive.

(ENABLE=BROKEN) on the client

My first test is without enabling DCD on the server: I have nothing defined in sqlnet.ora on the server side. I connect from the client without mentioning “ENABLE=BROKEN”:


The server (green) has set SO_KEEPALIVE but not the client.

Now I run the same scenario but adding (ENABLE=BROKEN) in the description:

strace -fyye trace=socket,setsockopt sqlplus demo/demo@"(DESCRIPTION=(ENABLE=BROKEN)(CONNECT_DATA=(SERVICE_NAME=PDB1))(ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=1521)))" <<<quit 2>&1 | GREP_COLORS='ms=01;33' grep --color=auto -E '^|.*sock.*|^=*'

The client (yellow) has now a call to set keep-alive:

setsockopt(9<TCP:[1810151]>, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0

However, as I’ll show later, this uses the TCP defaults:

[oracle@db195 tmp]$ tail /proc/sys/net/ipv4/tcp_keepalive*
==> /proc/sys/net/ipv4/tcp_keepalive_intvl <== 
75
==> /proc/sys/net/ipv4/tcp_keepalive_probes <== 
9
==> /proc/sys/net/ipv4/tcp_keepalive_time <== 
7200

After 2 hours (7200 seconds) of idle connection, the client will send a probe 9 times, every 75 seconds. If you want to reduce it you must change it on the client system settings. If you don’t add “(ENABLE=BROKEN)” the dead broken connection will not be detected before then next user call, after the default TCP timeout (15 minutes).

That’s only from the client when its connection to the server is lost.

SQLNET.EXPIRE_TIME on the server

On the server side, we have seen that SO_KEEPALIVE is set - using the TCP defaults. But, there, it may be important to detect dead connections faster because a session may hold some locks. You can (and should) set a lower value in sqlnet.ora with SQLNET.EXPIRE_TIME. Before 12c this parameter was used to send TNS packets as keep-alive probes but now that SO_KEEPALIVE is set, this parameter will control the keep-alive idle time (using TCP_KEEPIDL instead of the default /proc/sys/net/ipv4/tcp_keepalive_time).
Here is the same as my first test (without the client ENABLE=BROKER) but after having set SQLNET.EXPIRE_TIME=42 in $ORACLE_HOME/network/admin/sqlnet.ora

Side note: I’ve got the “do we need to restart the listener?” question very often about changes in sqlnet.ora but the answer is clearly “no”. This file is read for each new connection to the database. The listener forks the server (aka shadow) process and this one reads the sqlnet.ora, as we can see here when I “strace -f” the listener but the forked process is setting-up the socket.

Here is the new setsockopt() from the server process:

[pid  5507] setsockopt(16<TCP:[127.0.0.1:1521->127.0.0.1:31374]>, SOL_TCP, TCP_KEEPIDLE, [2520], 4) = 0
[pid  5507] setsockopt(16<TCP:[127.0.0.1:1521->127.0.0.1:31374]>, SOL_TCP, TCP_KEEPINTVL, [6], 4) = 0
[pid  5507] setsockopt(16<TCP:[127.0.0.1:1521->127.0.0.1:31374]>, SOL_TCP, TCP_KEEPCNT, [10], 4) = 0

This means that the server waits for 42 minutes of inactivity (the EXPIRE_TIME that I’ve set, here TCP_KEEPIDLE=2520 seconds) and then sends a probe. Without answer (ack) it re-probes every 6 seconds during one minute (the 6 seconds interval is defined by TCP_KEEPINTVL and TCP_KEEPCNT sets the retries to 10 times). We control the idle time with SQLNET.EXPIRE_TIME and then can expect that a dead connection is closed after one additional minute of retry.

Here is a combination of SQLNET.EXPIRE_TIME (server detecting dead connection in 42+1 minute) and ENABLE=BROKEN (client detecting dead connection after the default of 2 hours):

tcpdump and iptable drop

The above, with strace, shows the translation of Oracle settings to Linux settings. Now I’ll translate to the actual behavior by tracing the TCP packets exchanged, with tcpdump:

sqlplus demo/demo@"(DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=PDB1))(ADDRESS=(PROTOCOL=tcp)(HOST=localhost)(PORT=1521)))"
host cat $ORACLE_HOME/network/admin/sqlnet.ora
host sudo netstat -np  | grep sqlplus
host sudo netstat -np  | grep 36316
set time on escape on
host sudo tcpdump -vvnni lo port 36316 \&

“netstat -np | grep sqlplus” finds the client connection in order to get the port and “netstat -np | grep $port” shows both connections (“sqlplus” for the client and “oracleSID” for the server).

I have set SQLNET.EXPIRE_TIME=3 here and I can see that the server sends a 0-length packets every 3 minutes (connection at 14:43:39, then idle, 1st probe: 14:46:42, 2nd probe: 14:49:42…). And each time the client replied with an ACK and then the server knows that the connection is still alive:

Now I simulate a client that doesn’t answer, by blocking the input packets:

host sudo iptables -I INPUT 1 -p tcp --dport 36316 -j DROP
host sudo netstat -np  | grep 36316

Here I see the next probe 3 minutes after the last one (14:55:42) and then, as there is no reply, the 10 probes every 6 seconds:

At the end, I checked the TCP connections and the server one has disappeared. But the client side remains. That is exactly what DCD does: when a session is idle for a while it tests if the connection is dead and closes it to release all resources.
If I continue from there and try to run a query, the server cannot be reached and I’ll hang for the default TCP timeout of 15 minutes. If I try to cancel, I get “ORA-12152: TNS:unable to send break message” as it tries to send an out-of-bound break. SQLNET.EXPIRE_TIME is only for the server-side. The client detects nothing until it tries to send something.

For the next test, I remove my iptables rule to stop blocking the packets:

host sudo iptables -D INPUT 1

And I’m now running the same but with (ENABLE=BROKEN)

connect demo/demo@(DESCRIPTION=(ENABLE=BROKEN)(CONNECT_DATA=(SERVICE_NAME=PDB1))(ADDRESS=(PROTOCOL=tcp)(HOST=localhost)(PORT=1521)))
host sudo netstat -np  | grep sqlplus
host sudo netstat -np  | grep 37064
host sudo tcpdump -vvnni lo port 37064 \&
host sudo iptables -I INPUT 1 -p tcp --dport 37064 -j DROP
host sudo netstat -np  | grep 37064
host sudo iptables -D INPUT 1
host sudo netstat -np  | grep 37064

Here is the same as before: DCD after 3 minutes idle, and 10 probes that fail because I’ve blocked again with iptables:

As with the previous test, the server connection (the oracleSID) has been closed and only the client one remains. As I know that SO_KEEPALIVE has been enabled thanks to (ENABLE=BROKEN) the client will detect the closed connection:

17:52:48 is 2 hours after the last activity and probes 9 times every 1’15 according to the system defaults:

[oracle@db195 tmp]$ tail /proc/sys/net/ipv4/tcp_keepalive*
==> /proc/sys/net/ipv4/tcp_keepalive_intvl <==    TCP_KEEPINTVL
75
==> /proc/sys/net/ipv4/tcp_keepalive_probes <==     TCP_KEEPCNT
9
==> /proc/sys/net/ipv4/tcp_keepalive_time <==      TCP_KEEPIDLE

It was long (but you can change those defaults on the client) but finally, the client connection is cleared up (sqlplus not there in the last netstat).
Now, an attempt to run a user call fails immediately with the famous ORA-03113 because the client knows that the connection is closed:

Just a little additional test to show ORA-03135. If the server detects and closes the dead connection, but before the dead connection is detected on the client, we have seen that we wait for a 15 minutes timeout. But that’s because the iptable rule was still there to drop the packet. If I remove the rule before attempting a user-call, the server can be reached (then no wait and timeout) and detects immediately that there’s no endpoint anymore. This raises “connection lost contact”.

In summary:

  • On the server, the keep-alive is always enabled and SQLNET.EXPIRE_TIME is used to reduce the tcp_keepalive_time defined by the system, because it is probably too long.
  • On the client, the keep-alive is enabled only when (ENABLE=BROKEN) is in the connection description, and uses the tcp_keepalive_time from the system. Without it, the broken connection will be detected only when attempting a user call.

Setting SQLNET.EXPIRE_TIME to a few minutes (like 10) is a good idea because you don’t want to keep resources and locks on the server when a simple ping can ensure that the connection is lost and we have to rollback. If we don’t, then the dead connections may disappear only after 2 hours and 12 minutes (the idle time + the probes). On the client-side, it is also a good idea to add (ENABLE=BROKEN) so that idle sessions that have lost contact have a chance to know it before trying to use them. This is a performance gain if it helps to avoid sending a “select 1 from dual” each time you grab a connection from the pool

And, most important: the documentation is imprecise, which means that the behavior can change without notification. This is a test on specific OS, specific driver, specific version,… Do not take the results from this post, but now you know how to check in your environment.

Cet article SQLNET.EXPIRE_TIME and ENABLE=BROKEN est apparu en premier sur Blog dbi services.


Oracle 20c : The new PREPARE DATABASE FOR DATA GUARD

$
0
0

As you may know, Oracle 20c is in the cloud with new features. The one I have tested is the PREPARE DATABASE FOR DATA GUARD.
This command configures a database for use as a primary database in a Data Guard broker configuration. Database initialization parameters are set to recommended values.
Let’s see what this command will do for us
The db_unique_name of the primary database is prod20 and in the Data Guard I will build, the db_unique_name will be changed to prod20_site1.

SQL> show parameter db_unique_name

NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
db_unique_name			     string	 prod20
SQL> 

Now let’s connect to the broker can run the help to see the syntax

[oracle@oraadserver ~]$ dgmgrl
DGMGRL for Linux: Release 20.0.0.0.0 - Production on Tue Feb 18 21:36:39 2020
Version 20.2.0.0.0

Copyright (c) 1982, 2020, Oracle and/or its affiliates.  All rights reserved.

Welcome to DGMGRL, type "help" for information.
DGMGRL> connect /
Connected to "prod20_site1"
Connected as SYSDG.
DGMGRL> 
 
DGMGRL> help prepare    

Prepare a primary database for a Data Guard environment.

Syntax:

  PREPARE DATABASE FOR DATA GUARD
    [WITH [DB_UNIQUE_NAME IS ]
          [DB_RECOVERY_FILE_DEST IS ]
          [DB_RECOVERY_FILE_DEST_SIZE IS ]
          [BROKER_CONFIG_FILE_1 IS ]
          [BROKER_CONFIG_FILE_2 IS ]];

And then run the command

DGMGRL> PREPARE DATABASE FOR DATA GUARD with DB_UNIQUE_NAME is prod20_site1;
Preparing database "prod20" for Data Guard.
Initialization parameter DB_UNIQUE_NAME set to 'prod20_site1'.
Initialization parameter DB_FILES set to 1024.
Initialization parameter LOG_BUFFER set to 268435456.
Primary database must be restarted after setting static initialization parameters.
Shutting down database "prod20_site1".
Database closed.
Database dismounted.
ORACLE instance shut down.
Starting database "prod20_site1" to mounted mode.
ORACLE instance started.
Database mounted.
Initialization parameter DB_FLASHBACK_RETENTION_TARGET set to 120.
Initialization parameter DB_LOST_WRITE_PROTECT set to 'TYPICAL'.
RMAN configuration archivelog deletion policy set to SHIPPED TO ALL STANDBY.
Adding standby log group size 209715200 and assigning it to thread 1.
Adding standby log group size 209715200 and assigning it to thread 1.
Adding standby log group size 209715200 and assigning it to thread 1.
Initialization parameter STANDBY_FILE_MANAGEMENT set to 'AUTO'.
Initialization parameter DG_BROKER_START set to TRUE.
Database set to FORCE LOGGING.
Database set to FLASHBACK ON.
Database opened.
DGMGRL> 

The output shows the changes done by the PREPARE command. We can do some checks

SQL> show parameter db_unique_name

NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
db_unique_name			     string	 prod20_site1
SQL> select flashback_on,force_logging from v$database;

FLASHBACK_ON	   FORCE_LOGGING
------------------ ---------------------------------------
YES		   YES

SQL> 

SQL> show parameter standby_file

NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
standby_file_management 	     string	 AUTO
SQL> 

But here I can see that I only have 3 standby redo log groups instead of 4 (as I have 3 redo log groups)

SQL> select bytes,group# from v$log;

     BYTES     GROUP#
---------- ----------
 209715200	    1
 209715200	    2
 209715200	    3

SQL> 


SQL> select group#,bytes from v$standby_log;

    GROUP#	BYTES
---------- ----------
	 4  209715200
	 5  209715200
	 6  209715200

SQL> 

After building the Data Guard I did some checks (note that steps not shown here but the same that other version)
For the configuration

DGMGRL> show configuration verbose;

Configuration - prod20

  Protection Mode: MaxPerformance
  Members:
  prod20_site1 - Primary database
    prod20_site2 - Physical standby database 

  Properties:
    FastStartFailoverThreshold      = '30'
    OperationTimeout                = '30'
    TraceLevel                      = 'USER'
    FastStartFailoverLagLimit       = '30'
    CommunicationTimeout            = '180'
    ObserverReconnect               = '0'
    FastStartFailoverAutoReinstate  = 'TRUE'
    FastStartFailoverPmyShutdown    = 'TRUE'
    BystandersFollowRoleChange      = 'ALL'
    ObserverOverride                = 'FALSE'
    ExternalDestination1            = ''
    ExternalDestination2            = ''
    PrimaryLostWriteAction          = 'CONTINUE'
    ConfigurationWideServiceName    = 'prod20_CFG'
    ConfigurationSimpleName         = 'prod20'

Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS

For the primary database

DGMGRL> show database verbose 'prod20_site1';

Database - prod20_site1

  Role:                PRIMARY
  Intended State:      TRANSPORT-ON
  Instance(s):
    prod20

  Properties:
    DGConnectIdentifier             = 'prod20_site1'
    ObserverConnectIdentifier       = ''
    FastStartFailoverTarget         = ''
    PreferredObserverHosts          = ''
    LogShipping                     = 'ON'
    RedoRoutes                      = ''
    LogXptMode                      = 'ASYNC'
    DelayMins                       = '0'
    Binding                         = 'optional'
    MaxFailure                      = '0'
    ReopenSecs                      = '300'
    NetTimeout                      = '30'
    RedoCompression                 = 'DISABLE'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyLagThreshold               = '30'
    TransportLagThreshold           = '30'
    TransportDisconnectedThreshold  = '30'
    ApplyParallel                   = 'AUTO'
    ApplyInstances                  = '0'
    ArchiveLocation                 = ''
    AlternateLocation               = ''
    StandbyArchiveLocation          = ''
    StandbyAlternateLocation        = ''
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    LogXptStatus                    = '(monitor)'
    SendQEntries                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    HostName                        = 'oraadserver'
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=oraadserver)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=prod20_site1_DGMGRL)(INSTANCE_NAME=prod20)(SERVER=DEDICATED)))'
    TopWaitEvents                   = '(monitor)'
    SidName                         = '(monitor)'

  Log file locations:
    Alert log               : /u01/app/oracle/diag/rdbms/prod20_site1/prod20/trace/alert_prod20.log
    Data Guard Broker log   : /u01/app/oracle/diag/rdbms/prod20_site1/prod20/trace/drcprod20.log

Database Status:
SUCCESS

DGMGRL> 

For the standby database

DGMGRL> show database verbose 'prod20_site2';

Database - prod20_site2

  Role:                PHYSICAL STANDBY
  Intended State:      APPLY-ON
  Transport Lag:       0 seconds (computed 1 second ago)
  Apply Lag:           0 seconds (computed 1 second ago)
  Average Apply Rate:  2.00 KByte/s
  Active Apply Rate:   0 Byte/s
  Maximum Apply Rate:  0 Byte/s
  Real Time Query:     OFF
  Instance(s):
    prod20

  Properties:
    DGConnectIdentifier             = 'prod20_site2'
    ObserverConnectIdentifier       = ''
    FastStartFailoverTarget         = ''
    PreferredObserverHosts          = ''
    LogShipping                     = 'ON'
    RedoRoutes                      = ''
    LogXptMode                      = 'ASYNC'
    DelayMins                       = '0'
    Binding                         = 'optional'
    MaxFailure                      = '0'
    ReopenSecs                      = '300'
    NetTimeout                      = '30'
    RedoCompression                 = 'DISABLE'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyLagThreshold               = '30'
    TransportLagThreshold           = '30'
    TransportDisconnectedThreshold  = '30'
    ApplyParallel                   = 'AUTO'
    ApplyInstances                  = '0'
    ArchiveLocation                 = ''
    AlternateLocation               = ''
    StandbyArchiveLocation          = ''
    StandbyAlternateLocation        = ''
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    LogXptStatus                    = '(monitor)'
    SendQEntries                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    HostName                        = 'oraadserver2'
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=oraadserver2)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=PROD20_SITE2_DGMGRL)(INSTANCE_NAME=prod20)(SERVER=DEDICATED)))'
    TopWaitEvents                   = '(monitor)'
    SidName                         = '(monitor)'

  Log file locations:
    Alert log               : /u01/app/oracle/diag/rdbms/prod20_site2/prod20/trace/alert_prod20.log
    Data Guard Broker log   : /u01/app/oracle/diag/rdbms/prod20_site2/prod20/trace/drcprod20.log

Database Status:
SUCCESS

DGMGRL> 

Conclusion

I am sure that you will adopt this nice command.

Cet article Oracle 20c : The new PREPARE DATABASE FOR DATA GUARD est apparu en premier sur Blog dbi services.

Oracle 20c Data Guard : Validating a Fast Start Failover Configuration

$
0
0

In Oracle 20c, we can now validate a Fast Start Failover configuration with the new command VALIDATE FAST_START FAILOVER. This command will help identifying issues in the configuration. I tested this new feature.
The Fast Start Failover is configured and the observer is running fine as we can see below.

DGMGRL> show configuration verbose

Configuration - prod20

  Protection Mode: MaxPerformance
  Members:
  prod20_site1 - Primary database
    prod20_site2 - (*) Physical standby database 

  (*) Fast-Start Failover target
  Properties:
    FastStartFailoverThreshold      = '30'
    OperationTimeout                = '30'
    TraceLevel                      = 'USER'
    FastStartFailoverLagLimit       = '30'
    CommunicationTimeout            = '180'
    ObserverReconnect               = '0'
    FastStartFailoverAutoReinstate  = 'TRUE'
    FastStartFailoverPmyShutdown    = 'TRUE'
    BystandersFollowRoleChange      = 'ALL'
    ObserverOverride                = 'FALSE'
    ExternalDestination1            = ''
    ExternalDestination2            = ''
    PrimaryLostWriteAction          = 'CONTINUE'
    ConfigurationWideServiceName    = 'prod20_CFG'
    ConfigurationSimpleName         = 'prod20'

Fast-Start Failover: Enabled in Potential Data Loss Mode
  Lag Limit:          30 seconds
  Threshold:          30 seconds
  Active Target:      prod20_site2
  Potential Targets:  "prod20_site2"
    prod20_site2 valid
  Observer:           oraadserver2
  Shutdown Primary:   TRUE
  Auto-reinstate:     TRUE
  Observer Reconnect: (none)
  Observer Override:  FALSE

Configuration Status:
SUCCESS

DGMGRL> 

If we run the command we can see that everything is working and that a failover will happen if needed

DGMGRL> VALIDATE FAST_START FAILOVER;
  Fast-Start Failover:  Enabled in Potential Data Loss Mode
  Protection Mode:      MaxPerformance
  Primary:              prod20_site1
  Active Target:        prod20_site2

DGMGRL

Now let’s stop the observer

DGMGRL> stop observer

Observer stopped.

And if we run the Validate command again, we have the following message

DGMGRL> VALIDATE FAST_START FAILOVER;

  Fast-Start Failover:  Enabled in Potential Data Loss Mode
  Protection Mode:      MaxPerformance
  Primary:              prod20_site1
  Active Target:        prod20_site2

Fast-start failover not possible:
  Fast-start failover observer not started.

DGMGRL> 

Cet article Oracle 20c Data Guard : Validating a Fast Start Failover Configuration est apparu en premier sur Blog dbi services.

Speed up datapump export for migrating big databases

$
0
0

Introduction

Big Oracle databases (several TB) are still tough to migrate to another version on a new server. For most of them, you’ll probably use RMAN restore or Data Guard, but datapump is always a cleaner way to migrate. With datapump, you can easily migrate to a new filesystem (ASM for example), rethink your tablespace organization, reorganize all the segments, exclude unneeded components, etc. All of these tasks in one operation. But datapump export can take hours and hours to complete. This blog post describe a method I used on several projects: it helped me a lot to optimize migration time.

Why datapump export takes so much time?

First of all, exporting data with datapump is actually extracting all the objects from the database, so it’s easy to understand why it’s much slower than copying datafiles. Regarding datapump speed, it mainly depends on disk speed where datafiles reside, and parallelism level. Increasing parallelism does not always speed up export, simply because if you’re on mechanical disks, it’s slower to read multiple objects on the same disks than actually do it serially. So there is some kind of limit, and for big databases, it can last hours to export data. Another problem is that long lasting export needs more undo data. If your datapump export lasts 10 hours, you’ll need 10 hours of undo_retention (if you need a consistent dump – at least when testing the migration because application is running). You’re also risking DDL changes on the database, and undo_retention cannot do anything for that. Be carefull because uncomplete dump is totally usable to import data, but you’ll miss several objects, not the goal I presume.

The solution would be trying to reduce the time needed for datapump export to avoid such problems.

SSD is the solution

SSD is probably the best choice for today’s databases. No more bottleneck with I/Os, that’s all we were waiting for. But your source database, an old 11gR2 or 12cR1, probably doesn’t run on SSD, especially if it’s a big database. SSD were quite small and expensive several years ago. So what? You probably didn’t plan a SSD migration on source server as you will decommission it as soon as migration is finished.

The solution is to use a temporary server fitted with fast SSDs. You don’t need a real server, with a fully rendundant configuration. You even don’t need RAID at all to protect your data because this server will only be for a single use: JBOD is OK.

How to configure this server?

This server will have:

  • exactly the same OS, or something really similar compared to source server
  • the exact same Oracle version
  • the same configuration of the filesystems
  • enough free space to restore the source database
  • SSD-only storage for datafiles without redundancy
  • enough cores to maximise the parallelism level
  • a shared folder to put the dump, this shared folder would also be mounted on target server
  • a shared folder to pick up the latest backups from source database
  • enough bandwith for shared folders. 1Gbps network is only about 100MB/s, so don’t expect very high speed with that kind of network
  • you don’t need a listener
  • you’ll never use this database for you application
  • if you’re reusing a server, make sure it will be dedicated for this purpose (no other running processes)

And regarding the license?

As you may know this server would need a license. But you also know that during the migration project, you’ll have twice the license used on your environment for several weeks: still using old servers, and already using new servers for migrated database. To avoid any problem, you can use a server previously running Oracle databases and already decommissionned. Tweak it with SSDs and it will be fine. And please make sure to be fully compliant with the Oracle license on your target environment.

How to proceed?

We won’t use this server as a one-shot path for migration because we need to try if the method is good enough and also find the best settings for datapump.

To proceed, the steps are:

  • declare the database in /etc/oratab
  • create a pfile on source server and copy it to $ORACLE_HOME/dbs on the temporary server
  • edit the parameters to disable references to source environnement, for example local and remote_listeners and Data Guard settings. The goal is to make sure starting this database will have no impact on production
  • startup the instance on this pfile
  • restore the controlfile from the very latest controlfile autobackup
  • restore the database
  • recover the database and check the SCN
  • take a new archivelog backup on the source database (to simulate the real scenario)
  • catalog the backup folder on the temporary database with RMAN
  • do another recover database on temporary database, it should apply the archivelogs of the day, then check again the SCN
  • open the database in resetlogs mode
  • create the target directory for datapump on the database
  • do the datapump export with maximum parallelism level (2 times the number of cores available on your server – it will be too many at the beginning, but not enough at the end)

You can try various parallelism levels to adjust to the best value. Once you’ve found the best value, you can schedule the real migration.

Production migration

Now you managed to master the method, let’s imagine that you planned to migrate to production tonight at 18:00.

09:00 – have a cup of coffee first, you’ll need it!
09:15 – remove all the datafiles on the temporary server, also remove redologs and controlfiles, and empty the FRA. Only keep the pfile.
09:30 – startup force your temporary database, it should stop in nomount mode
09:45 – restore the latest controlfile autobackup on temporary database. Make sure no datafile will be added today on production
10:00 – restore the database on the temporary server. During the restore, production is still available on source server. At the end of the restore, do a first recover but DON’T open your database with resetlogs now
18:00 – your restore should be finished now, you can disconnect everyone from source database, and take the very latest archivelog backup on source database. From now your application should be down.
18:20 – on your temporary database, catalog the backup folder with RMAN. It will discover the latest archivelog backups.
18:30 – do a recover of your temporary database again. It should apply the latest archivelogs (generated during the day). If you want to make sure that everything is OK, check the current_scn on source database, it should be nearly the same as your temporary database
18:45 – open the temporary database with RESETLOGS
19:00 – do the datapump export with your optimal settings

Once done, you now have to do the datapump import on your target database. Parallelism will depend on the cores available on target server, and the resources you would preserve for other databases already running on this server.

Benefits and drawbacks

Obvious benefit is that it probably costs less than 30 minutes to apply the archivelogs of the day on the temporary database. And total duration of the export can be cut by several hours.

First drawback is that you’ll need a server of this kind, or you’ll need to build one. Second drawback is if you’re using Standard Edition: don’t expect to save that much hours as it has no parallelism at all. Big databases are not very well deserved by Standard Edition, you may know.

Real world example

This is a recent case. Source database is 12.1, about 2TB on mechanical disks. Datapump export is not working correctly: it lasted more than 19 hours with lots of errors. One of the big problem of this database is a bigfile tablespace of 1.8TB. Who did this kind of configuration?

Temporary server is a DEV server already decommissioned running the same version of Oracle and using the same Linux kernel. This server is fitted with enough TB of SSD: mount path was changed to match source database filesystems.

On source server:

su – oracle
. oraenv <<< BP3
sqlplus / as sysdba
create pfile='/tmp/initBP3.ora' from spfile;
exit
scp /tmp/initBP3.ora oracle@db32-test:/tmp

On temporary server:
su – oracle
cp /tmp/initBP3.ora /opt/orasapq/oracle/product/12.1.0.2/dbs/
echo "BP3:/opt/orasapq/oracle/product/12.1.0.2:N" >> /etc/oratab
. oraenv <<< BP3
vi $ORACLE_HOME/dbs/initBP3.ora
remove db_unique_name, dg_broker_start, fal_server, local_listener, log_archive_config, log_archive_dest_2, log_archive_dest_state_2, service_names from this pfile
sqlplus / as sysdba
startup force nomount;
exit
ls -lrt /backup/db42-prod/BP3/autobackup | tail -n 1
/backup/db42-prod/BP3/autobackup/c-2226533455-20200219-01
rman target /
restore controlfile from '/backup/db42-prod/BP3/autobackup/c-2226533455-20200219-01';
alter database mount;
CONFIGURE DEVICE TYPE DISK PARALLELISM 8 BACKUP TYPE TO BACKUPSET;
restore database;
...
recover database;
exit;

On source server:
Take a last backup of archivelogs with your own script: the one used in scheduled tasks.

On temporary server:
su – oracle
. oraenv <<< BP3
rman target /
select current_scn from v$database;
CURRENT_SCN
-----------
11089172427
catalog start with '/backup/db42-prod/BP3/backupset/';
recover database;
select current_scn from v$database;
CURRENT_SCN
-----------
11089175474
alter database open resetlogs;
exit;
sqlplus / as sysdba
create or replace directory mig as '/backup/dumps/';
expdp \'/ as sysdba\' full=y directory=migration dumpfile=expfull_BP3_`date +%Y%m%d_%H%M`_%U.dmp parallel=24 logfile=expfull_BP3_`date +%Y%m%d_%H%M`.log

Export was done in less than 5 hours, 4 times less than on source database. Database migration could now fit in one night. Much better isn’t it?

Other solutions

If you’re used to Data Guard, you can create a standby on this temporary server that would be dedicated to this purpose. No need to manually apply the latest archivelog backup of the day because it’s already in sync. Just convert this standby to primary without impacting the source database, or do a simple switchover then do the datapump export.

Transportable tablespace is a mixed solution where datafiles are copied to destination database, only metadata being exported and imported. But don’t expect any kind of reorganisation here.

If you cannot afford a downtime of several hours of migration, you should think about logical replication. Solutions like Golden Gate are perfect for keeping application running. But as you probably know, it comes at a cost.

Conclusion

If several hours of downtime is acceptable, datapump is still a good option for migration. Downtime is all about disk speed and parallelism.

Cet article Speed up datapump export for migrating big databases est apparu en premier sur Blog dbi services.

Control-M – Agent Upgrade – Communication Issue

$
0
0

At a customer, we are upgrading Control-M Agents running on Windows/Unix from 9.0.00.400 to 9.0.19.200 using Agent Deployment, an upgrade of one agent fails with error talking about communication issue between Control-M/server and Control-M agent:
Testing Communication with Control-M/Server using new Control-M/Agent binaries failed, Control-M/Agent Upgrade aborted, please check Log and agent installation requirements

On the other hand, the agent seems to be Available as shown in the above screenshot “Connected”, and all jobs running on it are executing and Ended OK. So why this communication fails? I found the response 😉

Analysis

As said above, I made a check on this agent on CCM and it was Available:

Humm, maybe the CCM is not saying the truth, let’s ask ctm_diag_comm:

lds00% ctm_diag_comm

Enter Agent platform node ID: lds10.dbi

This procedure runs for about 30 seconds. Please wait

Time Stamp :
-------------
Fri Feb 21 18:54:18 CET 2020

Executing ctmping lds10.dbi as Regular Agent .

CONTROL-M/Server to CONTROL-M/Agent Communication Diagnostic Report
-------------------------------------------------------------------

 CTMS User Name                    : cmsv
 CTMS Directory                    : /pk/cntrlm/server/ctm_server
 CTMS Platform Architecture        : Linux-x86_64
 CTMS Installed Version            : PACTV.9.0.19.200
 CTMS Local IP Host Interface Name : ctlm9.dbi
 Server-to-Agent Port Number       : 7008
 Agent-to-Server Port Number       : 7005
 Server-Agent Comm. Protocol       : SSL
 Server-Agent Protocol Version     : 11
 Server-Agent Connection mode      : Transient
 Agent Platform Name               : lds10.dbi
 Agent Status                      : Available
 Agent known Type                  : Regular
 UNIX ping to Agent or Remote host : Succeeded
 CTMS ping to Agent or Remote host : Succeeded
 Agent IP Address                  : 10.25.35.1

 CTMS Ping lds10.dbi as Regular Agent
 ================================================
 Agent [lds10.dbi] is available

So, it is available… But why the communication fails during the upgrade.

Let’s check on the agent side, I execute ctmagcfg then choose 7 (Advanced parameters)

 cmag@lds10[~]> ctmagcfg

                Agent Configuration Utility

1)      Agent-to-Server Port Number . . . : [7005]
2)      Server-to-Agent Port Number . . . : [7008]
        For items 3 and 4 do not use IP address
3)      Primary Control-M/Server Host . . : [ctlm9.dbi]
4)      Authorized Control-M/Server Hosts : [ctlm9.dbi]
5)      Diagnostic Level. . . . . . . . . : [0]
6)      Comm Trace. . . . . .(0-OFF|1-ON) : [0]
7)      Advanced parameters

s)      Save
q)      Quit

Please enter your choice:7


                Agent Configuration Utility - advanced menu

1)      Days To Retain Log Files. . . . . : [1]
2)      Daily Log File Enabled. . .  (Y|N): [Y]
3)      Tracker Event Port. . . . . . . . : [29204]
4)      Logical Agent Name. . . . . . . . : [lds10]
5)      Persistent Connection . . . . . . : [N]
6)      Allow Comm Init. . .  . . . . . . : [Y]
7)      Foreign Language Support. . . . . : [LATIN-1]
8)      Locale (LATIN-1 mode only) . . .. : [C]
9)      Protocol Version. . . . . . . . . : [11]
10)     AutoEdit Inline. . . . . . . (Y|N): [Y]
11)     Listen to Network Interface . . . : [*ANY]
12)     CTMS Address Mode . . . . . (IP|) : []
13)     Timeout for Agent utilities . . . : [120]
14)     TCP/IP Timeout. . . . . . . . . . : [60]
15)     Tracker Polling Interval. . . . . : [60]

r)      Return to main menu

Ok ok ok, the Logical Agent Name seems to be not exactly the same as configured in CONTROL-M/Server, try to change that:

Please enter your choice:4

Please enter the value:lds10.dbi

                Agent Configuration Utility - advanced menu

1)      Days To Retain Log Files. . . . . : [1]
2)      Daily Log File Enabled. . .  (Y|N): [Y]
3)      Tracker Event Port. . . . . . . . : [29204]
4)      Logical Agent Name. . . . . . . . : [lds10.dbi]
5)      Persistent Connection . . . . . . : [N]
6)      Allow Comm Init. . .  . . . . . . : [Y]
7)      Foreign Language Support. . . . . : [LATIN-1]
8)      Locale (LATIN-1 mode only) . . .. : [C]
9)      Protocol Version. . . . . . . . . : [11]
10)     AutoEdit Inline. . . . . . . (Y|N): [Y]
11)     Listen to Network Interface . . . : [*ANY]
12)     CTMS Address Mode . . . . . (IP|) : []
13)     Timeout for Agent utilities . . . : [120]
14)     TCP/IP Timeout. . . . . . . . . . : [60]
15)     Tracker Polling Interval. . . . . : [60]

Return to menu and save the change!

Retried the upgrade of this agent, and it has been completed successfully:

The agent is still Available in the CCM with the new version after upgrade:

Summary

Issue description

Upgrading Control-M/Agent running on Windows/Unix using Agent Deployment fails with error: “Testing Communication with Control-M/Server using new Control-M/Agent binaries failed, Control-M/Agent Upgrade aborted, please check Log and agent installation requirements”.

Cause

The agent’s Logical Agent Name parameter are not exactly the same as the Control-M/Agent’s host name that appears in the Control-M CCM.

Solution

Set the Control-M/Agent’s “Logical Agent Name” parameter to EXACTLY the same host name that appears in the Control-M Configuration Manager for this agent (follow above steps).

I hope that this blog help you to solve quickly this issue, if you have any question or doubt don’t hesitate to comment, also you can share other issue encountered during upgrade 😉

Cet article Control-M – Agent Upgrade – Communication Issue est apparu en premier sur Blog dbi services.

Control-M/EM : Unlock and Take ownership of a folder

$
0
0

Introduction :

Definition of a workspace:

It is a board letting you create and update folders and jobs, once you work ended and saved into the base the workspace can be deleted.

Sometimes when you want to work on a folder and edit it (Check in) you can be stuck by someone else “working” on it:

But how can you proceed if this user is no longer here? As if he has took 5 holidays weeks 😀

Instead of waiting for him during all this you have 2 solutions:

1 Joining him for 5 weeks to enjoy holidays

2 Take ownership of the folder

As we only leave for jobs scheduling, we will select the second option:

Problem:

We need to update a folder for example the one named  FOLDERTOCHANGE:

The fact is that once you have selected it and try to update it , you face the following issue:

You have no way to modify the jobs in the folder:

Note:

It is absolutely normal as it is a BMC Control M security to avoid that 2 or more persons work on the same folder at the same time it would be a bit messy and datas can be lost, user has the exclusivity.

Solution:

As we can see the user NOT_HERE_ON_HOLIDAYS is locking the workspace and seems to work on it:

Select the icon with the man in costume “used by”  to display the folder locked.

We can now click on the workspace held by user NOT_HERE…quite a long nickname 😊

Then it is possible to take ownership of the folder.

Of course, you must ensure that the user is not working on it anymore and that he only forgot to close his workspace.

Once you have taken ownership a confirmation pop-up will display reminding you that you ust contact the owner before kick him out:

Congratulations you are now the workspace’s owner:

You are now able to modify the folder:

And you are also the exclusive owner of the workspace so don’t forget to free it once you have finished your update ( even if we have this emergency solution 😊 )

If you have no more actions to do on the workspace and if it is still own by you, you can also delete the workspace to unlock it for the other users.

Select the corresponding workspace click on the cross and then confirm deletion.

Then you will have a confirmation:

As result you will see, when selecting the folder,that no user is assign to him:

Conclusion:

You are now able to take ownership on a folder and update it as you want but remember you must do it only if you are sure that the owner is not working on it anymore, in other case you have to advert him.

You can consult also BMC site and videos to go further, what about taking ownership  on web server??? 🙂

I invite you also to consult my dbi’s colleagues blogs on Control M to learn more ways to use this flexible software 😎 .

Next blogs coming : Mass update and shouts and even more….Stay tuned!

Cet article Control-M/EM : Unlock and Take ownership of a folder est apparu en premier sur Blog dbi services.

Oracle 20c SQL Macros: a scalar example to join agility and performance

$
0
0

By Franck Pachot

.
Let’s say you have a PEOPLE table with FIRST_NAME and LAST_NAME and you want, in many places of your application, to display the full name. Usually my name will be displayed as ‘Franck Pachot’ and I can simply add a virtual column to my table, or view, as: initcap(FIRST_NAME)||’ ‘||initcap(LAST_NAME). Those are simple SQL functions. No need for procedural code there, right? But, one day, the business will come with new requirements. In some countries (I’ve heard about Hungary but there are others), my name may be displayed with last name… first, like: ‘Pachot Franck’. And in some context, it may have a comma like: ‘Pachot, Franck’.

There comes a religious debate between Dev and Ops:

  • Developer: We need a function for that, so that the code can evolve without changing all SQL queries or views
  • DBA: That’s the worst you can do. Calling a function for each row is a context switch between SQL and PL/SQL engine. Not scalable.
  • Developer: Ok, let’s put all that business logic in the application so that we don’t have to argue with the DBA…
  • DBA: Oh, that’s even worse. The database cannot perform correctly with all those row-by-row calls!
  • Developer: No worry, we will put the database on Kubernetes, shard and distribute it, and scale as far as we need for acceptable throughput

And this is where we arrive in an unsustainable situation. Because we didn’t find a tradeoff between code maintainability and application performance, we get the worst from each of them: crazy resource usage for medium performance.

However, in Oracle 20c, we have a solution for that. Did you code some C programs where you replace functions by pre-processor macros? So that your code is readable and maintainable like when using modules and functions. But compiled as if those functions have been merged to the calling code at compile time? What was common in those 3rd generation languages is now possible in a 4th generation declarative language: Oracle SQL.

Let’s take an example. I’m building a PEOPLE table using the Linux /usr/share/dict of words:


create or replace directory "/usr/share/dict" as '/usr/share/dict';
create table people as
with w as (
select *
 from external((word varchar2(60))
 type oracle_loader default directory "/usr/share/dict" access parameters (nologfile) location('linux.words'))
) select upper(w1.word) first_name , upper(w2.word) last_name
from w w1,w w2 where w1.word like 'ora%' and w2.word like 'aut%'
order by ora_hash(w1.word||w2.word)
/

I have 100000 rows table here with first and last names.
Here is a sample:


SQL> select count(*) from people;

  COUNT(*)
----------
    110320

SQL> select * from people where rownum<=10;

FIRST_NAME                     LAST_NAME
------------------------------ ------------------------------
ORACULUM                       AUTOMAN
ORANGITE                       AUTOCALL
ORANGUTANG                     AUTHIGENOUS
ORAL                           AUTOPHOBIA
ORANGUTANG                     AUTOGENEAL
ORATORIAN                      AUTOCORRELATION
ORANGS                         AUTOGRAPHICAL
ORATORIES                      AUTOCALL
ORACULOUSLY                    AUTOPHOBY
ORATRICES                      AUTOCRATICAL

PL/SQL function

Here is my function that displays the full name, with the Hungarian specificity as an example but, as it is a function, it can evolve further:


create or replace function f_full_name(p_first_name varchar2,p_last_name varchar2)
return varchar2
as
 territory varchar2(64);
begin
 select value into territory from nls_session_parameters
 where parameter='NLS_TERRITORY';
 case (territory)
 when 'HUNGARY'then return initcap(p_last_name)||' '||initcap(p_first_name);
 else               return initcap(p_first_name)||' '||initcap(p_last_name);
 end case;
end;
/
show errors

The functional result depends on my session settings:


SQL> select f_full_name(p_first_name=>first_name,p_last_name=>last_name) from people
     where rownumFIRST_NAME,P_LAST_NAME=>LAST_NAME)
------------------------------------------------------------------------------------------------
Oraculum Automan
Orangite Autocall
Orangutang Authigenous
Oral Autophobia
Orangutang Autogeneal
Oratorian Autocorrelation
Orangs Autographical
Oratories Autocall
Oraculously Autophoby
Oratrices Autocratical

10 rows selected.

But let’s run it on many rows, like using this function in the where clause, with autotrace:


SQL> set timing on autotrace on
select f_full_name(first_name,last_name) from people
where f_full_name(p_first_name=>first_name,p_last_name=>last_name) like 'Oracle Autonomous';

F_FULL_NAME(FIRST_NAME,LAST_NAME)
------------------------------------------------------------------------------------------------------
Oracle Autonomous

Elapsed: 00:00:03.47

Execution Plan
----------------------------------------------------------
Plan hash value: 2528372185

----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |  1103 | 25369 |   129   (8)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| PEOPLE |  1103 | 25369 |   129   (8)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("F_FULL_NAME"("P_FIRST_NAME"=>"FIRST_NAME","P_LAST_NAME"=>
              "LAST_NAME")='Oracle Autonomous')


Statistics
----------------------------------------------------------
     110361  recursive calls
          0  db block gets
        426  consistent gets
          0  physical reads
          0  redo size
        608  bytes sent via SQL*Net to client
        506  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

100000 recursive calls. That is bad and not scalable. The time spent in context switches from the SQL to the PL/SQL engine is a waste of CPU cycles.

Note that this is difficult to improve be cause we cannot create on index for that predicate:


SQL> create index people_full_name on people(f_full_name(first_name,last_name));
create index people_full_name on people(f_full_name(first_name,last_name))
                                        *
ERROR at line 1:
ORA-30553: The function is not deterministic

Yes, this function cannot be deterministic because it depends on many other parameters (like the territory in this example, in order to check if I am in Hungary)

SQL Macro

The solution in 20c, currently available in the Oracle Cloud, here is very easy. I create a new function, M_FULL_NAME, when the only differences with F_FULL_NAME are:

  1. I add the SQL_MACRO(SCALAR) keyword and change the return type to varchar2 (if not already)
  2. I enclose the return expression value in quotes (using q'[ … ]’ for better readability) to return it as a varchar2 containing the expression string where variable names are just placeholders (no bind variables here!)

create or replace function m_full_name(p_first_name varchar2,p_last_name varchar2)
return varchar2 SQL_MACRO(SCALAR)
as
 territory varchar2(64);
begin
 select value into territory from nls_session_parameters
 where parameter='NLS_TERRITORY';
 case (territory)
 when 'HUNGARY'then return q'[initcap(p_last_name)||' '||initcap(p_first_name)]';
 else               return q'[initcap(p_first_name)||' '||initcap(p_last_name)]';
 end case;
end;
/

Here is the difference if I call both of them:


SQL> set serveroutput on
SQL> exec dbms_output.put_line(f_full_name('AAA','BBB'));
Aaa Bbb

PL/SQL procedure successfully completed.

SQL> exec dbms_output.put_line(m_full_name('AAA','BBB'));
initcap(p_first_name)||' '||initcap(p_last_name)

PL/SQL procedure successfully completed.

SQL> select m_full_name('AAA','BBB') from dual;

M_FULL_
-------
Aaa Bbb

One returns the function value, the other returns the expression that can be used to return the value. It is a SQL Macro that can be applied to a SQL text to replace part of it – a scalar expression in this case as I mentioned SQL_MACRO(SCALAR)

The result is the same as with the previous function:


SQL> select m_full_name(p_first_name=>first_name,p_last_name=>last_name) from people
     where rownumFIRST_NAME,P_LAST_NAME=>LAST_NAME)
-------------------------------------------------------------------------------------------------------
Oraculum Automan
Orangite Autocall
Orangutang Authigenous
Oral Autophobia
Orangutang Autogeneal
Oratorian Autocorrelation
Orangs Autographical
Oratories Autocall
Oraculously Autophoby
Oratrices Autocratical

10 rows selected.

And now let’s look at the query using this as a predicate:


SQL> set timing on autotrace on
SQL> select m_full_name(first_name,last_name) from people
     where m_full_name(p_first_name=>first_name,p_last_name=>last_name) like 'Oracle Autonomous';

M_FULL_NAME(FIRST_NAME,LAST_NAME)
-----------------------------------------------------------------------------------------------------
Oracle Autonomous

Elapsed: 00:00:00.06

Execution Plan
----------------------------------------------------------
Plan hash value: 2528372185

----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |  1103 | 25369 |   122   (3)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| PEOPLE |  1103 | 25369 |   122   (3)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(INITCAP("FIRST_NAME")||' '||INITCAP("LAST_NAME")='Oracle
              Autonomous')


Statistics
----------------------------------------------------------
         40  recursive calls
          4  db block gets
        502  consistent gets
          0  physical reads
          0  redo size
        608  bytes sent via SQL*Net to client
        506  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

I don’t have all those row-by-row recursive calls. And the difference is easy to see in the execution plan predicate sections: there’s no call to my PL/SQL function there. It was called only at parse time to transform the SQL statement: now only using the string returned by the macro, with parameter substitution.

That was my goal: stay in SQL engine for the execution, calling only standard SQL functions. But while we are in the execution plan, can we do something to avoid the full table scan? My function is not deterministic but has a small number of variations. Two in my case. Then I can create an index for each one:


 
SQL>
SQL> create index people_full_name_first_last on people(initcap(first_name)||' '||initcap(last_name));
Index created.

SQL> create index people_full_name_first_first on people(initcap(last_name)||' '||initcap(first_name));
Index created.

And run my query again:


SQL> select m_full_name(first_name,last_name) from people
     where m_full_name(p_first_name=>first_name,p_last_name=>last_name) like 'Autonomous Oracle';

no rows selected

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1341595178

------------------------------------------------------------------------------------------------
| Id  | Operation        | Name                        | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                             |  1103 | 25369 |   118   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| PEOPLE_FULL_NAME_FIRST_LAST |   441 |       |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access(INITCAP("FIRST_NAME")||' '||INITCAP("LAST_NAME")='Autonomous Oracle')

Performance and agility

Now we are ready to bring back the business logic into the database so that it is co-located with data and run within the same process. Thanks to SQL Macros, we can even run it within the same engine, SQL, calling the PL/SQL one only at compile time to resolve the macro. And we keep full code maintainability as the logic is defined in a function that can evolve and be used in many places without duplicating the code.

Cet article Oracle 20c SQL Macros: a scalar example to join agility and performance est apparu en premier sur Blog dbi services.

DevOpsDays 2020 Geneva – Day 1

$
0
0

It is a pleasure to come back to Geneva after 5 years to attend my first DevOps event. I’m very excited to share with you my first day feedback, but before I would like to thank dbi services for allowing me to attend this event and continuously improving my knowledge in this growing field.

The first day started by the Welcome Speech talking about the event in general, and specially about the program and Sponsors, by the way dbi services is a Silver sponsor! I liked the energy and the motivation that prevailed the room in the Geneva School of Business Administration.

Of course, in such event we would like to attend all talks, sessions and workshops 😀 which is impossible because two streams were organized, below I will list the talks and workshops I attend today:

Now that my delivery pipeline is in place, what should I do with my organization?

Good question, Joseph Glorieux and Mathieu Brun explained different ways:

  1. Do DevOps yourself by hiring or train your actual team, no dedicated DevOps guy
  2. Create a dedicated DevOps Team
  3. Having one Team by product, from Dev to Deployment to run one Team should do everything
  4. Google experience with “Site Reliability Engineering”, forget the notion of Team doing everything! According to Ben Treynor the founder of Google’s Site Reliability Team, SRE is “what happens when a software engineer is tasked with what used to be called operations”, in fact, the status of the application tel you who should manage it…

We saw 8 cases and possibilities, but at the end as I predicted they confirmed that there is no magic solution for organization, in fact, it depends on your actual structure, Teams, applications, needs, aso. At dbi services we do our best to advise our clients at this level!

Travel to the orchestrators world

Thomas Cottier spoke about his story with containers and schedulers since few years ago. I would say this is the journey of all companies and teams that started working on orchestrators at the beginning, some of my colleagues at dbi services have already briefed me 😉 Today, its orchestrators have reached a certain maturity, but there is wide area for improvement!

The solution of merge hell in monorepo

In fact, a monorepo is a development strategy where code for many projects is stored in the same repository.
Advantages
There are a number of potential advantages to a monorepo over individual repositories:

  • Ease of code reuse : Similar functionality or communication protocols can be abstracted into shared libraries and directly included by projects
  • Simplified dependency management : In a multiple repository environment where multiple projects depend on a third-party dependency, that dependency might be downloaded or built multiple times. In a monorepo the build can be easily optimized, as referenced dependencies all exist in the same codebase
  • Atomic commits : When projects that work together are contained in separate repositories, releases need to sync which versions of one project work with the other. And in large enough projects, managing compatible versions between dependencies can become dependency hell. In a monorepo this problem can be negated, since developers may change multiple projects atomically
  • Large-scale code refactoring : Since developers have access to the entire project, refactors can ensure that every piece of the project continues to function after a refactor
  • Collaboration across teams : In a monorepo that uses source dependencies (dependencies that are compiled from source),[6] teams can improve projects being worked on by other teams. This leads to flexible code ownership

Disadvantages

  • Loss of version information : Although not required, some monorepo builds use one version number across all projects in the repository. This leads to a loss of per-project semantic versioning
  • Lack of per-project security – With split repositories, access to a repository can be granted based upon need. A monorepo allows read access to all software in the project, possibly presenting new security issues
  • More storage needed : With split repositories, you can fetch only the project you are interested in. With a monorepo, you might need to fetch all projects. Note that this depends on the versioning system. This is not an issue if you use e.g. SVN in which you can download any part of the repo

At one of our cutomers, we are actively thinking and preparing a monorepo solution.

A multi cloud Service Mesh deployment in action

The term service mesh is used to describe the network of microservices that make up such applications and the interactions between them. As a service mesh grows in size and complexity, it can become harder to understand and manage. Its requirements can include discovery, load balancing, failure recovery, metrics, and monitoring.
We saw Istio and how it lets you connect, secure, control, and observe services. In fact, Istio makes it easy to create a network of deployed services with load balancing, service-to-service authentication, monitoring, and more, with few or no code changes in service code.

Tomorrow will be the second and last day, I already know which session I will attend and will keep you in touch with a quick feedback. The next step for me will be to deep inside this world which fascinates me.

Cet article DevOpsDays 2020 Geneva – Day 1 est apparu en premier sur Blog dbi services.


Java 1.8 Utility classes: XML, ZIP, BufferedImage and Download

$
0
0

When I write code, I usually need some utility classes to ease the development. So here I will share my most used classes. Hope this will help you as well!

XML store and load

I strongly use XML for storing configurations or even data, so I made an helper which can store and load any classes (with annotations) as XML, using generics

import java.io.File;

import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Marshaller;
import javax.xml.bind.Unmarshaller;

/**
 * Utility class to provide some handy method about xml management
 * 
 * @author sbi
 *
 */
public class XmlHelper {

	/**
	 * Generic method to store xml files based on classes with XML anotations
	 * 
	 * @param element - The object to store as xml
	 * @param file - The file path where to store the xml
	 * @throws JAXBException
	 */
	public static  void storeXml(T element, File file) throws JAXBException {
		JAXBContext jaxbContext = JAXBContext.newInstance(element.getClass());
		Marshaller jaxbMarshaller = jaxbContext.createMarshaller();
		jaxbMarshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
		jaxbMarshaller.marshal(element, file);
	}
	
	/**
	 * Generic method to load xml and transform it into an object which was declared with annotations
	 * @param clazz - The class of the object in which we want the xml to be casted
	 * @param file - The XML file located on the file system
	 * @return An object converted from XML
	 * @throws JAXBException
	 */
	@SuppressWarnings("unchecked")
	public static  T loadXml(Class clazz, File file) throws JAXBException {
		JAXBContext jaxbContext = JAXBContext.newInstance(clazz);
		Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller();
		return (T) jaxbUnmarshaller.unmarshal(file);
	}
}

And here an example of a basic class that can be written as XML. I’ve added 3 kinds of elements:

  • XMLRootElement: It sets this class as XML enabled. It is mandatory.
  • XMLElement: Basic XML element which will create the hierarchy
  • XMLAttribute: Attributes will be located inside the declaration of a new item
  • XMLElementWrapper: It allows to store lists of objects. You can even add your own object but you will have to add anotations to the underlying object as well. Like you did for this class.
import java.util.ArrayList;
import javax.xml.bind.annotation.XmlAttribute;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlElementWrapper;
import javax.xml.bind.annotation.XmlRootElement;

@XmlRootElement(name="XMLClass")
public class XMLClass {

	private String name;
	private int id;
	private ArrayList xmlElements;
	
	public ArrayList getXmlElements() {
		return xmlElements;
	}
	
	@XmlElementWrapper(name="ElementList")
	@XmlElement(name="Element")
	public void setXmlElements(ArrayList xmlElements) {
		this.xmlElements = xmlElements;
	}
	
	public int getId() {
		return id;
	}
	
	@XmlAttribute(name="ID")
	public void setId(int id) {
		this.id = id;
	}
	
	public String getName() {
		return name;
	}
	
	@XmlElement(name="Name")
	public void setName(String name) {
		this.name = name;
	}
}

An example of usage:

XMLClass cl = new XMLClass();
cl.setId(1234);
cl.setName("MyXMLExample");
cl.setXmlElements(new ArrayList());
cl.getXmlElements().add("Element1");
cl.getXmlElements().add("Element2");
cl.getXmlElements().add("Element3");
cl.getXmlElements().add("Element4");
try {
	XmlHelper.storeXml(cl, new File("xmlexample.xml"));
} catch (JAXBException e) {
	e.printStackTrace();
}

Result:

<XMLClass ID="1234">
    <Name>MyXMLExample</Name>
    <ElementList>
        <Element>Element1</Element>
        <Element>Element2</Element>
        <Element>Element3</Element>
        <Element>Element4</Element>
    </ElementList>
</XMLClass>

Zipping and Unzipping

Another handy class to manage zips in Java. I will not cover this one as it is self explanatory. Just call the zip function with a source path and the target path (e.g: my_path/myZip.zip).

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;
import java.util.zip.ZipOutputStream;
/**
 * Utility class to provide some handy method about zip management
 * Credit goes to: https://www.baeldung.com/java-compress-and-uncompress
 * 
 * @author sbi
 *
 */
public class ZipHelper {
	
	/**
	 * Zips a folder. The folder will be contained inside the zip.
	 * 
	 * @param sourceFile - The path to the source folder to zip
	 * @param target - The name and path to the target zipped file. e.g: path_to_file/myZip.zip
	 * @throws IOException - If the zipping failed
	 */
	public static void zipFolder(String sourceFile, String target) throws IOException {
        FileOutputStream fos = new FileOutputStream(target);
        ZipOutputStream zipOut = new ZipOutputStream(fos);
        File fileToZip = new File(sourceFile);
 
        zipFile(fileToZip, fileToZip.getName(), zipOut);
        zipOut.close();
        fos.close();
    }
 
	/**
	 * Internal zip method to zip a specific file into the folder
	 * 
	 * @param fileToZip - The file to zip
	 * @param fileName - The name of the file
	 * @param zipOut - The outputstream of the current zipping process
	 * @throws IOException - If the zipping failed
	 */
    private static void zipFile(File fileToZip, String fileName, ZipOutputStream zipOut) throws IOException {
        if (fileToZip.isHidden()) {
            return;
        }
        if (fileToZip.isDirectory()) {
            if (fileName.endsWith("/")) {
                zipOut.putNextEntry(new ZipEntry(fileName));
                zipOut.closeEntry();
            } else {
                zipOut.putNextEntry(new ZipEntry(fileName + "/"));
                zipOut.closeEntry();
            }
            File[] children = fileToZip.listFiles();
            for (File childFile : children) {
                zipFile(childFile, fileName + "/" + childFile.getName(), zipOut);
            }
            return;
        }
        FileInputStream fis = new FileInputStream(fileToZip);
        ZipEntry zipEntry = new ZipEntry(fileName);
        zipOut.putNextEntry(zipEntry);
        byte[] bytes = new byte[1024];
        int length;
        while ((length = fis.read(bytes)) >= 0) {
            zipOut.write(bytes, 0, length);
        }
        fis.close();
    }
    
    /**
     * Unzips a zip file
     * 
     * @param fileZip - The zip file to unzip
     * @param destDir - The path to where the zip file will be unzipped
     * @throws IOException - If the zipping failed
     */
    public static void unzipFolder(String fileZip, String destDir) throws IOException {
        byte[] buffer = new byte[1024];
        ZipInputStream zis = new ZipInputStream(new FileInputStream(fileZip));
        ZipEntry zipEntry = zis.getNextEntry();
        while (zipEntry != null) {
            File newFile = newFile(new File(destDir), zipEntry);
            FileOutputStream fos = new FileOutputStream(newFile);
            int len;
            while ((len = zis.read(buffer)) > 0) {
                fos.write(buffer, 0, len);
            }
            fos.close();
            zipEntry = zis.getNextEntry();
        }
        zis.closeEntry();
        zis.close();
    }
    
    /**
     * Internal unzip method used to unzip a specific file
     * 
     * @param destinationDir
     * @param zipEntry
     * @return
     * @throws IOException
     */
    private static File newFile(File destinationDir, ZipEntry zipEntry) throws IOException {
        File destFile = new File(destinationDir, zipEntry.getName());
         
        String destDirPath = destinationDir.getCanonicalPath();
        String destFilePath = destFile.getCanonicalPath();
         
        if (!destFilePath.startsWith(destDirPath + File.separator)) {
            throw new IOException("Entry is outside of the target dir: " + zipEntry.getName());
        }
         
        return destFile;
    }
}

Storing and loading BufferedImage as string

I came across a point where I had to manage Images inside a program. I wanted to store images in a handy way other than image formats. I wanted to store it as a String to be able to store it inside an XML file. And transform it on the fly as image again during runtime. So here are 2 simple functions to store and load PNG images. You can adapt it for other formats of course:

public static String imageToString(BufferedImage img) throws IOException {
	final ByteArrayOutputStream os = new ByteArrayOutputStream();
	ImageIO.write(img, "png", os);
	return Base64.getEncoder().encodeToString(os.toByteArray());
}

public static BufferedImage stringToImage(String text) throws IOException {
	byte[] imageData = Base64.getDecoder().decode(text);
	ByteArrayInputStream bais = new ByteArrayInputStream(imageData);
	return ImageIO.read(bais);
}

I use Base64 to avoid having strange character results, it’s then stored on a One Line String which is more beautiful. If you store it without Base64 in an XML document, it will be difficult to load it again, as it doesn’t support strange characters. And it seems to be a bit more condenced, resulting in lower size footprint.

Download file from URL

This one is for downloading a file from an URL, pretty easy to use. Provide the url as the Source and specify a target file.

public static void downloadFileTo(String source, String target) throws IOException {
	BufferedInputStream inputStream = new BufferedInputStream(new URL(source).openStream());
	File file = new File(target);
	file.getParentFile().mkdirs();
	file.createNewFile();
	FileOutputStream fileOS = new FileOutputStream(file,false);
	byte data[] = new byte[1024];
	int byteContent;
	while ((byteContent = inputStream.read(data, 0, 1024)) != -1) {
		fileOS.write(data, 0, byteContent);
	}
	fileOS.close();
}

Log4j2 external file configuration

I came to the point where I had to configure log4j2 to use a specific file outside the generated jar file. I don’t really like embedded configuration files as you cannot edit them on the fly. So here is a function to specify the location of the log4j2.xml file. You will have to call it at the start of your program so you can use logging as soon as possible:

private static Logger log;
private void configureLogging(String location) throws FileNotFoundException, IOException {
	// Set configuration file for log4j2
	ConfigurationSource source = new ConfigurationSource(new FileInputStream(location));
	Configurator.initialize(null, source);
	log = LogManager.getLogger(YourClass.class);
}

Cet article Java 1.8 Utility classes: XML, ZIP, BufferedImage and Download est apparu en premier sur Blog dbi services.

WebLogic 12.2.1.4 software installation or upgrade, installs and configure Coherence cache in the domains

$
0
0

We are using silent installation to install WebLogic Server software based on a response file with “INSTALL_TYPE=WebLogic Server” but it looks like the coherence server is installed becuase when we start the WebLogic Server, the following can be seen in the WebLogic log files:

Oracle Coherence Version 12.2.1.4.0 Build 74888
Grid Edition: Development mode
Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.
2020-01-03 14:18:45.334/42.467 Oracle Coherence GE 12.2.1.4.0  (thread=[ACTIVE] ExecuteThread: '5' for queue: 'weblogic.kernel.Default (self-tuning)', member=n/a): Configured versioned, multi-cluster Management over ReST

We have the same behavior when installing WebLogic Software using the GUI and selecting WebLogic only.

This was not the case with WebLogic 12.2.1.3.

As the Release Notes of Oracle WebLogic Server 12.2.1.4 documentation states, there is no option to deselect the coherence.

Installing the Oracle WebLogic Server and Coherence Software

About the Coherence Installation Type
For the WebLogic Server and Coherence standard installation topology, select the WebLogic Server installation type. When you select this installation type and use instructions in this guide, the standard installation topology includes a Coherence cluster that contains storage-enabled Managed Coherence Servers.

And below in the “Table 2-1 Oracle WebLogic Server and Coherence Installation Screens” we can see:

To create the standard installation topology for WebLogic Server and Coherence, select WebLogic Server.

Note:The Coherence gets installed with the WebLogic Server Installation and there is no option to deselect it under WebLogic Server.

I opened a ticket at Oracle support concerning my licensing concerns as we do have a few customers with WebLogic Basic licensing that do not included coherence. Here is the answer:

In an FMW 12c 12.2.1.x Infrastructure installation, there are no options for a Custom Installation and by default Coherence binaries will be installed during installation. A Bug/ER 25889430 has been created for this, and determined there was no action necessary. The Coherence binaries being installed by default it should not create any licensing issues from a Support perspective when installing and using the FMW 12c Infrastructure as documented. Oracle may install Coherence with an Infrastructure (including WebLogic Server Management Framework), but that does not mean a Coherence Enterprise Edition license needs to be purchased.

In Weblogic 12c Installation, there’s no option for Custom Install Type under which the installation of Coherence can be deselected with Weblogic Installation. By default, Coherence binaries will be installed during Weblogic 12c installation (unlike 11g, where there was an option to deselect Coherence under Custom Install Type) . The Coherence binaries shouldn’t create any problems until they are used. If desired, one can delete “Coherence” directory, once the installation is complete.

The Oracle support published two KM notes in the meantime.

Is it Possible to De-Select Coherence During the Oracle Fusion Middleware 12c Infrastructure Installation? ( Doc ID 2254754.1 )
Coherence Gets Installed With Weblogic Server 12c – Can This Be Deselected? ( Doc ID 2310687.1 )

Cet article WebLogic 12.2.1.4 software installation or upgrade, installs and configure Coherence cache in the domains est apparu en premier sur Blog dbi services.

After login as administrator to the WebLogic console, impossible to run administration action

$
0
0

For a few weeks I was faced with a problem that I had never encountered before. On a Windows Server, browsing to the WebLogic Administration console, I was not able to do any modification or even check the monitoring screens. I always got a message that an error occurred without more explanation. After a new start of the WebLogic Administration Server, the error disappeared but was back after a day a a few hours.

      <An exception [java.lang.InternalError: Unexpected CryptoAPI failure generating seed] was thrown while rendering the content at [/jsp/contentheader/ContentMenu.jsp].
javax.servlet.ServletException: java.lang.InternalError: Unexpected CryptoAPI failure generating seed
at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:341)
at weblogic.servlet.internal.ServletStubImpl.onAddToMapException(ServletStubImpl.java:416)
at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:326)
at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:183)

Note the highlighted line above; the issue was that the random generator was not having a enough wide range.

After this finding, the solution was to add the “urandom” generator in java options of the Administration Server start command. This was done using the setUserOverrides.cmd file to avoid those changes to be lost at next upgrade.

@REM Set the urandom for the Administration Server

if "%SERVER_NAME%"=="AdminServer" (
    set JAVA_OPTIONS=%JAVA_OPTIONS% -Djava.security.egd=file:/dev/./urandom
)

Of course a restart of the WebLogic Administration Server is required for those java options to be taken in account.

Cet article After login as administrator to the WebLogic console, impossible to run administration action est apparu en premier sur Blog dbi services.

DevOpsDays 2020 Geneva – Day 2

$
0
0

As promised in my previous blog, I will give you a short feedback on the second day here at DevOpsDays 2020 Geneva. In fact, I prepared a very interesting program of sessions to attend and people to meet. I must say that the open-spaces concept applied on the second day allowed participants to discuss on different topics without limit.

I started the day by preparing our stand with my colleagues Pierre-Yves Brehier and Arnaud Berbier, as usual we were the first come and first ready 😉

Next step was attending very interesting sessions :

GitOps as a way to manage enterprise K8s and virtual machines

Speakers highlighted the fact to have the entire system described in a declarative way, which means that if you read your configuration files you are able to understand what your system is looking like. The second point is that the Git is a single source of truth.
They explained also the pull-based deployments and how it should be triggered by building pipeline, push container images into the registry, aso… All steps of Continuous Integration pipeline and Continuous Delivery pipeline. This was an interesting presentation with interesting approach, see the image below as an example (without comment).

Building a scalable logging platform

Within k8s how you can manage your logs and alerts, this was interesting for me as I am actually working on a similar project for one of our customer.
From our side we are using Filebeat, Elasticsearch, Kibana and it is working fine with K8s and non-k8s environments!

The ROI of Mental Health: Building Happier, More Profitable Companies

We are and we feel that everyone is exited by new technologies and DevOps, that is why I think this topic deserves its place in this kind of events. Vinciane de Pape spoke about mental health and the risk for working more than 11 hours a day. I must confess that at dbi services everything is made to prevent this from happening! That is why we are a Great place to work 😉

At the end, I had the honor to write my name there, hoping to come back next year 🙂

Cet article DevOpsDays 2020 Geneva – Day 2 est apparu en premier sur Blog dbi services.

WebLogic Server process takes 100% CPU

$
0
0

During some monitoring work I niticed taht top shows the process 12013 using 800% CPU (We have a 12 CPU machine) and had to find out what’s happening !!

I used top to check the process ID and then the threads taking the CPU

ps -ef |grep 12013

weblogic 12013 11836 23 Apr03 ?        03:38:31 /app/weblogic/Java/jdk/bin/java -server -Xms2048m -Xmx2048m -XX:MaxMetaspaceSize=512m -Dweblogic.Name=msD2-02 -Djava.security.policy=/app/weblogic/Middleware/wlserver/server/lib/weblogic.policy -Dweblogic.ProductionModeEnabled=true -Dweblogic.system.BootIdentityFile=/app/weblogic/domains/myDomain/servers/msD2-02/data/nodemanager/boot.properties -Dweblogic.nodemanager.ServiceEnabled=true -Dweblogic.nmservice.RotationEnabled=true -Dweblogic.security.SSL.ignoreHostnameVerification=false -Dweblogic.ReverseDNSAllowed=false -Xms2048m -Xmx2048m -XX:MaxMetaspaceSize=512m -Dcom.sun.xml.ws.api.streaming.XMLStreamReaderFactory.woodstox=true -Dcom.sun.xml.ws.api.streaming.XMLStreamWriterFactory.woodstox=true -Djava.io.tmpdir=/app/weblogic/tmp/myDomain/msD2-02 -Ddomain.home=/app/weblogic/domains/myDomain -Dweblogic.nodemanager.ServiceEnabled=true -Dweblogic.security.SSL.protocolVersion=TLS1 -Dweblogic.security.disableNullCipher=true -Djava.security.egd=file:///dev/./urandom -Dweblogic.security.allowCryptoJDefaultJCEVerification=true -Dweblogic.nodemanager.ServiceEnabled=true -Djava.endorsed.dirs=/app/weblogic/Java/jdk1.8.0_45/jre/lib/endorsed:/app/weblogic/Middleware/wlserver/../oracle_common/modules/endorsed -da -Dwls.home=/app/weblogic/Middleware/wlserver/server -Dweblogic.home=/app/weblogic/Middleware/wlserver/server -Dweblogic.management.server=https://10.10.1.1:7001 -Dweblogic.utils.cmm.lowertier.ServiceDisabled=true weblogic.Server

top -H -b -p 12013

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
12624 weblogic  20   0 10.5g 3.6g  30m S 83.1 11.5  18053:38 java
 2540 weblogic  20   0 10.5g 3.6g  30m S 79.4 11.5   3:11.62 java
 2591 weblogic  20   0 10.5g 3.6g  30m S 79.4 11.5   3:08.02 java
12436 weblogic  20   0 10.5g 3.6g  30m S  9.2 11.5   3:14.59 java

Now that I have the Unix threads PID, I will get the thread dump of the WebLogic Server process and try to figure out the culprits. To generate the thread dump I used the jstack java tool.

jstack -l 12013 >  $HOME/jstack_100_CPU.txt

The Thread dump provides the threads with the ID in Hexadecimal, thus those provided from the top needs to be converted:

[myDomain]$ printf '%x\n' 12624
3150
[myDomain]$  printf '%x\n' 2540
9ec
[myDomain]$  printf '%x\n' 2591
a1f
[myDomain]$
"LDAPConnThread-1172 ldaps://ldapserver.dbi-lab.com:3636" #167119 daemon prio=5 os_prio=0 tid=0x00007f5444087800 nid=0xa1f runnable [0x00007f5409e88000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        - locked  (a sun.nio.ch.Util$2)
        - locked  (a java.util.Collections$UnmodifiableSet)
        - locked  (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        at weblogic.socket.NIOSocketMuxer$NIOInputStream.readInternal(NIOSocketMuxer.java:802)
        at weblogic.socket.NIOSocketMuxer$NIOInputStream.read(NIOSocketMuxer.java:746)
        at weblogic.socket.NIOSocketMuxer$NIOInputStream.read(NIOSocketMuxer.java:729)
        at weblogic.socket.JSSEFilterImpl.readFromNetwork(JSSEFilterImpl.java:462)
        at weblogic.socket.JSSEFilterImpl.read(JSSEFilterImpl.java:424)
        at weblogic.socket.JSSESocket$JSSEInputStream.read(JSSESocket.java:98)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
        - locked  (a java.io.BufferedInputStream)
        at netscape.ldap.ber.stream.BERElement.getElement(Unknown Source)
        at netscape.ldap.LDAPConnThread.run(Unknown Source)
        at java.lang.Thread.run(Thread.java:745)

 
"[STUCK] ExecuteThread: '302' for queue: 'weblogic.kernel.Default (self-tuning)'" #326 daemon prio=1 os_prio=0 tid=0x00007f5560376000 nid=0x3150 runnable [0x00007f543636e000]
   java.lang.Thread.State: RUNNABLE
        at java.io.InputStream.skip(InputStream.java:224)
        at weblogic.utils.http.HttpChunkInputStream.skip(HttpChunkInputStream.java:287)
        at weblogic.utils.http.HttpChunkInputStream.skipAllChunk(HttpChunkInputStream.java:497)
        at weblogic.servlet.internal.ServletInputStreamImpl.ensureChunkedConsumed(ServletInputStreamImpl.java:51)
        at weblogic.servlet.internal.ServletRequestImpl.skipUnreadBody(ServletRequestImpl.java:217)
        at weblogic.servlet.internal.ServletRequestImpl.reset(ServletRequestImpl.java:169)
        at weblogic.servlet.internal.HttpConnectionHandler.prepareRequestForReuse(HttpConnectionHandler.java:258)
        at weblogic.servlet.internal.HttpConnectionHandler.requeue(HttpConnectionHandler.java:665)
        at weblogic.servlet.internal.VirtualConnection.requeue(VirtualConnection.java:332)
        at weblogic.servlet.internal.ServletResponseImpl.send(ServletResponseImpl.java:1657)
        at weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1582)
        at weblogic.servlet.provider.ContainerSupportProviderImpl$WlsRequestExecutor.run(ContainerSupportProviderImpl.java:255)
        at weblogic.work.ExecuteThread.execute(ExecuteThread.java:311)
        at weblogic.work.ExecuteThread.run(ExecuteThread.java:263)

"LDAPConnThread-1171 ldaps://ldapserver.dbi-lab.com:3636"" #167118 daemon prio=5 os_prio=0 tid=0x00007f5544048000 nid=0x9ec runnable [0x00007f540a28c000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.interrupt(Native Method)
        at sun.nio.ch.EPollArrayWrapper.interrupt(EPollArrayWrapper.java:317)
        at sun.nio.ch.EPollSelectorImpl.wakeup(EPollSelectorImpl.java:193)
        - locked  (a java.lang.Object)
        at java.nio.channels.spi.AbstractSelector$1.interrupt(AbstractSelector.java:213)
        at java.nio.channels.spi.AbstractSelector.begin(AbstractSelector.java:219)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:78)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        - locked  (a sun.nio.ch.Util$2)
        - locked  (a java.util.Collections$UnmodifiableSet)
        - locked  (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        at weblogic.socket.NIOSocketMuxer$NIOInputStream.readInternal(NIOSocketMuxer.java:802)
        at weblogic.socket.NIOSocketMuxer$NIOInputStream.read(NIOSocketMuxer.java:746)
        at weblogic.socket.NIOSocketMuxer$NIOInputStream.read(NIOSocketMuxer.java:729)
        at weblogic.socket.JSSEFilterImpl.readFromNetwork(JSSEFilterImpl.java:462)
        at weblogic.socket.JSSEFilterImpl.read(JSSEFilterImpl.java:424)
        at weblogic.socket.JSSESocket$JSSEInputStream.read(JSSESocket.java:98)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
        - locked  (a java.io.BufferedInputStream)
        at netscape.ldap.ber.stream.BERElement.getElement(Unknown Source)
        at netscape.ldap.LDAPConnThread.run(Unknown Source)
        at java.lang.Thread.run(Thread.java:745)

This looks to be an issue with the LDAP server or the network.

The WebLogic java threads loops on the LDAP connection to get data. This loop should fail in a timeout depending on the Authentication provider LDAP connection timeout settings. But it looks there is an issue with the JDK that reinitiates the connection in the sun.nio.ch.EPollArrayWrapper class.

Once the LDAP connection was back the WebLogic Domain could be restarted properly and the service was back.

A possible workaround could be to not use the weblogic.socket.NIOSocketMuxer but the weblogic.socket.PosixSocketMuxer. This workaround is described in Oracle Suport note:2128032.1
Or a better solution is to upgrade to the latest WebLogic Software version and apply the latest PSU where this issue is fixed.

Cet article WebLogic Server process takes 100% CPU est apparu en premier sur Blog dbi services.

DevOpsDays 2020 at HEG in Geneva

$
0
0

This 2020 new year began with a really exiting DevOps Event in Geneva. Kubernetes, Helm, Cloud Native, CNCF, CI/CD, Ansible, Terraform,… So much topics around DevOps that was in every people mouth. This confirm that DevOps is a good choice for any retraining 😉 This is effectively the next generation in the IT world.

For this second edition, around 300 peoples were attending the event. Thanks to the organizers

During the 1st day, I attended a workshop “Terraform best practices with examples and arguments” given by Anton Babenko. Anton is AWS community Hero and is specialized in Infrastructure as Code. Do no hesitate to have a look on his web site dedicated to best practices

https://www.terraform-best-practices.com

Next to the workshop, I went to the interesting comparison of the git repository organization mechanism, the monorepo and polyrepo support.

Maria Guseva, provides examples, give advantages and drawbacks of each and show us how they are managing source code monorepo at Yandex

Another session that took my attention was the one provided by Paolo Kreth “Persistence layers for microservices – the converge database approach”. It began very well with well structured introduction. I was exited and he was providing information about how to architecture application with persistence layer and microservices.

Then, even after putting a disclaimer that his presentation is not for marketing purpose, he totally changed and talked a lot about the Oracle Database Product. Mainly saying that with latest versions “from 12c”, the Pluggable Database is comparable to the container “the container database” where we could easily patch, upgrade without near zero downtime. Adding that with the autonomous features, there is less stuff to take care about compared to PostgreSQL for example. The speech was concentrated to compare a previous job failed POC made at Mobiliar. As an Oracle employee, we can understand.

By the way, he also remind  the  updated  licensing  model as of  December 5th 2019 the machine learning and the spatial & Graph features option do no longer require a specific license. These are included in all editions of the Oracle Database “EE and SE2”

Let me share a screenshot of the speech that I mostly appreciate… It was done by “Jan De Vries”, Becoming antifragile is more important than ever in disruptive times

I enjoyed this two days and thanks again to the organiser.

Cet article DevOpsDays 2020 at HEG in Geneva est apparu en premier sur Blog dbi services.

Refactoring procedural to SQL – an example with MySQL Sakila

$
0
0

By Franck Pachot

What I want to show in this blog post is that, as in mathematics where you have to apply some algebra rules to transform an equation to an equivalent one, the database developer must translate the business specification to an equivalent that is optimized (in performance, reliability and readability) for the data model.

I was looking at the Sakila sample database provided with MySQL. It simulates a DVD rental store. For my younger readers who wonder what this is, you can imagine a pre-Netflix generation where, when you want to watch a movie, you read it from a storage media that you bring at home, rather than streaming it through the internet from a distant data center. I’ll write someday about how this Netflix approach, while being more agile (you choose a movie and can watch it immediately – on demand) is a terrible resource consumption design which, in my opinion, is not sustainable. This DVD vs. Netflix example may be good to introduce data gravity, and processing data in the database. But that’s not the topic here. Or maybe it is because I’ll talk about optimization, and stored procedure…

I have installed MySQL in my Oracle Cloud Free Tier instance (Oracle Linux 7) with the following:


sudo yum install -y https://yum.oracle.com/repo/OracleLinux/OL7/MySQL80_community/x86_64/getPackage/mysql-community-devel-8.0.19-1.el7.x86_64.rpm
sudo systemctl start mysqld.service
mysql -uroot -p
mysql --connect-expired-password --user=root \
--password=$(sudo awk '/temporary password/{print $NF}' /var/log/mysqld.log) \
-e "alter user 'root'@'localhost' identified by '2020 @FranckPachot';"

Then I have downloaded and run the Sakila sample database installation:


curl https://downloads.mysql.com/docs/sakila-db.tar.gz | tar -xzvf - -C /var/tmp
mysql --user=root --password='2020 @FranckPachot' < /var/tmp/sakila-db/sakila-schema.sql
mysql --user=root --password='2020 @FranckPachot' < /var/tmp/sakila-db/sakila-data.sql

And I was looking at the example in the documentation: https://dev.mysql.com/doc/sakila/en/sakila-usage.html#sakila-usage-rent-a-dvd which starts the “Rent a DVD” use-case by checking if the DVD is available in stock:


mysql> SELECT inventory_in_stock(10);
+------------------------+
| inventory_in_stock(10) |
+------------------------+
|                      1 |
+------------------------+

This function is defined as:


DELIMITER $$
CREATE FUNCTION inventory_in_stock(p_inventory_id INT) RETURNS BOOLEAN
READS SQL DATA
BEGIN
    DECLARE v_rentals INT;
    DECLARE v_out     INT;

    #AN ITEM IS IN-STOCK IF THERE ARE EITHER NO ROWS IN THE rental TABLE
    #FOR THE ITEM OR ALL ROWS HAVE return_date POPULATED

    SELECT COUNT(*) INTO v_rentals
    FROM rental
    WHERE inventory_id = p_inventory_id;

    IF v_rentals = 0 THEN
      RETURN TRUE;
    END IF;

    SELECT COUNT(rental_id) INTO v_out
    FROM inventory LEFT JOIN rental USING(inventory_id)
    WHERE inventory.inventory_id = p_inventory_id
    AND rental.return_date IS NULL;

    IF v_out > 0 THEN
      RETURN FALSE;
    ELSE
      RETURN TRUE;
    END IF;
END $$

DELIMITER ;

 
I was really surprised by the complexity of this: 2 queries in a procedural function for something that I expect to be just a simple SQL query. I guess the purpose was to show what can be done within a procedure but it is a very bad example for people starting with SQL. At least the comment is clear:


    #AN ITEM IS IN-STOCK IF THERE ARE EITHER NO ROWS IN THE rental TABLE
    #FOR THE ITEM OR ALL ROWS HAVE return_date POPULATED

But the code seems to be a one-to-one translation of this sentence to procedural code. And as I mentioned in the introduction, this may have to be worded differently to be optimized. And we must be sure, of course, that the transformation is a functional equivalent. This is called “refactoring”.

Prefer SQL to procedural language

There are several reasons we should prefer one SQL query, when possible, to procedural code. First, because different languages are usually executed in a different engine, and that means context switch and passing parameters. Second, when you run everything as a SQL query, your modules (should I say microservices?) can be merged as a subquery in the calling statement and the query planner can go further to optimize it as a whole. And then you have the advantage of both worlds: the performance of monoliths with the agility of micro-services.
In addition to that, you may think that procedural code is easier to read and evolve, but it is actually the opposite. This impression comes probably from the way IT is learned at school. Students learn a lot of 3rd generation languages (procedural), as interpreted scripts or compiled programs. When I was at university I also learned some 4th generation (declarative) languages like Prolog and, of course, SQL. But this is, in my opinion, neglected today. The language is not important but the logic is. A declarative language is like a math formula: probably difficult to start with, because of the high level of abstraction. But then, once the logic is understood, it because obvious and error-safe. In today’s developer life, this means: more errors encountered before arriving to a solution, but fewer bugs and side effects once the solution is found.

There’s another reason to prefer one SQL statement rather than many in a procedure. The latter may have its behavior depending on the isolation level, and you may have to think about exception handling. A SQL statement is always atomic and consistent.

Static analysis

I’ll re-write this function as a simple query and, before looking at the dynamic of the procedural language, I’m looking at bounding the scope: the tables and columns used. This will help to define the tests that should cover all possibilities.
The function reads two tables: the INVENTORY (one row per DVD) and RENTAL (one row per rent transaction). On INVENTORY we check only the presence of the INVENTORY_ID. On RENTAL we check the presence of the INVENTORY_ID and we also check the presence of a RETURN_DATE to know if it is back to the stock.

Define the tests

When I refactor an existing function, the improved performance is only the secondary goal. The first goal is to be sure that my new proposal is functionally equivalent to the existing one. Then, it is critical to build the tests to validate this.

I’ll use the same item, inventory_id=10, and update the tables to get all variations. I’ll rollback between each and then I must disable autocommit:

--------------
set autocommit=0
--------------

This DVD, in the initial state of the sample database, has been rented 3 times and is now back in the stock (all rentals have a return date):

--------------
select * from inventory where inventory_id=10
--------------

+--------------+---------+----------+---------------------+
| inventory_id | film_id | store_id | last_update         |
+--------------+---------+----------+---------------------+
|           10 |       2 |        2 | 2006-02-15 05:09:17 |
+--------------+---------+----------+---------------------+
--------------
select * from rental where inventory_id=10
--------------

+-----------+---------------------+--------------+-------------+---------------------+----------+---------------------+
| rental_id | rental_date         | inventory_id | customer_id | return_date         | staff_id | last_update         |
+-----------+---------------------+--------------+-------------+---------------------+----------+---------------------+
|      4364 | 2005-07-07 19:46:51 |           10 |         145 | 2005-07-08 21:55:51 |        1 | 2006-02-15 21:30:53 |
|      7733 | 2005-07-28 05:04:47 |           10 |          82 | 2005-08-05 05:12:47 |        2 | 2006-02-15 21:30:53 |
|     15218 | 2005-08-22 16:59:05 |           10 |         139 | 2005-08-30 17:01:05 |        1 | 2006-02-15 21:30:53 |
+-----------+---------------------+--------------+-------------+---------------------+----------+---------------------+

The INVENTORY_IN_STOCK function returns “true” in this case:

--------------
SELECT inventory_in_stock(10)
--------------

+------------------------+
| inventory_in_stock(10) |
+------------------------+
|                      1 |
+------------------------+

My second test simulates a rental that is not yet returned, by setting the last RENTAL_DATE to null:

--------------
update rental set return_date=null where inventory_id=10 and rental_id=15218
--------------

--------------
select * from inventory where inventory_id=10
--------------

+--------------+---------+----------+---------------------+
| inventory_id | film_id | store_id | last_update         |
+--------------+---------+----------+---------------------+
|           10 |       2 |        2 | 2006-02-15 05:09:17 |
+--------------+---------+----------+---------------------+
--------------
select * from rental where inventory_id=10
--------------

+-----------+---------------------+--------------+-------------+---------------------+----------+---------------------+
| rental_id | rental_date         | inventory_id | customer_id | return_date         | staff_id | last_update         |
+-----------+---------------------+--------------+-------------+---------------------+----------+---------------------+
|      4364 | 2005-07-07 19:46:51 |           10 |         145 | 2005-07-08 21:55:51 |        1 | 2006-02-15 21:30:53 |
|      7733 | 2005-07-28 05:04:47 |           10 |          82 | 2005-08-05 05:12:47 |        2 | 2006-02-15 21:30:53 |
|     15218 | 2005-08-22 16:59:05 |           10 |         139 | NULL                |        1 | 2020-02-29 22:27:15 |
+-----------+---------------------+--------------+-------------+---------------------+----------+---------------------+

There, the INVENTORY_IN_STOCK returns “false” because the DVD is out:

--------------
SELECT inventory_in_stock(10)
--------------

+------------------------+
| inventory_in_stock(10) |
+------------------------+
|                      0 |
+------------------------+

My third test simulates a case that should not happen: multiple rentals not returned. If this exists in the database, it can be considered as corrupted data because we should not be able to rent a DVD that was not returned from the previous rental. But when I replace a function with a new version, I want to be sure that the behavior is the same even in case of a bug.

--------------
rollback
--------------

--------------
update rental set return_date=null where inventory_id=10
--------------

--------------
select * from inventory where inventory_id=10
--------------

+--------------+---------+----------+---------------------+
| inventory_id | film_id | store_id | last_update         |
+--------------+---------+----------+---------------------+
|           10 |       2 |        2 | 2006-02-15 05:09:17 |
+--------------+---------+----------+---------------------+
--------------
select * from rental where inventory_id=10
--------------

+-----------+---------------------+--------------+-------------+-------------+----------+---------------------+
| rental_id | rental_date         | inventory_id | customer_id | return_date | staff_id | last_update         |
+-----------+---------------------+--------------+-------------+-------------+----------+---------------------+
|      4364 | 2005-07-07 19:46:51 |           10 |         145 | NULL        |        1 | 2020-02-29 22:27:16 |
|      7733 | 2005-07-28 05:04:47 |           10 |          82 | NULL        |        2 | 2020-02-29 22:27:16 |
|     15218 | 2005-08-22 16:59:05 |           10 |         139 | NULL        |        1 | 2020-02-29 22:27:16 |
+-----------+---------------------+--------------+-------------+-------------+----------+---------------------+
--------------
SELECT inventory_in_stock(10)
--------------

+------------------------+
| inventory_in_stock(10) |
+------------------------+
|                      0 |
+------------------------+

This returns “false” for the presence in the stock and I want to return the same in my new function.

I will not test the case where rentals exists but the DVD is not in the inventory because this situation cannot happen thanks to referential integrity constraints:


--------------
delete from inventory
--------------

ERROR 1451 (23000) at line 130 in file: 'sakila-mysql.sql': Cannot delete or update a parent row: a foreign key constraint fails (`sakila`.`rental`, CONSTRAINT `fk_rental_inventory` FOREIGN KEY (`inventory_id`) REFERENCES `inventory` (`inventory_id`) ON DELETE RESTRICT ON UPDATE CASCADE)

The next test is about an item which never had any rental:

--------------
rollback
--------------

--------------
delete from rental where inventory_id=10
--------------

--------------
select * from inventory where inventory_id=10
--------------

+--------------+---------+----------+---------------------+
| inventory_id | film_id | store_id | last_update         |
+--------------+---------+----------+---------------------+
|           10 |       2 |        2 | 2006-02-15 05:09:17 |
+--------------+---------+----------+---------------------+
--------------
select * from rental where inventory_id=10
--------------

This means that the DVD is available in the stock:


--------------
SELECT inventory_in_stock(10)
--------------

+------------------------+
| inventory_in_stock(10) |
+------------------------+
|                      1 |
+------------------------+

One situation remains: the item is not in the inventory and has no rows in RENTAL:

--------------
delete from inventory where inventory_id=10
--------------

--------------
select * from inventory where inventory_id=10
--------------

--------------
select * from rental where inventory_id=10
--------------

--------------
SELECT inventory_in_stock(10)
--------------

+------------------------+
| inventory_in_stock(10) |
+------------------------+
|                      1 |
+------------------------+

For this one, the INVENTORY_IN_STOCK defined in the Sakila sample database returns “true” as if we can rent a DVD that is not in the inventory. Probably the situation has a low probability to be encountered because the application cannot get an INVENTORY_ID if the row is not in the inventory. But a function should be correct in all cases without any guess on the context where it is called. Are you sure that some concurrent modification cannot encounter this case: one user removing a DVD and another accepting to rent it at the same time? That’s the problem with procedural code: it is very hard to ensure that we cover all cases that can happen with various loops and branches. With declarative code like SQL, because it follows a declarative logic, the risk is lower.
For this refactoring, I’ll make sure that I have the same result even if I think this is a bug. Anyway, it is easy to add a join to INVENTORY if we want to fix this.

--------------
rollback
--------------

Single SQL for the same logic

The function returns “true”, meaning that the item is in-stock, when there are no rows in RENTAL (never rented) or if all rows in RENTAL have the RETURN_DATE populated (meaning that it is back to stock).

The Sakila author has implemented those two situations as two queries but that is not needed. I’ll show a few alternatives, all in one SQL query.

NOT EXISTS

“All row in RENTAL have RETURN_DATE not null” can be translated to “no row in RENTAL where RETURN_DATE is null” and that can be implemented with a NOT EXISTS subquery. This also covers the case where there’s no row at all in RENTAL:

select not exists (
 select 1 from rental where rental.inventory_id=10
 and return_date is null
);

This returns the same true/false value as the INVENTORY_IN_STOCK function. But the parameter for inventory_id must be present in the query (the value 10 here) and this still needs to call a procedure because MySQL has no parameterized views.

NOT IN

A similar query can use NOT IN to return “false” as soon as there is a rental that was not returned:

select (10) not in (
 select inventory_id from rental where return_date is null
);

The advantage here is that the parameter (the value 10 here) is outside of the subquery.

VIEW TO ANTI-JOIN

Then, the subquery can be defined as a view where the logic (the return_date being null) is encapsulated and usable in many places:


create or replace view v_out_of_stock as
 select inventory_id from rental where return_date is null;

And we can use it from another query (probably a query that gets the INVENTORY_ID by its name) with a simple anti-join:


select (10) not in (
 select inventory_id from v_out_of_stock
);

VIEW TO JOIN

If you prefer a view to join to (rather than anti-join), you can add the INVENTORY table into the view. But then, you will have a different result in the case where the inventory_id does not exist (where I think the Sakila function is not correct).


create or replace view v_inventory_instock as
select inventory_id from inventory
where inventory_id not in (
 select inventory_id from rental where return_date is null
);

Then is you are coming from INVENTORY you can simply join to this view:


select '=== not same in view',(10) in (
 select inventory_id from inventory natural join v_inventory_instock
);

I used a natural join because I know there is only one column, but be careful with that. In doubt just join with a USING clause.

Of course this may read the INVENTORY table two times because the MySQL optimizer does not detect that this join can be eliminated:


explain
select * from inventory natural join v_inventory_instock where inventory_id=10
--------------

+----+-------------+-----------+------------+-------+---------------------+---------------------+---------+-------+------+----------+-------------------------+
| id | select_type | table     | partitions | type  | possible_keys       | key                 | key_len | ref   | rows | filtered | Extra                   |
+----+-------------+-----------+------------+-------+---------------------+---------------------+---------+-------+------+----------+-------------------------+
|  1 | SIMPLE      | inventory | NULL       | const | PRIMARY             | PRIMARY             | 3       | const |    1 |   100.00 | NULL                    |
|  1 | SIMPLE      | inventory | NULL       | const | PRIMARY             | PRIMARY             | 3       | const |    1 |   100.00 | Using index             |
|  1 | SIMPLE      | rental    | NULL       | ref   | idx_fk_inventory_id | idx_fk_inventory_id | 3       | const |    3 |   100.00 | Using where; Not exists |
+----+-------------+-----------+------------+-------+---------------------+---------------------+---------+-------+------+----------+-------------------------+

However, this is still cheaper than calling a function. And you may prefer to create the in-stock view with all the columns from INVENTORY and then don’t have to join to it.

COUNT

If you prefer a COUNT rather than an EXISTS like what was done in the Sakila function, you can compare the number of rentals with the number of returns:


select count(rental_date)=count(return_date) from rental where inventory_id=10 ;

This works as well with no rentals as the count is at zero for both. However, leaves less possibilities to the optimizer. With a ‘NOT IN’ the scan of all rentals can stop as soon as a row is encountered. With a COUNT all rows will be read.

Performance

As the Sakila function is designed, taking an INVENTORY_ID as input, there are good chances that the application calls this function after a visit to the INVENTORY table.

You can test this with the original Sakila sample database with just the additional following view which lists the inventory_id with the in-stock status


create or replace view v_inventory_stock_status as
select inventory_id, inventory_id not in (
 select inventory_id from rental where return_date is null
) inventory_in_stock
from inventory
;

I have run the following to get the execution time when checking the in-stock status for each inventory item:


time mysql --user=root --password='2020 @FranckPachot'  -bve "
use sakila;
select count(*),inventory_in_stock(inventory_id) 
from inventory 
group by inventory_in_stock(inventory_id);"
--------------
select count(*),inventory_in_stock(inventory_id) from inventory group by inventory_in_stock(inventory_id)
--------------

+----------+----------------------------------+
| count(*) | inventory_in_stock(inventory_id) |
+----------+----------------------------------+
|     4398 |                                1 |
|      183 |                                0 |
+----------+----------------------------------+

real    0m2.272s
user    0m0.005s
sys     0m0.002s

2 seconds for those 4581 calls to the inventory_in_stock() function, which itself has executed one or two queries.

Now here is the same group by using an outer join to my view:


time mysql --user=root --password='2020 @FranckPachot'  -bve "
use sakila;
select count(inventory_id),inventory_in_stock
from inventory
natural left outer join v_inventory_stock_status
group by inventory_in_stock;"

select count(inventory_id),inventory_in_stock
from inventory
natural left outer join v_inventory_stock_status
group by inventory_in_stock
--------------

+---------------------+--------------------+
| count(inventory_id) | inventory_in_stock |
+---------------------+--------------------+
|                4398 |                  1 |
|                 183 |                  0 |
+---------------------+--------------------+

real    0m0.107s
user    0m0.005s
sys     0m0.001s

Of course, the timings here are there only to show the idea.

You can test the same in your environment, and profile it to understand the overhead behind the recursive calls to the procedure and to its SQL statements. Stored procedures and functions are nice to provide an API to your database. It is ok to execute one per user call. But when you get the parameters from a SQL statement (INVENTORY_ID here) and call a function which itself executes some other SQL statements you go to poor performance and non-scalable design. When you can refactor the functions to replace them by views you keep the same maintainability with reusable modules, and you also give to the SQL optimizer a subquery that can be merged in a calling SQL statement. A join will always be faster than calling a function that executes another SQL statement.

The idea here is common to all databases, but other database engines may have better solutions. PostgreSQL can inline functions that are defined in “language SQL” and this can be used as a parameterized view. Oracle 20c introduces SQL Macros which are like a SQL preprocessor where the function returns a SQL clause rather than computing the result.

Cet article Refactoring procedural to SQL – an example with MySQL Sakila est apparu en premier sur Blog dbi services.


Control M/EM : Shout destination tables graphic and cmd mode

$
0
0

Introduction :

Today we will check how to use notification by using shout destination table, indeed sometime it could be useful to be notified about our jobs status ( long execution, late submission failure etc..).

For that kind of need we can use job notifications through shout method. We will explain it below:

  • We will use shout destination table to manage notification
  • To call scripts
  • We will define shout destination and test it with jobs.

Configuration used:

For my tests Iam  using Control M 9.0.18 running on a Centos 8

Create a shout destination table:

 

1)Using CCM GUI :

  • First step we will use the CCM graphical interface to create our shout destinations:

 

Select SYSTEM

  • Now we can create a New destination for our job notifications

Be careful of the maximum number of characters , as you can see on the screen the logical name allows only a max of  16 characters:

  • Let’s rename it and select our destination, we will select “server” in the destination’s drop down panel.

  • Now we will check the “destination” which is the most interesting part of this shout configuration because it offers many choices and many ways to use the shout destination manager:

We will come back later to these options especially the “command” option

For the moment as we want to send a mail notification, we will select the “Mail” option:

Warning:

Max number of characters for mail addresses is 98 characters ( but  later we will do an interesting test during our job notification sending )

In your configuration you have to separate your mails with a “,”  to be effective.

2)Using cmd mode CTMSYS utility:

You can use also the ctmsys utility by connecting on your controlm server with user controlm

serveurctmv919% ctmsys
+------------------------------------------------+
|      CONTROL-M System Maintenance Utility      |
|                  Main Menu                     |
+------------------------------------------------+
 
 1)   Shout Destination Tables
 2)   System Parameters
  1. q) Quit
 
Enter Option:1
Shout Destination Tables Menu
-----------------------------
 
Active Shout Destination Table: SYSTEM
 
1)    Create/Modify a Table
2)    Set Active Table
3)    List Tables
4)    Delete Table
  1. q) Quit and return to main menu
 
Enter Option:1
 
Shout Destination Tables
------------------------
SYSTEM
 
Table to create/modify or 'q' to quit [SYSTEM]:1
 
Shout Destination Table '1'
---------------------------
 
#    Destination Type Adr Logical Name       Physical Name
---  ---------------- --- ----------------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
  1. q) Quit e#) Edit entry # n) New entry  d#) Delete entry #
 
Enter Option:n
 
Dest. Type: (U)ser, (M)ail, (T)erminal, c(O)nsole, (L)og, (P)rogram, CONTROL-M/(E)M:M
Address Type: (S)erver or (A)gent:S
Logical Name:SHOUT_TO_CTM_TEAM
WARNING we have the same issue than on CCM when we defined the logical name

Logical Name will be truncated to 16 characters.
Physical Name:nabil.controlm@mymailfromdbi-services.com
 
add completed successfully
 
you can also edit the shout table by user the second option edit entry with e and the line you want to edit ( here table 1 for example ) :
 
Shout Destination Table 'SYSTEM'
--------------------------------
 
#    Destination Type Adr Logical Name       Physical Name
---  ---------------- --- ----------------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
  1  O                S   CONSOLE                                                                                                                           
 
  2  E                S   EM                                                                                                                                 
 
  3  L                S   IOALOG                                                                                                                            
 
  4  M                S   SHOUT_TO_CTM      nabilcontrolm@dbi-services.com,saoualcontrolm@gmail.com                                                            
 
  1. q) Quit e#) Edit entry # n) New entry  d#) Delete entry #
 
Enter Option:e4
  1. q) Quit e#) Edit entry # n) New entry  d#) Delete entry #
 
Enter Option:e4
 
Dest. Type: (U)ser, (M)ail, (T)erminal, c(O)nsole, (L)og, (P)rogram, CONTROL-M/(E)M [M ]:m
Address Type: (S)erver or (A)gent [S ]:s
Physical Name [nabilcontrolm@dbi-services.com,saoualcontrolm@gmail.com]:nabil.saoual@ctm.com
  • Once we have completed the shout definition then we can use it in a job:

  • Go to the actions tab , we will configure the job to generate a notification following a statement or a condition ( for example late execution,in our case the notification will be sent past a minute of execution)

We will update the job with some parameters to call the command

You can notice that you have ton enter the SHOUT NAME the first time to be taken in account ( you can’t see the destination “SHOUT_TO_CTM”  until we add it manually .)

 

When checking the log we can see that the shout is not performed even though job’s configuration is correct.

But Why?

As you know the shout destination table works on smtp protocol so let’s have a look on it

  • We will do a check on controlm smtp configuration :

First you  have to go to ctm_menu with user controlm and select the choice 4 then make a test with choice 6

CTMSRVCENTOS% ctm_menu
CONTROL-M Main Menu
-----------------------------
Select one of the following menus:
1 - CONTROL-M Manager
2 - Database Menu
3 - Security Authorization
4 - Parameter Customization
5 - Host Group
6 - View HostID details
7 - Agent Status
8 - Troubleshooting
q - Quit

Enter option number ---> []:4
Parameter Customization Menu
-----------------------------
Select one of the following options:
1 - Basic Communication and Operational Parameters
2 - Advanced Communication and Operational Parameters
3 - System Parameters and Shout Destination Tables
4 - Default Parameters for Communicating with Agent Platforms
5 - Parameters for Communicating with Specific Agent Platforms
6 - Simple Mail Transfer Protocol Parameters
q - Quit
Enter option number ---> []:6
Simple Mail Transfer Protocol Parameters Menu
----------------------------------------------
Select one of the following options:
1 - SMTP Server (Relay) Name : localhost.localdomain
2 - Sender Email : CONTROL@M
3 - Port Number : 25
4 - Sender Friendly Name : NabilfromCTMSRV
5 - Reply-To Email : nabilcontrolm@dbi-services.com
6 - Test SMTP Settings
s - Save Parameters
q - Quit
Enter option number ---> []:6
Testing SMTP Settings...
Enter To-Email []:nabilcontrolm@dbi-services.com
Test SMTP Setting Failed.

The smtp test failed  🙁 ,the fact is that the service sendmail was not installed and configured !(fresh machine ,hope on your environment you are already up to date!)

  • So lets go for installing sendmail!
[root@CTMSRVCENTOS ~]# yum install sendmail*
Dernière vérification de l’expiration des métadonnées effectuée il y a 0:05:23 le jeu. 27 févr. 2020 16:52:04 CET.
Dépendances résolues.
=============================================================================================================================================================
Paquet Architecture Version Dépôt Taille
=============================================================================================================================================================
Installing:
sendmail x86_64 8.15.2-32.el8 AppStream 773 k
sendmail-cf noarch 8.15.2-32.el8 AppStream 198 k
sendmail-doc noarch 8.15.2-32.el8 AppStream 581 k
sendmail-milter x86_64 8.15.2-32.el8 AppStream 82 k
Installation des dépendances:
procmail x86_64 3.22-47.el8 AppStream 180 k
Résumé de la transaction
=============================================================================================================================================================
Installer 5 Paquets
Taille totale des téléchargements : 1.8 M
Taille des paquets installés : 4.7 M
Voulez-vous continuer ? [o/N] : o
Téléchargement des paquets :
(1/5): procmail-3.22-47.el8.x86_64.rpm 62 kB/s | 180 kB 00:02
(2/5): sendmail-cf-8.15.2-32.el8.noarch.rpm 58 kB/s | 198 kB 00:03
(3/5): sendmail-milter-8.15.2-32.el8.x86_64.rpm 247 kB/s | 82 kB 00:00
(4/5): sendmail-8.15.2-32.el8.x86_64.rpm 168 kB/s | 773 kB 00:04
(5/5): sendmail-doc-8.15.2-32.el8.noarch.rpm 343 kB/s | 581 kB 00:01
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Total 279 kB/s | 1.8 MB 00:06
Test de la transaction en cours
La vérification de la transaction a réussi.
Lancement de la transaction de test
Transaction de test réussie.
Exécution de la transaction
Préparation en cours : 1/1
Installing : procmail-3.22-47.el8.x86_64 1/5
Exécution du scriptlet: sendmail-8.15.2-32.el8.x86_64 2/5
Installing : sendmail-8.15.2-32.el8.x86_64 2/5
Exécution du scriptlet: sendmail-8.15.2-32.el8.x86_64 2/5
Installing : sendmail-cf-8.15.2-32.el8.noarch 3/5
Installing : sendmail-doc-8.15.2-32.el8.noarch 4/5
Installing : sendmail-milter-8.15.2-32.el8.x86_64 5/5
Exécution du scriptlet: sendmail-milter-8.15.2-32.el8.x86_64 5/5
Vérification de : procmail-3.22-47.el8.x86_64 1/5
Vérification de : sendmail-8.15.2-32.el8.x86_64 2/5
Vérification de : sendmail-cf-8.15.2-32.el8.noarch 3/5
Vérification de : sendmail-doc-8.15.2-32.el8.noarch 4/5
Vérification de : sendmail-milter-8.15.2-32.el8.x86_64 5/5
Installé:
sendmail-8.15.2-32.el8.x86_64 sendmail-cf-8.15.2-32.el8.noarch sendmail-doc-8.15.2-32.el8.noarch sendmail-milter-8.15.2-32.el8.x86_64
procmail-3.22-47.el8.x86_64
Terminé !
  • After the completion of sendmail service installation,you have to configure it.

Let’s follow this link to make you own configuration:

Configure Sendmail MTA on CentOS 8 to work as SMTP Relay

Once the sendmail configurations step is done  we will make  again a quick check to the  controlm smtp parameters:

Simple Mail Transfer Protocol Parameters Menu
----------------------------------------------
 
Select one of the following options:
 
1 - SMTP Server (Relay) Name : localhost.localdomain
2 - Sender Email             : CONTROL@M
3 - Port Number              : 25
4 - Sender Friendly Name     : NabilfromCTMSRV
5 - Reply-To Email           : nabilcontrolm@dbi-services.com
6 - Test SMTP Settings
 
s - Save Parameters
 
q - Quit
 
 Enter option number --->   [6]:6
 
Testing SMTP Settings...
Enter To-Email [nabil.controlm@dbi-services.com]:nabil.controlm@dbi-services.com
 
Test SMTP Setting completed successfully.
 
Press Enter to continue

The test is now OK we are able to send mails using notifications.

In the job’s log we have the shout that is performed:

And on the email destination defined in ctmsys we have the notification:

 

 

Using a script with shout destination table

How about killing a job executing a too long time using shout destination table?

We can use the script ctmkilljob in association with shout destination table to activate it on a job with notification:

We have to add the order ID after the command in order to kill the job:

Lets try the command line to kill a job :

[ctmag900@CTMSRVCENTOS exe_9.0.18.200]$ /home/controlm/ctm_agent/ctm/exe_9.0.18.200/ctmkilljob -ORDERID 0001C
Output:
Job was killed. Result: Success.

You have to add the orderid (reminder:  it’s a unique identifier for a job following his order date ) after the script to have your job killed on the workflow

Lets have a look on controlm workflow:

Job is killed as expected

  • So next step is to call this command through ctmsys ( shout destination table ).

To make the thing more interesting we will make our own script and configure the jobs to be killed following the conditions we want ( late sub,too long execution etc etc…)

Create the script and adding it in ctmsys shout destination table:

Following this way we can configure jobs to be killed depending on the conditions we want ( late sub,too long execution etc etc.)

  • To kill a too long execution of a  job using CTMSYS ,we will create a script in a folder you have chosen and give it the correct level rights to be executed by user control for example:
CTMSRVCENTOS% cat /home/controlm/APP/kill_you
#!/bin/sh
ctmkilljob -ORDERID $2 &
CTMSRVCENTOS% chmod 755 /home/controlm/APP/kill_you
CTMSRVCENTOS% id
uid=1001(controlm) gid=3110(controlm) groups=3110(controlm)
CTMSRVCENTOS% ls -lart /home/controlm/APP/kill_you
-rwxr-xr-x. 1 controlm controlm 35 Feb 28 08:19 /home/controlm/APP/kill_you

 

  • Once your script is prepared, we have to add it in ctmsys as below:

Select in destination drop down menu program

Add the name of your script with the complete path:

  • Be careful to add in the message the variable ORDERID in order to be used by the script called in CTMSYS:

 

  • After this preparation we will now be able to run our job and see if it will send a late execution mail and also if the shout will  kill it after 1 minute:

  • After having run the job we can see in job’s log that he was killed after a minute as required :

 

Conclusion:

There are many ways to use ctmsys,indeed we only have used a small part of its possibilities for our examples,so you can adapt this tool to your needs,don’t hesitate to give a feed back about what you tried from your side!

Once again you can consult the BMC support site to get more info and check also this video

 

https://www.youtube.com/watch?v=e2U7wJcMutg

 

Be sure to check every blog updates on dbi-services especially on Control M 😀 and enjoy trying other action using ctmsys utility!

 

Next time we will see how to use the mass update and see some useful examples.

Cet article Control M/EM : Shout destination tables graphic and cmd mode est apparu en premier sur Blog dbi services.

SQL Server: Collect Page Split events using Extended Event session

$
0
0

Earlier this week someone tried to show me how to capture page split events using Extended Events (XE) but unfortunately, the demo failed. This is a good opportunity for me to refresh my knowledge about page split and set up a simple demo about this. Hopefully, this one will be working.

It’s not necessarily a bad thing when a page split occurs. It’s a totally fine behavior when we INSERT a row in a table with a clustered index (e.g.: column with identity property). SQL Server will create a page to the right-hand side of the index (at the leaf level). This event is a kind of page split; SPLIT_FOR_INSERT

In this post, I will focus on the SPLIT_FOR_UPDATE page split which occurs when we UPDATE a row. The updated row size doesn’t fit anymore in its current page. A new page is allocated by SQL Server, the row is moved to this new page before being written to disk and to the transaction log. Moreover, pages in all the index pointing to the data pages are updated. This type of page split can be problematic.

Let’s start this demo. I’ll use the WildWorldImporters database on SQL Server 2019.

The page_split Extended Event

The Extended Event page_split provides a splitOperation element. The value 3 stands for SPLIT_FOR_UPDATE.
This event did not provide such detailed information before SQL Server 2012.

CREATE EVENT SESSION [PageSplit] ON SERVER
ADD EVENT sqlserver.page_split(
    WHERE (
             [splitOperation]=   3
             AND [database_id]=  9      -- Change to your database_id
       )
)
ADD TARGET package0.ring_buffer
GO
ALTER EVENT SESSION [PageSplit] ON SERVER STATE=START;
GO

Page split demo

EXEC sys.sp_configure N'show advanced options', N'1' RECONFIGURE WITH OVERRIDE
GO
EXEC sys.sp_configure N'fill factor (%)', N'0' RECONFIGURE WITH OVERRIDE
GO
EXEC sys.sp_configure N'show advanced options', N'0' RECONFIGURE WITH OVERRIDE
GO
use WideWorldImporters
go
create sequence Sequences.PageSplitOdd start with 1 INCREMENT by 2;
create sequence Sequences.PageSplitEven start with 2 INCREMENT by 2;

create table DemoPageSplit (
    id int not null primary key,
    col varchar(8000) null
);

First, I make sure the Fill Factor is set to 100% (value 0) and for the sake of the demo I use sequences to insert only odd numbers in my primary key.
An SQL Server page is 8KB. With this table structure, 2 rows can easily fit on the same page if the varchar column is not fully used.

insert into DemoPageSplit(id, col)
    select next value for [Sequences].PageSplitOdd, replicate('a', 4000)
go 10

I just inserted 10 rows with only 4000 bytes of data each for the varchar column.
Let’s have a look at the clustered index leaf-level pages stats.

SELECT index_type_desc, alloc_unit_type_desc, index_depth, avg_page_space_used_in_percent
    , page_count, record_count, avg_record_size_in_bytes
FROM sys.dm_db_index_physical_stats(
    DB_ID('WideWorldImporters')
    , OBJECT_ID('DemoPageSplit'), NULL, NULL , 'DETAILED')
where index_level = 0 -- 0 for index leaf levels

We can see 5 pages at the leaf-level of the index. Pages are almost 100% full and can fit 2 rows each as expected.

I can now UPDATE one of the rows to trigger a page split.

update DemoPageSplit set col = CONCAT(col, replicate('c', 1000)) where id = 5

The SPLIT_FOR_UPDATE event has been generated successfully.

Now, I run the previous query again about index stats, the result shows a new page has been created while the number of rows is still 10.

Querying the XE data with sys.dm_db_page_info

Using the new SQL Server 2019 DMV in a CROSS APPLY I can directly see in SQL some in-depth details about the pages changed by the page split.

;with xe AS (
    select
        xed.event_data.value('(data[@name="database_id"]/value)[1]', 'int') as databaseId
        , xed.event_data.value('(data[@name="file_id"]/value)[1]', 'int') as fileId
        , xed.event_data.value('(data[@name="new_page_page_id"]/value)[1]', 'int') as new_page
        , xed.event_data.value('(data[@name="page_id"]/value)[1]', 'int') as pageId
        , xed.event_data.value('(data[@name="splitOperation"]/value)[1]', 'varchar') as SplitOperation
    FROM (
        SELECT CAST(target_data as XML) target_data
        FROM sys.dm_xe_sessions AS s
            JOIN sys.dm_xe_session_targets t
                ON s.address = t.event_session_address
        WHERE s.name = 'PageSplit'
          AND t.target_name = 'ring_buffer' 
    ) as tab
        CROSS APPLY target_data.nodes('RingBufferTarget/event') as xed(event_data)
)
select p.page_id, p.page_type_desc, p.free_bytes, p.free_bytes_offset
from xe
    cross apply sys.dm_db_page_info(
        xe.databaseId, xe.fileId, xe.new_page, 'DETAILED'
    ) AS p
union
select p.page_id, p.page_type_desc, p.free_bytes, p.free_bytes_offset
from xe
    cross apply sys.dm_db_page_info(
       xe.databaseId, xe.fileId, xe.pageId, 'DETAILED'
    ) AS p

The page with id 8114 has about 3KB of free space. It’s is the new page produced by the Page Split. The page 8118 contains the row we updated to a 4KB varchar.
In addition to this demo, if you want to go further I suggest you remove the filter on splitOperation from the XE session and run the following batchs;

insert into DemoPageSplit(id, col)
    select next value for [Sequences].PageSplitEven, replicate('a', 4000)
go 10
insert into DemoPageSplit(id, col)
    select next value for [Sequences].PageSplitEven, replicate('a', 8000)
go 10

What kind of Page split operation is produced and what is the consequence on the index page count?

In this blog post, we’ve seen how to easily trigger a page split events of type UPDATE and capture them using an Extended Event Session. We saw that such XE session data can be used with the new DMV sys.dm_db_page_info.

Cet article SQL Server: Collect Page Split events using Extended Event session est apparu en premier sur Blog dbi services.

Control M /EM : Mass update,some examples on how to use it

$
0
0

Introduction:

 

We are back today to write about a powerful tool that we can use to update our folders.

 

Question:

Suppose you have to update two jobs of your folder; the task should be easy.

But what happens when you must update 50 jobs in a restricted delay?

Solution:

 

To get this easier and smarter we will use the find and update tool.

We will use the find and update tool:

  • To find specifics jobs
  • To find specific commands
  • To find specific users
  • To update jobs with new parameters

Before going further, we will list the methods to find information’s in Control M folders:

We have many ways to get information about our folders:

  1. By using The SQL queries
  2. By using The XML export
  3. By using The find and update method

Depending of the need, you can use the three methods as well knowing that they are the result of SQL queries.

I. Using the find and update option:

First of all, we have to connect on CTM GUI and go to the planning panel to check job definitions

We will select the folder where we will have to perform the task

Then we will use the find and update button :

We will check how many jobs contain the word TEST in their name:

Result:

We found 12 jobs corresponding to these criteria.

Now supposing we want to list and update only a certain type of job for example OS jobs, excluding MFT jobs and dummy jobs.

We select JOB/Folder Type in the huge drop down panel and enter the type requested ( here *OS* for OS JOB )

Note:

As explained at the beginning of the blog this searching tool is based on SQL queries, it only gives us a more “friendly” way to perform our queries.

When we launch the search we have 10 jobs corresponding to OS jobs and containing TEST is their name.

Now you know the principle, we can proceed to any update of these jobs depending of your request:

For example, let’s try to update the jobs by assigning them a Quantitative resource:

First step is to make a “backup” of our folder.

Easiest way to do that is to make a xml export file (that is also a good way to search information’s about your folder)

Once your save done, we can proceed to QR adding:

We will assign a QR for each jobs containing TEST in their name.

Important:

Ensure you have clicked on the checkout button to update your jobs:

Select in update part the function you want to assign or update and enter the character chain you want to apply it for your select jobs.

After the jobs update you can check them to see if the update is corresponding to what you expect

If it’s not the case, you can perform a Rollback Updates command.

Let’s open a job and see if QR update worked.

Update worked well as we see in prerequisite tab, we can also see that no quantity is defined nor total Amount (which must be defined in the QR definitions tab, maybe in a new blog in coming).

So what about adding this quantity to each job? 😀

After that we have to check if jobs are configured as expected:

Watch out !

If you want to add QR by mass update be careful to add your quantity in the same command to avoid adding twice a time a QR

Example:

If you repeat the update above , it will be incremented so you will find two identical QR in a same job and you would get this kind of result

So make sure you double check you updates and if possible do the update in a single row like below:

After your update don’t forget to do your check-in 😉

Tip:Save and load presets

Last time I did a tricky query, but I forgot how did I achieved to have the result I was expecting for…So instead of searching like me during hours to find it,what about doing some presets of our favorites or our more complicated requests?

Select the Presets button than save your preset ( the drop down panel show also the other saved presets)

That’s it , you will no longer blame your old age for having forgotten how to sort our jobs…

 

II. Update jobs by editing XML file or Jobs & Folders file Editor

 

1)Update from XML file:

 

You can also update your jobs by exporting the corresponding xml file (as for our previous folder backup):

In the planning tab you select the folder that you want to export, then once exported you can edit it with your favorite file editor:

It’s a quick way to identify jobs and other components of your folder such as hostnames, conditions or variables.

Once your fields updated you import the file in the planning then you can upload it:

But there is a constraint:

You can’t keep the modification’s history of your folder if you upload it from an xml file,as control will consider it as a brand new folder.

 

2)Update with Jobs & Folders file Editor tool:

 

You can also use the Jobs & Folders file Editor which is a tool mixing XML file and graphical interface for your jobs:

Load the XML file you want to update:

After your modifications you can save it:

Then you can upload the XML file.

This tool let you have a better translation and understanding of your XML file in a Control M folder display format.

Reminder:

The result is the same as updating with XML file in other terms you will not be able to keep your folder’s history.

III. Using SQL queries :

This method is the basic method and the modification on the database doesn’t need a “checkout check-in” process:

In the following examples we will check how to update jobs through SQL queries and see the result on the planning.

We will use the table below to update our jobs and check if they are taken in account in control M planning:

You can get the Control M DB schema following this link to process your requests:

ftp://ftp.bmc.com/pub/control-m/opensystem/DB_Schemas/918_ERDs/Control-M_9.0.18_DB_Ports_Diagram.zip

Connect on user control and type SQL to get the prompt:

Constraint:

You need to know the name of the table to do your update and of course the associated command.

If you have no SQL skills it could be better to use find and update graphical tool.

The below example will show us how to update a jobname with PostgreSQL command:

  • List the jobs by SQL query:

First we will list the jobs defined in the folder:

ctmem=> select job_name,application from def_ver_job where application like 'MFTAPPLICATION' ;
              job_name              |  application
------------------------------------+----------------
 JOB_TEST_CR3                       | MFTAPPLICATION
 JOB_TEST_PGADMIN_re                | MFTAPPLICATION
 JOB_TEST_CR2                       | MFTAPPLICATION
 JOB_TEST_CR6                       | MFTAPPLICATION
 FileWatcher_Job                    | MFTAPPLICATION
 JOB_TEST_SHOUT                     | MFTAPPLICATION
 FileWatcher_Job1                   | MFTAPPLICATION
 JOB_TEST_CR5                       | MFTAPPLICATION
 JOB_TEST_CR4                       | MFTAPPLICATION
 JOB_TEST                           | MFTAPPLICATION
 JOB_TEST_SHOUT2                    | MFTAPPLICATION
 MFT_TRANSFERT_CENT_TO_SRVCTM3      | MFTAPPLICATION
 MFT_TRANSFERT_CENT_TO_SRVCTM4      | MFTAPPLICATION
 MFT_TRANSFERT_CENT_TO_SRVCTM1      | MFTAPPLICATION
 JOB_TEST_CR1                       | MFTAPPLICATION
 JOB_TEST_PGADMIN_updated           | MFTAPPLICATION
 MFT_TRANSFERT_TEST_CENT_TO_SRVCTM2 | MFTAPPLICATION
 TEST_DUMMY                         | MFTAPPLICATION
(18 rows)
  • Update and verification on PostgreSQL :

We will update the jobname and check if the update worked:

 

ctmem=>
ctmem=> update def_ver_job set job_name='JOB_TEST_PGADMIN_updated' where job_name like 'JOB_TEST_PGADMIN_ re ' ;
UPDATE 2
ctmem=> select job_name from def_ver_job where job_name like 'JOB_TEST%' and application like 'MFT%' ;
         job_name
--------------------------
 JOB_TEST_CR3
 JOB_TEST_PGADMIN_updated
 JOB_TEST_CR2
 JOB_TEST_CR6
 JOB_TEST_SHOUT
 JOB_TEST_CR5
 JOB_TEST_CR4
 JOB_TEST
 JOB_TEST_SHOUT2
 JOB_TEST_CR1
 JOB_TEST_PGADMIN_updated
(11 rows)
 
ctmem=>
  • Verification on the control M GUI planning panel

It’s taken in account in the planning (you must quit your workspace and reload it , no need to log off):

Now we want to repeat this action on many jobs:

Let’s try to update the ‘run as’ part of every job running as controlm user and compare it to the find and update graphical method:

First we have to identify the column corresponding to the ‘run as’ in dbschema , it matches with ‘owner’ parameter :

ctmem=> select owner,job_name,application from def_ver_job where application like 'MFTAPPLICATION' ;
       owner        |              job_name              |  application
--------------------+------------------------------------+----------------
 controlm           | JOB_TEST_CR3                       | MFTAPPLICATION
 controlm           | JOB_TEST_PGADMIN_updated           | MFTAPPLICATION
 controlm           | JOB_TEST_CR2                       | MFTAPPLICATION
 controlm           | JOB_TEST_CR6                       | MFTAPPLICATION
 controlm           | FileWatcher_Job                    | MFTAPPLICATION
 controlm           | JOB_TEST_SHOUT                     | MFTAPPLICATION
 controlm           | FileWatcher_Job1                   | MFTAPPLICATION
 controlm           | JOB_TEST_CR5                       | MFTAPPLICATION
 controlm           | JOB_TEST_CR4                       | MFTAPPLICATION
 NABIL_MFT_2        | MFT_TRANSFERT_CENT_TO_SRVCTM3      | MFTAPPLICATION
 CMT_NABIL_EXPORTED | MFT_TRANSFERT_CENT_TO_SRVCTM4      | MFTAPPLICATION
 controlm           | JOB_TEST                           | MFTAPPLICATION
 controlm           | JOB_TEST_SHOUT2                    | MFTAPPLICATION
 NABIL_MFT_2        | MFT_TRANSFERT_CENT_TO_SRVCTM1      | MFTAPPLICATION
 controlm           | JOB_TEST_CR1                       | MFTAPPLICATION
 controlm           | JOB_TEST_PGADMIN_updated           | MFTAPPLICATION
 CMT_NABIL_EXPORTED | MFT_TRANSFERT_TEST_CENT_TO_SRVCTM2 | MFTAPPLICATION
 DUMMYUSR           | TEST_DUMMY                         | MFTAPPLICATION
(18 rows)

Now we have listed all the users of my folder we can update the ‘run as’ parameter to modify from controlm ‘run as user’ to emuser:

1)Find jobs to modify on control M GUI :

We make a find and update to list the jobs running with user controlm then we will update them by PostgreSQL query (don’t use the update button we will do this update through PostgreSQL query just below to get the same result 😉 ):

2) Mass update the jobs from SQL query:

We will perform the update by sql query and check if it is taken in account on Control M GUI

ctmem=> update def_ver_job set owner='emuser' where owner like 'controlm' and application like 'MFTAPPLICATION' ;
UPDATE 13
ctmem=>

Then we can list again if run as update worked:

1)Checking with PostgreSQL query:

ctmem=> select owner,job_name,application from def_ver_job where application like 'MFTAPPLICATION' ;
       owner        |              job_name              |  application
--------------------+------------------------------------+----------------
 emuser             | JOB_TEST_CR3                       | MFTAPPLICATION
 emuser             | JOB_TEST_PGADMIN_updated           | MFTAPPLICATION
 emuser             | JOB_TEST_CR2                       | MFTAPPLICATION
 emuser             | JOB_TEST_CR6                       | MFTAPPLICATION
 emuser             | FileWatcher_Job                    | MFTAPPLICATION
 emuser             | JOB_TEST_SHOUT                     | MFTAPPLICATION
 emuser             | FileWatcher_Job1                   | MFTAPPLICATION
 emuser             | JOB_TEST_CR5                       | MFTAPPLICATION
 emuser             | JOB_TEST_CR4                       | MFTAPPLICATION
 emuser             | JOB_TEST                           | MFTAPPLICATION
 emuser             | JOB_TEST_SHOUT2                    | MFTAPPLICATION
 NABIL_MFT_2        | MFT_TRANSFERT_CENT_TO_SRVCTM3      | MFTAPPLICATION
 CMT_NABIL_EXPORTED | MFT_TRANSFERT_CENT_TO_SRVCTM4      | MFTAPPLICATION
 NABIL_MFT_2        | MFT_TRANSFERT_CENT_TO_SRVCTM1      | MFTAPPLICATION
 emuser             | JOB_TEST_CR1                       | MFTAPPLICATION
 emuser             | JOB_TEST_PGADMIN_updated           | MFTAPPLICATION
 CMT_NABIL_EXPORTED | MFT_TRANSFERT_TEST_CENT_TO_SRVCTM2 | MFTAPPLICATION
 DUMMYUSR           | TEST_DUMMY                         | MFTAPPLICATION
(18 rows)
 
ctmem=>

2)Checking from GUI find and update method:

By checking on the GUI we see Run as user is now updated , the owner emuser is now assigned as ‘run as’ to the 12 jobs ,they were updated, furthermore we didn’t see any controlm user in the search:

As a result , we can see that the check of run as user controlm return no match.

 

Using SQL query is a powerful  tool to do the mass update but it is more friendly to use find and update tool , because it is easier for you to do a  rollback update  and also find the rows you want to update without consulting the Control M dbschema which be a bit less intuitive.

To make a quick review of the the main solutions here’s a quick summary of good and bad points of each of them:

And you, which method would you chose?

 

Conclusion:

 

Now you know how to perform your updates on your Control M’s folders, naturally  you can choose the most adapted solution depending of your need.

Be careful to always have a backup of your folders before doing your update and feel free to give other tips or tricks in the comment part.

Once again you can check my  dbi colleagues blogs concerning Control M and other technologies and of course see you for a next session 😀

You also have the possibility to check BMC site and their excellent videos.

By the way,to avoid any influence I prefer to do my topic before checking if any similar is existing on this site or elsewhere,then I can compare if we have the same methodology and way to use Control M  :D.

Also feel free to check my other blogs and don’t hesitate to share your advice on it ;).

 

Another example on how to use Mass update, I told you it’s a really cool tool !

 

Cet article Control M /EM : Mass update,some examples on how to use it est apparu en premier sur Blog dbi services.

How to automate a kill of one or several executing jobs in AJF at specific Date/Time at once

$
0
0

Introduction

We had a request to automate a kill of some executing jobs in AJF at specific date and time.

For example:
A Control-M job is running a program all day, in order other programs can connect to it but those programs have to be stopped the night

There are several ways to manage that, such as:

  • using in Action Tab, use Notifications before job completion, then run a Shout to a script which kills the Job with its %%ORDERID as parameter:
  • Run a separate job which runs a script which does a Unix command Kill -9 of the script
  • This new way which is also to run a separate job which runs a script which uses the native kill feature of Control-M (ctmkill utility), but it can be used to Kill one or several jobs at once, based on the list of Jobs name
    => This last method will be described here

For that we need

  • a script (in Unix) which will run the ctmkill utility for 1 or several Job names which are in Executing mode
  • A Control-M job which launches the script, passing as parameter the list of jobs to be killed
    That Job is scheduled at the wished date & time
  • In some contexts, we need to pay attention on some prerequisites (Job name, type of agent, Control-M security)

Job definition

For this use case, there are Jobs:

  1. Job_Killer


  2. Define in PARM# list of Jobs name to Kill

  3. Two Jobs to kill : Job_To_Kill_1 and Job_To_Kill_2
  4. Running Jobs in AJF

Unix script

$HOME/My_Scripts/kill_job_by_jobname.sh

# Description: kill one or several Control-M executing Jobs by using ctmpsm and ctmkill Control-M utilities
#  Parameter: 1 or several Job name from PARAM#
#
echo The number of daemon to kill is $#
echo jobname $1
if [ -z "$1" ]
then
   echo A jobname is required as first parameter
   exit 101
else
  echo Jobname : $1
fi

# for each parameter
until [ $# = 0 ]
do
  echo Parameter value : $1
### When using the ctmpsm utility of an Control-M agent, the separator is “pipe |” for awk
  export OrderId=`ctmpsm -LISTALL | grep $1 | grep Executin |  awk -F" " '{print $1}'`
#
### When using the ctmpsm utility of the Control-M server, the separator is “space |” for awk
#  export OrderId=`ctmpsm -LISTALL | grep $1 | grep Executin |  awk -F" " '{print $1}'`
  echo The orderId for $1 job is : $OrderId

  # Get the number of order id in result
  num_OrderId=`echo $OrderId | awk  '{print NF}'`
  echo The number of order id is : $num_OrderId

  # If no order id
  if [ -z "$OrderId" ]
  then
    echo No OrderId found. No kill will happen for $1.
    #echo No OrderId found. EXIT 102.
    #exit 102

  # If more than one order id
  elif [ $num_OrderId -gt 1 ]
  then
    echo There is more than one order id : $num_OrderId. No kill will happen for $1.
    #echo There is more than one order id : $num_OrderId. EXIT 103.
    #exit 103

  # Else display and kill the job
  else
    echo Parameter value is : $1.
    echo The order number is : $OrderId
    ctmkilljob -ORDERID $OrderId
  fi
  shift
done



Pre-requisites

  1. Job name issues
    • Length
      It must be up to 18 characters, because the ctmpsm displays 18 characters
    • Similar Name
      There should be no other existing running job which name starting with the same name, otherwise that job will be taken in account
      For example:
      Job executing to kill : Job_123
      Another job executing : Job_123_To_no_Kill
      => That could be confusing because both jobs will be found in the pattern then it will fail
  2. Control-M Security

    If Control-M Full Security (ctmsys) is set to Y, as the Agent is running the ctmkill utility, ensure that agent is defined in ctmsec as a User, and it has the permission in AUTHORIZED AJF Tab, The Kill option set to Yes

    Otherwise you’ll get such error in the Output

  3. ctmpsm utility

    This utility is used to list all jobs in the AJF
    It exists in Control-M/Server installation and also in Control-M/Agent installation, but the Output is slightly different, and then has an impact of the awk command within the script

    • ctmpsm of Control-M Server

      The separator is a space “ “

    • ctmpsm of Control-M Agent

      The separator is a pipe “| “

=> In the script, with the awk command, indicate the appropriate separator :
awk -F” “ or awk -F”|”

Cet article How to automate a kill of one or several executing jobs in AJF at specific Date/Time at once est apparu en premier sur Blog dbi services.

Restore S3 Object with AWSPOWERSHELL

$
0
0

AWS S3 offers different Storage Classes, allowing to optimize cost among others.
For instance, some classes are used for archiving purposes: S3 Glacier and S3 Glacier Deep Archive. It means the storage cost is the lowest you can obtain, but your data is not available immediately and the access cost is increased.

In the case of S3 archive classes, retrieving the data is not cost-effective because this is clearly not what it is aimed for. This is for data you want to keep for some reasons (legal, insurance…), but you need very rare access on it.
Database backups are clearly one scenario where these storage classes are designed for.

But what does happen if I need this data? How do I proceed? We will answer to these questions using AWSPOWERSHELL module.
Of course, AWS CLI is another approach possible and well-documented. But, in my opinion, this is less reusable (integration in a custom module less convenient for instance) and less in the PowerShell “philosophy”.

I- Select your S3 object

First of all, you need to find the object you have to retrieve. To do so, several information are necessary:

  • the bucket name where the object resides (mandatory)
  • the key (optional): returns the object matching the exact key
  • the key prefix (optional): returns all objects with a key starting with this prefix

The cmdlet you need for this is called Get-S3Object. Here are some examples of usage:

# Retrieve object from a specific key
Get-S3Object -BucketName $BucketName -Key $Key

# Retrieve objects from a key prefix
Get-S3Object -BucketName $BucketName -KeyPrefix $KeyPrefix

It is not possible from this cmdlet to retrieve an object only with its name: you need to know the key or the beginning of the key (key prefix).
Of course, a research inside PowerShell is possible, but you will need to retrieve ALL objects in a bucket before doing the research… You are dependent on the number of objects in your bucket.

Moreover, to retrieve information regarding restore status, you need to look into metadata with cmdlet Get-S3ObjectMetadata.

To make the research simple with the desired information, I created a custom function to accept the partial name of a S3 object as input and to personalize the output:

Function Get-dbiS3Object(){
    param(
        [Parameter(Mandatory=$true)]
        [String]
        $BucketName,
        [String]
        $Key,
        [String]
        $KeyPrefix = '',
        [String]
        $Name = ''
    )

    $Command = 'Get-S3Object -BucketName ' + '"' + $BucketName + '"';

    If ($KeyPrefix){
        $Command += ' -KeyPrefix ' + '"' + $KeyPrefix + '"';
    }
    If ($Key){
        $Command += ' -Key' + '"' + $Key + '"';
    }
    If ($Name){
        $Command += ' | Where-Object Key -Match ' + '"' + $Name + '"';
    }

    $Objects = Invoke-Expression $Command;


    If ($Objects){
        @($Objects) | ForEach-Object -Begin {`
                        [System.Collections.ArrayList] $S3CustomObjects = @();}`
                  -Process {`
                           $Metadata = $_ | Get-S3ObjectMetadata;`
                           $S3CustomObj = [PSCustomObject]@{`
                                         BucketName = "$($_.BucketName)";`
                                         StorageClass = "$($_.StorageClass)";`
                                         LastModified = "$($_.LastModified)";`
                                         SizeInB = "$($_.Size)";`
                                         RestoreExpirationUtc = "$($Metadata.RestoreExpiration)";`
                                         RestoreInProgress = "$($Metadata.RestoreInProgress)";`
                                         ExpirationRule = "$($Metadata.Expiration.RuleId)";`
                                         ExpiryDateUtc= "$($Metadata.Expiration.ExpiryDateUtc)";`
                           };`
                           $Null = $S3CustomObjects.Add(S3CustomObj);`
                  };
    }

  return $S3CustomObjects;
}

2- Restore the S3 Object

Once you have selected your objects, you have to create a request to make your objects accessible. Indeed, in Glacier, your objects are not accessible until a request is performed: they are archived (like “frozen”).
For Glacier, it exists 3 archive retrieval options:

  • Expedited: 1-5 minutes for the highest cost
  • Standard: 3-5 hours for a lower cost
  • Bulk: 5-12 hours for the lowest cost

So after this request you will have to wait, depending on your archive retrieval options.

This demand is performed with the cmdlet Restore-S3Object.
Here is an example of usage:

# CopyLifetimeInDays is the number of days the object remains accessible before it is frozen again
Restore-S3Object -BucketName $element.BucketName -Key $element.Key -CopyLifetimeInDays $CopyLifetimeInDays -Tier $TierType

By using our previous custom cmdlet called Get-dbiS3Object, we can also build a new custom cmdlet to simplify the process:

Function Restore-dbiS3Object (){
    param(
        $CustomS3Objects,
        [String]
        $Key,
        [String]
        $KeyPrefix,
        [String]
        $BucketName,
        [Amazon.S3.GlacierJobTier]
        $Tier='Bulk',              # Default archive retrieval option if nothing specified
        [int]
        $CopyLifetimeInDays = 5    # Default number of days if nothing specified
    )

    If ($CustomS3Objects){
        @($CustomS3Objects) | Foreach-Object -Process {`
            If ( (-not ($_.RestoreExpirationUtc) -and (-not ($_.RestoreInProgress) -and ($_.StorageClass -eq 'Glacier') -and ($_.SizeInB -gt 0)))) {`
                Restore-S3Object -BucketName $_.BucketName -Key $_.Key -CopyLifetimeInDays $CopyLifetimeInDays -Tier $TierType);`
            }`
        }
    }
    elseif ($Key -and $BucketName){
        $Objects = Get-dbiS3Object -BucketName $BucketName -Key $Key;
        Restore-dbiS3Object -CustomS3Objects $Objects;
    }
    elseif ($KeyPrefix -and $BucketName){
        $Objects = Get-dbiS3Object -BucketName $BucketName -KeyPrefix $KeyPrefix;
        Restore-dbiS3Object -CustomS3Objects $Objects;
    }
}

To check if the retrieval is finished and if the object is accessible for download, you can obtain this information with the cmdlet Get-dbiS3Object.

Of course, these 2 custom functions are perfectible and could be customized differently. The goal of this blog is mostly to introduce the potential of this PowerShell module, and give examples of integration in a custom PowerShell module to make daily life easier 🙂

Cet article Restore S3 Object with AWSPOWERSHELL est apparu en premier sur Blog dbi services.

Viewing all 1431 articles
Browse latest View live