In this post, I would like to share some personal thoughts. September 28th 2021, Oracle released the eleventh Exadata data machine called “X9M-2” (2 CPU sockets), X9M-8 (8 sockets) and ZDLRA X9M. Exadata is a computing platform to run Oracle RDBMS, zero data loss recovery appliance (ZDLRA) is a platform to backup Oracle RDBMS and based on Exadata hardware.
On changes and use case
For the readers technically interested, I recommend to have a look at documentation. See here for example on hardware details allowing to compare old and new. Unfortunately, documentation is not up-to-date on X9M ZDLRA and new functionality yet.
Oracle Exadata is undoubtedly an interesting platform to run Oracle RDBMS on. It’s the combination of hard- and software that makes the difference to other platforms. Maybe you you rember the release of Apple iPhone that redefined smartphone market? I would not say Exadata (even in cloud or cloud@customer version) match iPhone usability. That’s unfair as Exadata has another, much smaller audience and Oracle RDMBS is complex to run and understand.
On performance and tuning
Please find in documentation many numbers on a single webpage. If you prefer to read whitepapers, here we go:
Performance increase between X8M and X9M platform according to Oracle tests (we don’t know how Oracle is testing and if test are consistent) are impressive:
Compute nodes
Metric | # CPU sockets | X8M | X9M | increase % |
8K database read I/Os per second | 2 (X?M-2) | 1’500’000 | 2’800’000 | 87 % |
8 (X?M-8) | 5’000’000 | 5’000’000 | 0 % | |
8K flash write I/Os per second | 2 (X?M-2) | 980’000 | 2’000’000 | 204 % |
8 (X?M-8) | 3’000’000 | 3’000’000 | 0 % | |
Max Memory in GB | 2 (X?M-2) | 1536 | 2048 | 33 % |
8 (X?M-8) | 6144 | 6144 | 0 % | |
CPU cores | 2 (X?M-2) | 48
Intel® Xeon® |
64
Intel® Xeon® 8358, 2.6Ghz |
33 % |
8 (X?M-8) | 184
Intel® Xeon® |
184 Intel® Xeon® 8268, 2.9 Ghz |
0 % |
No increase on X?M-8 compute nodes, as same hardware on both versions. No average latency given on IOPS, therefore not complete.
Storage nodes
Metric | Model | X8M | X9M | increase % |
8K database read I/Os per second | Extreme Flash (EF) | 1’500’000 | 2’300’000 | 53 % |
High capacity (HC) | 1’500’000 | 2’300’000 | 53 % | |
Extended (XT) | n/a | n/a | n/a | |
8K flash write I/Os per second | Extreme Flash (EF) | 470’000 | 614’000 | 31 % |
High capacity (HC) | 470’000 | 614’000 | 31 % | |
Extended (XT) | n/a | n/a | n/a | |
Flash raw capacity in TB | Extreme Flash (EF) | 51.2, PCIe 3.0 | 51.2, PCIe4.0 | 0 % |
High capacity (HC) | 25.6, PCIe 3.0 | 25.6, PCI 4.0 | 0 % | |
Extended (XT) | 0 | 1 | ||
Disk raw capacity in TB | Extreme Flash (EF) | 0 | 0 | 0 % |
High capacity (HC) | 168 | 216 | 28.6% | |
Extended (XT) | 168 | 216 | 28.6% | |
Persistent memory raw capacity in TB | Extreme Flash (EF) | 1.5 Series 100 |
1.5 Series 200 |
0% 32 % bandwidth |
High capacity (HC) | 1.5 Series 100 |
1.5, Series 100 |
0% 32 % bandwidth |
|
Total CPU cores | Extreme Flash (EF) | 32
Intel Xeon 5218 processors (2.3GHz) |
32
Intel® Xeon® 8352Y 2.2 Ghz |
0 % |
High capacity (HC) | 32
Intel Xeon 5218 processors (2.3GHz) |
32
Intel® Xeon® 8352Y 2.2 Ghz |
0 % |
No increase on flash storage and persistent memory capacity. No average latency given on IOPS, therefore not complete.
PMEM bandwidth increase according to Intel.
Full rack high capacity
Metric | X8M | X9M | Increase |
8K database read I/Os per second | 12’000’000 | 22’400’000 | 86.7% |
8K flash write I/Os per second | 5’640’000 | 8’596’000 | 52.4 % |
Database Servers Total cores |
8 400 |
8 512 |
0 % 28 % |
Storage Servers Total cores |
12 384 |
14 448 |
16.7 % |
Disk raw capacity in TB | 2016 |
3024 |
50% |
Flash raw capacity in TB | 307.2 | 358.4 | 16.7 % |
Persistent Memory raw capacity in TB | 18 | 21 |
14 instead of 12 storage nodes, more CPU cores on both compute and storage nodes. No average latency given on IOPS, therefore not complete.
Please be advised: Performance increase is not only due to hardware change (Ice lake CPUs, PCIe 4.0 instead 3.0, series 200 persistent memory instead 100, … you name it), but also to software which may be used on both X8 and X9M platforms.
Furthermore, I believe hardware tuning is the easiest of all methods to tune Oracle RDBMS. But not the one with biggest effects. There can be a bigger impact by tuning on SQL level or simply keep active data sets in databases small (e.g. one uses partitioning and offload data on a regular basis). Both SQL tuning and data archiving means that a tuning specialist needs to understand how end users use a service. And how the service is meant to be used. Unfortunately there is a difference between the two… Therefore tuning is individual in every company even using the exact same software. Writing this I got remembered to the 2009 german movie “Same Same But Different“…
To sum it up: Tuning is hard work, a true craft, sometimes even art… It’s advisable to tune with a scientific approach (knowing the effects before implementing them) and end tuning efforts when impact on performance from an end user perspective is too low. So get end users in the tuning boat.
What can be learned from cloud
Speaking on performance tuning, one can think of services that can not be run on public clouds because of network latency. In my opinion, one can learn from cloud services. I see at least two advantages:
- Automatisation
Saves time and ensures quality. DBAs end up becoming developers. Running services autonomously by data mine data ditctionary (or simple use “statistics”, but this term is not as modern as “data mining”) is also part of automatisation. - Standardisation
Many database workloads can fit in a few standards. Maybe you start to set them like T-shirt sizes S, M, L, XL on your oracle RDBMS offerings? No rule without exception, there are always and will be services demanding more or less. But don’t underestimate the power of Standardisation.
T-shirt sizes are usefull to define limits when consolidate a large number of database on same hardware. Exadata offers Ressource Manager to limit CPU, memory and disk usage.
To me it’s important to understand that both automatisation and standardisation can be achieved running services on premises as well. But maybe at different costs and time efforts. Time and agility are some main reasons why to move workload to cloud.
On trust
At the end, trust between customer and supplier is an important factor to choose a solution. Trust is a cultural thing, not everybody act the same. This makes live quite interesting, don’t you think? Regarding Exadata Cloud@customer service, customers trust Oracle corporation to run their workloads and deal with their data even hardware stays in customers data center. Not every supplier is capable protecting data as many incidents showed in the past, and risks will increase with standardisation (less variants to choose from) and monopolisation (fewer vendors). There would be an interesting experiment on Cloud@customer services: What happens if one caps internet access to the Exadata hardware? Will databases continue to run? Legally, you may not be allowed to.
On appliances
A vendor may know it’s solution best on capabilities and limits, so it’s a logic step to also offer appliances. The more complex a solution, the more this can become true. Oracle RDBMS is a complex piece of software. Appliances and services have to fit in an existing environment. My experience is: The bigger the company, the bigger the environment, the more vendors/silos there are, the less standards used. Quite an impressive amount of people are dealing to migrate data and services from one platform to another. Not every time successful and legacy platforms stay longer as one thinks.
Time to refactor?
On the other hand: if only the vendor is able to build meaningful appliances, did the vendor reach or miss the point in time to refactor its software products? Oracle RDBMS was first commercially available in 1979. Honestly, I have no idea how much code and ideas still exist from older time in today versions. It’s hard for developers to keep up with what hardware and software today are offering. Modern programming languages and remote direct memory access (RDMA) and persistent memory are just some examples. One goal is that hardware interacts the most direct way with hardware using the least amount of software (and main CPU cycles), so overall performance will improve. Dealing with Oracle RDBMS, licence costs that are CPU based may be lowered.
On future
But maybe at the end the most fundamental step is to release software under open source licence, work constantly on simplicity and close with end customers, identify and lowering technical depths before implementing new features and monetise not by selling licences but services.
Thinking out loud. What is your opinion?
Cet article Exadata X9M release est apparu en premier sur Blog dbi services.