The default block size is 8k in Oracle. This is the most common. Sometimes, people create the database with 16k block size for datawarehouses. You can also find some 32k block size, but less common which means more bugs. In our tuning workshop we recommend the default, 8k, for most of the cases. I'll show here 3 tests to show what different block size change for full table scan and index access.
Test case
I have defined a cache size for the non default block size I want to use:
SQL> show parameter db%_cache_size NAME TYPE VALUE ------------------------------------ ----------- ----- db_cache_size big integer 0 db_2k_cache_size big integer 0 db_4k_cache_size big integer 0 db_8k_cache_size big integer 0 db_16k_cache_size big integer 112M db_32k_cache_size big integer 112M db_flash_cache_size big integer 0 db_keep_cache_size big integer 0 db_recycle_cache_size big integer 0I'm creating 3 tablespaces with 8k, 16k and 32k block size. I create them as uniform with a 1M extent size because we are supposed to use large block size for large tables and I don't want the side effects of smaller first extents in auto extent size.
SQL> create tablespace DEMO08K datafile '/oracle/u01/oradata/DEMO08K.dbf' size 1024M extent management local uniform size 1M blocksize 8k; Tablespace created. SQL> create tablespace DEMO16K datafile '/oracle/u01/oradata/DEMO16K.dbf' size 1024M extent management local uniform size 1M blocksize 16k; Tablespace created. SQL> create tablespace DEMO32K datafile '/oracle/u01/oradata/DEMO32K.dbf' size 1024M extent management local uniform size 1M blocksize 32k; Tablespace created.and then create 3 identical tables in each tablespace:
SQL> create table TAB08K (id constraint PK_08K primary key,n,x) tablespace DEMO08K as select rownum id , ora_hash(rownum,10) n , lpad(ora_hash(rownum,100),20,'0') x from xmltable('1 to 10000000'); Table created. SQL> create table TAB16K (id constraint PK_16K primary key,n,x) tablespace DEMO16K as select rownum id , ora_hash(rownum,10) n , lpad(ora_hash(rownum,100),20,'0') x from xmltable('1 to 10000000'); Table created. SQL> create table TAB32K (id constraint PK_32K primary key,n,x) tablespace DEMO32K as select rownum id , ora_hash(rownum,10) n , lpad(ora_hash(rownum,100),20,'0') x from xmltable('1 to 10000000'); Table created.My tables have 10 million rows, two number column and one larger varchar2:
SQL> select table_name,num_rows,avg_row_len,blocks from user_tables where table_name like 'TAB__K'; TABLE_NAME NUM_ROWS AVG_ROW_LEN BLOCKS ---------- ---------- ----------- ---------- TAB08K 10000000 30 48459 TAB16K 10000000 30 23997 TAB32K 10000000 30 11933Of course, larger block size need smaller number of blocks, but the total size is roughly the same. Here I have small rows so this is where the fixed size of block header can make the most difference.
Full Table Scan
So, the common idea is that larger block size helps to do larger i/o calls when doing full table scan...
SQL> set timing on arraysize 5000 autotrace trace SQL> select * from TAB08K; 10000000 rows selected. Elapsed: 00:00:32.53 Execution Plan ---------------------------------------------------------- Plan hash value: 1209268626 ------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| ------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10M| 286M| 8462 (1)| | 1 | TABLE ACCESS STORAGE FULL| TAB08K | 10M| 286M| 8462 (1)| ------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 5 recursive calls 0 db block gets 50174 consistent gets 48175 physical reads 0 redo size 348174002 bytes sent via SQL*Net to client 22489 bytes received via SQL*Net from client 2001 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10000000 rows processed SQL> set autotrace off SQL> select name,value from v$mystat join v$statname using(statistic#) where name like 'physical read%' and value>0; NAME VALUE ---------------------------------------------------------------- ---------- physical read total IO requests 403 physical read total multi block requests 379 physical read total bytes 394821632 physical reads 48196 physical reads cache 23 physical reads direct 48173 physical read IO requests 403 physical read bytes 394821632 physical reads direct temporary tablespace 1I've read 48175 8k blocks with 403 i/o calls.
Now doing the same from the table stored in the 16k blocksize tablespace:
SQL> select * from TAB16K; 10000000 rows selected. Elapsed: 00:00:31.04 Execution Plan ---------------------------------------------------------- Plan hash value: 2288178481 ------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| ------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10M| 286M| 4378 (2)| | 1 | TABLE ACCESS STORAGE FULL| TAB16K | 10M| 286M| 4378 (2)| ------------------------------------------------------------------------- SQL> select name,value from v$mystat join v$statname using(statistic#) where name like 'physical read%' and value>0; NAME VALUE ---------------------------------------------------------------- ---------- physical read total IO requests 397 physical read total multi block requests 375 physical read total bytes 391012352 physical reads 23876 physical reads cache 21 physical reads direct 23855 physical read IO requests 397 physical read bytes 391012352 physical reads direct temporary tablespace 1I've read 23855 16k blocks with 397 i/o calls. It's not a lot better.
SQL> select * from TAB32K; 10000000 rows selected. Elapsed: 00:00:29.61 Execution Plan ---------------------------------------------------------- Plan hash value: 1240330363 ------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| ------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10M| 286M| 2364 (3)| | 1 | TABLE ACCESS STORAGE FULL| TAB32K | 10M| 286M| 2364 (3)| ------------------------------------------------------------------------- SQL> select name,value from v$mystat join v$statname using(statistic#) where name like 'physical read%' and value>0; NAME VALUE ---------------------------------------------------------------- ---------- physical read total IO requests 398 physical read total multi block requests 373 physical read total bytes 388890624 physical reads 11886 physical reads cache 24 physical reads direct 11862 physical read IO requests 398 physical read bytes 388890624 physical reads direct temporary tablespace 1I've read 11892 32k blocks with 398 i/o calls.
Conclusion: we do roughly the same amount of i/o when doing a full table scan. This is because Oracle is reading in multiblock. Note that the db_file_multiblock_read_count is defined as the number of blocks of default block size, but the i/o size is adapted for tablespace with non default block size. For example, db_file_multiblock_read_count=128 when default block size is 8k will always read 1MB.
Indexes
I already have an index on the primary key. Let's add some more indexes:
SQL> create index ID_08K on TAB08K(x) tablespace DEMO08K ; Index created. SQL> create index ID_16K on TAB16K(x) tablespace DEMO16K ; Index created. SQL> create index ID_32K on TAB32K(x) tablespace DEMO32K ; Index created. SQL> create bitmap index BI_08K on TAB08K(n) tablespace DEMO08K ; Index created. SQL> create bitmap index BI_16K on TAB16K(n) tablespace DEMO16K ; Index created. SQL> create bitmap index BI_32K on TAB32K(n) tablespace DEMO32K ; Index created.and check their size:
SQL> select index_name,num_rows,blevel,leaf_blocks from user_indexes where table_name like 'TAB__K' order by 1; INDEX_NAME NUM_ROWS BLEVEL LEAF_BLOCKS ---------- ---------- ---------- ----------- BI_08K 3211 2 1606 BI_16K 1562 1 781 BI_32K 759 1 380 ID_08K 10000000 2 44643 ID_16K 10000000 2 22027 ID_32K 10000000 2 10929 PK_08K 10000000 2 22132 PK_16K 10000000 2 10921 PK_32K 10000000 2 5425Of course the number of blocks is lower when the block size is bigger. And because branches are larger, then you may have a smaller depth. But look: on my 10000000 rows table the depth is the same for the regular indexes: 2 branch levels. Only for the bitmap indexes, because they are very small, we need one less branch level here.
But think about it. Index depth mostly matter for OLTP where you get rows by their primary key. But people say that smaller blocks are better for OLTP... Datawarehouses often have bitmap indexes, but do you care to have smaller bitmap indexes?
Index access (lot of rows, good clustering factor)
Anyway, let's test a large range scan:
SQL> select * from TAB08K where id between 1 and 100000; 100000 rows selected. Elapsed: 00:00:00.44 Execution Plan ---------------------------------------------------------- Plan hash value: 2790916815 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 100K| 2929K| 707 (1)| | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| TAB08K | 100K| 2929K| 707 (1)| |* 2 | INDEX RANGE SCAN | PK_08K | 100K| | 225 (1)| ----------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ID">=1 AND "ID" v=100000) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 878 consistent gets 679 physical reads 0 redo size 3389860 bytes sent via SQL*Net to client 1589 bytes received via SQL*Net from client 101 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 100000 rows processed SQL> set autotrace off SQL> select name,value from v$mystat join v$statname using(statistic#) where name like 'physical read%' and value>0; NAME VALUE ---------------------------------------------------------------- ---------- physical read total IO requests 705 physical read total bytes 5775360 physical reads 705 physical reads cache 705 physical read IO requests 705 physical read bytes 5775360We have read 100000 rows through index. The index is very well clustered. I've done 705 i/o calls to get those rows from 8k blocks.
Now with 16k blocks:
SQL> select * from TAB16K where id between 1 and 100000; 100000 rows selected. Elapsed: 00:00:00.37 Execution Plan ---------------------------------------------------------- Plan hash value: 1432239150 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 100K| 2929K| 352 (1)| | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| TAB16K | 100K| 2929K| 352 (1)| |* 2 | INDEX RANGE SCAN | PK_16K | 100K| | 113 (1)| ----------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ID">=1 AND "ID"v=100000) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 537 consistent gets 337 physical reads 0 redo size 3389860 bytes sent via SQL*Net to client 1589 bytes received via SQL*Net from client 101 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 100000 rows processed SQL> set autotrace off SQL> select name,value from v$mystat join v$statname using(statistic#) where name like 'physical read%' and value>0; NAME VALUE ---------------------------------------------------------------- ---------- physical read total IO requests 363 physical read total bytes 5734400 physical reads 363 physical reads cache 363 physical read IO requests 363 physical read bytes 5734400the number of i/o calls have been divided by two.
SQL> select * from TAB32K where id between 1 and 100000; 100000 rows selected. Elapsed: 00:00:00.35 Execution Plan ---------------------------------------------------------- Plan hash value: 3074346038 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 100K| 2929K| 177 (1)| | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| TAB32K | 100K| 2929K| 177 (1)| |* 2 | INDEX RANGE SCAN | PK_32K | 100K| | 58 (2)| ----------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ID">=1 AND "ID"v=100000) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 369 consistent gets 169 physical reads 0 redo size 3389860 bytes sent via SQL*Net to client 1589 bytes received via SQL*Net from client 101 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 100000 rows processed SQL> set autotrace off SQL> select name,value from v$mystat join v$statname using(statistic#) where name like 'physical read%' and value>0; NAME VALUE ---------------------------------------------------------------- ---------- physical read total IO requests 195 physical read total bytes 5750784 physical reads 195 physical reads cache 195 physical read IO requests 195 physical read bytes 5750784with 32k blocks, it's once again divided by two.
Conclusion: when doing single block reads, coming from a well clustered index, we do less i/o calls with larger blocks. The fact is that because we need contiguous rows (because we are using a well clustered index) having large blocks makes more rows physically contiguous.
Index access (few rows, bad clustering factor)
Here is a query WHERE X='00000000000000000000'. The index on N - that I've populated with a hash value on rownum - has a bad clustering factor. I fetch only 30 rows.
---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 30 | 900 | 16 (0)| | 1 | COUNT STOPKEY | | | | | | 2 | TABLE ACCESS BY INDEX ROWID| TAB08K | 30 | 900 | 16 (0)| |* 3 | INDEX RANGE SCAN | ID_08K | 99010 | | 3 (0)| ---------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("X"='00000000000000000000') NAME VALUE ---------------------------------------------------------------- ---------- physical read total IO requests 16 physical read total bytes 131072 physical reads 16 physical reads cache 16 physical read IO requests 16 physical read bytes 131072The query returned about 30 rows. Because of the bad clustering factor we had to read a block every two rows on average. Let's see the same with 16k blocks.
--------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 30 | 900 | 7 (0)| 00:00:01 | | 1 | COUNT STOPKEY | | | | | | | 2 | TABLE ACCESS BY INDEX ROWID| TAB32K | 30 | 900 | 7 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | ID_32K | 99010 | | 3 (0)| 00:00:01 | --------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("X"='00000000000000000000') NAME VALUE ---------------------------------------------------------------- ---------- physical read total IO requests 33 physical read total bytes 442368 physical reads 33 physical reads cache 33 physical read IO requests 33 physical read bytes 442368More i/o calls here and higher block size.
Conclusion: larger block size is bad when we need only few rows, especially from a badly clustered index. More i/o calls, larger i/o size, and the large blocks takes more space in the buffer cache.
So, which block size suits your workload?
Question is: do you get better performance in datawarhouse with larger block size? For full table scans, no it's the same and has always been the same. For index range scans retrieving large number of rows, then yes, the access by index may be faster. But think about it. Retrieving lot of rows by index is always bad, - whatever the block size is. If you want do to something about it, look at the design: define better indexes (covering all selected columns), partition the table, compress the table, use Parallel Query,... Lot of tuning to do before thinking about block size.
With smaller block size the optimizer will favor a full table scan for those cases, and today the gap between full table scan and index access is not so large. There is Exadata SmartScan, In-Memory. Even without options, multiblock read is faster since 11g when doing serial direct read. All that makes full table scan faster. And index access has also been improved: you have the batched access by rowid to compensate bad clustring factor.
And if you think about having a non default block size for some tablespaces, are you sure that the gain you expect will not be ruined by a bad sizing of buffer cache? When having non default blocksize tablespace you have to manage their buffer cache manually.
Other considerations
Larger block size may have been recommended for very large tablespaces in order to avoid to have too many datafiles (their maximum size is in number of block) but today you can create bigfile tablespaces for them, so it is not a reason anymore.
There is one reason to have larger block size. When you have very large rows, you can avoid row chaining. But once again, are you sure it is a problem (i.e do you select often the columns at the end)? And maybe you should review the design first.
There was another reason to have a larger block size for tablespace containing large LOB (I know that the 'L' is already for 'Large' but I mean LOBs larger than the default block size). Today you should use SecureFiles and we get better performance with them. But that's for another blog post.
Conclusion
When people come with those kinds of rules of thumbs, I usually try to see if it is something they thought about, or just a blind idea. For example, when they want to rebuild indexes, I ask them which PCTFREE they set for that. Because rebuilding without knowing the PCTFREE we want to achieve is pointless.
And when they want to create a tablespace with non default block size, I ask them how they have calculated the buffer cache to allocate for that blocksize. The whole size of the table? Then why do you want to do less i/o calls if everything is in cache. A small size? Then are you sure that the lower number of i/o calls will compensate the cache you can have in the default buffer pool?
In most of the cases, just keep the default block size, and you have probably lot of other things to tune. I'll now tell you a short story. I was a junior not-yet-DBA in 1998, implementing a banking software (COBOL, tuxedo, HP-UX, Oracle 7). Application design had a lot to review. Things like comments stored in CHAR(2000) for example. Then I've seen a senior consultant recommending 'The' solution: increase the block size. The customer accepted that, we did it and everything was immediately better. Of course, in order to do that you have to export/import the database, everything was reorganized, lot of design problems were hidden for a few days. After a while, the performance issues came back, and we had to continue the optimization tasks. Being on a larger block size did not change anything about that. This is where I learned exactly which kind of consultant I don't want to be.