Recently, I'mOracle Database 23aiOnTPC-H 100GBTested and sorted out complete implementation steps and optimization experience. If you also want to evaluate the performance of Oracle database in decision support scenarios, you can refer to my steps to get started quickly.
1. Environmental preparation
Before starting the test, let’s introduce the basic situation of this test environment:
- Database version:Oracle Database 23ai
- storage:500GBAvailable hard disk space (storage temporary files, data, index table space, etc.)
- Memory:32GB
- CPU: 4-core processor (optimized parallel query)
- operating system: Oracle Linux 8.10 (or RHEL compatible version)
First, you must make sure that the 23ai database has been successfully installed in this environment. If the specific steps to install the database are not clear, please refer to "Quickly silently install 23ai database on OCI》。
It should be noted that 23ai only supports multi-tenant architecture, so my test here isALFRED
Under the tenant, create a new one in this tenanttpch
Test the user, the user's default tablespace is specified asTBS_ALFRED
, In addition, create a tablespace that specifically stores indexesTBS_ALFRED_INDEX
Used for subsequent optimization.
alter session set container=ALFRED;
--TABLESPACE:tbs_alfred,120G
create tablespace tbs_alfred datafile '/u01/app/oracle/oradata/DEMO/alfred/tbs_alfred01.dbf' size 30G;
-- An error occurred when adding data files. I found that the default creation of 23ai is a large file table space. Then I directly resize 150G (avoid the pit, be generous here. I gave 120G before and found that it was not enough...)
ALTER DATABASE DATAFILE '/u01/app/oracle/oradata/DEMO/alfred/tbs_alfred01.dbf' RESIZE 150G;
--TABLESPACE:tbs_alfred_index,60G, direct size specified
create tablespace tbs_alfred_index datafile '/u01/app/oracle/oradata/DEMO/alfred/tbs_alfred_index01.dbf' size 60G;
--Create the test user tpch, and specify the default table space as tbs_alfred, assign basic permissions
create user tpch identified by tpch default tablespace tbs_alfred;
grant connect, resource to tpch;
ALTER USER TPCH QUOTA UNLIMITED ON TBS_ALFRED;
ALTER USER TPCH QUOTA UNLIMITED ON TBS_ALFRED_INDEX;
When creating the data file here, the monitoring also shows that the IO capability of this test environment is not strong, and the limit is more than 200 M writes per second.
To facilitate subsequent testing, sqlplus can be configured to connect directly to PDB:
# Configure PDB connection, test the connection OK after update
vi /u01/app/oracle/product/23.0.0/db_1/network/admin/
sqlplus tpch/tpch@alfred
2. Download and install the TPC-H tool
Officially provided by TPCTPC-H Tools,include:
-
DBGEN
:Generate test data(8 tables.tbl
Data file) -
QGEN
:Generate SQL Query(TPC-H 22 query statements)
(1) Download the TPC-H tool
Download from TPC official websiteTPC-H Toolkit:
- /
- accessTPC-H Download Page
- chooseTPC-H Tools,download
TPC-H_Tools_v3.0.
- tpc-h-3.0.
You need to register here for free. The system will send you a download link to the email address you leave. Note that this link can only be downloaded once and has a time limit:
- After successful download, unzip: (The oracle user is used to decompress it, which is convenient for testing. For pressure testing tools, personal habit is to place it in the media directory /u01/media)
unzip TPC-H_Tools_v3.0. cd TPC-H V3.0.1
(2) Compile DBGEN and QGEN
TPC-H requires manual configuration, to specifyDatabase TypeandSystem environment。
Modify Makefile
Edit the dbgen directory, just search and modify the following line:
CC=gcc
DATABASE=ORACLE
MACHINE=LINUX
WORKLOAD=TPCH
Compilation
Execute the make command to compile:
make -f
After success, the current directory will be generated:
-
dbgen
(Used to generate data) -
qgen
(Used to generate query SQL)
3. Generate 100GB of TPC-H data
TPC-H allows data sets of different sizes, I choose here100GB(Scale Factor = 100):
# vi
time ./dbgen -s 100 -vf
100G is not small anymore, so I am worried about the time being too long, so I record the time and put it in the background to execute to prevent interruptions during the period:
nohup sh &
The generated data file includes commands that can be used.ls -lrth *.tbl
Check it out and understand each table size intuitively:
[oracle@dbtest dbgen]$ ls -lrth *.tbl
-rw-r--r--. 1 oracle oinstall 137M Mar 6 09:31
-rw-r--r--. 1 oracle oinstall 389 Mar 6 09:31
-rw-r--r--. 1 oracle oinstall 2.3G Mar 6 09:31
-rw-r--r--. 1 oracle oinstall 12G Mar 6 09:31
-rw-r--r--. 1 oracle oinstall 17G Mar 6 09:31
-rw-r--r--. 1 oracle oinstall 2.2K Mar 6 09:31
-rw-r--r--. 1 oracle oinstall 75G Mar 6 09:31
-rw-r--r--. 1 oracle oinstall 2.3G Mar 6 09:31
Follow-up optimization suggestions:
- You can consider splitting files. dbgen supports splitting files, which can facilitate parallel imports at the same time. The test volume of this 100G is OK, so let's do it for now.
4. Create a TPC-H table in Oracle Database 23ai
TPC-H provides, can be created directly8 tables。
In this way, the basic test basically does not need to change the content, the default table creation will beTBS_ALFRED
in tablespace.
The first round of tests will not be optimized and adjusted.
In addition, the field type definition in the script, for exampleINTEGER
、VARCHAR
These do not need to be changed manuallyNUMBER
andVARCHAR2
Type, Oracle will do this work by itself, just execute the creation directly, and after execution, you can desc check the following table structure.
Follow-up optimization suggestions:
- Evaluate usePartition tableCan query performance be improved, such as for the largest tables: lineitem, orders, and partssupp, and is there a suitable partitioning method?
5. Use SQL*Loader to load data in batches
TPC-H generated.tbl
The files need to be imported into the Oracle database. I useSQL*LoaderPerform efficient batch import.
-
The control files required to create SQLDR import:
The corresponding 8 tables tested, each form has a control file. Here I deliberately capitalize the name of the control file, which makes it easier to distinguish:
--1. vi
LOAD DATA
INFILE ''
INTO TABLE NATION
TRUNCATE
FIELDS TERMINATED BY '|'
(
N_NATIONKEY INTEGER EXTERNAL,
N_NAME CHAR(25),
N_REGIONKEY INTEGER EXTERNAL,
N_COMMENT CHAR(152)
)
--2. vi
LOAD DATA
INFILE ''
INTO TABLE REGION
TRUNCATE
FIELDS TERMINATED BY '|'
(
R_REGIONKEY INTEGER EXTERNAL,
R_NAME CHAR(25),
R_COMMENT CHAR(152)
)
--3. vi
LOAD DATA
INFILE ''
INTO TABLE SUPPLIER
TRUNCATE
FIELDS TERMINATED BY '|'
(
S_SUPPKEY INTEGER EXTERNAL,
S_NAME CHAR(25),
S_ADDRESS CHAR(40),
S_NATIONKEY INTEGER EXTERNAL,
S_PHONE CHAR(15),
S_ACCTBAL DECIMAL EXTERNAL,
S_COMMENT CHAR(101)
)
--4. vi
LOAD DATA
INFILE ''
INTO TABLE PART
TRUNCATE
FIELDS TERMINATED BY '|'
(
P_PARTKEY INTEGER EXTERNAL,
P_NAME CHAR(55),
P_MFGR CHAR(25),
P_BRAND CHAR(10),
P_TYPE CHAR(25),
P_SIZE INTEGER EXTERNAL,
P_CONTAINER CHAR(10),
P_RETAILPRICE DECIMAL EXTERNAL,
P_COMMENT CHAR(23)
)
--5. vi
LOAD DATA
INFILE ''
INTO TABLE PARTSUPP
TRUNCATE
FIELDS TERMINATED BY '|'
(
PS_PARTKEY INTEGER EXTERNAL,
PS_SUPPKEY INTEGER EXTERNAL,
PS_AVAILQTY INTEGER EXTERNAL,
PS_SUPPLYCOST DECIMAL EXTERNAL,
PS_COMMENT CHAR(199)
)
--6. vi
LOAD DATA
INFILE ''
INTO TABLE CUSTOMER
TRUNCATE
FIELDS TERMINATED BY '|'
(
C_CUSTKEY INTEGER EXTERNAL,
C_NAME CHAR(25),
C_ADDRESS CHAR(40),
C_NATIONKEY INTEGER EXTERNAL,
C_PHONE CHAR(15),
C_ACCTBAL DECIMAL EXTERNAL,
C_MKTSEGMENT CHAR(10),
C_COMMENT CHAR(117)
)
--7. vi
LOAD DATA
INFILE ''
INTO TABLE ORDERS
TRUNCATE
FIELDS TERMINATED BY '|'
(
O_ORDERKEY INTEGER EXTERNAL,
O_CUSTKEY INTEGER EXTERNAL,
O_ORDERSTATUS CHAR(1),
O_TOTALPRICE DECIMAL EXTERNAL,
O_ORDERDATE DATE "YYYY-MM-DD",
O_ORDERPRIORITY CHAR(15),
O_CLERK CHAR(15),
O_SHIPPRIORITY INTEGER EXTERNAL,
O_COMMENT CHAR(79)
)
--8. vi
LOAD DATA
INFILE ''
INTO TABLE LINEITEM
TRUNCATE
FIELDS TERMINATED BY '|'
(
L_ORDERKEY INTEGER EXTERNAL,
L_PARTKEY INTEGER EXTERNAL,
L_SUPPKEY INTEGER EXTERNAL,
L_LINENUMBER INTEGER EXTERNAL,
L_QUANTITY DECIMAL EXTERNAL,
L_EXTENDEDPRICE DECIMAL EXTERNAL,
L_DISCOUNT DECIMAL EXTERNAL,
L_TAX DECIMAL EXTERNAL,
L_RETURNFLAG CHAR(1),
L_LINESTATUS CHAR(1),
L_SHIPDATE DATE "YYYY-MM-DD",
L_COMMITDATE DATE "YYYY-MM-DD",
L_RECEIPTDATE DATE "YYYY-MM-DD",
L_SHIPINSTRUCT CHAR(25),
L_SHIPMODE CHAR(10),
L_COMMENT CHAR(44)
)
- Run SQL*Loader:
#
sqlldr userid=tpch/tpch@alfred control= log= bad= direct=true
#
sqlldr userid=tpch/tpch@alfred control= log= bad= direct=true
#
sqlldr userid=tpch/tpch@alfred control= log= bad= direct=true
#
sqlldr userid=tpch/tpch@alfred control= log= bad= direct=true
#
sqlldr userid=tpch/tpch@alfred control= log= bad= direct=true
#
sqlldr userid=tpch/tpch@alfred control= log= bad= direct=true
#
sqlldr userid=tpch/tpch@alfred control= log= bad= direct=true
#
sqlldr userid=tpch/tpch@alfred control= log= bad= direct=true
# OPTION1: Because I only gave 120G table space before, the space was not enough, so I continued to load after adjustment. Also, please note that change to append, otherwise the previous record will be cleared
sqlldr userid=tpch/tpch@alfred control= log= bad= discard= direct=true skip=582854696
# OPTION2: Because no primary key unique constraint has been established, I saw that the test data is not an integer, and I mistakenly thought that there was more data, so I simply imported it completely, modified it to truncate mode again, and repeated step 8, and found that it was still line 600037902:
sqlldr userid=tpch/tpch@alfred control= log= bad= direct=true
6. Run TPC-H 22 queries
TPC-H is predefined22 SQL queries, can be usedqgen
Generate SQL statements:
I encountered a problem here, briefly record it:
[oracle@dbtest dbgen]$ ./qgen -s 100 1 >
Open failed for ./ at :170
This error wasted a little time, and finally found that it was necessary to set the lower variable, and then execute it again and it was OK:
[oracle@dbtest dbgen]$ export DSS_QUERY=./queries
[oracle@dbtest dbgen]$ ./qgen -s 100 1 >
After confirming that it is OK, use the following command to quickly generate these 22 queries:
for i in $(seq 1 22); do
./qgen -s 100 $i > query$
echo "Generated query$"
done
Then you can choose to run in the Oracle database and record the execution time:
SQL>
@query1
@query2
...
@query22
At this point, the preparation of the Oracle 23ai TPC-H test environment has been completed. Note that there is no optimization at this time, and even the default generation syntax of some SQL is not compatible with Oracle. Subsequent articles will record these detailed adjustment methods and some optimization techniques. Stay tuned!