Location>code7788 >text

Oracle 23ai TPC-H Test Environment Deployment

Popularity:465 ℃/2025-03-08 00:37:45

Recently, I'mOracle Database 23aiOnTPC-H 100GBTested and sorted out complete implementation steps and optimization experience. If you also want to evaluate the performance of Oracle database in decision support scenarios, you can refer to my steps to get started quickly.


1. Environmental preparation

Before starting the test, let’s introduce the basic situation of this test environment:

  • Database version:Oracle Database 23ai
  • storage500GBAvailable hard disk space (storage temporary files, data, index table space, etc.)
  • Memory32GB
  • CPU: 4-core processor (optimized parallel query)
  • operating system: Oracle Linux 8.10 (or RHEL compatible version)

First, you must make sure that the 23ai database has been successfully installed in this environment. If the specific steps to install the database are not clear, please refer to "Quickly silently install 23ai database on OCI》。

It should be noted that 23ai only supports multi-tenant architecture, so my test here isALFREDUnder the tenant, create a new one in this tenanttpchTest the user, the user's default tablespace is specified asTBS_ALFRED, In addition, create a tablespace that specifically stores indexesTBS_ALFRED_INDEXUsed for subsequent optimization.

alter session set container=ALFRED;

 --TABLESPACE:tbs_alfred,120G
 create tablespace tbs_alfred datafile '/u01/app/oracle/oradata/DEMO/alfred/tbs_alfred01.dbf' size 30G;
 -- An error occurred when adding data files. I found that the default creation of 23ai is a large file table space. Then I directly resize 150G (avoid the pit, be generous here. I gave 120G before and found that it was not enough...)
 ALTER DATABASE DATAFILE '/u01/app/oracle/oradata/DEMO/alfred/tbs_alfred01.dbf' RESIZE 150G;

 --TABLESPACE:tbs_alfred_index,60G, direct size specified
 create tablespace tbs_alfred_index datafile '/u01/app/oracle/oradata/DEMO/alfred/tbs_alfred_index01.dbf' size 60G;

 --Create the test user tpch, and specify the default table space as tbs_alfred, assign basic permissions
 create user tpch identified by tpch default tablespace tbs_alfred;
 grant connect, resource to tpch;
 ALTER USER TPCH QUOTA UNLIMITED ON TBS_ALFRED;
 ALTER USER TPCH QUOTA UNLIMITED ON TBS_ALFRED_INDEX;

When creating the data file here, the monitoring also shows that the IO capability of this test environment is not strong, and the limit is more than 200 M writes per second.

To facilitate subsequent testing, sqlplus can be configured to connect directly to PDB:

# Configure PDB connection, test the connection OK after update
 vi /u01/app/oracle/product/23.0.0/db_1/network/admin/
 sqlplus tpch/tpch@alfred

2. Download and install the TPC-H tool

Officially provided by TPCTPC-H Tools,include:

  • DBGENGenerate test data(8 tables.tblData file)
  • QGENGenerate SQL Query(TPC-H 22 query statements)

(1) Download the TPC-H tool

Download from TPC official websiteTPC-H Toolkit

  • /
  1. accessTPC-H Download Page
  2. chooseTPC-H Tools,downloadTPC-H_Tools_v3.0.
  • tpc-h-3.0.
    You need to register here for free. The system will send you a download link to the email address you leave. Note that this link can only be downloaded once and has a time limit:
  1. After successful download, unzip: (The oracle user is used to decompress it, which is convenient for testing. For pressure testing tools, personal habit is to place it in the media directory /u01/media)
    unzip TPC-H_Tools_v3.0.
    cd TPC-H V3.0.1
    

(2) Compile DBGEN and QGEN

TPC-H requires manual configuration, to specifyDatabase TypeandSystem environment

Modify Makefile

Edit the dbgen directory, just search and modify the following line:

CC=gcc
DATABASE=ORACLE
MACHINE=LINUX
WORKLOAD=TPCH

Compilation

Execute the make command to compile:

make -f 

After success, the current directory will be generated:

  • dbgen(Used to generate data)
  • qgen(Used to generate query SQL)

3. Generate 100GB of TPC-H data

TPC-H allows data sets of different sizes, I choose here100GB(Scale Factor = 100)

# vi 
time ./dbgen -s 100 -vf

100G is not small anymore, so I am worried about the time being too long, so I record the time and put it in the background to execute to prevent interruptions during the period:

nohup sh  &

The generated data file includes commands that can be used.ls -lrth *.tblCheck it out and understand each table size intuitively:

[oracle@dbtest dbgen]$ ls -lrth *.tbl
-rw-r--r--. 1 oracle oinstall 137M Mar  6 09:31 
-rw-r--r--. 1 oracle oinstall  389 Mar  6 09:31 
-rw-r--r--. 1 oracle oinstall 2.3G Mar  6 09:31 
-rw-r--r--. 1 oracle oinstall  12G Mar  6 09:31 
-rw-r--r--. 1 oracle oinstall  17G Mar  6 09:31 
-rw-r--r--. 1 oracle oinstall 2.2K Mar  6 09:31 
-rw-r--r--. 1 oracle oinstall  75G Mar  6 09:31 
-rw-r--r--. 1 oracle oinstall 2.3G Mar  6 09:31 

Follow-up optimization suggestions:

  • You can consider splitting files. dbgen supports splitting files, which can facilitate parallel imports at the same time. The test volume of this 100G is OK, so let's do it for now.

4. Create a TPC-H table in Oracle Database 23ai

TPC-H provides, can be created directly8 tables

In this way, the basic test basically does not need to change the content, the default table creation will beTBS_ALFREDin tablespace.

The first round of tests will not be optimized and adjusted.

In addition, the field type definition in the script, for exampleINTEGERVARCHARThese do not need to be changed manuallyNUMBERandVARCHAR2Type, Oracle will do this work by itself, just execute the creation directly, and after execution, you can desc check the following table structure.

Follow-up optimization suggestions

  • Evaluate usePartition tableCan query performance be improved, such as for the largest tables: lineitem, orders, and partssupp, and is there a suitable partitioning method?

5. Use SQL*Loader to load data in batches

TPC-H generated.tblThe files need to be imported into the Oracle database. I useSQL*LoaderPerform efficient batch import.

  1. The control files required to create SQLDR import
    The corresponding 8 tables tested, each form has a control file. Here I deliberately capitalize the name of the control file, which makes it easier to distinguish:
--1. vi 
LOAD DATA
INFILE ''
INTO TABLE NATION
TRUNCATE
FIELDS TERMINATED BY '|'
(
    N_NATIONKEY    INTEGER EXTERNAL,
    N_NAME         CHAR(25),
    N_REGIONKEY    INTEGER EXTERNAL,
    N_COMMENT      CHAR(152)
)

--2. vi 
LOAD DATA
INFILE ''
INTO TABLE REGION
TRUNCATE
FIELDS TERMINATED BY '|'
(
    R_REGIONKEY    INTEGER EXTERNAL,
    R_NAME         CHAR(25),
    R_COMMENT      CHAR(152)
)

--3. vi 
LOAD DATA
INFILE ''
INTO TABLE SUPPLIER
TRUNCATE
FIELDS TERMINATED BY '|'
(
    S_SUPPKEY     INTEGER EXTERNAL,
    S_NAME        CHAR(25),
    S_ADDRESS     CHAR(40),
    S_NATIONKEY   INTEGER EXTERNAL,
    S_PHONE       CHAR(15),
    S_ACCTBAL     DECIMAL EXTERNAL,
    S_COMMENT     CHAR(101)
)

--4. vi 
LOAD DATA
INFILE ''
INTO TABLE PART
TRUNCATE
FIELDS TERMINATED BY '|'
(
    P_PARTKEY     INTEGER EXTERNAL,
    P_NAME        CHAR(55),
    P_MFGR        CHAR(25),
    P_BRAND       CHAR(10),
    P_TYPE        CHAR(25),
    P_SIZE        INTEGER EXTERNAL,
    P_CONTAINER   CHAR(10),
    P_RETAILPRICE DECIMAL EXTERNAL,
    P_COMMENT     CHAR(23)
)

--5. vi 
LOAD DATA
INFILE ''
INTO TABLE PARTSUPP
TRUNCATE
FIELDS TERMINATED BY '|'
(
    PS_PARTKEY    INTEGER EXTERNAL,
    PS_SUPPKEY    INTEGER EXTERNAL,
    PS_AVAILQTY   INTEGER EXTERNAL,
    PS_SUPPLYCOST DECIMAL EXTERNAL,
    PS_COMMENT    CHAR(199)
)

--6. vi 
LOAD DATA
INFILE ''
INTO TABLE CUSTOMER
TRUNCATE
FIELDS TERMINATED BY '|'
(
    C_CUSTKEY     INTEGER EXTERNAL,
    C_NAME        CHAR(25),
    C_ADDRESS     CHAR(40),
    C_NATIONKEY   INTEGER EXTERNAL,
    C_PHONE       CHAR(15),
    C_ACCTBAL     DECIMAL EXTERNAL,
    C_MKTSEGMENT  CHAR(10),
    C_COMMENT     CHAR(117)
)

--7. vi 
LOAD DATA
INFILE ''
INTO TABLE ORDERS
TRUNCATE
FIELDS TERMINATED BY '|'
(
    O_ORDERKEY      INTEGER EXTERNAL,
    O_CUSTKEY       INTEGER EXTERNAL,
    O_ORDERSTATUS   CHAR(1),
    O_TOTALPRICE    DECIMAL EXTERNAL,
    O_ORDERDATE     DATE "YYYY-MM-DD",
    O_ORDERPRIORITY CHAR(15),
    O_CLERK        CHAR(15),
    O_SHIPPRIORITY INTEGER EXTERNAL,
    O_COMMENT      CHAR(79)
)

--8. vi 
LOAD DATA
INFILE ''
INTO TABLE LINEITEM
TRUNCATE
FIELDS TERMINATED BY '|'
(
    L_ORDERKEY        INTEGER EXTERNAL,
    L_PARTKEY         INTEGER EXTERNAL,
    L_SUPPKEY         INTEGER EXTERNAL,
    L_LINENUMBER      INTEGER EXTERNAL,
    L_QUANTITY        DECIMAL EXTERNAL,
    L_EXTENDEDPRICE   DECIMAL EXTERNAL,
    L_DISCOUNT        DECIMAL EXTERNAL,
    L_TAX             DECIMAL EXTERNAL,
    L_RETURNFLAG      CHAR(1),
    L_LINESTATUS      CHAR(1),
    L_SHIPDATE        DATE "YYYY-MM-DD",
    L_COMMITDATE      DATE "YYYY-MM-DD",
    L_RECEIPTDATE     DATE "YYYY-MM-DD",
    L_SHIPINSTRUCT    CHAR(25),
    L_SHIPMODE        CHAR(10),
    L_COMMENT         CHAR(44)
)
  1. Run SQL*Loader
#
 sqlldr userid=tpch/tpch@alfred control= log= bad= direct=true

 #
 sqlldr userid=tpch/tpch@alfred control= log= bad= direct=true

 #
 sqlldr userid=tpch/tpch@alfred control= log= bad= direct=true

 #
 sqlldr userid=tpch/tpch@alfred control= log= bad= direct=true

 #
 sqlldr userid=tpch/tpch@alfred control= log= bad= direct=true

 #
 sqlldr userid=tpch/tpch@alfred control= log= bad= direct=true

 #
 sqlldr userid=tpch/tpch@alfred control= log= bad= direct=true

 #
 sqlldr userid=tpch/tpch@alfred control= log= bad= direct=true

 # OPTION1: Because I only gave 120G table space before, the space was not enough, so I continued to load after adjustment.  Also, please note that change to append, otherwise the previous record will be cleared
 sqlldr userid=tpch/tpch@alfred control= log= bad= discard= direct=true skip=582854696

 # OPTION2: Because no primary key unique constraint has been established, I saw that the test data is not an integer, and I mistakenly thought that there was more data, so I simply imported it completely, modified it to truncate mode again, and repeated step 8, and found that it was still line 600037902:
 sqlldr userid=tpch/tpch@alfred control= log= bad= direct=true

6. Run TPC-H 22 queries

TPC-H is predefined22 SQL queries, can be usedqgenGenerate SQL statements:

I encountered a problem here, briefly record it:

[oracle@dbtest dbgen]$ ./qgen -s 100 1 > 
Open failed for ./ at :170

This error wasted a little time, and finally found that it was necessary to set the lower variable, and then execute it again and it was OK:

[oracle@dbtest dbgen]$ export DSS_QUERY=./queries
[oracle@dbtest dbgen]$ ./qgen -s 100 1 > 

After confirming that it is OK, use the following command to quickly generate these 22 queries:

for i in $(seq 1 22); do
  ./qgen -s 100 $i > query$
  echo "Generated query$"
done

Then you can choose to run in the Oracle database and record the execution time:

SQL>
@query1
@query2
...
@query22

At this point, the preparation of the Oracle 23ai TPC-H test environment has been completed. Note that there is no optimization at this time, and even the default generation syntax of some SQL is not compatible with Oracle. Subsequent articles will record these detailed adjustment methods and some optimization techniques. Stay tuned!