Location>code7788 >text

ORACLE data mining MSET-SPRT

Popularity:703 ℃/2025-03-18 23:45:03

Although I am a familiar Oracle database, I didn’t understand the knowledge about machine learning and data mining at first. The MSET related settings in the article are derived from the sample provided by my colleagues. During the test, I learned to absorb the data while searching for the data, so I saw a different Oracle.

Oracle's MSET-SPRT is mainly used for high-precision anomaly detection and predictive maintenance, and is especially suitable for critical business systems such as databases, storage, industrial control systems and data center operation environments.

MSET-SPRT is translated into Chinese as: multivariate state estimation technology - sequence probability ratio test.

I personally feel that this noun is a bit confusing when translated... I will use the English abbreviation to replace it.

In the official ORACLE document, MSET-SPRT is described as follows:

The Multivariate State Estimation Technique - Sequential Probability Ratio Test (MSET-SPRT) algorithm monitors critical processes and detects subtle anomalies.

The following records the test process of MSET, which is used for testing directly before reusing.TPC-H environmentTo verify:

  • 1. Create a new test table and configuration table
  • 2. Create a stored procedure for constructing test data
  • 3. Simulate test data of different scales and insert them
  • 4. Create a data mining model
  • 5. Query prediction results
  • 6. Other test suggestions

1. Create a new test table and configuration table

Here we select only fields with practical significance and build the TBL_IOT table:

  CREATE TABLE "TPCH"."TBL_IOT" 
   ("B1_T1" NUMBER(38,0), 
	"B1_T2" NUMBER(38,0), 
	"B1_T3" NUMBER(38,0), 
	"B2_T1" NUMBER(38,0), 
	"B2_T2" NUMBER(38,0), 
	"B2_T3" NUMBER(38,0), 
	"B3_T1" NUMBER(38,0), 
	"B3_T2" NUMBER(38,0), 
	"B3_T3" NUMBER(38,0), 
	"B4_T1" NUMBER(38,0), 
	"B4_T2" NUMBER(38,0), 
	"B4_T3" NUMBER(38,0), 
	"B5_T1" NUMBER(38,0), 
	"B5_T2" NUMBER(38,0), 
	"B5_T3" NUMBER(38,0), 
	"B6_T1" NUMBER(38,0), 
	"B6_T2" NUMBER(38,0), 
	"B6_T3" NUMBER(38,0), 
	"B7_T1" NUMBER(38,0), 
	"B7_T2" NUMBER(38,0), 
	"B7_T3" NUMBER(38,0), 
	"B8_T1" NUMBER(38,0), 
	"B8_T2" NUMBER(38,0), 
	"B8_T3" NUMBER(38,0), 
	"B9_T1" NUMBER(38,0), 
	"B9_T2" NUMBER(38,0), 
	"B9_T3" NUMBER(38,0), 
	"B10_T1" NUMBER(38,0), 
	"B10_T2" NUMBER(38,0), 
	"B10_T3" NUMBER(38,0), 
	"B11_T1" NUMBER(38,0), 
	"B11_T2" NUMBER(38,0), 
	"B11_T3" NUMBER(38,0), 
	"B12_T1" NUMBER(38,0), 
	"B12_T2" NUMBER(38,0), 
	"B12_T3" NUMBER(38,0), 
	"S" NUMBER(38,1), 
	"K_TS" TIMESTAMP (6)
   );

Create the MSET_IOT_SETTINGS table to store the configuration of the data mining algorithm, insert the key parameters of the MSET-SPRT algorithm:

  • Select the MSET-SPRT algorithm
  • Turn on automatic data preparation
  • Set vector storage size
  • Set the probability of alpha (false positive)
  • Set the exception alarm threshold (number of times & window size)
-- Create setting table        
CREATE TABLE MSET_IOT_SETTINGS(SETTING_NAME VARCHAR2(30), 
                             SETTING_VALUE VARCHAR2(128));
-- Populate setting table
BEGIN
  -- Select MSET-SPRT as the algorithm
  INSERT INTO MSET_IOT_SETTINGS
         VALUES(DBMS_DATA_MINING.ALGO_NAME,
                DBMS_DATA_MINING.ALGO_MSET_SPRT);
  -- Turn on automatic data preparation   
  INSERT INTO MSET_IOT_SETTINGS
         VALUES(DBMS_DATA_MINING.PREP_AUTO,
                DBMS_DATA_MINING.PREP_AUTO_ON);
  -- Set memory vector
  INSERT INTO MSET_IOT_SETTINGS
    VALUES(DBMS_DATA_MINING.MSET_MEMORY_VECTORS, 100);
  -- Set alpha
  INSERT INTO MSET_IOT_SETTINGS
    VALUES(DBMS_DATA_MINING.MSET_ALPHA_PROB, 0.1);
  -- Set alert count
  INSERT INTO MSET_IOT_SETTINGS
    VALUES(DBMS_DATA_MINING.MSET_ALERT_COUNT, 3);
  -- Set alert window
  INSERT INTO MSET_IOT_SETTINGS
    VALUES(DBMS_DATA_MINING.MSET_ALERT_WINDOW, 5);  
  -- Examples of other possible settings are:
  -- (dbms_data_mining.mset_beta_prob, 0.1)
  -- (dbms_data_mining.mset_adb_height, 0.01)
  -- (dbms_data_mining.mset_std_tolerance, 3)
  -- (dbms_data_mining.mset_heldaside, 500)
  COMMIT;  
END;
/

2. Create a stored procedure for constructing test data

Create a stored procedure specifically, passing in the number of rows as parameters, so as to facilitate inserting data of different sizes at any time.
Here, IOT scenarios are simulated, requiring one data per second.

The table needs to be manually cleared before inserting data. Because I prevent misoperation, I did not write the high-risk commands of the truncate table directly into the stored procedure. Before executing the stored procedure, the truncate table operation is manually executed. The stored procedure used to insert data is as follows:

CREATE OR REPLACE PROCEDURE INSERT_IOT_DATA(p_total_rows IN NUMBER) AS
    v_start_time TIMESTAMP := TO_TIMESTAMP('2025-03-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS');
BEGIN
    FOR i IN 1..p_total_rows LOOP
        INSERT INTO TBL_IOT (
            B1_T1, B1_T2, B1_T3, B2_T1, B2_T2, B2_T3, B3_T1, B3_T2, B3_T3, 
            B4_T1, B4_T2, B4_T3, B5_T1, B5_T2, B5_T3, B6_T1, B6_T2, B6_T3, 
            B7_T1, B7_T2, B7_T3, B8_T1, B8_T2, B8_T3, B9_T1, B9_T2, B9_T3, 
            B10_T1, B10_T2, B10_T3, B11_T1, B11_T2, B11_T3, B12_T1, B12_T2, B12_T3, 
            S, K_TS
        ) VALUES (
            DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), 
            DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), 
            DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), 
            DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), 
            DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), 
            DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), 
            DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), 
            DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), 
            DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), 
            DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), 
            DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), 
            DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55), DBMS_RANDOM.VALUE(50, 55),
            DBMS_RANDOM.VALUE(2, 5), 
            v_start_time + INTERVAL '1' SECOND * i
        );
        
        IF MOD(i, 1000) = 0 THEN
            COMMIT;
        END IF;
    END LOOP;
    
    COMMIT;
END;
/

Note: When building the data here I expect to insert random integers between 50 and 55, but using DBMS_RANDOM.VALUE(50,55), the actual randomness is decimal, but the result is as expected. This is because when inserting data with decimals into columns of type NUMBER(38,0), Oracle does not report an error, but will automatically truncate the decimal part and only retain the integer part. You can manually verify that the data meets expectations.

3. Simulate test data of different scales and insert them

The following needs to test different data sizes: 86w, 130w, 260w.
Before inserting data of different sizes, you need to manually truncate the table:

-- Before inserting data, make sure that the table data has been cleared and high-risk operations are manually performed:
 truncate table TBL_IOT;

 -- Insert data of different sizes according to requirements:
 -- Insert 860,000 rows
 EXEC INSERT_IOT_DATA(860000);

 -- Insert 1.3 million rows
 EXEC INSERT_IOT_DATA(1300000);

 -- Insert 2.6 million rows
 EXEC INSERT_IOT_DATA(2600000);

 -- The query table takes up space
 select SEGMENT_NAME, BYTES / 1024 / 1024 "MB" from dba_segments where SEGMENT_NAME = 'TBL_IOT';

4. Create a data mining model

Create a data mining modelIOTMSET_MODEL, if it exists before, delete it and create it:

-- Delete the data mining model IOTMSET_MODEL
 exec dbms_data_mining.drop_model('IOTMSET_MODEL');

 -- Create a data mining model IOTMSET_MODEL
 BEGIN
   dbms_data_mining.create_model(model_name => 'IOTMSET_MODEL',
                mining_function => 'CLASSIFICATION',
                data_table_name => 'TBL_IOT', --table name
                case_id_column_name => 'K_TS', --timestamp column name
                target_column_name => '',
                settings_table_name => 'mset_iot_settings'); --Configure table name
 END;
 /

After creating the model, you will find that the object under the current user has an additional series of tables and views named with the DM$ prefix. This is a model storage object automatically created by Oracle Data Mining (ODM). They are used to store relevant data of the MSET-SPRT model.

5. Query prediction results

Query the results of the predicted test dataset:

SELECT rownum, K_TS, pred FROM (SELECT K_TS, prediction(IOTMSET_MODEL using *)
   over (ORDER BY K_TS) pred FROM tbl_iot)
   where pred < 1 --Exception
   order by 2, 1 ;

6. Other test suggestions

It is recommended to create a snapshot of awr at the beginning and end of the test:

  • exec DBMS_WORKLOAD_REPOSITORY.create_snapshot();

Get the AWR report after the test is completed:

  • @?/rdbms/admin/awrrpt
  • awrrpt_1_220_221.html

The test results are omitted.