SAP HANA Quickly Export Import with Command Line

wedge

I have been struggling for nearly an afternoon today to use the command line tool that comes with SAP HANA to export data backups.

SAP HANA (sequently referred to as Hana) is an in-memory database. I have never been afraid of anyone in terms of performance.

Since SAP HANA provides Hana Studio, a desktop tool for easy operation and maintenance and DBA use, the official tutorial is also demonstrated using Hana Studio, but this is a bit troublesome:

Hana runs on the server, and Hana Studio runs on the terminal computer as a client. The scenarios when using Hana are basically a scenario where the amount of data is large and the query is slow. In this case, is it possible that you still have to put the backup of complete uncompressed data on the operation and maintenance computer?
Will the transmission efficiency be too low when cross-host import and export are affected by bandwidth and switch performance?
The configuration of ordinary computers will not be too high, and the performance of CPU and network cards may not be very fast. Insufficient import and export efficiency will inevitably make operators spend a lot of time.
Hana Studio is developed based on eclipse. It is essentially a Java program that occupies a lot of memory. What if the Hana Studio memory is burst because of too much data during export and import? Isn’t this going to be re-operated again!
Moreover, during the data transmission, this computer basically couldn't do anything.

So is there a way to avoid operating and maintenance, DBA host network and hard disk, and to increase efficiency a few times?

Some, brothers, some.
That is the command line operation, using part of the CPU and hard disk on the database server to perform backups.
Another advantage of command line operations is that it can be compressed locally on the server before transmission, greatly reducing the amount of hard disk usage and transmission pressure.

This article uses SAP HANA's command line tool hdbsql for processing. Since it is very difficult to find information and AI Q&A online, it took time to sort out this information.

On the one hand, this is in line with my current working positioning of DevOps———————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————

On the other hand, it can also reduce the trouble of being dominated by Hana Studio——Who is willing to pretend to be such a broken thing, and the operation logic is anti-human!

I will put the relevant content of the official document in this article for readers to refer to it for themselves, and at the end of the article, I will put the operation method I have summarized.

It’s better to act than to move. Let’s start, start, right... (You can only make me taller if you speak)

environment

The company currently uses hana's official express edition 2..00.20240701.1 image to deploy the test environment database.

The command is also applicable to 2. other versions (not yet tested on other major versions, readers can try it themselves).

In theory, the design of command line tools should be consistent across versions.

In the textSYSTEMAssume that the password isabcdefg, ordinary usersGCREPORT_PRDAssume that the password is12345678。

Migrate and import to ordinary usersGCREPORT_TESTAssume that the password is87654321。

The database user and group ID are12000:79。

Official Documentation

Refer to the data import and export statements section of "SAP HANA Database – SQL Reference Manual".

EXPORT- Export command

grammar:

EXPORT <object_name_list> AS <export_format> INTO <path> [WITH <export_option_list>]

Syntax elements:

WITH <export_option_list>: You can use the WITH clause to pass in the EXPORT option. 

 <object_name_list> ::= <OBJECT_NAME>,... | ALL

 <export_import_format> ::= BINARY | CSV

 <path> ::= 'FULL_PATH'

 <export_option_list> ::= <export_option> | <export_option_list> <export_option>

 <export_option> ::=

 REPLACE |

 CATALOG ONLY |

 NO DEPENDENCIES |

 SCRAMBLE [BY <password>] |

 THREADS <number_of_threads>

describe:

EXPORT

Command in specified formatBINARYorCSV, export table, view, column view, synonym, sequence or stored procedure. The data of temporary tables and the "no logging" table cannot be exported using EXPORT.

OBJECT_NAME

The SQL name of the object will be exported. To export all objects under all collections, you need to use the ALL keyword. If you want to export objects under a specified collection, you should use the collection name and asterisk, such as "SYSTEM"."*".

BINARY

The table data will be exported in internal BINARY format. Exporting data in this way is orders of magnitude faster than in CSV format. Only column tables can be exported in binary format. Line sheets are always exported in CSV format, even if the BINARY format is specified.

CSV

The table data will be exported in CSV format. Exported data can be imported into other databases. In addition, the exported data order may be disrupted. Both columnar and row tables can be exported in CSV format.

FULL_PATH

The server path to export.

Note: When using a distributed system, FULL_PATH must point to a shared disk. Due to security reasons, the path may not contain symbolic links or may not point to the DB instance's folder, except for the 'backup' and 'work' subfolders. Example of valid path (assuming the database instance is located in /usr/sap/HDB/HDB00):
'/tmp' 
'/usr/sap/HDB/HDB00/backup' 
'/usr/sap/HDB/HDB00/work' 

REPLACE

Using the REPLACE option, the previously exported data will be deleted, and the latest exported data will be saved. If the REPLACE option is not specified, an error will be thrown if previously exported data exists in the specified directory.

CATALOG ONLY

Use the CATALOG ONLY option to export only the database directory and contain no data.

NO DEPENDENCIES

Using the NO DEPENDENCIES option, the related objects of the exported object will not be exported.

SCRAMBLE

When exporting in CSV format, useSCRAMBLE [BY '<password>'], can disrupt sensitive customer data. When no extra is specified

The database will use the default scramble password. Only disturb string data. When importing data, the disrupted data will be out of order

Importing makes it impossible for the end user to read the data and it is impossible to return to the original state.

THREADS

Represents the number of threads used for parallel export.

Number of threads used Given the number of THREADS specifies the number of objects exported in parallel, defaulting to 1. Increasing numbers may reduce the export time, but it can also affect system performance.

The following should be considered:

For a single table, THREADS has no effect.
For views or stored procedures, 2 or more threads should be used (depending on the number of objects at most).
For the entire set, consider using more than 10 threads (depending on the number of system cores at most).
For thousands of tables throughout the BW/ERP system (ALL keyword), a large number of threads is reasonable (up to 256).

System and monitoring views:

You can monitor the progress of export using the system view M_EXPORT_BINARY_STATUS.

You can use the session ID to terminate the export session from the corresponding view in the following statement.

ALTER SYSTEM CANCEL [WORK IN] SESSION 'sessionId'

The exported detailed results are stored in the local session temporary table #EXPORT_RESULT.

example:

EXPORT "SCHEMA"."*" AS CSV INTO '/tmp' WITH REPLACE SCRAMBLE THREADS 10

IMPORT- Import command

grammar:

IMPORT <object_name_list> [AS <import_format>] FROM <path> [WITH <import_option_list>]

Syntax elements:

WITH <import_option_list>:

 You can use the WITH clause to pass in the IMPORT option. 

 <object_name_list> ::= <object_name>,... | ALL

 <import_format> ::= BINARY | CSV

 <path> ::= 'FULL_PATH'

 <import_option_list> ::= <import_option> | <import_option_list> <import_option>

 <import_option> ::=

 REPLACE |

 CATALOG ONLY |

 NO DEPENDENCIES |

 THREADS <number_of_threads>

describe:

IMPORTCommands import tables, views, column views, synonyms, sequences, or stored procedures. Data from temporary tables and "no logging" tables cannot be imported using IMPORT.

OBJECT_NAME

The SQL name of the object will be imported. To import all objects in the path, you need to use the ALL keyword. If you want to import objects into a specified collection, you should use the collection name and asterisk, such as "SYSTEM"."*".

BINARY | CSV

The import process may ignore the definition of the format because the format will be automatically detected during the import process. It will be imported in the same format as exported.

FULL_PATH

Import from this server path.

Note: When using a distributed system, FULL_PATH must point to a shared disk. If the REPLACE option is not specified, an error will be thrown in the specified directory.

CATALOG ONLY

Use the CATALOG ONLY option to import only the database directory and contain no data.

NO DEPENDENCIES

Using the NO DEPENDENCIES option, the related objects that have been imported will not be imported.

THREADS

Represents the number of threads used for parallel imports.

Number of threads used Given the number of THREADS specifies the number of objects imported in parallel, defaulting to 1. Increasing numbers may reduce import time, but can also affect system performance.

The following should be considered:

For a single table, THREADS has no effect.
For views or stored procedures, 2 or more threads should be used (depending on the number of objects at most).
For the entire set, consider using extra 10 threads (depending on the number of system cores at most). For thousands of tables throughout the BW/ERP system (ALL keyword), a large number of threads is reasonable (up to 256).

System and monitoring views:

You can monitor the progress of imports using the system view M_IMPORT_BINARY_STATUS.

You can terminate the import session from the corresponding view using the session ID in the following statement.

ALTER SYSTEM CANCEL [WORK IN] SESSION 'sessionId'

The detailed results imported are stored in the local session temporary table #IMPORT_RESULT.

IMPORT FROM- Import data from external csv files into an existing table

grammar:

IMPORT FROM [<file_type>] <file_path> [INTO <table_name>] [WITH <import_from_option_list>]

Syntax elements:

WITH <import_from_option_list>:

 You can use the WITH clause to pass in the IMPORT FROM option. 

 <file_path> ::= '<character>...'

 <table_name> ::= [<schema_name>.]<identifier>

 <import_from_option_list> ::= <import_from_option> | <import_from_option_list> <import_from_option_list>

 ort_from_option>

 <import_from_option> :: =

 THREADS <number_of_threads> |

 BATCH <number_of_records_of_each_commit> |

 TABLE LOCK |

 NO TYPE CHECK |

 SKIP FIRST <number_of_rows_to_skip> ROW |

 COLUMN LIST IN FIRST ROW |

 COLUMN LIST ( <column_name_list> ) |

 RECORD DELIMITED BY '<string_for_record_delimiter>' |

 FIELD DELIMITED BY '<string_for_field_delimiter>' |

 OPTIONALLY ENCLOSED BY '<character_for_optional_enclosure>' |

 DATE FORMAT '<string_for_date_format>' |

 TIME FORMAT '<string_for_time_format>' |

 TIMESTAMP FORMAT '<string_for_timestamp_format>' |

describe:

IMPORT FROMThe statement imports the data from the external csv file into an existing table.

THREADS: Represents the number of threads that can be used for parallel export. The default value is 1 and the maximum value is 256.

BATCH: Indicates the number of records that can be inserted in each submission.

THREADSand BATCH can achieve high performance in loading by enabling parallel loading and submitting multiple records at once. Generally speaking, for column tables, the submission frequency of 10 parallel load threads and 10,000 records is a good setting.

TABLE LOCK: Lock the table in order to import data into the column table faster. If NO TYPE CHECK is specified, the record will not check the type of each field when inserted.

SKIP FIRST <int> ROW: Skip the n records before insertion.

COLUMN LIST IN FIRST ROW: A column representing the first row in the CSV file.

COLUMN LIST ( <column_name_list> ): Indicates the list of fields to be inserted.

RECORD DELIMITED BY '<string>': Represents the record separator in the CSV file.

FIELD DELIMITED BY '<string>': Represents the field separator in the CSV file.

OPTIONALLY ENCLOSED BY '<character>': An optional close character representing field data.

DATE FORMAT '<string>': represents the date format of the character. If the CSV file has a date type, the specified format will be used for the date type field.

TIME FORMAT '<string>': represents the time format of the character. If the CSV file has a time type, the specified format is used for the time type field.

TIMESTAMP FORMAT '<string>': represents the timestamp format of the character. If the CSV file has a timestamp type, the specified format is used for the date type field.

example:

IMPORT FROM CSV FILE '/data/' INTO "MYSCHEMA"."MYTABLE" WITH RECORD DELIMITED BY  
'\n' FIELD DELIMITED BY ',';

Summary of the pitfalls

The SYSTEM user in Hana mirror actually does not have backup permissions, only import permissions.

Pit 2: Although SYSTEM users have some permissions to grant permissions, Hana restricts the current user from being able to empower the current user, and export can only be performed by ordinary users.

Pit 3: Unlike traditional database SQL, you can enter single and double quotes in English after logging in with hdbsql, but these two are not the same!

The following commands are executed in the environment where the database server resides.

Export

(Name is required for the first time) Grant backup and export permissions to the user, here are the GCREPORT_PRD users.

hdbsql -i 90 -d HXE -u SYSTEM -p 'abcdefg'
GRANT BACKUP ADMIN,EXPORT TO GCREPORT_PRD;
\q

hdbsql -i 90 -d HXE -u GCREPORT_PRD -p '12345678'
EXPORT GCREPORT_PRD."*" AS BINARY INTO '/hana/mounts/GCREPORT_PRD' WITH REPLACE THREADS 4;
\q

Special attention:"*"Used in English double quotes, and the SCHEMA name of the previous user can also be surrounded by English double quotes.Can't use single quotes!

Compression, migration and decompression

cd /hana/mounts
tar zcf GCREPORT_PRD. GCREPORT_PRD

Assume that GCREPORT_PRD. has been moved to another service, decompress

cd /hana/mounts
tar zxf GCREPORT_PRD.

Authorization (for reference only)

chown 12000:79 -R GCREPORT_PRD

Import

Just use SYSTEM user import. SCHEMA is switched here, and import it from GCREPORT_PRD to GCREPORT_TEST.

hdbsql -i 90 -d HXE -u SYSTEM -p 'abcdefg'
IMPORT ALL AS BINARY FROM '/hana/mounts/GCREPORT_PRD' WITH REPLACE RENAME SCHEMA GCREPORT_PRD TO GCREPORT_TEST THREADS 4;
\q

That’s all for today’s article. If it’s inspiring you, give me a like and follow!

I'm Hellxz, see you next time!