Location>code7788 >text

Java EasyExcel export reported memory overflow how to solve the problem

Popularity:632 ℃/2024-10-28 15:13:52

Hello everyone, I am V brother. Using EasyExcel to export large amounts of data can easily lead to memory overflow, especially when exporting millions of data. Have you ever encountered this situation, the following is V organized to solve the problem of some common methods to share with you, welcome to discuss:

EasyExcel large data volume export common methods

1. Batch writing

  • EasyExcel supports batch writing of data, which allows you to load data into memory in batches and write them to Excel files in batches, avoiding loading a large amount of data into memory at one time.
  • sample code (computing)
     String fileName = "large_data.xlsx";
     ExcelWriter excelWriter = (fileName).build();
     WriteSheet writeSheet = ("Sheet1").build();

     // Assuming that each write to10000data entry
     int batchSize = 10000;
     List<Data> dataList;
     int pageIndex = 0;
     do {
         // Paging for data
         dataList = getDataByPage(pageIndex++, batchSize);
         (dataList, writeSheet);
     } while (() == batchSize);

     // Close resource
     ();

2. Set the appropriate JVM memory

  • For big data export scenarios, you can try to increase the memory allocation of the JVM, for example:
     java -Xms512M -Xmx4G -jar 
  • account for
    • -Xms512M: Set the initial heap size to 512MB.
    • -Xmx4G: Set the maximum heap size to 4GB.

3. Reducing the complexity of data objects

  • When exporting data, try to simplify the data object to avoid unnecessary nesting and loading of redundant fields to reduce the memory space occupied by the object.

4. Disable automatic column width setting

  • EasyExcel's automatic column width feature can take up a lot of memory, especially with large amounts of data. Turning off automatic column widening can save memory.
  • sample code (computing)
     (fileName)
             .registerWriteHandler(new SimpleWriteHandler()) // do not use automatic column widths
             .sheet("Sheet1")
             .doWrite(dataList);

5. Export using Stream (suitable for big data)

  • utilizationOutputStreamWrite data in batches to reduce memory consumption. Reduces memory consumption byBufferedOutputStreamPerformance can be further improved.
  • sample code (computing)
     try (OutputStream out = new BufferedOutputStream(new FileOutputStream(fileName))) {
         ExcelWriter excelWriter = (out).build();
         WriteSheet writeSheet = ("Sheet1").build();
         int pageIndex = 0;
         List<Data> dataList;
         do {
             dataList = getDataByPage(pageIndex++, batchSize);
             (dataList, writeSheet);
         } while (() == batchSize);
         ();
     } catch (IOException e) {
         ();
     }

6. Selection of an appropriate data export tool

  • If the amount of data is very large, consider switching to an export tool that supports higher performance (such as Apache POI'sSXSSFWorkbook), suitable for exporting million-dollar data volumes, but will be more complex to configure and use.

Here's the kicker, so how do you use POI's SXSSFWorkbook to export million-dollar data volumes?

Apache POI's SXSSFWorkbook realizes the case of exporting millions of data volume.

Using the Apache POI'sSXSSFWorkbookExcel exports that can handle large data volumes asSXSSFWorkbookBased on streaming write, not all the data will be loaded into memory, but use temporary files for caching, which can significantly reduce memory consumption, suitable for millions of data export. Let's look at a full implementation example below.

The code is as follows

import .*;
import ;

import ;
import ;
import ;
import ;

public class LargeDataExportExample {

    public static void main(String[] args) {
        // File Output Path
        String filePath = "vg_large_data_export.xlsx";
        
        // Export millions of data
        exportLargeData(filePath);
    }

    private static void exportLargeData(String filePath) {
        // Batch size per write
        final int batchSize = 10000;
        // Total number of data entries
        final int totalRows = 1_000_000;

        // establishSXSSFWorkbookboyfriend,The only thing that remains in memory is the100classifier for objects in rows such as words,The excess is written to a temporary file
        SXSSFWorkbook workbook = new SXSSFWorkbook(100);
        (true); // Enable temporary file compression

        // establish工作表
        Sheet sheet = ("Large Data");

        // establish标题classifier for objects in rows such as words
        Row headerRow = (0);
        String[] headers = {"ID", "Name", "Age"};
        for (int i = 0; i < ; i++) {
            Cell cell = (i);
            (headers[i]);
        }

        int rowNum = 1; // 数据开始的classifier for objects in rows such as words号

        try {
            // Write data by batch
            for (int i = 0; i < totalRows / batchSize; i++) {
                // Simulate the acquisition of each batch of data
                List<Data> dataList = getDataBatch(rowNum, batchSize);
                
                // Write data to theExcelcenter
                for (Data data : dataList) {
                    Row row = (rowNum++);
                    (0).setCellValue(());
                    (1).setCellValue(());
                    (2).setCellValue(());
                }

                // After processing a batch of data,Option to clear cached data,Preventing Memory Overflow
                ((SXSSFSheet) sheet).flushRows(batchSize); // 清除已写的classifier for objects in rows such as words缓存
            }

            // Write data to file
            try (FileOutputStream fos = new FileOutputStream(filePath)) {
                (fos);
            }
            ("Data export completed:" + filePath);

        } catch (IOException e) {
            ();
        } finally {
            // clotureworkbookand delete the temporary files
            ();
        }
    }

    /**
     * Simulate paging to get data
     */
    private static List<Data> getDataBatch(int startId, int batchSize) {
        List<Data> dataList = new ArrayList<>(batchSize);
        for (int i = 0; i < batchSize; i++) {
            (new Data(startId + i, "Name" + (startId + i), 20 + (startId + i) % 50));
        }
        return dataList;
    }

    // data type
    static class Data {
        private final int id;
        private final String name;
        private final int age;

        public Data(int id, String name, int age) {
             = id;
             = name;
             = age;
        }

        public int getId() {
            return id;
        }

        public String getName() {
            return name;
        }

        public int getAge() {
            return age;
        }
    }
}

Let's explain the code.

  1. SXSSFWorkbookSXSSFWorkbook(100)Indicates that up to 100 rows of data will be kept in memory, and any more than that will be written to a temporary file to save memory.
  2. batch file: BybatchSizeControl the amount of data written per batch to minimize memory consumption.totalRowsSetting it to 1,000,000 means exporting 1,000,000 pieces of data.
  3. Simulation Data GenerationgetDataBatchmethod simulates paging for data, returning a batch of data at a time.
  4. Clearing Cache Lines: Each time a batch of data is written, it is passed through theflushRows(batchSize)Clears cached lines from memory to control memory usage.
  5. Compression of temporary files(true)Enable temporary file compression to further reduce disk space usage.

Matters requiring attention

  • temporary file: SXSSFWorkbook generates temporary files in the system temporary folder, you need to make sure that you have enough disk space.
  • Resource release: After completing the data write you need to call()to clean up temporary files.
  • performance optimization: Adjustable to machine memorybatchSizecap (a poem)SXSSFWorkbookCache lines to avoid frequent refreshes and memory overflows.