Happy Moment
I remember when I was a kid, my family lost money and I took it, but my parents insisted that my brother took it.
Dad beat my brother all over his body, and he squinted at me in anger
I didn't dare to look directly at my brother, my eyes turned to my dad and said, "Dad, look at him, it seems like he's still not convinced
Description of the problem
The project is based onPOI 4.1.2
generatingExcel 2007
File, has docked a lot of customers, but also stable operation for several years; just two days ago, docking a new customer, the generated Excel 2007 file imported into their system failed, prompting:
-700006004 The column was not found in the current Excel sheet column name.
To tell you the truth, this tip didn't work for me for a dime, so I had to ask the developers of their system; after half a day of troubleshooting, their developers came to the conclusion:
Your Excel 2007 files look like they are old and don't meet the new version standards
This answer makes me even more confused, touching my knowledge blind spot, do not until how to answer the words
Excel 2007 file there are standard and non-standard? Let's not dwell on this issue, in line with the principle of prioritizing the problem, try to try to upgrade the version of POI!
Why the first thing that comes to mind is to upgrade the POI version? Because it is the Excel 2007 file generated by POI (seems to be the same as not saying that).
Upgrade the POI version to5.3.0
The code is not adjusted in any way, the file is regenerated and sent to the customer, and the customer verifies that it can be imported normally; do you think that the matter has come to an end, and upgrading the version of the POI is fine, I can only say that you have a new love and forget about the old one, what about the customers who have already been docked? You dare to guarantee that after upgrading the POI generated Excel 2007 (2003 will also follow the impact) can still be imported into the system of these customers, so our ambitions can not be a little more ambitious: new love and old love to be!
It has no impact on existing customers, but also meets the requirements of new customers, which leads to the title of this article
How to generate Excel 2007 files compliant with the new version without upgrading the POI version
It's a zip.
through (a gap)Excel 2007
Beginning with Microsoft's adoption of a new file format called OpenXML
file format, which improves file and data management, data recovery, and interactivity; and Excel 2007 is a compressed package containing XML, images, and other files; let's focus only on XML for the moment, based on thePOI 4.1.2
<dependency>
<groupId></groupId>
<artifactId>poi</artifactId>
<version>4.1.2</version>
</dependency>
<dependency>
<groupId></groupId>
<artifactId>poi-ooxml</artifactId>
<version>4.1.2</version>
</dependency>
String filePath = "D:/POI_4_1_2.xlsx";
public void createExcel(String filePath) throws Exception {
try(SXSSFWorkbook wb = new SXSSFWorkbook();
OutputStream os = ((filePath))) {
SXSSFSheet sheetA = ("a");
SXSSFSheet sheetB = ("b");
SXSSFRow sheetA_row1 = (0);
sheetA_row1.createCell(0).setCellValue("hello world");
sheetA_row1.createCell(1).setCellValue("666");
SXSSFRow sheetA_row2 = (1);
sheetA_row2.createCell(0).setCellValue("888");
sheetA_row2.createCell(1).setCellValue("999");
SXSSFRow sheetB_row1 = (0);
sheetB_row1.createCell(0).setCellValue("qsl");
sheetB_row1.createCell(1).setCellValue("greenstone road");
(os);
();
}
}
Generate an old Excel 2007 file:POI_4_1_2.xlsx
The direct use of the7z
Perform the extraction (you can also just rename POI_4_1_2.xlsx to POI_4_1_2.zip and extract it)
The directory structure after decompression is as follows
All the files areXML
The POI will be upgraded to5.3.0
<dependency>
<groupId></groupId>
<artifactId>poi</artifactId>
<version>5.3.0</version>
</dependency>
<dependency>
<groupId></groupId>
<artifactId>poi-ooxml</artifactId>
<version>5.3.0</version>
</dependency>
String filePath = "D:/POI_5_3_0.xlsx";
public void createExcel(String filePath) throws Exception {
try(SXSSFWorkbook wb = new SXSSFWorkbook();
OutputStream os = ((filePath))) {
SXSSFSheet sheetA = ("a");
SXSSFSheet sheetB = ("b");
SXSSFRow sheetA_row1 = (0);
sheetA_row1.createCell(0).setCellValue("hello world");
sheetA_row1.createCell(1).setCellValue("666");
SXSSFRow sheetA_row2 = (1);
sheetA_row2.createCell(0).setCellValue("888");
sheetA_row2.createCell(1).setCellValue("999");
SXSSFRow sheetB_row1 = (0);
sheetB_row1.createCell(0).setCellValue("qsl");
sheetB_row1.createCell(1).setCellValue("greenstone road");
(os);
();
}
}
decompression (in digital technology)POI_5_3_0.xlsx
The directory structure of POI_4_1_2.xlsx is the same as that of POI_4_1_2.xlsx, with the same file names and number of files.
with respect to
The Excel 2007 file is a compressed package!
I'm sure you have no questions; let's compare the two catalogs
Although the discrepancy file is quite large, it can be grouped into two categories
-
standalone Difference
_rels\.rels docProps\ xl\_rels\ [Content_Types].xml
The differences between these four files are the same (all four files are on one line, I switched the same to the second line to highlight the differences)
The standalone value in the POI 4.1.2 generated xml is
no
and the standalone value in the xml generated by POI 5.3.0 isyes
There's just one difference.There is another discrepancy in the
It is normal for creation times to be different, and this difference can be ignored
-
dimension Difference
xl\worksheets
directory holds the sheet-related xml, but with the names sheet1 ~ sheetn, instead of the names specified in thea
、b
If there are as many sheets as there are xml files, we only need to look at the differences in one of the xml files, and the others are similar.There is one difference: in the sheet generated by POI 4.1.2, it is
<dimension ref="A1"/>
and in POI 5.3.0 it is<dimension ref="A1:B2"/>
So it seems that Excel 2007 files do have a standard and non-standard distinction
Back to the question.
How to generate Excel 2007 files compliant with the new version without upgrading the POI version
How would you handle it?
To ensure that it does not affect the already docked customers (the subtext is: neither replace the POI, nor upgrade the POI) at the same time, but also to be able to generate a standard version of the Excel 2007 file to meet the new customers, I feel that there is nothing to do ah, only to increase the configuration of the item:Whether to generate standard Excel 2007
, the default value is:clogged
, indicates that a non-standard Excel 2007 file is generated to ensure that customers who have been docked are not affected. configuration item value if yes:be
If you have a standard Excel 2007 file, then you can generate a standard Excel 2007 file; so here's the problem again!
How are standard Excel 2007 files generated?
Generated by the POI is certainly not, because it can not upgrade its version, the generation of non-standard Excel 2007 files, then how do we do, we can change a component Well, all roads lead to Rome, generate Excel 2007 components are certainly not the only POI, a different component to generate the standard Excel 2007 files on it!
Other components
aliEasyExcel
I'm sure you all know this, so use it to generate a standard Excel 2007 file, introduce the dependency
<dependency>
<groupId></groupId>
<artifactId>easyexcel</artifactId>
<version>4.0.2</version>
</dependency>
Let's look at its dependency tree
You should be able to read the boxed part; EasyExcel relies on the POI, but because POI 4.1.2 has a higher priority than 5.2.5, which EasyExcel relies on, it ends up relying on POI 4.1.2
For maven prioritization check out:maven pass dependencies and prioritization with examples, difficult to top also have to be on the yah!
Are you guys confused at this point?
Apparently it doesn't work with EasyExcel; I also tried thejxl
, found that it didn't work either (the directory structure was completely different after unzipping), and didn't bother trying the other components because I came up with a solution that felt feasible
repackaging
Remember the previous directory comparison, there are two types of diff files, the standalone diff is fixed at 4 files
_rels\.rels
docProps\
xl\_rels\
[Content_Types].xml
dimension Differences fixed as a category of documents
xl\worksheets\sheet*.xml
Except for these discrepancy files, all the other files are consistent, so can't we handle it like this
The Excel 2007 file is still generated based on POI 4.1.2, if the configuration item:
Whether to generate standard Excel 2007
Not configured or configured withclogged
, then file generation ends (with established logic) if the configuration item is configured:be
The following is done to the generated Excel 2007
- Unzip the generated Excel 2007 file
- Make changes to the variance file to change the corresponding variance items to standard values
- Repackage into an Excel 2007 file and replace the old Excel 2007 file with the previous one.
Does this fulfill the need? If we have a solution, let's try it.
-
decompression (in digital technology)
Just use the POI dependencies
commons-compress
Just unzip it/** * treat (sb a certain way) Excel 2007 Unzip the file * @param sourceFile rootExcel 2007file * @param unzipDir decompression directory * @throws IOException decompression anomaly * @author greenstone road */ private void unzip(File sourceFile, String unzipDir) throws IOException { try (ZipFile zipFile = new ZipFile(sourceFile)) { // (math.) ergodic ZIP file中的每个条目 Enumeration<ZipArchiveEntry> entries = (); while(()) { ZipArchiveEntry entry = (); // 创建输出file的路径 Path outputPath = (unzipDir, ()); if (!(())) { // Make sure the parent directory exists (()); } try (InputStream inputStream = (entry); FileOutputStream outputStream = new FileOutputStream(())) { (inputStream, outputStream); } } } }
-
modifications
standalone value modification
/** * modificationsxml (used form a nominal expression) standalone attribute value * @param filePath embody standalone 属性(used form a nominal expression)xmlfile * @throws IOException IOexceptions * @author greenstone road */ private void updateXmlStandalone(Path filePath) throws IOException { Path bakPath = (().toString(), () + "_bak"); try (BufferedReader reader = (filePath)) { String line = (); String replace = ("standalone=\"no\"", "standalone=\"yes\""); (bakPath, (StandardCharsets.UTF_8)); } (filePath); (bakPath, filePath); }
dimension modification, first we need to figure out
ref
Meaning of value// POI 4.1.2 // POI 5.3.0 In POI 4.1.2, the value of ref only represents the start coordinate, A represents the X coordinate value, and 1 represents the Y coordinate value, whereas in POI 5.3.0, the value of ref not only has the start coordinate, but also includes the end coordinate.
A1
denotes the starting coordinate, theB2
denotes the ending coordinate, where 2 denotes the number of data rows/** * modificationsxml (used form a nominal expression) dimension ref attribute value * @param sheetDir sheet xmlCatalog * @throws IOException IOexceptions * @author greenstone road */ private void updateSheetXmlDimension(Path sheetDir) throws IOException { // modifications第二行中(used form a nominal expression) <dimension ref="A1"/> try (Stream<Path> filePaths = (sheetDir)) { (filePath -> { // Get the number of columns and rows first,rows:Number of data lines,totalRows:Total lines of content AtomicInteger columns = new AtomicInteger(0); AtomicInteger rows = new AtomicInteger(0); try (Stream<String> lines = (filePath)) { (line -> { if (("</row>")) { (); } if (() == 1 && ("</row>")) { (("</c>").length - 1); } }); } catch (IOException e) { throw new RuntimeException(e); } // Excel coordinate of a column A ~ Z,AA ~ ZZ,... int circleTimes = () % 26 == 0 ? (() / 26 - 1) : (() / 26); StringBuilder sb = new StringBuilder(); for (int i = 0; i < circleTimes; i++) { ("A"); } ((char) ('A' + (() % 26 == 0 ? 25 : (() % 26 - 1)))); // <dimension ref="A1:B2"/> String objStr = "<dimension ref=\"A1:" + sb + (); try { Path bakPath = (().toString(), () + "_bak"); (bakPath); try (Stream<String> lines = (filePath)) { (line -> { try { if (("<dimension ref=\"A1")) { line = ("<dimension ref=\"A1", objStr); } if (!("</worksheet>")) { line = line + "\n"; } (bakPath, (StandardCharsets.UTF_8), ); } catch (IOException e) { throw new RuntimeException(e); } }); } (filePath); (bakPath, filePath); } catch (IOException e) { throw new RuntimeException(e); } }); }; }
This code is slightly more complex, but can be summarized in the following steps
-
Iterate through the contents of the sheet xml file to get the number of columns and rows
-
Going by the number of columns to derive the maximum column coordinate (B), and then the number of rows (2) to get the ending coordinate (B2), then the value of ref is also:
A1:B2
There is a small pitfall here, when there is only one row and one column of data, the value of the ref in the new version is the same as the value of the ref in the old version, both are
A1
But the above code getsA1:A1
, so it still needs to be adjusted for compatibility, I'll leave it up to you guys as to how to do that, I'm just hinting here to watch out for the pit!!!! -
Make a copy of the sheet xml data and use the
<dimension ref=\"A1:B2
replace<dimension ref=\"A1
The last thing you need to do is to replace the old sheet xml file with the new one.
-
-
ask for a doggy bag (at a restaurant)
Once the xml files that need to be changed have been modified, re-package them, and continue using the
commons-compress
/** * repackage xlsx * @param basePath Unzip the root directory([Content_Types].xmlCatalog) * @param oriFile rootExcel 2007file * @throws IOException * @author greenstone road */ private void repackage(String basePath, File oriFile) throws IOException { File newFile = new File(basePath + ".xlsx"); try (FileOutputStream fos = new FileOutputStream(newFile); ZipArchiveOutputStream zaos = new ZipArchiveOutputStream(fos)) { // 获取rootfile夹下的所有file和子file夹 File srcDir = new File(basePath); for (File f : (())) { addToZip(f, "", zaos); } } // 用新file覆盖原file Path oriPath = (); (oriPath); ((), oriPath); } private void addToZip(File file, String parentFolder, ZipArchiveOutputStream zaos) throws IOException { if (()) { // If it's a catalog,则遍历其中的file并递归调用 addToZip for (File childFile : (())) { addToZip(childFile, parentFolder + () + "/", zaos); } } else { // 如果是file,Then add it to the ZIP file中 try (FileInputStream fis = new FileInputStream(file)) { // Create a directory without the first level ZipArchiveEntry String entryName = parentFolder + (); if (("/")) { entryName = (1); } ZipArchiveEntry entry = new ZipArchiveEntry(entryName); (entry); (fis, zaos); (); } } }
It's nothing complicated. I'm sure you'll all understand.
-
establish ties or contact
String the above 3 steps together
/** * repackagingExcel2007file * @param ifExcel2007New Whether to repackage * @param xlsxFile xlsx源file * @throws IOException * @author greenstone road */ private void repackageExcel2007(boolean ifExcel2007New, File xlsxFile) throws IOException { if (!ifExcel2007New) { return; } Path unzipDir = (""); try { String basePath = ((), ().substring(0, ().lastIndexOf("."))).toString(); // decompression (in digital technology)xlsx unzip(xlsxFile, basePath); // modificationsxml updateXmlStandalone((basePath, "_rels", ".rels")); updateXmlStandalone((basePath, "docProps", "")); updateXmlStandalone((basePath, "xl", "_rels", "")); updateXmlStandalone((basePath, "[Content_Types].xml")); updateSheetXmlDimension((basePath, "xl", "worksheets")); // wrap upxlsx repackage(basePath, xlsxFile); } finally { // 删除临时file夹 try (Stream<Path> walk = (unzipDir)) { (()) .map(Path::toFile) .forEach(File::delete); } } }
At this point, the job is done! I have tried, after repackaging Excel 2007 files, with Windows Excel tools can be opened normally, WPS can also be opened normally, for new customers to test, can also be imported normally, simply perfect!
summarize
-
Excel 2007 files are compressed packages of xml, images and other files.
-
When introducing new functionality, it must not interfere with existing functionality.
I told you not to move. You had to make adjustments. You had a production accident.
-
You can modify the metadata of Excel 2007 files by unzipping, modifying, or packaging them
-
Unzip and pack with
commons-compress
, using something else might be a shock!
-