doc4j 转 html 乱码问题

原创
2018/01/21 20:54
阅读数 670

由于需要对  结果进行处理,所以返回字符串,使用

ByteArrayOutputStream baos = new ByteArrayOutputStream();
String result = baos.toString();

结果乱码

然后使用如下方式

byte[] lens = baos.toByteArray();
String result = new String(lens,"utf-8");

乱码解决

 

mark!

public static String getHtmlFromDocx(InputStream fileInputStream, String imageFilePath) throws Exception {
    WordprocessingMLPackage wordMLPackage = Docx4J.load(fileInputStream);
    ContentType contentType =  new ContentType("charset=utf-8");
    wordMLPackage.setContentType(contentType);
    HTMLSettings htmlSettings = Docx4J.createHTMLSettings();
    htmlSettings.setImageDirPath(imageFilePath);
    htmlSettings.setImageTargetUri("images");
    htmlSettings.setWmlPackage(wordMLPackage);

    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    Docx4jProperties.setProperty("docx4j.Convert.Out.HTML.OutputMethodXML", true);
    Docx4J.toHTML(htmlSettings, baos, Docx4J.FLAG_EXPORT_PREFER_XSL);
    byte[] lens = baos.toByteArray();
    String result = new String(lens,"utf-8");
    return result ;
}
展开阅读全文
加载中

作者的其它热门文章

打赏
0
0 收藏
分享
打赏
0 评论
0 收藏
0
分享
返回顶部
顶部