使用pdfbox将pdf文件转换为图像

koaltpgm 于 2021-07-05 发布在 Java

关注(0)|答案(5)|浏览(383)

有人能给我一个例子，说明如何使用apachepdfbox转换不同图像中的pdf文件（pdf的每一页对应一个）？

pdfbox

来源：https://stackoverflow.com/questions/63739989/how-to-take-a-screenshot-of-pdf-content-using-selenium-and-java

5条答案

按热度按时间

unftdfkk1#

1.8.*版本的解决方案：

PDDocument document = PDDocument.loadNonSeq(new File(pdfFilename), null);
List<PDPage> pdPages = document.getDocumentCatalog().getAllPages();
int page = 0;
for (PDPage pdPage : pdPages)
{ 
    ++page;
    BufferedImage bim = pdPage.convertToImage(BufferedImage.TYPE_INT_RGB, 300);
    ImageIOUtil.writeImage(bim, pdfFilename + "-" + page + ".png", 300);
}
document.close();

在进行构建之前，不要忘记阅读1.8依赖项页面。
2.0版本的解决方案：

PDDocument document = PDDocument.load(new File(pdfFilename));
PDFRenderer pdfRenderer = new PDFRenderer(document);
for (int page = 0; page < document.getNumberOfPages(); ++page)
{ 
    BufferedImage bim = pdfRenderer.renderImageWithDPI(page, 300, ImageType.RGB);

    // suffix in filename will be used as the file format
    ImageIOUtil.writeImage(bim, pdfFilename + "-" + (page+1) + ".png", 300);
}
document.close();

imageioutil类位于单独的下载/工件（pdf工具）中。在进行构建之前，请阅读2.0 dependencies页面，对于带有jbig2图像的PDF、保存到tiff图像以及读取加密文件，您需要额外的jar文件。
确保使用jdk版本的最新版本，也就是说，如果您使用的是jdk8，那么不要使用1.8.0\u5、1.8.0\u191或阅读时最新的版本。早期的版本非常慢。

赞(0）回复(0）举报 2021-07-05

9fkzdhlc2#

下面是我的部分代码，用于将pdf从多部分文件转换为jpg缩略图。我正在将图像保存为base64字符串。使用pdfbox 2.0.21版本。

private static String generatePdfThumbnail(byte[] imageInBytesArray) throws IOException {
    PDDocument document = PDDocument.load(imageInBytesArray);
    PDFRenderer renderer = new PDFRenderer(document);
    BufferedImage bufferedImage = renderer.renderImage(0);
    Graphics2D bufImageGraphics = bufferedImage.createGraphics();
    bufImageGraphics.drawImage(bufferedImage, 0, 0, null);

    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    boolean foundWriter = ImageIO.write(bufferedImage, "jpg", baos);
    byte[] fileContent = null;
    if (!foundWriter) {
      return "";
    }

    fileContent = baos.toByteArray();
    return Base64.getEncoder().encodeToString(fileContent);
  }

赞(0）回复(0）举报 2021-07-05

inb24sb23#

没有任何额外的依赖项，您只需使用 PDFToImage 类已包含在 PDFBox .
Kotlin： PDFToImage.main(arrayOf<String>("-outputPrefix", "newImgFilenamePrefix", existingPdfFilename)) 其他配置选项：https://pdfbox.apache.org/docs/2.0.8/javadocs/org/apache/pdfbox/tools/pdftoimage.html

赞(0）回复(0）举报 2021-07-05

xtfmy6hx4#

public class PDFtoJPGConverter {

    public List<File> convertPdfToImage(File file, String destination) throws Exception {

    File destinationFile = new File(destination);

    if (!destinationFile.exists()) {
        destinationFile.mkdir();
        System.out.println("DESTINATION FOLDER CREATED -> " + destinationFile.getAbsolutePath());
    }else if(destinationFile.exists()){
        System.out.println("DESTINATION FOLDER ALLREADY CREATED!!!");
    }else{
        System.out.println("DESTINATION FOLDER NOT CREATED!!!");
    }

    if (file.exists()) {
        PDDocument doc = PDDocument.load(file);
        PDFRenderer renderer = new PDFRenderer(doc);
        List<File> fileList = new ArrayList<File>();

        String fileName = file.getName().replace(".pdf", "");
        System.out.println("CONVERTER START.....");

        for (int i = 0; i < doc.getNumberOfPages(); i++) {
        // default image files path: original file path
        // if necessary, file.getParent() + "/" => another path
        File fileTemp = new File(destination + fileName + "_" + i + ".jpg"); // jpg or png
        BufferedImage image = renderer.renderImageWithDPI(i, 200);
        // 200 is sample dots per inch.
        // if necessary, change 200 into another integer.
        ImageIO.write(image, "JPEG", fileTemp); // JPEG or PNG
        fileList.add(fileTemp);
        }
        doc.close();
        System.out.println("CONVERTER STOPTED.....");
        System.out.println("IMAGE SAVED AT -> " + destinationFile.getAbsolutePath());
        return fileList;
    } else {
        System.err.println(file.getName() + " FILE DOES NOT EXIST");
    }
    return null;
    }

    public static void main(String[] args) {

    try {
        PDFtoJPGConverter converter = new PDFtoJPGConverter();
        Scanner sc = new Scanner(System.in);
        System.out.print("Enter your destination folder where save image \n");
        // Destination = D:/PPL/;
        String destination = sc.nextLine();

        System.out.print("Enter your selected pdf files name with source folder \n");
        String sourcePathWithFileName = sc.nextLine();
        // Source Path = D:/PDF/ant.pdf,D:/PDF/abc.pdf,D:/PDF/xyz.pdf
        if (sourcePathWithFileName != null || sourcePathWithFileName != "") {
        String[] files = sourcePathWithFileName.split(",");
        for (String file : files) {
            File pdf = new File(file);
            System.out.print("FILE:>> "+ pdf);
            converter.convertPdfToImage(pdf, destination);
        }
        }

    } catch (Exception ex) {
        ex.printStackTrace();
    }
    }
}

====================================
这里我使用apachepdfbox-2.0.8、commons-logging-1.2和fontbox-2.0.8库
快乐编码：）

赞(0）回复(0）举报 2021-07-05

0md85ypi5#

我今天用PDFBOX2.0.15试过了。

import org.apache.pdfbox.pdmodel.*;
import org.apache.pdfbox.rendering.*;
import java.awt.image.*;
import java.io.*;
import javax.imageio.*;

public static void PDFtoJPG (String in, String out) throws Exception
{
    PDDocument pd = PDDocument.load (new File (in));
    PDFRenderer pr = new PDFRenderer (pd);
    BufferedImage bi = pr.renderImageWithDPI (0, 300);
    ImageIO.write (bi, "JPEG", new File (out)); 
}

赞(0）回复(0）举报 2021-07-05

我来回答

使用pdfbox将pdf文件转换为图像

5条答案

相关问题

热门标签

最新问答