Convert docx files to PDF files format loss-free with Java

Last Updated:  June 16, 2022 | Published: September 16, 2018

Recently I had to convert generated .docx files to .pdf files for more convenient distribution. The Word documents contained some custom formatting and additional pictures. I tried several Java libraries for doing this job (Docx4j, XDocReport and Apache POI) but all of them couldn't generate the output I got from manually converting the .docx files with Microsoft Word's native export functionality. On GitHub, I found a nice command-line tool for converting the documents to pdf files: OfficeToPDF. In this blog post, I'll show you a quick example on how to use this CLI tool. This helps us to convert docx to pdf using Java without losing formatting.

First off there are some technical requirements you need to fulfill:

  • .NET Framework 4
  • Office 2016, 2013, 2010 or Office 2007

These requirements are a strong indicator of running this solution on a Windows machine. To try the following example on your machine, you need to download the .exe from the GitHub project site and have a .docx for the conversion at hand.

Calling the .exe from Java might look like the following (the CLI expects just two parameters: the path of the .docx file and the path of the generated .pdf):

The code above calls the CLI with its two parameters. It waits until the process finished and reads in the content of the generated file as a byte array.

This approach may be too oversized if you plan to convert just text-based .docx files, but if your files contain some custom formatting, this approach might help you. For regular text-based files, I would prefer to use the conversion functionality of the Java libraries.

For further examples about file handling with Java, have a look at the following overview page.

Have fun converting docx to pdf with Java,

Phil.

    • Hey Singgih,

      I’ve not yet tested multithreading for this example. But as for every incoming request, a new OS process is spawned, it should work. For a more robust solution, you might limit the max. parallel conversion of .docx documents. Queueing the requests might be a good approach.

      Kind regards,
      Phil

  • I did your example, but getting an error 3 as result. No further explanation. Executing the OfficeToPdf.exe directly on the command line works well.
    Is there a way to get some more debugging output?

    Here is the code snippet I did:

    try {
    String pdffile = docxfile.substring(0, docxfile.lastIndexOf(“.”) + 1) + “pdf”;
    Process process;
    process = new ProcessBuilder(converter, “/verbose”, docxfile, pdffile).start();
    process.waitFor();

    if (debug != null) {
    Common.log(“Result of processing : ” + process.exitValue());
    Common.log(“Conversation of ” + docxfile + ” to ” + pdffile + ” done.”);
    }

    and the result:

    Result of processing : 3

    According to documentation this means

    1 – Failure
    2 – Unknown Error

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}
    >