T
T
Tesla4o2019-10-31 20:22:37
C++ / C#
Tesla4o, 2019-10-31 20:22:37

How to programmatically convert Office documents to PDF and HTML?

I've looked all over the internet and there isn't much on this subject.
LibreOffice is offered everywhere, but for some reason it didn’t start for me with this method:
\"C:\\Program Files (x86)\\LibreOffice\\program\\soffice.exe\" --headless --writer --convert-to html file.docx
Maybe there are some libraries for such cases?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
#
#, 2019-10-31
@mindtester

there is. quite googling. all the free solutions found were reduced to various wrappers over a binary tool, I don’t remember the name anymore. everything seemed rather muddy to me - everywhere there is a multi-pass, through conversion to html. and the task was not just from doc, from rtf to get pdf, in C #.
in the end, if you have an MS office under Windows, it turned out to be the easiest way to use Word. I think that it will not be difficult to translate into pluses

internal static bool wordAsConverter(string rtf, string pdf, bool verb = true, bool clean = true)
{
  $"\t..try convert to pdf...".print();
  var res = false;
  var app = new Application();
  try
  {
    var doc = app.Documents.Open(rtf);
    doc.ExportAsFixedFormat(pdf, WdExportFormat.wdExportFormatPDF);
    doc.Close(false);
    res = true;
    var fn = Path.GetFileName(rtf);
    if (verb) $"\t{fn} converted to pdf".print();
    if (clean)
    {
      File.Delete(rtf);
      if (verb) $"\t{fn} deleted".print();
    }
  }
  catch (Exception e) { e.Message.print(); }
  finally { app.Quit(false); }
  return res;
}

R
Radjah, 2019-11-01
@Radjah

I in the hand-made articles with MSO worked through OLE. All methods and parameters are described in the built-in help.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question