D
D
Denis Tomsky2016-02-11 06:36:35
PHP
Denis Tomsky, 2016-02-11 06:36:35

doc(pdf) to html converter script?

Good day! We need a script for converting a doc file to html, with a file selection it would be great. To save all links, tables and pictures. I'm waiting for your help

Answer the question

In order to leave comments, you need to log in

2 answer(s)
K
krypt3r, 2016-02-11
@krypt3r

pandoc to the rescue. It turned out to be the best option for converting docx -> html (in my case, the next step is converting to pdf using wkhtmltopdf). You can add your own css styles to html.
Pandoc saves tables normally only simple ones. It does not work with cells with several rows or columns (I had to use a hack to assign the required colspan to the cell). The links seem to work. I did not check the pictures - they are not needed in document templates.
pandoc is picky about the original docx. Nested lists don't work. An extra line between the elements of a numbered list breaks the numbering. Multi-level numbered lists are supported, it is enough to correctly "type" them in Word and build the correct CSS.
However, again, this is the best option for converting docx -> html
PS. There are PHP wrappers on github for both pandoc and wkhtmltopdf

D
Denis Tomsky, 2016-02-11
@tomskiydenis

the script must be in js or php

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question