What open-source PDM/PLM works with PDF/scan + text layer?

P

Propieller2019-01-10 11:58:35

PDF

Propieller, 2019-01-10 11:58:35

There is a desire to have a system for storing / working with documentation.
Now the documentation is a dump of hundreds of GB of documents scanned to PDF in various directories. For some documents in the same directories there are scan files of the pages of the document that have changed during the design / operation. Root level - by systems / subsystems / tasks in the design. The name of a directory or files may contain a textual description of what is inside. Usually, the name of files and directories is something like kakoy-to-tekst-XYYZ-esche-tekst, and XYYZ is the code of this document, by which it can be found by links in some other documents. Inside - PID diagrams, wiring diagrams, algorithms, text descriptions. Sometimes a scan of a cover letter with a list of submitted documents is stored along with the document. Thus,
As a solution, it will probably be some kind of PDM (Product Data Management).
It is necessary:
- for all files of the dump - to recognize and put a layer in pdf text data (to enable text search)
- to maintain a database with Barcode correspondence <-> a pack of relevant documents in the intranet, so that you can quickly find and view the corresponding ones by taking pictures from your Barcode phone documents from a phone (wifi intranet) or any intranet computer
- to be able to see both the current version of the document and any of the history of changes
- open source (1. because the problems of blacks do not concern the wallet of the manual, 2. because most likely you will have to finish it under myself)

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

A

Alejandro Esquire, 2019-01-15
@A1ejandro

We have a slightly different task. Electronic archive, scans of "human" documents. But perhaps there are features in common with your project. First, we implemented apparently very similar to yours just a dump of documents in a heap, with division by accounting objects, which, like you, had to look through everything when you needed to find something. Then we introduced a rigid codification of documents, which already allows you to clearly determine whether there is a document of the required type in a particular case, and, if necessary, immediately open it. At the same time, we use two main types of document storage at once. PDF and JPEG. Basically, even cases are scanned and edited in JPEG, and only then translated into PDF. Moreover, like you, we wanted the whole project to be based on free software. In general, we have succeeded.