S
S
SpeakLive912018-10-13 11:51:15
1C-Bitrix
SpeakLive91, 2018-10-13 11:51:15

How to index the contents of PDF files in the Bitrix system?

Good afternoon! I looked all over the Internet, but did not find how to set up indexing of the content of uploaded PDF files in the Bitrix system. While sitewide search searches for a key within the content of articles or news, it does not search within the PDF documents themselves.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
A
Andrey Nikolaev, 2018-10-13
@gromdron

It all depends on which edition of Bitrix.
For example, Bitrix24 has an Intranet module, inside which (in the settings) there are "links to external programs" for content indexing.
The following command is responsible for pdf: where #FILE_PATH# is the full path to the file to be indexed. In theory, for Site Management, you can write a handler for creating / editing an infoblock element, where you upload files and append the contents of the file to SEARCH_CONTENT / search index.

S
serginhold, 2018-10-13
@serginhold

And with a fig, should he look in PDF? And Bitrix has nothing to do with it. News is data in the database, text in PDF is data in the file. Different things.
Theoretically, when uploading a PDF to the site, you can read it and add content to the database. And then look for matches in the database.

E
Eugene Pukha, 2018-10-13
@summoner2015

Out of the box, this file content search can only work if sphinx is installed on the server and enabled in the site search module settings. But I'm not sure if it will be able to search the contents of PDF files. This will most likely require refinement, here in this article there is an example - https://habr.com/post/131089/.
If the site is running on bitrixenv, then sphinx does not need to be installed, just enable it in the server settings and index it.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question