Answer the question
In order to leave comments, you need to log in
How to avoid Segmentation Fault when trying to build an FTS index on SqLite big data?
Hello dear community members.
The situation is the following.
There is a SQLite database with about 3 billion records. Each entry is a field of type CHAR and a number of additional fields with different types. The file size is about 340 GB. The maximum content length in the doc field is 256 characters, the content is in Russian.
I'm trying to create a full text index and I'm getting a Segmentation Fault.
I tried to create it in various ways, both under Windows (using SQLite Expert and my software on .NET, including x64 architecture), and under Linux (both Ubuntu and Centos). I even compiled sqlite from source, including the necessary flags for FTS3 and FTS4 - the same error.
Tried two options
- contentless FTS4 - this is when the content is stored in a regular table, and only the index is stored in the FTS table (create virtual table docs_fts using fts4(content='docs'... )
- the second option is to create a full-fledged FTS table from a regular one - insert into docs_fts select doc... from docs;
SQLite runs for about 4 hours, after which it invariably throws a Segmentation Fault.
There are no NULL fields in the database.
Tried on three different versions of SQLite, including the latest one available on the site.
I created and filled the database shortly before I started trying to create its full-text index, no other work was done with the database, except for filling it with data. It has one docs table, guaranteed not to contain NULL values. I also thought that, perhaps, something with the base, I did Reindex and Vacuum - Reindex happens instantly, Vacuum happens, albeit for a long time, but successfully.
Roughly speaking, my actions are:
CREATE TABLE docs ([id] INTEGER PRIMARY KEY AUTOINCREMENT, [doc] CHAR... (other fields here)
(here are actions for filling the database with data - they are loaded by my own script from text files)
CREATE VIRTUAL TABLE docs_fts using fts4 (content='docs', doc... (other fields here)
(here an attempt to INSERT INTO docs_fts(docs_fts) VALUES ('rebuild') in order to rebuild the contentless FTS index)
or
CREATE VIRTUAL TABLE docs_fts using fts4 (doc... (other fields here)
(here an attempt to INSERT INTO docs_fts SELECT doc... from docs;)
or
CREATE VIRTUAL TABLE docs_fts using fts3 (doc... (other fields here)
(here is an attempt to INSERT INTO docs_fts SELECT doc... from docs;)
For each attempt, I made a new copy of the source file, because I suspected that the database could be broken after Segmentation Fault.I
even changed the memory bars to others - I was afraid that the memory might be buggy.After
all attempts, the result is the same - Segmentation Fault.
В баг-трекер SQLite инфу отправил, жду реакции. Однако, хотелось бы ответа и здесь - возможно, я что-то делаю неправильно (за исключением того, что, как минимум, странно такое количество информации хранить в SqLite))))
Спасибо.
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question