Answer the question
In order to leave comments, you need to log in
Why does the Trigram (or Trigraph) concept use exactly 3 and not 2 or 4+?
Example
https://www.postgresql.org/docs/current/pgtrgm.html
It would be possible to split into 4 characters or more, as well as 2.
Intuition suggests that the point is accuracy, with long or too short pieces, the accuracy drops, but Is it so?
Answer the question
In order to leave comments, you need to log in
Empirically found the "golden mean". Natural languages are different. For English, with its typical short words, 2 might be fine, but for German, 4+ would probably be better. We experimentally found that on average 3 for different languages gives a good result.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question