Answer the question
In order to leave comments, you need to log in
How to add a character to the whitelist of a tokenizer?
There is a text field, a custom analyzer is assigned to it, it has a standard tokenizer, this tokenizer gets rid of all punctuation marks, this is what I need, but it also counts numbers separated by a slash as two morphemes, for example 23/45 is two tokens, "23 " and "45", but I need them to be counted as one token, i.e. "23/45", otherwise the behavior of the tokenizer suits me. How can this tokenizer behavior be changed? I tried to replace / with a word with a filter, but then I can't get it back. Thanks in advance
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question