Answer the question
In order to leave comments, you need to log in
Is it possible to use Apache Lucene to determine if a certain string from a set is included in the text?
Hello!
There is such a task: there is a list of words (for example, ["mother", "home", "family"]), and there are also texts ("I live with my mother", "It was cold in the house", etc. ). It is necessary to determine whether any word from the list occurs in the specified texts. For example, in the first text there is the word "mom", if we bring it to its original form ("mother"), we will see that it is contained in the original list, the same with the second sentence. Can Apache Lucene help me with this? Well, or some other Java library that will cope with the task.
Answer the question
In order to leave comments, you need to log in
Here is a working example of a fuzzy match:
public static void main(String[] args) throws Exception {
String fieldName = "myField";
//создание тестового индекса
Directory directory = new RAMDirectory();//в "настоящей" Системе здесь должно быть FSDirectory.open(dir)
RussianAnalyzer analyzer = new RussianAnalyzer(Version.LUCENE_46);
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_46, analyzer);
IndexWriter writer = new IndexWriter(directory, config);
writer.addDocument(createDocument(fieldName, "Я живу у мамы"));
writer.addDocument(createDocument(fieldName, "В доме было холодно"));
writer.commit();
writer.close();
//поиск
int startFrom = 0;
int pageSize = 20;
DirectoryReader ireader = DirectoryReader.open(directory);
IndexSearcher indexSearcher = new IndexSearcher(ireader);
//FuzzyQuery осуществляет поиск неточных вхождений
FuzzyQuery wildcardQuery = new FuzzyQuery(new Term(fieldName, "мама"));
TopDocs topDocs = indexSearcher.search(wildcardQuery, startFrom + pageSize);
ScoreDoc[] hits = topDocs.scoreDocs;
for (int i = startFrom; i < topDocs.totalHits; i++) {
if (i > (startFrom + pageSize) - 1) {
break;
}
Document hitDoc = indexSearcher.doc(hits[i].doc);
if (hitDoc != null) {
System.out.println(hitDoc.get(fieldName));
}
}
}
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question