Answer the question
In order to leave comments, you need to log in
Natural language processing on the knee, how to learn to pull out a city, street, district from a text?
Good day.
I would be grateful if you tell me how you can write a script on your knee that will pull out information about the location from the text of the ad - city, district, street, avenue, village, metro.
Regex is not very efficient because there are texts where the words "city {NAME}" or "city {NAME}" are not used.
Answer the question
In order to leave comments, you need to log in
Your task is called Named-entity recognition . There are a number of libraries that solve this problem (spacy, NLTK). Most of the solutions are given for the English language. But I think there are examples for Russian as well.
It’s easier to use ready-made services like https://dadata.ru/
If you want to do everything yourself, then you need to compile a database of all cities, their synonyms, abbreviations (St. Petersburg, St. Petersburg, St. Petersburg, etc.) and sort through. Then add inaccurate search and error correction.
You can either "on your knee", or you can "in your mind" (since the "neural network" tag has been set).
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question