Answer the question
In order to leave comments, you need to log in
Is there a publicly available regular expression for converting text numbers to numbers?
I have a task to translate the combinations of the text "twenty-five", "twenty-fifth", "in the twenty-fifth" into the number 25. And in this way I should be able to translate the first 40 numbers. I can’t find regular expressions ready for this in the public domain (I don’t want to write it myself). And we also need regular expressions to translate months in different cases into their serial number in the year.
Answer the question
In order to leave comments, you need to log in
For those who are interested, I implemented it simply through dialogflow.
This is not a very simple question and regular expressions are not done. I did something close. The algorithm I will advise is this:
1. Connect the search engine to a dictionary of forty of your numbers. (I used yandex.server - no longer available, but this is just an example, maybe a sphinx will do, or look for something else).
2. Take turns and one word at a time from the text.
3. Check their location in the dictionary in the search engine. (This is where the magic happens, because the search engine can search based on morphology). Accordingly, one must be able to distinguish one number from one word from a double number from two words (although you can split the search into two stages - first we find texts from two numerical words and replace them, then from one word, but there are always subtleties, for example, to distinguish from a number from three words when there are hundreds). Do a little magic on numbers that have a double name, for example, "one" - "first" - "one", "two" - "second".
4. Replace these words with numbers, because you already know the correspondence of words in the dictionary with numbers.
5. Profit.
For example, your dictionary:
times (1 in the mind or in the program code)
two (2 in the mind)
three (3 in the mind)
Text:
"Pay the receipt on the first day
" the search engine will not find in the dictionary
pay - skip, tk. the search engine will not find the first one in the dictionary
- the search engine will find the "first", change the
number to "1" - skip, because the search engine will not find it in the dictionary
After the replacement, you will receive:
"Pay the receipt on the 1st"
But here you also need to be careful, otherwise you will get such a "translation":
"Once I went hunting ..."
"1 I went hunting"
Well, maybe "once" does not need to be translated.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question