F
F
Fudo Tsukiko2017-11-04 19:42:21
Unicode
Fudo Tsukiko, 2017-11-04 19:42:21

LUA | Counting Russian characters?

Here I ran into a problem ...
When counting Unicode (for example, "yalaya"), it gives out not 4 but 8. I know that there are 2 bytes in a Russian character, and I tried to divide by 2 ... But there is another problem - if in the text there will be English letters or spaces, then this method immediately disappears ...

Answer the question

In order to leave comments, you need to log in

3 answer(s)
S
Sergey Lerg, 2017-11-04
@Lerg

https://github.com/starwing/luautf8

R
Roman Mirilaczvili, 2017-11-04
@2ord

To count Cyrillic characters , you need to determine the characters whose code points are in the range U+0410 - U+044F, not including ё (U+0451), Ё (U+0401).
Also, the statement is not always true.

I know that there are 2 bytes in a Russian character
since this is a special case depending on the choice of encoding.

D
dollar, 2018-08-07
@dollar

You can use the utf8 extension for Lua.
For example this is https://github.com/starwing/luautf8

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question