Answer the question
In order to leave comments, you need to log in
How to correctly detect the presence of unicode characters in text through RegularExpressions (or otherwise) under .Net?
So far the code is like this:
IsMatch ("[" & ChrW(128) & "-" & ChrW(65535) & "]", System.Text.RegularExpressions.RegexOptions.IgnoreCase)
Answer the question
In order to leave comments, you need to log in
Managed to figure it out:
Achievement 1:
Works fine with a similar request, but with the opposite exception:
So i and I are no longer recognized in the aisles 128-65535.
Achievement 2:
Hex code of a double-byte character is set to "[\u00FF-\uFFFF]"
Achievement 3:
Accumulated and pasted System.Text.RegularExpressions.RegexOptions.IgnoreCase in vain. When this flag is disabled, everything works as it should. Apparently "i" has at least three case representations in utf-8, at least one of which is in the range "[\u00FF-\uFFFF]"
(although the reverse still doesn't work, so the question is still not fully resolved )
When I see reports of problems with handling the characters "i" and "I" when the IgnoreCase flag is on, I immediately suspect that the comparison is done using the Turkish language. In it, the lowercase "i" is converted to a capital "İ", and the capital "I" (read as "ы" in Russian) is converted to a lowercase "ı". To be honest, I didn’t delve deeply into your problem, but maybe my comment will lead you to something.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question