R
R
redduckrobot2017-11-06 16:38:43
Django
redduckrobot, 2017-11-06 16:38:43

How to sort with Ukrainian subset of Cyrillic characters in Django + SQLite?

Hello, I ran into a problem with sorting Ukrainian characters, in particular the name starting with the letter "І" (Ukrainian "I"). This character after sorting appears in the first place (or in the end, depending on the sort direction). The database can only use SQLite.
In Django I make the following query: Similarly, in SQL: The result is the same - first there are names starting with "І", and then everything else. Directly, Python3 itself also refuses to sort properly (with the exception of lowercase characters, they fly to the end, capital ones to the beginning).
Employee.objects.all().order_by('name')
SELECT name FROM employee ORDER BY name

>>> l = ['А', 'И', 'І', 'і', 'а', 'и', 'ч']
>>> sorted(l)
['І', 'А', 'И', 'а', 'и', 'ч', 'і']
>>>

Has anyone encountered similar problems? Please tell me possible solutions.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
R
redduckrobot, 2017-11-06
@redduckrobot

I found the solution, maybe it will be useful for someone. Just below the "accepted solution of this topic" stackoverflow , the code is:

# python2.5 code below
# corpus is our unicode() strings collection as a list
corpus = [u"Art", u"Älg", u"Ved", u"Wasa"]

import locale
# this reads the environment and inits the right locale
locale.setlocale(locale.LC_ALL, "")
# alternatively, (but it's bad to hardcode)
# locale.setlocale(locale.LC_ALL, "sv_SE.UTF-8")

corpus.sort(cmp=locale.strcoll)

# in python2.x, locale.strxfrm is broken and does not work for unicode strings
# in python3.x however:
# corpus.sort(key=locale.strxfrm)

For Django, respectively:
emps = list(Employee.objects.all())
emps.sort(key=lambda x: locale.strxfrm(x.name))

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question