G
G
ge2021-04-17 02:19:50
Character encoding
ge, 2021-04-17 02:19:50

How to decode a broken Cyrillic alphabet?

What is the name of the encoding method, where the files in the console get the following form? I just don't know how to google.

# Вывод ls
''$'\301\345\347''-'$'\350\354\345\355\350''-1-1-150x150.jpg'
''$'\301\345\347''-'$'\350\354\345\355\350''-1-1-300x215.jpg'
''$'\301\345\347''-'$'\350\354\345\355\350''-1-150x150.jpg'
''$'\361\342\340\344\374\341\340''-1-150x150.jpg'

There is scientific interest - is it possible to decode this tin?

Filenames were once Cyrillic. I suspect that the first file was called 'No-name-1-1-150x150.jpg'

When selecting Unicode stupidly by numbers, nothing happens on the go: ĭřś ŞŢřţŞ (this sequence could not be converted). But judging by the way new "krakozyabry" are repeated, the idea can be somehow developed - there is a pattern.

Please don't suggest not to use Cyrillic in files :)

Answer the question

In order to leave comments, you need to log in

1 answer(s)
G
galaxy, 2021-04-17
@gedev

And what is there to decode - this is CP1251 (adjusted for $ and quotes):

>>> s = b"""''$'\301\345\347''-'$'\350\354\345\355\350''-1-1-150x150.jpg'
''$'\301\345\347''-'$'\350\354\345\355\350''-1-1-300x215.jpg'
''$'\301\345\347''-'$'\350\354\345\355\350''-1-150x150.jpg'
''$'\361\342\340\344\374\341\340''-1-150x150.jpg'"""

>>> print(s.decode('cp1251'))
''$'Без''-'$'имени''-1-1-150x150.jpg'
''$'Без''-'$'имени''-1-1-300x215.jpg'
''$'Без''-'$'имени''-1-150x150.jpg'
''$'свадьба''-1-150x150.jpg'

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question