F
F
FairyTaleComposer2022-03-05 16:29:48
PostgreSQL
FairyTaleComposer, 2022-03-05 16:29:48

How to fix krakozyabry instead of cyrillic in PostgreSQL dump?

I understand that the encoding is wrong somewhere, but I don’t understand where exactly.

Important: krakozyabry only in the dump. The application in which I use this database displays the Cyrillic alphabet as it should.

The base to be transferred contains the UTF8 encoding and the locale russian_Russia.1251
client_encoding - WIN1251
server_encoding - UTF8

I create a dump like this:

pg_dump -U postgres -W -E UTF8 -d dbname > dbname.sql


For the sake of experiment, I created another database with the ru_RU.UTF8 locale.
In the dump, there are the same bugs, like this:
h%*h%�%d%h%%h%�%d%h%

I tried to do it , the dump is still the same. At the same time, I am trying to transfer the database from Windows to Linux, and on Linux, using file, I checked the encoding of the resulting dump - UTF-16. Although I explicitly specify UTF8 when creating it. I will add: The system locale in my Linux:SET CLIENT_ENCODING TO UTF8



$ locale
LANG=en_US.UTF-8
LANGUAGE=en_US
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC=ru_UA.UTF-8
LC_TIME=ru_UA.UTF-8
LC_COLLATE="en_US.UTF-8"
LC_MONETARY=ru_UA.UTF-8
LC_MESSAGES="en_US.UTF-8"
LC_PAPER=ru_UA.UTF-8
LC_NAME=ru_UA.UTF-8
LC_ADDRESS=ru_UA.UTF-8
LC_TELEPHONE=ru_UA.UTF-8
LC_MEASUREMENT=ru_UA.UTF-8
LC_IDENTIFICATION=ru_UA.UTF-8
LC_ALL=


The list of databases with encodings and locales is also there:
6223b97fc16a7682238272.png

What manipulations did I do with an already prepared dump:

Restoring data from a dump to a previously created database
$ psql -U valen running_events < events.sql
ERROR: invalid byte sequence for encoding "UTF8": 0xff

Moreover, if open dump file in code editor, see:
SET client_encoding = 'UTF8';

Check dump encoding:
$ file events.sql
events.sql: Little-endian UTF-16 Unicode text, with CRLF line terminators
Convert:
$ iconv -f UTF-16 -t UTF-8 events.sql -o events2.sql
file created

Second attempt to restore data from already converted dump:
$ psql -U valen running_events < events2.sql
data has been written

Checking the output of tables from the database:
running_events=# \dt
List of relations
Schema | name | type | Owner
----------+----------+-------+----------
public | events | table | postgres
(1 row)

Indeed, everything was recorded. But:
running_events=# SELECT * FROM events WHERE id = 1;
6223ba527ccfe421194528.png

Again the Cyrillic alphabet is displayed incorrectly. Despite the fact that the running_events database has a Russian locale.

One more thing:
In Windows (i.e., in the same place where I do the dump), the same story happens. Data from the dump is not written to the database.

Another addition:
I decided to do the whole operation from start to finish in Ubuntu. That is, I created a database with UTF8 encoding and ru_RU.UTF8 locale, inserted one record into it containing Russian text in one of the fields, then dumped it. I opened it in the code editor - no incorrect characters at all, the Cyrillic alphabet is also displayed.
I restored the data from the dump to a freshly created database - everything was recorded the first time. Checked up in psql - on request SELECT I see that unique record, cyrillic is displayed correctly.

And yet it is still not clear why Windows so stubbornly refuses to create a normal dump ... Dump

files on github:
https://github.com/composercoder/running-events/tr...

There are three .sql in this directory file.
  • events_dump.sql - dump of the main database from Windows, which I am trying to transfer
  • events_dump_converted_to_utf8.sq - the same, converted to utf8
  • events_linux_dump.sql - dump of the same database created in ubuntu with the same table, but only one entry

Answer the question

In order to leave comments, you need to log in

1 answer(s)
G
galaxy, 2022-03-06
@FairyTaleComposer

Try dumping like this:

pg_dump -U postgres -W -E UTF8 -d dbname -f dbname.sql

because maybe a redirect bug

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question