V
V
Vlad_beg2017-11-24 14:19:35
Python
Vlad_beg, 2017-11-24 14:19:35

How to convert the file name to utf-8 correctly?

I wrote a simple script that saves all file names in a folder to a text file, but the script gives an error because it encounters files that have Cyrillic in their names. What is the correct way to use "encode" in this case so that the save is performed.

import os
import sys

path = "C:\\Users\\User\\Desktop"
file = open('testfile.txt', 'w')

for dirpath, dirnames, filenames in os.walk('C:\\Users\\User\\Documents\\Folder'):
    print('Current path: ', dirpath)
    print('Directories: ', dirnames)
    for fname in filenames:
        line = fname.encode('utf-8')
        file.write(line+ "\n")

Mistake
Traceback (most recent call last):
File "c:\Users\User\Desktop\Folder Compare.py", line 12, in
file.write(fname + "\n")
File "C:\Users\User\AppData \Local\Programs\Python\Python36-32\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 6-11: character maps to

Answer the question

In order to leave comments, you need to log in

1 answer(s)
A
Artem Sovetnikov, 2018-01-05
@Vlad_beg

Your file is opened for writing with a non-UTF8 encoding, and when writing, an attempt is made to convert UTF8 characters to cp1252 (why isn't 1251 strange...).
Here is the working code, the file encoding is specified when opening

import os

path = "Q:\\Temp"
file = open('testfile.txt', 'w', encoding='utf8')

for dirpath, dirnames, filenames in os.walk(path):
    print('Current path: ', dirpath)
    print('Directories: ', dirnames)
    for fname in filenames:
        file.write(fname + os.linesep)

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question