8
8
891099838382016-02-18 04:34:26
Python
89109983838, 2016-02-18 04:34:26

Why does xlwt not save the entire file if there are Russian letters?

Good afternoon!
Tell me where to "dig", the essence of the problem is this:
With the help of xlrd and xlwt I'm trying to process a small price list of 1500 lines! - but only the first 130 - 200 lines appear in the final file, but more often nothing at all.
And when opening the target file new_price.xls EXCEL gives the message " In the book "new_price.xls" found content that could not be read. Try to restore the contents of the book? If you trust the source of this book, click Yes "
If you open through OpenOffice Calc = then after the above 130-250 (each time in a different way) in the line with the name goes "sets of all kinds of icons and symbols" and further in the article!
If you turn off the output to the price of the name (ws.write(num,1,v[0])) then everything is fine - both the article and prices are in place, and EXCEL does not swear when you open it.
If you process lines 150 - 200, then there are no problems with Russian letters either!
If you replace all the names in the source file with text purely in Russian, then there are no problems either! because in the original in the names and Russian and English words (+ numbers, brackets, quotes, etc. but they do not affect the problem)
Here is a simplified code:

#-*- coding: utf8 -*-
import xlrd, xlwt
import os
from trans import transliterate  # самописная функция транслитерации 
                                                  # благодаря ей понял что дело 
                                                  # не в спец. символах, а Русских символах!
file = '01.xls'     # исходный файл EXCEL
art = 0              # столбец с Артикулом
names = 1        # столбец с Названием товара
price = 2          # столбец с Цена товара
data = {}           # Словарь в него записываются данные для обработки 
                        # в формате {'Артикул':('Название', 'Цена')}
n = 0
directory = os.getcwd() # Определяем текущий путь

mybook = xlrd.open_workbook (directory + '\\' + file, on_demand=True) #encoding_override='utf8' # пробовал разные кодировки не помогает
list = mybook.sheet_names()[0]
sheet = mybook.sheet_by_name(list)
for s in range(sheet.nrows):
    if sheet.row_values(s)[names] == '': # Проверяем что в names и приводим к строке
        name = 'нет имени'
    else:
        if type(sheet.row_values(s)[names]) is str:
            name = sheet.row_values(s)[names]
        else:
            name = str(int(sheet.row_values(s)[names]))
    name = transliterate(name)   # обработал name через транслитерацию и понял что дело в русских буквах!!!
# для нижнего скрина в транслитерации "выключил" замену буквы б, 
# при полной замене всех русских символов английскими конечный прайс открывается нормально!

    data[str(int(sheet.row_values(s)[art]))] = (name, sheet.row_values(s)[price])  # Заполняем словарь с вложенным кортежем
    n = +=1    

mybook.release_resources() # после обработки выгружаем ресурсы -> '01.xls'

wb = xlwt.Workbook() #encoding='utf8'   # пробовал разные кодировки не помогает
ws = wb.add_sheet('Test')   #,cell_overwrite_ok=True)
   
num = 1
for k,v in data.items():
    nums = num
    k = k.replace(' ','')
    ws.write(num,0,k)
    ws.write(num,1,v[0])   # <- вот ТУТ чтото не так
    print (v[0])           # хотя печатает все 1500 позиций в консоли на родном русском, когда отключена транслитерация!
    ws.write(num,2,v[1])
    num += 1
ws.write(1501,1,'ПРИВЕТ')  # при наличии русских букв в словаре этой надписи в конечном файле (new_price.xls) нет!
wb.save('new_price.xls')   # сохраняем конечный результат
print ('ГОТОВО!')

Here is a screenshot of what happens in OpenOffice Calc when you open it - due to the presence of only one letter of the Russian 'b' in the final price list, all subsequent names are poured into the column with Articles (the first one), the prices of the most interesting remain as they should (seen by the fact that they are in float)
Pay attention to the 214th line in it in the name tur b onasadka correctly displays the letter b
bc85737a7ebd4ed18c63bd136a373210.jpg
Here is another screen (without using transliteration):
the impression is that the "dictionary" data is going crazy....
b0f8454db9b84ce180ecead31a295348.jpg

Answer the question

In order to leave comments, you need to log in

2 answer(s)
8
89109983838, 2016-02-26
@89109983838

The solution to the problem for the python 3 version is the transition from xlwt to xlwt3 - then there are no problems with Cyrillic encodings!
Only after installing xlwt3 if an error occurs:
Comment out the __init__ function in .........lib\site-packages\xlwt3\formula.py

D
Dimonchik, 2016-02-18
@dimonchik2013

look here and here
try alternatives
and check the classics by clearing characters 1 , 2 , 3

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question