I
I
irina_leifijtijhiodu2021-08-23 22:58:03
Python
irina_leifijtijhiodu, 2021-08-23 22:58:03

Unable to load csv file with numpy, what's wrong?

Good afternoon!
I had the following task during my training:
Upload some data file (an Excel table from your computer, saved in csv format, or a table from the Internet).

Unable to load csv file with numpy:
table = np.loadtxt('Dlya_ucheby.csv' , delimiter = ' , ' , skiprows=1) gives

the following error:
UnicodeDecodeError Traceback (most recent call last) in () ---> 1 table = np.loadtxt('Dlya_ucheby.csv', delimiter = ',', skiprows=1)

1 frames /usr/lib/python3.7/codecs.py in decode(self, input, final) 320 # decode input (taking the buffer into account) 321 data = self.buffer + input --> 322 (result, consumed) = self.bufferdecode(data, self.errors, final) 323 # keep undecoded input until the next call 324 self.buffer = data[consumed:]

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 0: invalid continuation byte

Tell me what exactly went wrong?

Previously, I asked this question in the course in which I am studying the curator, they advised me to choose a colon as a separator and indicate the encoding, but this seems to be in another library, which I have not yet studied:
table = pd.read_csv('Dlya_ucheby.csv', encoding = 'cp1251' , sep = ';')
data.head()

Tell me how to solve the problem with the help of numpy

Answer the question

In order to leave comments, you need to log in

1 answer(s)
G
galaxy, 2021-08-24
@irina_leifijtijhiodu

Numpy also supports the encoding parameter:

table = np.loadtxt('Dlya_ucheby.csv' , delimiter = ' , ' ,encoding = 'cp1251', skiprows=1)

galaxy, tried
another error comes out:
------------------------------------ --------------------------------
ValueError Traceback (most recent call last)
in ()
----> 1 table = np.loadtxt('Dlya_ucheby.csv', delimiter = ',', encoding = 'cp1251', skiprows=1)
3 frames
/usr/local/lib/python3.7/dist-packages/numpy/lib/npyio. py in floatconv(x)
761 if '0x' in x:
762 return float.fromhex(x)
--> 763 return float(x)
764
765 typ = dtype.type
ValueError: could not convert string to float: '207401; 2013;41327;0'

Apparently, the separator is something - a semicolon should be
delimiter = ';',
still the problem persists:
table = np.loadtxt('Dlya_ucheby.csv', delimiter = ' ; ', encoding = 'cp1251', skiprows=1)
----------------- -------------------------------------------------- --------
ValueError Traceback (most recent call last)
in ()
----> 1 table = np.loadtxt('Dlya_ucheby.csv', delimiter = ';', encoding = 'cp1251', skiprows =1)
3 frames
/usr/local/lib/python3.7/dist-packages/numpy/lib/npyio.py in floatconv(x)
761 if '0x' in x:
762 return float.fromhex(x)
-- > 763 return float(x)
764
765 typ = dtype.type ValueError :
could not convert string to float: '0.58'
'207401;2013;41327;0'

?
New problem due to the decimal separator - you have a comma, a dot is expected. I don’t see an easy way to configure this in python, so re-save the CSV file with a different decimal separator (I don’t remember if this is configured somehow in Excel or if it takes from the Windows system settings, like the second)

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question