Answer the question
In order to leave comments, you need to log in
How to properly parse a CSV file in pandas that contains many TimeStamp0 datasets; value0; Timestamp1;Value1;?
Hello,.
I just started learning Python and immediately decided to try to solve an urgent practical problem - creating interactive reports.
Sketched a module that works with a specific CSV file. I got the result, but I want to do well.
The next step is to unify to import a file with an unknown number of datasets. And here the question arose about how best to process the source data.
The data in CSV has the format:
Timestamp0; value0 ; timestamp1; value1; ...TimestampN; ValueN;
The main problem is understanding the organization of indexing. When reading with Pandas, you can select a column or series as an index. And my problem is that each data column must have its own index. For further plotting using Plotly, I simply read the CSV several times with the required column in index_col. I understand that this is very wrong, but I can not figure out how to do it more correctly.
Could you tell me how to work with such data?
import plotly
import plotly.graph_objs as go
#from plotly.graph_objs import Scatter, Layout
import pandas
plotly.offline.init_notebook_mode(connected=True) #инициализация работы plotly offline
# чтение CSV файла
source="d:/TEMP/GAS.csv"
trend1 = 'Термопара BK1, °C'
trend2 = 'Термопара BK2, °C'
trend3 = 'Термопара BK3, °C'
trend1_name='A-A.BK1'
trend2_name='A-A.BK2'
trend3_name='A-A.BK3'
df = pandas.read_csv(source,
sep=';',
parse_dates=[trend1 + ' Time'],
dayfirst=True,
index_col=[trend1 + ' Time'],
decimal=','
)
dfBK2 = pandas.read_csv(source,
sep=';',
parse_dates=[trend2 + ' Time'],
dayfirst=True,
index_col=[trend2 +' Time'],
decimal=','
)
dfBK3 = pandas.read_csv(source,
sep=';',
parse_dates=[trend3 + ' Time'],
dayfirst=True,
index_col=[trend3 +' Time'],
decimal=','
)
# определение трендов
trace1 = go.Scatter(x=df.index,
y=df[trend1 + ' ValueY'],
name=trend1_name
)
trace2 = go.Scatter(x=dfBK2.index,
y=dfBK2[trend2 + ' ValueY'],
name=trend2_name,
yaxis='y2'
)
trace3 = go.Scatter(x=dfBK3.index,
y=dfBK3[trend3 + ' ValueY'],
name=trend3_name,
yaxis='y3'
)
data = [trace1, trace2, trace3]
# определение области построения
Width = 1
High = 1
domainWidth=Width-0.1
layout = dict(legend= dict(x= 0,
y= 1
),
hovermode='x',
xaxis=dict(domain=[0, domainWidth] # размер области графика
),
yaxis=dict(showgrid=True,
side= 'right',
title= trend1_name
),
yaxis2=dict(overlaying= 'y',
anchor= 'free',
side= 'right',
title= trend2_name,
position=domainWidth+0.05
),
yaxis3=dict(overlaying= 'y',
anchor= 'free',
side= 'right',
title= trend3_name,
position=domainWidth+0.1
),
)
fig = dict(data=data,layout=layout)
plotly.offline.plot(fig)
Answer the question
In order to leave comments, you need to log in
You can simply split the table into several independent dataframes. There is no need to read CSV multiple times.
For example, like this:
import pandas as pd
import numpy as np
def get_df(filename, parse_dates=None):
"""Build :class:`DataFrame` list from CSV file.
Expected CSV file format::
Timestamp0; Value0 ; Timestamp1; Value1; ...TimestampN; ValueN;
Args:
filename: CSV filename.
parse_dates: List of columns with dates.
Returns:
List of DataFrames with 'Timestamp' as index and 'Value' as value
column.
Notes:
:attr:`DataFrame._name` contains name extracted
from 'TimestampX' column.
"""
df_all = pd.read_csv(filename, sep=';', decimal=',',
parse_dates=parse_dates, header=0)
assert len(df_all.columns) % 2 == 0
lst = []
columns = ['time', 'value']
# Create lsit of 2-items chunks.
col_list = np.split(df_all.columns, len(df_all.columns) / 2)
for cols in col_list:
df = df_all[cols] # split 2-column DataFrame.
df._name = cols[0].split(',')[0] # attach name to data frame.
df.columns = columns # change columns names.
df = df.set_index('time') # set index to timestamps.
lst.append(df)
return lst
df_list = get_df('GAS.csv', parse_dates=[0, 2, 4])
df = df_list[0]
print(df.index)
print(df._name)
...
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question