Answer the question
In order to leave comments, you need to log in
How to quickly iterate through all rows in Python Pandas?
Good day, friends. I have over 3 million rows in a DataFrame. I need to take the first customer, look at all their transactions for each month, put all those transactions into another DataFrame, adding some data based on the transactions. And then go through all the clients like this. 3 million lines are all operations, all clients.
Answer the question
In order to leave comments, you need to log in
Apparently here you are creating a single row dataframe, but you can create a regular series (pd.Series)
datareport = pd.DataFrame({'ODATE': {0:0},
'TENOR': {0:0},
'VDATE': {0:0},
'R': {0:0},
'RC': {0:0},
'DE': {0:0},
'I': {0:0},
'DEB': {0:0},
'D': {0:0}}
datareport = pd.concat([datareport, pd.DataFrame({'ODATE': {0:row.odate},
'TENOR': {0:tenor(row.ltt)},
'VDATE': {0:row.vdate},
'R': {0:bucket_class_old},
'RC': {0:buckclass(row.odues)},
'DE': {0:row.ball},
'I': {0:0},
'DEB': {0:ball_old},
'D': {0:0}})])
datareport = pd.concat([datareport, pd.DataFrame({'ODATE': {0:row.odate},
'TENOR': {0:tenor(row.ltt)},
'VDATE': {0:row.vdate},
'R': {0:bucket_class_old},
'RC': {0:100},
'DE': {0:difference},
'I': {0:0},
'DEB': {0:ball_old},
'D': {0:0}})])
ser1 = pd.Series({'ODATE': 0,
'TENOR': 0,
'VDATE':0,
'R': 0,
'RC': 0,
'DE': 0,
'I': 0,
'DEB': 0,
'D': 0})
ser2 = pd.Series({'ODATE': 0,
'TENOR': 1,
'VDATE':2,
'R': 3,
'RC': 4,
'DE': 5,
'I': 6,
'DEB': 7,
'D': 8})
df = pd.concat([ser1, ser2], axis=1).T
datareport1 = pd.DataFrame({'ODATE': {0:0},
'TENOR': {0:0},
'VDATE': {0:0},
'R': {0:0},
'RC': {0:0},
'DE': {0:0},
'I': {0:0},
'DEB': {0:0},
'D': {0:0}})
datareport2 = pd.DataFrame({'ODATE': {0:1},
'TENOR': {0:2},
'VDATE': {0:3},
'R': {0:4},
'RC': {0:5},
'DE': {0:6},
'I': {0:7},
'DEB': {0:8},
'D': {0:9}})
df2 = pd.concat([datareport1, datareport2])
You can try a regular database,
but push the data into RAM.
Index by client.
Google "sqlite in memory"
The only optimization that came to my mind at the moment is to remove the concatenation operation and instead create an array of type:
array_data = []
'''
Тут находиться код программы
'''
array_data.append([row['odate'], tenor(row(ltt)), row['vdate'], bucket_class_old, buckclass(row['odues']), row['ball'], 0, ball_old, 0])
array_data.append([row['odate'], tenor(row(ltt)), row['vdate'], bucket_class_old, 100, difference, 0, ball_old, 0])
'''
Тут остальной код программы
'''
pandas.DataFrame(array_data)
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question