Answer the question
In order to leave comments, you need to log in
Merging/joining tables (python, pandas library)?
Please tell me how to properly merge two dataframes
In [1]: df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
...: 'B': ['B0', 'B1', 'B2', 'B3'],
...: 'C': ['C0', 'C1', 'C2', 'C3'],
...: 'D': ['D0', 'D1', 'D2', 'D3']},
...: index=[0, 1, 2, 3])
In [2]: df2 = pd.DataFrame({'A': ['A4', 'A5', 'A6', 'A7'],
...: 'B': ['B4', 'B1', 'B2', 'B7'],
...: 'C': ['C4', 'C1', 'C2', 'C7'],
...: 'D': ['D4', 'D5', 'D6', 'D7']},
...: index=[0, 1, 2, 3])
Answer the question
In order to leave comments, you need to log in
You can try head-on "declaratively":
a) df_{a,b,c,d} = join df2, df1 over {a,b,c,d}
b) projection onto the required attributes from df2 in each df_{a,. ..}
c) take indexes that are not included in df_{a,b,c,d} and put entries in rest
d) concat(df_{a,b,c,d}, rest)
Or, more simply, imperatively:
take two loop over iterators (DataFrame.iterrows()) and go through both datasets.
What will be easier to implement and faster to work with? To be honest, it's not very obvious, and it may depend on the data - you need to try.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question