H
H
Headballz2020-08-05 19:45:48
Python
Headballz, 2020-08-05 19:45:48

How to enter differing data in one row with a partial match in the columns?

There is a dataframe in which the first two columns can be repeated.
It is necessary to perform an operation so that 2 rows where values ​​are repeated merge into one, and write different values ​​separated by commas into the third column, for example.
There is:

Маршрут         ТС   Водитель
0   route_name   car     john
1   route_name   car    boris
2  route_name2  car2    boris


Need:
Маршрут         ТС    Водитель
0   route_name   car     john,boris
1  route_name2  car2    boris

Answer the question

In order to leave comments, you need to log in

1 answer(s)
A
Alexey Cheremisin, 2020-08-05
@Headballz

Make a dict where the key is a tuple of the first two columns and the value is a set of rows. dict.get((route,car), set()) is very good
. Well, something like this

dataset = [
    ["route1","car1","alex"],
    ["route1","car1","boris"],
    ["route2","car1","alex"],
    ["route2","car1","boris"],
    ["route1","car1","john"],
    ["route3","car2","alex"],
    ["route1","car2","alex"],
    ["route1","car3","alex"],
    ["route1","car2","alex"],
    ["route1","car2","alex"],
    ["route3","car1","alex"],
]

outdataset = {}

for route,car,driver in dataset:
    key = (route,car)
    _d = outdataset.get(key,set())
    _d.add(driver)
    outdataset[key] = _d

for route,car in outdataset.keys():
    print(route,car,", ".join(outdataset[(route,car)]))

route1 car1 john, alex, boris
route2 car1 alex, boris
route3 car2 alex
route1 car2 alex
route1 car3 alex
route3 car1 alex

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question