Answer the question
In order to leave comments, you need to log in
How to split one column into two in a dataframe?
I write in zeppeline notebook
, I have this dataframe:
splited_genres_df.show(20)
+-------+--------------------+---------+
|movieId| title| genres|
+-------+--------------------+---------+
| 1| Toy Story (1995)|Adventure|
| 1| Toy Story (1995)|Animation|
| 1| Toy Story (1995)| Children|
| 1| Toy Story (1995)| Comedy|
| 1| Toy Story (1995)| Fantasy|
| 2| Jumanji (1995)|Adventure|
| 2| Jumanji (1995)| Children|
| 2| Jumanji (1995)| Fantasy|
| 3|Grumpier Old Men ...| Comedy|
| 3|Grumpier Old Men ...| Romance|
| 4|Waiting to Exhale...| Comedy|
| 4|Waiting to Exhale...| Drama|
| 4|Waiting to Exhale...| Romance|
| 5|Father of the Bri...| Comedy|
| 6| Heat (1995)| Action|
| 6| Heat (1995)| Crime|
| 6| Heat (1995)| Thriller|
| 7| Sabrina (1995)| Comedy|
| 7| Sabrina (1995)| Romance|
| 8| Tom and Huck (1995)|Adventure|
+-------+--------------------+---------+
only showing top 20 rows
Answer the question
In order to leave comments, you need to log in
splited_genres_df['year'] = splited_genres_df['title'].str.extract('\((\d+)\)', expand=True)
splited_genres_df['title'] = splited_genres_df['title'].str.extract('(.+)\(\d+\)', expand=True)
splited_genres_df.head()
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question