Answer the question
In order to leave comments, you need to log in
How to randomly shuffle column data in csv file?
Task: randomly shuffle rows in each column of a CSV table. What are the ways, please tell me.
More: For example, there is a large array with parsed data. Let's just say - in the first column are the names, in the second are the surnames, in the third is the street of residence, in the fourth is the house number, in the fifth is the phone. And I need to shuffle the contents of the columns every time to get unique generated "identities".
The columns are large, 10,000 rows each. The sample is taken from the first 5 thousand lines. Well, these are the details.
Answer the question
In order to leave comments, you need to log in
1. Open in Excel
2. Add a row and fill it with random numbers
3. Sort the columns by the value of this row with random numbers.
4. Repeat for each row of the table (if each row needs to be sorted individually)
5. Remove the row with random numbers
6. Profit.
In general, the meaning of the action is not clear. If all the data is mixed, can it just be filled with random numbers?
if without programming:
1) https://www.textpad.com/
2) f9 (Sort) and you can play in all three sorts at the same time, you can change the lengths several times
if with programming
#-*- coding:utf-8 -*-
import random
filename='c:/filename.csv'
f=open(filename)
lines = f.readlines()
f.close()
random.shuffle(lines)
f=open(filename,'w')
f.writelines(lines)
f.close()
def shuffle(df, n=1, axis=0):
df = df.copy()
for _ in range(n):
df.apply(np.random.shuffle, axis=axis)
return df
df = pandas.read_csv('your_file.csv')
shuffle(df)
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question