V
V
Viktor2016-02-08 23:52:30
Database
Viktor, 2016-02-08 23:52:30

How to randomly shuffle column data in csv file?

Task: randomly shuffle rows in each column of a CSV table. What are the ways, please tell me.
More: For example, there is a large array with parsed data. Let's just say - in the first column are the names, in the second are the surnames, in the third is the street of residence, in the fourth is the house number, in the fifth is the phone. And I need to shuffle the contents of the columns every time to get unique generated "identities".
The columns are large, 10,000 rows each. The sample is taken from the first 5 thousand lines. Well, these are the details.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
M
maaGames, 2016-02-09
@maaGames

1. Open in Excel
2. Add a row and fill it with random numbers
3. Sort the columns by the value of this row with random numbers.
4. Repeat for each row of the table (if each row needs to be sorted individually)
5. Remove the row with random numbers
6. Profit.
In general, the meaning of the action is not clear. If all the data is mixed, can it just be filled with random numbers?

D
Dimonchik, 2016-02-09
@dimonchik2013

if without programming:
1) https://www.textpad.com/
2) f9 (Sort) and you can play in all three sorts at the same time, you can change the lengths several times
if with programming

#-*- coding:utf-8 -*-
import random

filename='c:/filename.csv'
f=open(filename)
lines = f.readlines()
f.close()
random.shuffle(lines)
f=open(filename,'w')
f.writelines(lines)
f.close()

V
Vladimir Olohtonov, 2016-02-09
@sgjurano

def shuffle(df, n=1, axis=0):     
     df = df.copy()
     for _ in range(n):
         df.apply(np.random.shuffle, axis=axis)
     return df

df = pandas.read_csv('your_file.csv')
shuffle(df)

From here: stackoverflow.com/a/15772356

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question