I
I
Ilya Garbazhiy2020-05-12 09:26:15
Python
Ilya Garbazhiy, 2020-05-12 09:26:15

Python how to split text file into multiple by line length?

There is a huge text file called words_alpha.txt , of the format:

Слово1
Слово2
Слово3
и т.д.

You need to split it into several files according to the length of the line (word). That is, in the file words1, words consisting of one letter, in words2 - from two letters, and so on. How many files will end up is unknown. I note that in the source file the size of the words is not in order, and not in a row.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
V
Vladimir Kuts, 2020-05-12
@WaterWalker

f_handlers = {}
with open('words_alpha.txt', 'r') as inp_file:
    for line in inp_file:
        w_len = len(line.strip())
        if w_len == 0:   # исключаем слова с нулевой длиной
            continue
        fn = f'words_{w_len}.txt'
        f = f_handlers.setdefault(fn, open(fn, 'w+'))
        f.write(line)

for handler in f_handlers.values():
    handler.close()

In the folder will create files with the desired type
words_<number>.txt

S
Sergey Tikhonov, 2020-05-12
@tumbler

Well, you read and write to different files according to the length of the line.

S
Sergey Pankov, 2020-05-12
@trapwalker

py "('x'*random.randint(1, 10) for _ in range(100))" | \ 
py -x "open(f'tmp/path_to_dest_folder/words_len={len(x):05}.txt', 'a').write(x+'\n')"

The first line generates words, in your case there will be cat big_file.txt.
The second line sorts everything into files.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question