Mapping two lists and outputting blocks based on the number of values in one of them?

D

DarkWood2017-12-13 13:45:04

Python

DarkWood, 2017-12-13 13:45:04

Hello.
I think that the issue is solved more easily than I thought of it, but my head is already fixated and does not find another way out. Hope you suggest it.
There is an array of data presented in plain text. This array represents values for several entities at once. These entities themselves are placed in a separate list. The values for each entity come in blocks separated by a blank line. An example is below in the code. The number of values per entity can vary.

dataset_names = ["moscow", "new-york"]

text = """2008	11 186
2009	11 281
2011	11 776
2012	11 856

2011	11 776
2012	11 856"""

def chunks(l, n=2):
    """Разбивает список (l) на части размером n"""
    for i in range(0, len(l), n):
        yield l[i:i + n]

prepare_text = text.replace(' ','').replace('\n','\t').split('\t\t')

data = [list(chunks(item.split('\t'))) for item in prepare_text]

What I'm doing now: remove the spaces (they are not perceived in the destination), split into blocks, divide the blocks into pairs.
And then, as I already wrote, the brains fall into a cycle and ultimately come to the same solution, which does not give the desired result. It seems not difficult to compare both lists and add the names of the datasets to the chunks, but how then to pull out the data from there in the required formatting?
Ultimately I want something like this:
moscow
value_1 = 2008
value_2 = 11186
value_1 = 2009
value_2 = 11281
etc
new-york
value_1
value_2
value_1
value_2
etc
Simply put, I want to display for each entity from one list all the values \u200b\u200bcorresponding to it from another, additionally breaking them into pairs.
Please point in the right direction.

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

G

gill-sama, 2017-12-13
@gill-sama

import re

dataset_names = ["moscow", "new-york"]

text = """2008	11 186
2009	11 281
2011	11 776
2012	11 856
2011	11 776

2012	11 856"""



prepare_text = re.split('\t+|\n+', text)

out  = {k: [{'value_{}'.format(i+1): prepare_text[t+i] for i in range(2)} for t in range(0,len(prepare_text), 2)] for k in dataset_names}
print(out)

something like this?