How to generate nested json from structure?

D

Dmitry Bystrov2020-03-05 19:11:16

Python

Dmitry Bystrov, 2020-03-05 19:11:16

I am reading a large size binary like this:

import struct

filepath = 'hits.dat'

with open(filepath, 'rb') as fp:
    while True:
        bytes = fp.read(28)
        if not bytes or len(bytes) != 28:
            break
        
        event_id, track_id, x, y, z = struct.unpack(">HHddd", bytes)

The result is a structure like this:

event_id: 256
track_id: 0
x: -7.942855253805373e-275
y: 6.303619193466582e-17
z: 8.500503648212859e-45

There are many such structures. For convenience, I would like to represent all this in the form of JSON, so that each event_id corresponds to a track_id, and the track_id has its own x, y, z coordinates.

For example:

events = {
    'event_id': 1,
    'tracks' : [{
        'track_id': 1,
        'coordinates': [{
            'x':1,
            'y':2,
            'z':3
        },
        {
            'x':4,
            'y':5,
            'z':6
        }]
    }]
}

How can I do this with nested dictionaries?

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

S

Sergey Pankov, 2020-03-05
@trapwalker

If you really have a lot of events, then I recommend that you do not shove them into one JSON, but use this format: jsonlines.org
These are paragraph-separated lines with JSONs.
This format does not require a huge amount of memory (for the entire dataset at once), it can be streamed.
But the problem does not say that individual tracks of one event are in a row and individual coordinates of one track are also in a row. So don't count on it.
Refine the problem and it will be possible to optimize the solution in memory by switching to streaming issuance.

import struct
import json

filepath = 'hits.dat'
events = {}
with open(filepath, 'rb') as fp:
    buffer = fp.read(28)
    while len(buffer) == 28:
        event_id, track_id, x, y, z = struct.unpack(">HHddd", buffer)
        buffer = fp.read(28)

        event = events.setdefault(event_id, dict(event_id=event_id, tracks={}))        
        track = event['tracks'].setdefault(track_id, dict(coordinates=[]))
        track['coordinates'].append(dict(x=x, y=y, z=z))

with open('hits.json', 'w') as fp:
    json.dump(events, fp, indent=2)

I don't have your file, so I wrote by touch. Check.
By the way, you have an error in the JSON example.
Need like this:

events = {
    1: {
        'event_id': 1,
        'tracks': {
            1: {        
                'track_id': 1,
                'coordinates': [
                    {'x': 1, 'y': 2, 'z': 3},
                    {'x': 4, 'y': 5, 'z': 6}
                ]
            },
            2: {
                'track_id': 2,
                'coordinates': [
                    {'x': 12, 'y': 22, 'z': 33},
                    {'x': 44, 'y': 55, 'z': 66}
                ]
            }

        }
    }
}