Answer the question
In order to leave comments, you need to log in
How to generate nested json from structure?
I am reading a large size binary like this:
import struct
filepath = 'hits.dat'
with open(filepath, 'rb') as fp:
while True:
bytes = fp.read(28)
if not bytes or len(bytes) != 28:
break
event_id, track_id, x, y, z = struct.unpack(">HHddd", bytes)
event_id: 256
track_id: 0
x: -7.942855253805373e-275
y: 6.303619193466582e-17
z: 8.500503648212859e-45
events = {
'event_id': 1,
'tracks' : [{
'track_id': 1,
'coordinates': [{
'x':1,
'y':2,
'z':3
},
{
'x':4,
'y':5,
'z':6
}]
}]
}
Answer the question
In order to leave comments, you need to log in
If you really have a lot of events, then I recommend that you do not shove them into one JSON, but use this format: jsonlines.org
These are paragraph-separated lines with JSONs.
This format does not require a huge amount of memory (for the entire dataset at once), it can be streamed.
But the problem does not say that individual tracks of one event are in a row and individual coordinates of one track are also in a row. So don't count on it.
Refine the problem and it will be possible to optimize the solution in memory by switching to streaming issuance.
import struct
import json
filepath = 'hits.dat'
events = {}
with open(filepath, 'rb') as fp:
buffer = fp.read(28)
while len(buffer) == 28:
event_id, track_id, x, y, z = struct.unpack(">HHddd", buffer)
buffer = fp.read(28)
event = events.setdefault(event_id, dict(event_id=event_id, tracks={}))
track = event['tracks'].setdefault(track_id, dict(coordinates=[]))
track['coordinates'].append(dict(x=x, y=y, z=z))
with open('hits.json', 'w') as fp:
json.dump(events, fp, indent=2)
events = {
1: {
'event_id': 1,
'tracks': {
1: {
'track_id': 1,
'coordinates': [
{'x': 1, 'y': 2, 'z': 3},
{'x': 4, 'y': 5, 'z': 6}
]
},
2: {
'track_id': 2,
'coordinates': [
{'x': 12, 'y': 22, 'z': 33},
{'x': 44, 'y': 55, 'z': 66}
]
}
}
}
}
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question