Answer the question
In order to leave comments, you need to log in
How to parse a string with not quite correct json?
There is a string that looks like invalid JSON, where the data is mostly represented as a key:value, but sometimes there is only a key, or 2 keys and one value.
input_str="""name1: value1; name2: value2; name3; prefix: name4: value4;"""
output_dict={'name1': 'value1', 'name2': 'value2', 'name3':True, 'prefix name4': 'value4'}
import json
input_str="""name1: value1; name2: value2; name3; prefix: name4: value4;"""
if input_str[-1]==';':
input_str=input_str[:-1]
god_str='","'.join([{0: item+': True', 1: item}.get(item.count(':'),item.replace(":", " ", 1)) for item in json.dumps(input_str).split('; ')])
json_str='{%s}'%god_str.replace(': ','":"')
output_dict=json.loads(json_str)
print(output_dict)
{'prefix name4': 'value4', 'name1': 'value1', 'name2': 'value2', 'name3': 'True'}
Answer the question
In order to leave comments, you need to log in
def tokenize(data):
cleanup = lambda entry: entry.replace(':', '').strip()
for entry in data.strip(';').split(';'):
entry = map(cleanup, entry.rsplit(':',1))
if len(entry) == 1:
entry.append(True)
yield entry
input = 'name1: value1; name2: value2; name3; prefix: name4: value4;'
print dict(tokenize(input))
>>> dict(re.findall('\s*([\w\s:]+?)\s*(?::\s*([\w\s]*)\s*)?(?=[;$])', input))
{'prefix: name4': 'value4', 'name2': 'value2', 'name3': '', 'name1': 'value1'}
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question