How to get int and float strings from JSON?

E

EvaSpence2021-09-06 15:36:42

Python

EvaSpence, 2021-09-06 15:36:42

The code parses the csv string from the log, then converts it to json and then the database data is unloaded .. here.

The question is how to make sure that the data type is correct? otherwise everything is String and it is necessary that the values correspond there to int, float .

JSON example

{
"FileName": "fio-example",
"terse_version_3": "3",
"fio_version": "fio-3.27-12-gd7a2",
"jobname": "oltp_read_uniform",
"read_runtime_ms": "120001",
"read_slat_min_us": "3",
"read_slat_max_us": "662",
"read_slat_mean_us": "7.105567",
"read_clat_pct03": "10.000000%=89",
"read_tlat_min_us": "50",
"read_lat_max_us": "23256",
"read_lat_mean_us": "114.506388",
"read_lat_dev_us": "60.999852",
"write_clat_min_us": "0",
"write_clat_max_us": "0",
"write_clat_mean_us": "0.000000",
"write_clat_dev_us": "0.000000",
"write_clat_pct01": "1.000000%=0",
"write_clat_pct18": "0%=0",
"write_clat_pct19": "0%=0",
"write_clat_pct20": "0%=0",
"write_tlat_min_us": "0",
"write_lat_max_us": "0",
"write_lat_mean_us": "0.000000",
"write_lat_dev_us": "0.000000",
"write_bw_min_kb": "0",
"cpu_user": "2.976875%",
"cpu_sys": "9.288333%",
"cpu_csw": "8295390",
"cpu_mjf": "0",
"cpu_minf": "381",
"iodepth_1": "100.0%",
"iodepth_2": "0.0%",
"lat_100us": "50.52%",
"lat_250us": "49.00%",
"lat_500us": "0.24%",
"lat_750us": "0.01%",
"lat_1000us": "0.19%",

}

piece of code :

MY_COL .PY ############################################ ###############

from sqlalchemy import create_engine, Column, Integer, String, DateTime, MetaData, Table
from migrate.changeset import *

def CreateTable(tblName,engine,cols):
    metadata = MetaData()

    data = Table(tblName, metadata,
                Column('FileName', String))

    for coli in cols:
        col = Column(coli.replace("\n", ""), String)
        data.append_column(col)

    metadata.create_all(engine)
    return data

V1.PY ############################################## ############################

def makeJSON(colnames,vals):
    i = 0
    r = vals.split(';')
    res='{'
    for col in colnames:
        try:
            res=res+'"'+col.name+'":"'+r[i]+'",'
        except:
            res=res+'"'+col.name+'":"",'
        i+=1
    res=res[:-1]+'}'
    return res.replace("\n", "")

MAIN.PY ############################################## ###########################

for pars_el in parsed_lst:
        try:
            js = v1.makeJSON(data.columns, tarshortname + pars_el)
            conn.execute(data.insert(), [json.loads(js)])
        except Exception as err:
            print(err)
            print('Произошла ошибка')
            session.close()
            exit(-1)

    session.commit()
    session.close()
    print('Файлы успешно обработаны')

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

V

Vindicar, 2021-09-06
@Vindicar

When parsing CSV, look at what the column is and change its data type.
For example, using a dictionary:

transforms = {
"write_bw_min_kb": int,
"cpu_user": lambda s: float(s[:-1]), #перевод из % в числа
# и так далее
}
noop = lambda s: s #а это для тех позиций, которые следует оставить строками
#а при парсинге CSV делаешь так
transform = transforms.get(col.name, noop) #определяем метод преобразования
value = transform(r[i])
#дальше используешь value

But in general, manually generating JSON is bewildering. This makes sense if you have gigabytes of data, but you're shaping it via string concatenation, which is devilishly inefficient and slow.
Why don't you like json.dumps()? I formed the data structure as it is needed, and dumped it.