Answer the question
In order to leave comments, you need to log in
How to deal with different encodings in one JSON?
Good day, colleagues.
I ran into one problem that at first glance seemed funny to me, but now many hours have passed, and it's no longer funny.
Bottom line:
within the framework that we write on the project (by the way, if anyone is interested in IOT - that's what we do www.kaaproject.org/) Apache Avro is used to convert, generate and validate schemes.
We have some "uuid" identifiers that are generated and written directly to these schemes to uniquely identify different records.
But the problem is that these identifiers within Apache Avro are written as binary data (byte[]) in Latin-1 encoding. And now just imagine, at the output we have all the schemas that are stored in the form of jasons, which are fine for everyone, except that all the fields are "uuid": "...some bin data in latin-1..." . It turns out a mixture of two encodings: UTF-8 & Latin-1.
From here, another problem grows if we export the schema, and then try to import it back - we need to parse this JSON to validate some points. But we have invalid JSON because the spec says that JSON must be exclusively UTF encoded.
Libraries that can parse JSON either fail with exceptions that it is not UTF-8 , or distort Latin-1 , which turns this identifier into a completely different one. My task is to take a byte[] , parse it as JSON, run through the nodes, check something, insert and delete, and return byte[] again without distorting the existing identifiers in Latin-1 in any way.
I thought to go the other way - to write an inheritor class for Avro JsonDecoder (which creates a problem with this encoding), but it turned out there are private constructors or package level. I thought about making a wrapper - but again, this class is used in such a way that I need some of its methods that I cannot override. And in some places I would need to implement some interfaces, I'm sure this would lead to a string of dependencies ))
I would appreciate any ideas how to solve this problem!
Answer the question
In order to leave comments, you need to log in
https://github.com/apache/avro
write in any format
Who prevents you from changing the scheme after it has been created programmatically?
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question