Answer the question
In order to leave comments, you need to log in
How to convert xml to csv in functional style?
I'm learning a new area for myself - functional programming. The principles are clear, the implementation in Python, Erlang and Scala has been studied in the basics. But the restructuring of thinking is going slowly. I'm too old for this bullshit © There are few examples on the net, and most of them are purely theoretical.
There is a simple real task - to parse xml and generate csv based on its data. I feel in my heart that the functional approach is very useful here, but I can’t figure out exactly how to do it.
The xml structure is something like this:
<group name="">
<entry name="">
<item name="" attr1="" attr2="" />
<item name="" attr1="" attr2="" />
<item name="" attr1="" attr2="" />
</entry>
</group>
Answer the question
In order to leave comments, you need to log in
xandox wanted something like this
(ns su.gornostaev.rstatfr2csv
(:use [clojure.xml :only (parse)])
(:use [clojure.string :only [index-of join split]])
(:use [clojure.java.io])
(:import java.io.File))
(defn parse-cmd
[args]
(when-let [file-name (first args)]
file-name))
(defn parse-file
[file-name]
(if (nil? file-name) nil (parse (File. file-name))))
(defn format-datetime
[dt]
(let [[date time] (split dt #"\s")]
(str (join "." (reverse (split date #"\."))) " " (subs time 0 (index-of time \:)) ":00")))
(defn merge-counters
[counters]
(map
(fn [item]
(let [[k & v] item]
(assoc (reduce (fn [acc val] (merge-with + acc val)) (first v)) :dt k)))
(reduce
(fn [acc val]
(let [{:keys [dt in out]} val]
(update-in acc [dt] (fnil #(conj % {:in in :out out}) []))))
{} counters)))
(defn parse-counters
[sensor]
(map
(fn [counter]
{:dt (format-datetime (:datetime (:attrs counter)))
:in (read-string (:realin (:attrs counter)))
:out (read-string (:realout (:attrs counter)))})
sensor))
(defn parse-sensors
[shop]
(map
(fn [sensor]
{:name (:name (:attrs sensor)) :counters (merge-counters (parse-counters (:content sensor)))})
shop))
(defn parse-shops
[dom]
(map
(fn [shop]
{:name (:name (:attrs shop)) :sensors (seq (parse-sensors (:content shop)))})
dom))
(defn parse-doc
[doc]
(parse-shops
(->>
doc
:content
first
:content)))
(defn save-csv
[data]
(with-open [w (writer (file "output.csv"))]
(doseq [shop data]
(doseq [sensor (:sensors shop)]
(doseq [counter (:counters sensor)]
(.write w (str (join "\t" (concat [(:name shop) (:name sensor)] (vals (into (sorted-map) counter)))) "\n")))))))
(save-csv (parse-doc (parse-file (parse-cmd *command-line-args*))))
preg_match... 3 pieces, then by results - cycles with search of results.
But it's better to take a ready-made parser, for example SimpleXML .
For good, you just need to describe the lexer and parser, which will receive data from xml in streaming mode and translate it into csv. No intermediate storage of 100500 MB of data is needed.
Look towards Erlang, there is even an option to embed your xml into source code and compile it as a module.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question