S
S
Sergey Gornostaev2014-09-05 10:42:45
Erlang
Sergey Gornostaev, 2014-09-05 10:42:45

How to convert xml to csv in functional style?

I'm learning a new area for myself - functional programming. The principles are clear, the implementation in Python, Erlang and Scala has been studied in the basics. But the restructuring of thinking is going slowly. I'm too old for this bullshit © There are few examples on the net, and most of them are purely theoretical.
There is a simple real task - to parse xml and generate csv based on its data. I feel in my heart that the functional approach is very useful here, but I can’t figure out exactly how to do it.
The xml structure is something like this:

<group name="">
  <entry name="">
    <item name="" attr1="" attr2="" />
    <item name="" attr1="" attr2="" />
    <item name="" attr1="" attr2="" />
  </entry>
</group>

at the output, I would like to get a csv from a set of lines like "group-name; entry-name; attr1-val; attr2-val"
Can anyone help with an example?
UPD: I must have worded my question badly. I know how to parse xml. Right now I'm just reading the xml, looping through the nodes, performing the necessary operations with the attribute values, saving the intermediate results in the dictionary, and then iterate over the dictionary again, performing post-processing and uploading the results to csv. Imperative style in every decision. How did you do it in functional style? To hell with cycles and intermediate data.

Answer the question

In order to leave comments, you need to log in

4 answer(s)
S
Sergey Gornostaev, 2016-03-05
@sergey-gornostaev

xandox wanted something like this

(ns su.gornostaev.rstatfr2csv
  (:use [clojure.xml :only (parse)])
  (:use [clojure.string :only [index-of join split]])
  (:use [clojure.java.io])
  (:import java.io.File))

(defn parse-cmd
  [args]
  (when-let [file-name (first args)]
    file-name))

(defn parse-file
  [file-name]
  (if (nil? file-name) nil (parse (File. file-name))))

(defn format-datetime
  [dt]
  (let [[date time] (split dt #"\s")]
    (str (join "." (reverse (split date #"\."))) " " (subs time 0 (index-of time \:)) ":00")))

(defn merge-counters
  [counters]
  (map
    (fn [item]
      (let [[k & v] item]
      (assoc (reduce (fn [acc val] (merge-with + acc val)) (first v)) :dt k)))
    (reduce
      (fn [acc val] 
        (let [{:keys [dt in out]} val]
        (update-in acc [dt] (fnil #(conj % {:in in :out out}) []))))
      {} counters)))

(defn parse-counters
  [sensor]
  (map
    (fn [counter]
      {:dt (format-datetime (:datetime (:attrs counter))) 
       :in (read-string (:realin (:attrs counter))) 
       :out (read-string (:realout (:attrs counter)))})
    sensor))

(defn parse-sensors
  [shop]
  (map
    (fn [sensor]
      {:name (:name (:attrs sensor)) :counters (merge-counters (parse-counters (:content sensor)))})
    shop))

(defn parse-shops
  [dom]
  (map
    (fn [shop]
      {:name (:name (:attrs shop)) :sensors (seq (parse-sensors (:content shop)))})
    dom))

(defn parse-doc
  [doc]
  (parse-shops
    (->> 
      doc
      :content
      first
      :content)))

(defn save-csv
  [data]
  (with-open [w (writer (file "output.csv"))]
    (doseq [shop data]
      (doseq [sensor (:sensors shop)]
        (doseq [counter (:counters sensor)]
          (.write w (str (join "\t" (concat [(:name shop) (:name sensor)] (vals (into (sorted-map) counter)))) "\n")))))))

(save-csv (parse-doc (parse-file (parse-cmd *command-line-args*))))

A
Anton Solomonov, 2014-09-05
@Wendor

preg_match... 3 pieces, then by results - cycles with search of results.
But it's better to take a ready-made parser, for example SimpleXML .

V
Vladimir, 2014-09-05
@rostel

on erlang www.erlang.org/doc/apps/xmerl/xmerl_ug.html

S
Sergey, 2014-09-05
@begemot_sun

For good, you just need to describe the lexer and parser, which will receive data from xml in streaming mode and translate it into csv. No intermediate storage of 100500 MB of data is needed.
Look towards Erlang, there is even an option to embed your xml into source code and compile it as a module.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question