D
D
Daniil Kolesnichenko2015-06-19 07:40:34
Haskell
Daniil Kolesnichenko, 2015-06-19 07:40:34

How to recursively parse brackets?

I'm trying to decompose a view string "(+ 12 (* 3 4))"into a list of view tokens ["(", "+", "12", "(", "*", "3", "4", ")", ")"]
I wrote something in Haskell:

import Data.Char
import Data.List.Split
import Data.Maybe
import Text.Read

sep :: String -> [String]
sep s = concat $ map sepBr (splitOn " " s) where
    sepBr :: String -> [String]
    sepBr ""  = []
    sepBr " " = []
    sepBr (x:xs)
        | x `elem` ['(', ')'] = ([x] : sepBr xs)
        | otherwise           = ([x] : [xs])

Works almost correctly :)
Instead of the expected result, it returns ["(","+","","1","2","(","*","","3","","4","))"]. Empty lines could be cleaned out using filter , but I don’t know what to do with glued brackets and a sticky number.
PS I'm just starting to try to master FP, haskell, recursion and all this, while I'm still confused, don't throw your slippers too much)
UPD:
I kind of rewrote it, in theory it should work, but he swears at types:
import           Data.Char
import           Data.List.Split
import           Data.Maybe
import           Text.Read


sep :: String -> [String]
sep s = concat $ map sepBr (splitOn " " s) where
    sepBr :: String -> [String]
    sepBr ""  = []
    sepBr " " = []
    sepBr word
        | a `elem` brackets =  ++ sepBr bc
        | c `elem` brackets = sepBr ab ++ 
        | otherwise         = ([a : bc])
        where a        = if word == [] then [] else head word
              b        = if word == [] then [] else init $ tail word
              c        = if word == [] then [] else last word
              ab       = [a] ++ b
              bc       = b ++ [c]
              brackets = ['(', ')']

Mistake:
src/[email protected]:20-13:28 Couldn't match type Char with [t1]
Expected type: 
  Actual type: [Char] …
src/[email protected]:33-13:34 Couldn't match expected type Char with actual type [t2] …
src/[email protected]:20-14:28 Couldn't match type Char with [t3]
Expected type: 
  Actual type: [Char] …
src/[email protected]:37-14:39 Couldn't match type [t4] with Char
Expected type: String
  Actual type:  …
src/[email protected]:45-14:46 Couldn't match expected type Char with actual type [t5] …
src/[email protected]:33-15:34 Couldn't match expected type Char with actual type [t6] …
src/[email protected]:58-16:62 Couldn't match type Char with [t]
Expected type: 
  Actual type: String
Relevant bindings include
  a :: [t]
    (bound at /home/app/isolation-runner-work/projects/119295/session.207/src/src/Main.hs:16:15) …
src/[email protected]:58-18:62 Couldn't match type Char with [t]
Expected type: 
  Actual type: String
Relevant bindings include
  c :: [t]
    (bound at /home/app/isolation-runner-work/projects/119295/session.207/src/src/Main.hs:18:15) …
src/[email protected]:33-19:34 Couldn't match type Char with [t]
Expected type: 
  Actual type: [Char]
Relevant bindings include
  ab :: 
    (bound at /home/app/isolation-runner-work/projects/119295/session.207/src/src/Main.hs:19:15) …
src/[email protected]:32-20:33 Couldn't match expected type Char with actual type [t0] …

I don’t understand where it comes from if the type of the function is last :: [a] -> a. The same with the rest of the functions, why acan't it be a character (Char)?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
A
Alexander Ruchkin, 2015-06-20
@KolesnichenkoDS

First, replace splitOn " "with words, which will eat up all spaces. Further, concat $ map ...is the same as concatMap .... Consider sepBr. It takes a string without spaces and splits it into chunks if there are operators, numbers, or parentheses. If the string is already empty, the result is an empty list. If the string is non-empty, then options are possible. If its first character is some kind of bracket, we separate this bracket, and divide the rest again using sepBr. Otherwise, we do this: we split the string so that only numbers go first: span isDigitand see what happens. If there are numbers, we separate them, and divide the rest again sepBr. If there are no numbers, then simply separate the first character. Here's what happened:

sep ∷ String → [String]
sep = concatMap sepBr . words where
  sepBr ∷ String → [String]
  sepBr "" = [] -- нечего делить
  sepBr s'@(x:xs)
    | x `elem` "()" = [x] : sepBr xs -- скобка
    | otherwise = case span isDigit s' of -- возьмём цифры
      ("", t:tl) → [t] : sepBr tl -- нет цифр, берём первый символ остатка
      (n, tl) → n : sepBr tl -- есть цифры, их отдельно

Regarding your second option.
Next, what is ab? This [a] ++ b, i.e. [head word] ++ init (tail word), i.e. it's the same as just init word. Likewise bc = tail word. Instead of taking separately headand separately tail, you can use pattern matching and record (a : bc) = word.
With this in mind, your version is rewritten to
sep2 ∷ String → [String]
sep2 = concatMap sepBr . words where
  sepBr ∷ String → [String]
  sepBr ""  = []
  sepBr " " = []
  sepBr word
    | a `elem` brackets =  ++ sepBr bc
    | c `elem` brackets = sepBr ab ++ 
    | otherwise = ([a : bc])
    where
      (a : bc) = word
      c = last word
      ab = init word
      brackets = ['(', ')']

And it works quite well.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question