haskell - Setting begin and end of multi-line input in Parsec -


i new @ parsec. appreciate pointers problem here. say, have csv file fixed number of headers. instead of parsing each line separately, token @ beginning of line, , lines until next line non-empty token. example below:

token,flag,values a,1, ,,a ,,f b,2, 

rule valid input is: if token field filled in, lines until next non-empty token field. so, parsec multiple lines below first input (those multiple lines can parsed rule):

a,1, ,,a ,,f 

then, process starts again on next line non-empty token field (last line in example here). trying figure out if there simple way specify rule in parsec - lines meet rule. handed off parser. basically, looks kind of lookahead rule specify valid multi-line input. did right?

we can ignore comma separator above now, , input begins when character found @ beginning of line, , ends when character found @ beginning of line.

i solved problem of @user2407038 suggested basic outline in comment. solution , explanation below (please see comments after function - show how function behaves input):

{-# language flexiblecontexts #-} import control.monad import text.parsec import control.applicative hiding ((<|>), many)   -- | 1 accepts until newline, , discards newline -- | 1 used building block in functions below restofline :: stream s m char => parsect s u m [char] restofline = many1 (satisfy (\x -> not $ x == '\n')) <* char '\n'  -- | line token "many alphanumeric characters" followed  -- | characters until newline  tokenline :: stream s m char => parsect s u m [char] tokenline =  (++) <$>  many1 alphanum <*> restofline  -- | ghci test: -- | *main text.parsec> parsetest tokenline "a,1,,\n" -- | "a,1,," -- | *main text.parsec> parsetest tokenline ",1,,\n" -- | parse error @ (line 1, column 1): -- | unexpected "," -- |expecting letter or digit  -- | non-token line line has number of spaces followed -- | ",", characters until newline nontokenline :: stream s m char => parsect s u m [char] nontokenline = (++) <$> (many space) <*> ((:) <$> char ',' <*> restofline)  -- | ghci test: -- | *main text.parsec> parsetest nontokenline ",1,,\n" -- | ",1,," -- | *main text.parsec> parsetest nontokenline "a,1,,\n" -- | parse error @ (line 1, column 1): -- | unexpected "a" -- | expecting space or ","  -- | 1 entry tokenline followed number of nontokenline oneentry :: stream s m char => parsect s u m [[char]] oneentry = (:) <$> tokenline <*> (many nontokenline)  -- | ghci test - please note drops last line expected -- | *main text.parsec> parsetest oneentry "a,1,,\n,,a\n,,f\nb,2,,\n" -- | ["a,1,,",",,a",",,f"]   -- | add 'many' oneentry parse entire file, , multiple match entries multientries :: stream s m char => parsect s u m [[string]] multientries = many oneentry  -- | ghci test - please note gets 2 entries expected -- | *main text.parsec> parsetest multientries "a,1,,\n,,a\n,,f\nb,2,,\n" -- | [["a,1,,",",,a",",,f"],["b,2,,"]] 

the parser error seen in comments expected on invalid inputs. can handled. above code basic building block started.


Popular posts from this blog