Skip to content

afroisalreadyinu/real-world-haskell-chapter-10

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Real World Haskell Ch. 10, Explained

Real World Haskell is a great book for learning Haskell, but one chapter, Chapter 10, is notorious for giving beginners a hard time. It introduces important concepts in a very roundabout way, with convoluted code examples that in some cases don't even run. Also, it is full of leaky abstractions that don't work as expected (deceptive chaining of lambdas, newtype as a container of stateful behavior, fmaps that have to be doubled etc.) that are still important for the rest of the book. In fact, most of the weird things turn out to be concepts introduced in later chapters, such as the bind operator of monads. This post is attempted as an understanding aid for Chapter 10, for me and for Haskell beginners who get stuck in it.

Chapter 10 uses as example the parsing of PGM (portable grey map) files. A sample file in this ancient but simple format is included in this repo as test.pgm. Here is what you will see if you open this file in a text editor:

P5
640 480
255
\377\377...

The last row is binary data that specifies greyscale values for individual pixels. The representation of a parsed PGM image generated by the example parsers in this chapter is straightforward; a data type named Greymap with fields grepWidth, greyHeight, greyMax and greyData is used. In order to represent such an image properly on the console, the Show interface of a Greymap is implemented to print size and maximum greyscale information. Since the Greymap data type is used by all code examples, it is included in this repo in the file common.hs imported by the other files. Here is the Greymap definition from that file:

data Greymap = Greymap {
      greyWidth :: Int
    , greyHeight :: Int
    , greyMax :: Int
    , greyData :: L.ByteString
    } deriving (Eq)

instance Show Greymap where
    show (Greymap w h m _) = "Greymap " ++ show w ++ "x" ++ show h ++
                             " " ++ show m

Given a file in PGM format, how should we read it into memory? Since the file contains mostly binary data, we can't (or shouldn't) use the Prelude methods that read a file into a String. Instead, we should read the file contents into a Data.ByteString.Lazy.ByteString. The methods to do that, and also many other methods to deal with binary data, are in two modules: Data.ByteString.Lazy.Char8 and Data.ByteString.Lazy. These are imported as L8 and L, respectively, to avoid excessive typing.

Linear parsing

The PGM format is simple to parse; the parser can linearly go through the file, matching the next piece of data into whatever is expected, and stopping if something unexpected happens. In the first example for parsing, the authors do exactly this. A working version of their code can be found in pgm1.hs; you can compile this file with ghc pgm1.hs, and run it with ./pgm1 test.pgm. This will print the result of the show call on the Greymap parse result.

Let's have a look at how the parsing is carried out. The parser has to look for three kinds of data: header, integers, and the image bytes. These are done with the matchHeader, getNat (for natural number), and getBytes methods. Here's the matchHeader function:

matchHeader :: L.ByteString -> L.ByteString -> Maybe L.ByteString
matchHeader prefix str
    | prefix `L8.isPrefixOf` str
        = Just (L8.dropWhile isSpace (L.drop (L.length prefix) str))
    | otherwise
        = Nothing

Keep in mind that L.ByteString here is the Data.ByteString.Lazy.ByteString data type. The above function can be used to check whether a ByteString starts with the given prefix, returning Nothing if not. If it does, the prefix and any following whitespace is dropped (using Data.ByteString.Lazy.Char8.dropWhile and isSpace), and returned with a Just. Observe that the second argument of this function is also a ByteString; we will see later how a string can be packed into a ByteString so that it can be passed to this function appropriately.

The next function is for matching an integer:

getNat :: L.ByteString -> Maybe (Int, L.ByteString)
getNat s = case L8.readInt s of
             Nothing -> Nothing
             Just (num,rest)
                 | num <= 0    -> Nothing
                 | otherwise -> Just (fromIntegral num, rest)

This function does something very similar to the previous one; it picks the piece of interesting data from the start of the string, using the Data.ByteString.Lazy.Char8.readInt function. In the case of error (readInt returns Nothing if number is less than 0), Nothing is returned. In case of success, the parsed number and the rest of the ByteString are packed into a tuple and returned as a Just.

The last function that fetches bytes, getBytes, is different in that it takes as an argument how many bytes to read. This is computed from the size of the image parsed earlier:

getBytes :: Int -> L.ByteString
         -> Maybe (L.ByteString, L.ByteString)
getBytes n str = let count           = fromIntegral n
                     both@(prefix,_) = L.splitAt count str
                 in if L.length prefix < count
                    then Nothing
                    else Just both

getBytes splits the ByteString into a prefix of the given size and the rest using Data.ByteString.Lazy.splitAt. The size of the prefix (determined using Data.ByteString.Lazy.length) is compared to the count argument, and if it is smaller, which means that the ByteString was not big enough, Nothing is returned. Otherwise, the tuple of the prefix and rest data is returned.

The parsing function that uses these matchers is very straightforward; it's a bit long and does not need much explaining, so it's not included here, but you can inspect it in pgm1.hs, starting from line 31. It applies the matchers sequentially, returning Nothing if one of them returns Nothing, otherwise ending with the Greymap instance that packs the data that was parsed. Data.ByteString.Lazy.Char8 is used to turn the PGM prefix P5 into a ByteString, passing the result into matchHeader. If the return value is Nothing, parsing ends. Otherwise, the returned rest ByteString is destructured from the Just, and fed to getNat to fetch the width. The parsing goes on like this, as the ByteString is consumed piece by piece, and the parsed data is gathered in the nested function scopes. If we run the command ./pgm1 test.pgm on the command line, here's the output we should see:

Greymap 640x480 255

which is exactly what we should expect.

As straightforward as the parsing function is, it's not good code. It's very repetitive, would be difficult to change and adapt to any variations in file format, does not offer any abstraction, and looks plain ugly. The first step in refactoring the parsing function is factoring out the chaining, where Nothing from a matching function leads to exit, whereas a Just value leads to the contents getting passed on to the next step. This is done using a binary operator >>? that encapsulates this logic:

(>>?) :: Maybe a -> (a -> Maybe b) -> Maybe b
Nothing >>? _ = Nothing
Just v  >>? f = f v

This is the bind function of monads, as a student of Haskell will see right away once she come to the relevant chapter. The standard infix operator for it is >>=. Why the authors introduce it without the relevant context is beyond me.

Observe that >>? is an infix function, and it's left-associative like any other function, so a >>? b >>? c is equivalent to (a >>? b) >>? c. The parsing function that uses this combination operator looks like this:

parseP5_take2 :: L.ByteString -> Maybe (Greymap, L.ByteString)
parseP5_take2 s =
    matchHeader (L8.pack "P5") s      >>?
    \s -> skipSpace ((), s)           >>?
    (getNat . snd)                    >>?
    skipSpace                         >>?
    \(width, s) ->   getNat s         >>?
    skipSpace                         >>?
    \(height, s) ->  getNat s         >>?
    \(maxGrey, s) -> getBytes 1 s     >>?
    (getBytes (width * height) . snd) >>?
    \(bitmap, s) -> Just (Greymap width height maxGrey bitmap, s)

This is from the file pgm2.hs, which you can compile and run, as with pgm1.hs, on test.pgm. As the reader continues with the book, she will find out that this method of chaining monadic functions is very common, and there is syntactic support for it in Haskell in the form of do blocks.

A method skipSpace had to be added to account for the clean up done within the original parseP5 function:

skipSpace :: (a, L.ByteString) -> Maybe (a, L.ByteString)
skipSpace (a, s) = Just (a, L8.dropWhile isSpace s)

Also pay attention to the (getNat . snd) combination, which serves to pick the second element of the tuple returned by skipSpace and feed it to getNat. If there are no errors on the way, the execution of the chained functions built by the parseP5_take2 method follows from top to bottom in a wrapped fashion. That is, the last line is the outermost function application. There is one very tricky thing about parseP5_take2, though. In parseP5, the parsed integer values (width, height, and maxGrey) are stored in the closures of successive function calls. At first impression, the same thing should happen within parseP5_take2, since there is no explicit management of these variables, but the indentation tells us that the lambdas are what get chained. But these lambdas have separate function contexts, so how are they supposed to contribute to the common closure? They can't and they don't; it's not the lambdas that are getting chained, but the function calls within the lambdas. You can see this more clearly if you wrap one of the lambdas in paranthesis (e.g. the one on line 54) and try to compile again, which returns the following error:

$ ghc pgm2.hs
[1 of 1] Compiling Main             ( pgm2.hs, pgm2.o )

pgm2.hs:58:16: Not in scope: ‘width’

pgm2.hs:59:35: Not in scope: ‘width’

Each lambda is a wrapper around the rest of parseP5_take2. The correct (or rather more truthful) indentation would be the following:

parseP5_take2_truthful :: L.ByteString -> Maybe (Greymap, L.ByteString)
parseP5_take2_truthful s =
    matchHeader (L8.pack "P5") s      >>?
    (\s -> skipSpace ((), s)           >>?
      (getNat . snd)                    >>?
      skipSpace                         >>?
      \(width, s) ->   getNat s         >>?
        skipSpace                         >>?
        \(height, s) ->  getNat s         >>?
          \(maxGrey, s) -> getBytes 1 s     >>?
            (getBytes (width * height) . snd) >>?
            \(bitmap, s) -> Just (Greymap width height maxGrey bitmap, s)

which doesn't look as nice as the previous one.

Since the second variation does not really solve the problem of encapsulating parse state, and is also misguiding in the way it's built, time for a third variation. From this point on, things get a bit weird. The third example uses rather convoluted code to insinuate at certain ideas, which doesn't really work. This leads to code that's difficult to understand at a syntactic level, and even if you understand it, doesn't make sense at a functional level. In the third variation, the parse state is packed inside the type ParseState that contains the complete ByteString and the current offset:

data ParseState = ParseState {
      string :: L.ByteString
    , offset :: Int64           -- imported from Data.Int
    } deriving (Show)

The parsing function will take a ParseState with the ByteString to be parsed and the offset set to 0, returning an Either String (a, ParseState). The question is how to encapsulate the parsing steps. The authors use a newtype declaration to achieve this. This is a surprise move, as there was no example of putting a function inside a newtype constructor, and it is one of the things I don't really get. The role of this pattern in monadic Haskell code becomes clear only once similar monads such as State or Reader are introduced.

Here's a sample that you can copy-paste into ghci to help you at least get the basic idea:

newtype IntOperation a = IntOperation { process :: a -> Either String a }

let multiplier limit = IntOperation $ \arg -> if arg > limit then (Left "Int too big") else Right (arg * 10)

If you then run process (multiplier 2) 5 in ghci, you should see the output Left "Int too big". IntOperation is simply a wrapper around a lambda passed in with the process parameter. This lambda, together with the wrapping IntOperation, can be created using something like a factory method, in this case multiplier.

Here is the parsing function encapsulation from the book:

newtype Parse a = Parse {
      runParse :: ParseState -> Either String (a, ParseState)
    }

Haskell being a functional language, this same thing could have been achieved using a simple closure. The only advantage of this formulation is that new types can be created only using the Parse constructor; using a function, one would have to declare a function type such as ParseState -> Either String (Int, ParseState). Also, the destruction of the Parse constructor can be avoided by exporting only the type constructor. The big mystery for me here was what the heck the a argument here is -- one point where the Haskell pattern of single-letter argument names fails. The use of the Parse type in the parsers that follow in this chapter reveal that a stands for the output of the previous parsing step, such as an int or a byte.

A sample parser where we can see the Parser type in action is the identity parser which returns whatever it is given. Here are the relevant lines from parse_identity.hs:

identity :: a -> Parse a
identity a = Parse (\s -> Right (a, s))

So whatever ParseState it is given (s in the encapsulated lambda), identity will return whatever it was initialized with and the ParseState in a tuple. identity does not look particularly useful here, but we will see how it is put to use later, in combination with parser chaining and functors.

Parsing a file involves reading data from the ByteString and increasing the offset by the number of bytes read. Increasing the offset of a ParseState is done using the bracket notation on records to create new ones:

modifyOffset :: ParseState -> Int64 -> ParseState
modifyOffset initState newOffset =
    initState { offset = newOffset }

The output of this function is a new ParseState that is different only on the offset field from initState.

Another thing we need is a way to chain Parsers. That is, we need the capability to take one Parse, run it (remember, a Parse encapsulates parsing logic in a lambda), take the result, and put it into another Parse. If you look at the type definition of ==> from pgm3.hs, this is exactly what it does:

(==>) :: Parse a -> (a -> Parse b) -> Parse b

When used as a binary operator, the left argument will be a Parse a, where a is the type wrapped in Either returned by the lambda within Parse. The right will be a function that takes an a as an argument, and returns a Parse b. The definition of this operator is relatively straightforward:

firstParser ==> secondParser  =  Parse chainedParser
  where chainedParser initState   =
          case runParse firstParser initState of
            Left errMessage ->
                Left errMessage
            Right (firstResult, newState) ->
                runParse (secondParser firstResult) newState

Let's try to understand what's happening here, shall we? The result of the ==> operator is the chainedParser function that is created in the where clause, wrapped in a Parse. Before the runParse function is called on the end result and the unwrapped function is called, this function will not run, so we're essentially creating state wrapped in a lambda inside a newtype. chainedParser gets an initState as an argument, as it should since it's getting wrapped in a Parse. It then runs the firstParser on this state. If the result is an error message, the output of chainedParser is also an error message. If the result is Right (firstResult, newState), the secondParser is initialized with firstResult, and called with newState. The main difficulty in understanding this function stems from firstParser and secondParser having names that imply they are of the same kind, but they actually aren't. firstParser is Parse a, whereas secondParse is a -> Parse b, i.e. a factory that produces a Parser.

The ==> operator is yet another form of the bind function on monads, introduced in later chapters. Why yet another syntax is necessary for a topic whose time has not come is a decision best left unexplained.

Now let's look at the actual parser from the third example that uses the above tools:

bail :: String -> Parse a
bail err = Parse $ \s -> Left $
           "byte offset " ++ show (offset s) ++ ": " ++ err

parseByte :: Parse Word8
parseByte =
    getState ==> \initState ->
    case L.uncons (string initState) of
      Nothing ->
          bail "no more input"
      Just (byte,remainder) ->
          putState newState ==> \_ ->
          identity byte
        where newState = initState { string = remainder,
                                     offset = newOffset }
              newOffset = offset initState + 1

bail here is a very simple function to return an error. parseByte uses two functions we haven't seen yet, getState and putState. These are both a bit weird; getState is a Parse that puts the state it is passed in both parts of the returned Either:

getState :: Parse ParseState
getState = Parse (\s -> Right (s, s))

It is used in parseByte to pack the initial state for the next step. You might ask yourself the question "Why the #*!! is getState a Parse? It just creates a tuple!", and you would be right. The reason for this convoluted way of presenting a ParseState will be understandable when Parse gets to act as a functor, so that we can use instances including getState to encapsulate parsing logic and chain it with others. getState is also a primitive implementation of the return function of a monad, especially of the kind similar to State or Reader.

putState returns unity and the parse state as a result:

putState :: ParseState -> Parse ()
putState s = Parse (\_ -> Right ((), s))

Here is how parseByte works. The central piece is the ==> operator that chains a Parse with a Parse factory (a -> Parse b) to create another Parse. The chained elements are getState with the lambda that follows it, and then within the lambda, putState with a lambda that creates an identity Word8. When parseByte is called on a ParseState created from the contents of the test.pgm file (lines 67 and 68 in pgm3.hs), the combined parser created by the first ==> is fired. It runs getState, which returns the exact same initial ParseState in both slots of a tuple. The second argument to ==> in this first chain, the lambda starting at line 49 with the single argument initState, is run with the first slot of the tuple, the initial state. This lambda takes the ByteString and uncons's it using Data.ByteString.Lazy.uncons -- that is, splits head and tail. If there is no data, bail is called. Otherwise, another chain is created using ==>. This one involves putState and another lambda. As we already know, putState takes whatever it was given, and creates a Parse that returns a unity and that thing. When we couple this with an identity that receives as its first argument the byte that was parsed in the containing lambda, and returns that byte no matter what it is called with, what we get is the byte. To make this last step more understandable, here is a step-by-step evaluation:

(putState newState) ==> \_ -> identity byte
-- is equivalent to
Parse (\_ -> Right ((), newState) ==> \_ -> Parse (\s -> Right(byte, s))
-- is equivalent to
runParse ( Parse (\s -> Right(byte, s)) newState
-- is equivalent to
Right (byte, newState)

We end up with the remainder from the uncons operation and a new incremented offset in a ParseState, and the parsed byte. pgm3.hs tells me that the first byte of test.pgm is 80 when I run it. What is most relevant for the rest of the chapter is that identity, when used as the second argument of ==>, serves to pack the output of the first argument, as ==> passes it in opaquely through a closure. When another Parse is chained, it will get exactly the same arguments that were packed into identity.

There are a number of obvious problems with parseByte. First of all, getByte and putByte are totally senseless; we don't need them to pass a ParseState to the case statement they enclose. They are used just to demonstrate the ==> operator; it would be relatively easy (but still pointless) to write a parseByte alternative without any of these two functions. Also, what's the deal with the weird combination of putState and identity? They are essentially just swapping arguments and returning a value in a backhanded manner through a closure. parseByte is not how one should write Haskell.

Parsing with functors

The discussion of functors in Chapter 10 is rather straightforward and the examples do not need explanation, so we will skip that. The only thing I would like to note is the definition of a functor, which is a class with one method:

class Functor f where
    fmap :: (a -> b) -> f a -> f b

The type f here is referred to as the container type. fmap can be used as an inline operator either by putting ticks around it, or by importing <$> from Control.Applicative, and using that instead, which we will do frequently in the following. There are two rules functors must obey to be considered a proper implementation. The first is that a functor must preserve identity:

fmap id ==  id

The second is that functors must be composable, i.e. applying the composition of two functions to a functors should be the same as applying them separately:

fmap (f . g)  ==  fmap f . fmap g

And now for the fourth iteration of the PGM parser using functors. You can find the code for this iteration in pgm4.hs. The relevant bits from earlier iterations have been copied in for your convenience. For this iteration, Parse is turned into a functor, and then chained using a new combination operator. The functor instance definition is as follows:

instance Functor Parse where
    fmap f parser = parser ==> \result ->
                    identity (f result)

It's the same identity trick that we saw earlier. What does this combinator do? It returns a Parser that, when evaluated with runParser on a ParseState, returns Right ((f result), newState), provided that there are no errors. This definition of functor for Parse preserves identity, because Right ((id result), newState) == Right (result, newState). It is also composable because the Parse instance returned by fmap allows chaining more identity parsers with further application of fmap. identity simply passes these function applications on, preserving the ParseState (as explained above), and leading to the following equality:

parse ((f . g) <$> parseByte) input == parse (f <$> g <$> parseByte) input

Once we have the functor definition for Parse nailed, we can discover the possibilities offered by its application. For starters, instead of duplicating code, we can use parseByte to define a parseChar:

w2c :: Word8 -> Char
w2c = chr . fromIntegral

parseChar :: Parse Char
parseChar = w2c <$> parseByte

Parsing a character is simply turning a Word8 to its character value using Data.Char.chr. A more complicated application is peeking inside a Parser to read the first byte or character:

peekByte :: Parse (Maybe Word8)
peekByte = (fmap fst . L.uncons . string) <$> getState

peekChar :: Parse (Maybe Char)
peekChar = fmap w2c <$> peekByte

The definition of peekByte is a bit tricky because of precedence rules and the presence of a Maybe type. We have to start with getState, which would allow fmap'ing onto the ParseState, because we are concerned with the ByteString in there. Three functions are combined before getting fmap'ed on getState. These are fmap fst, L.uncons and string. That is, the fmap binding within the paranthesis is not on the combination of the three functions, but only on the first one, and then the combination occurs. This is also the reason both fmap and <$> are used; these have different precedences; fmap is a function, so it has the highest precedence, . follows it with 9, and <$> has precedence 4. Applied from right to left, string gets the ByteString of the ParseState, L.uncons splits this ByteString into head and rest, or Nothing if its length was 0, and fmap fst takes the first element. The reason we have fmap fst here instead of just fst is that he return value of Data.ByteString.Lazy.uncons is Maybe(Word8, BytString). Maybe being a functor, we can get the first of the two arguments by fmap'ing fst to it. peekChar is, as was the case with parseChar, a simple application of w2c to peekByte.

A function we will need later is a an alternative to takeWhile that creates a Parser which keeps on reading from a ByteString as long as a criterion p is met:

parseWhile :: (Word8 -> Bool) -> Parse [Word8]
parseWhile p = (fmap p <$> peekByte) ==> \mp ->
               if mp == Just True
               then parseByte ==> \b ->
                    (b:) <$> parseWhile p
               else identity []

parseWhileWith :: (Word8 -> a) -> (a -> Bool) -> Parse [a]
parseWhileWith f p = fmap f <$> parseWhile (p . f)

parseWhile should be relatively easy to understand: It applies a predicate to the peeked byte, using double fmap's once more. This results in a Parse, which is chained with ==> to a Parse factory that continues parsing if the predicate was true, and returns an identity that returns an empty list otherwise. The only trick is using the cons operator : with the parsed byte to create a function to be fmap'ed to a further parseWhile. parseWhileWith is a variation on parseWhile that takes another function f in addition to the predicate. This function is used to turn the peeked byte into another preferred type before applying the predicate, and is also mapped on the final result. The double fmap'ing is necessary this time around too, because the result of parseWhile is a list, and we can apply f to all elements of a list with fmap.

Using parseWhileWith, we can read an integer from the ByteString by collecting digits from it:

parseNat :: Parse Int
parseNat = parseWhileWith w2c isDigit ==> \digits ->
           if null digits
           then bail "no more input"
           else let n = read digits
                in if n < 0
                   then bail "integer overflow"
                   else identity n

The logic here is no different from earlier Parse instances. One thing to note is the hack of checking for integer overflow by looking at whether the value is negative. Int is a signed integer, and on most platforms, overflow causes it to become negative because the sign bit is modified. An interesting point is how the collected digits are converted to Int. The digits are characters, and a list of characters is a string, which can be parsed as an Int simply by reading it. We don't have to tell the compiler which type we want to have it as, because the context makes it clear.

With that, we can build the next version of our parser, which unfortunately doesn't look nice:

(==>&) :: Parse a -> Parse b -> Parse b
p ==>& f = p ==> \_ -> f

skipSpaces :: Parse ()
skipSpaces = parseWhileWith w2c isSpace ==>& identity ()

assert :: Bool -> String -> Parse ()
assert True  _   = identity ()
assert False err = bail err

parseBytes :: Int -> Parse L.ByteString
parseBytes n =
    getState ==> \st ->
    let n' = fromIntegral n
        (h, t) = L.splitAt n' (string st)
        st' = st { offset = offset st + L.length h, string = t }
    in putState st' ==>&
       assert (L.length h == n') "end of input" ==>&
       identity h

parseRawPGM =
    parseWhileWith w2c notWhite ==> \header -> skipSpaces ==>&
    assert (header == "P5") "invalid raw header" ==>&
    parseNat ==> \width -> skipSpaces ==>&
    parseNat ==> \height -> skipSpaces ==>&
    parseNat ==> \maxGrey ->
    parseByte ==>&
    parseBytes (width * height) ==> \bitmap ->
    identity (Greymap width height maxGrey bitmap)
  where notWhite = (`notElem` " \r\n\t")

To be perfectly honest, this is some of the ugliest and most incomprehensible code I have ever seen, and I don't understand how it could be an improvement over anything. But I made the promise to explain the whole chapter, so here it goes.

We have one more combination operator, ==>&, which looks a lot like a Perl regular expression to me. This operator uses the previous ==> but simply omits the result of the first Parse, feeding only the resulting ParseState. The second argument to the ==>& operator thus should be an initialized Parse and not a factory. It receives the ParseState that came out of the first one when run with runParse. Once more, this is a monadic function that will pop up in the relevant chapter; namely the >> function that ignores value of the first monad.

There are three more helper functions that either use this new combination operator or are arguments to it. The first, skipSpaces, keeps on reading from a ByteString as long as it's a space character. The result is dismissed, and the final ParseState is passed on to an identity whose result part is unity. assert is an assertion packed into a Parse, and does not deserve any further discussion.

The parseBytes function takes an n :: Int and returns a Parse that reads the first n bytes of a ByteString. The functionality is embedded in a lambda that has a let .. in .. statement which returns three chained Parse instances. Within the let part, Data.ByteString.Lazy.splitAt is used to split the ByteString. A new ParseState is created using the record notation. The Parse instances are then chained in the let part; these are a putState, an assert, and an identity that is initialized with the head part from splitting the ByteString. What will this identity contain when we run the Parse resulting from this function? The first item in the resulting tuple will be the head from splitAt; the second will be the new ParseState that was created in the line st' = st { offset = offset st + L.length h, string = t } and put into putState. The reason is that these three Parse instances are chained using ==>&, which omits any results and only passes on the ParseState.

With that, we come to the final parseRawPGM function. The steps are the same with the parsing functions in the earlier iterations; the difference is in how they are chained to each other. If the result of a parse step is valuable, such as the first step with parseWhileWith w2c notWhite which gets the header, the ==> operator is used to connect to the following lambda (it has to be a lambda because of the definition of this operator). This lambda gets as its only argument the result of the first parse step, and once more, although it's not obvious from the indentation, these results are preserved within the nested lambdas. If the result of a Parse is meaningless, such as with skipSpaces, the ==>& operator is used, which discards the result. We end up with a Greymap.

That was it! I hope this walk-through helps you with continuing with this great book.

About

A walk-through of Real World Haskell 10th Chapter

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published