This exercises section has numerous mistakes and typos in it, including incorrect numbering (at least, that's the print I have). The numbers I'm using here are the "right" ones, not the ones that appear in print.

#1. It's `foldl`, of course, because in each step we have to multiply everything we've seen so far by 10 -- and "everything we've seen so far" refers to stuff on the *left* of the currently-added digit.

Because the minus sign can only appear as the first character, we'll use a special pattern just for it, before folding.

import Data.Char asInt_fold :: String -> Int asInt_fold ('-':xs) = -(asInt_fold xs) asInt_fold xs = foldl step 0 xs where step x y = 10 * x + (digitToInt y)

#2. I can only assume the authors meant "" and "-" to be reported as errors. I also assume they wish us to use `error` directly, rather than rely on `digitToInt` doing the work for us. Finally, I assume we can't use `isDigit`, which wasn't introduced yet.

asInt_fold :: String -> Int asInt_fold [] = error "Contains no digits" asInt_fold ('-':xs) = -(asInt_fold xs) asInt_fold xs = foldl step 0 xs where step x y | (y >= '0') && (y <= '9') = 10 * x + (digitToInt y) | otherwise = error ("Not a digit '" ++ [y] ++ "'")

#3. (sigh) The signature of `asInt_either` is broken in the text. It should be:

type ErrorMessage = String asInt_either :: String -> Either ErrorMessage Int

And of course, `Either` wasn't presented yet...! To add insult to injury, the book's index doesn't have the correct pointer to information about `Either`: you can read about it in page 210 (Chapter 8); or you can use the GHC online docs. To keep things simple, I will not be using the `either` function, just the `Either` type, with patterns and `case` to tell `Left` from `Right`:

type ErrorMessage = String asInt_either :: String -> Either ErrorMessage Int asInt_either [] = Left "Contains no digits" asInt_either ('-':xs) = case inner of Left err -> Left err Right val -> Right (-val) where inner = (asInt_either xs) asInt_either xs = foldl step (Right 0) xs where step (Right x) y | (y >= '0') && (y <= '9') = Right (10 * x + (digitToInt y)) | otherwise = Left ("Not a digit '" ++ [y] ++ "'") step (Left err) _ = Left err

What makes the code so unattractive is the fact we have to deconstruct the answer at every folding step, and reconstruct it right back. If it was deconstructed to a `Left`, it stays a `Left` with the same error message; if it was deconstructed to a `Right`, we proceed normally in the step, but must wrap the result in a `Right` constructor again.

#4. Simply append the nested lists to each other:

myConcat :: [[a]] -> [a] myConcat xs = foldr (++) [] xs

#5. Recursive:

takeWhile_rec :: (a -> Bool) -> [a] -> [a] takeWhile_rec _ [] = [] takeWhile_rec pred (x:xs) | pred x = x : takeWhile_rec pred xs | otherwise = []

`takeWhile` is very similar to `filter`, except that if the predicate isn't met, we don't proceed (return an empty list instead of the accumulator):

takeWhile_foldr :: (a -> Bool) -> [a] -> [a] takeWhile_foldr pred xs = foldr stepr [] xs where stepr x acc | pred x = x : acc | otherwise = []

#6. What does `groupBy` do? The first argument is of type `a -> a -> Bool`, meaning it's a *relation* on the type `a` (for example, ">" is a relation on the integers; `==` is a relation on any comparable type; etc.) This, plus the function name, gives us a pretty strong hint. Let's check it out, in ghci:

Prelude>:module Data.ListPrelude Data.List>groupBy (>) [1,2,3,4,5,6,7][[1],[2],[3],[4],[5],[6],[7]] Prelude Data.List>groupBy (>) [4,3,2,1][[4,3,2,1]] Prelude Data.List>groupBy (>) [4,3,4,2,1][[4,3],[4,2,1]] Prelude Data.List>groupBy (==) [4,3,4,2,1][[4],[3],[4],[2],[1]] Prelude Data.List>groupBy (==) [1,1,1,2,1,1,2,2,2,1,2][[1,1,1],[2],[1,1],[2,2,2],[1],[2]]

Seems like it compares pairs of elements, and as long as each element fulfills the relation with the previous one, they are added to the same group. But then, something funny happens when you try other stuff:

*Main Data.List>groupBy (>) [4,2,3,1,3,5,2,3,2,1][[4,2,3,1,3],[5,2,3,2,1]]

Huh?

Well, apparently `groupBy` picks the first elements, and groups to it all elements that are in the relation with it; once an element not in the relationship *with the first one* is found, it is used to start a new group, and further elements will be compared to this head of the new group. And so on. So, again, each element is compared not with the preceding element, but with the first element (the head) of the group being constructed.

This only goes to show how important it is to use varied input sets when testing your code.

Now, let's see how this can be implemented using fold. Because each element is compared not to its direct neighbor, but to the first of its group, `foldr` makes little sense: seeing the rightmost element, we don't know what to compare it with! So, with `foldl`:

groupBy_foldl :: (a -> a -> Bool) -> [a] -> [[a]] groupBy_foldl p xs = foldl step [] xs where step [] x = [[x]] step acc x | p (head (last acc)) x = (init acc) ++ [((last acc) ++ [x])] | otherwise = acc ++ [[x]]

Here, the accumulator (`acc`) is a list of lists, and to get the first element of the last list we use `head (last acc)`. I assume `last` isn't a very efficient function, so it might be more efficient to keep the last element of `acc` on its own, and append it to `acc` only when moving to a new sublist (or when we're done). This yields the less-readable function:

groupBy_foldl2 :: (a -> a -> Bool) -> [a] -> [[a]] groupBy_foldl2 p xs = acc ++ [lastAcc] where (acc, lastAcc) = foldl step ([], []) xs step (acc, []) x = (acc, [x]) step (acc, lastAcc) x | p (head lastAcc) x = (acc, lastAcc ++ [x]) | otherwise = (acc ++ [lastAcc], [x])

... But while it's less readable, I'm not at all certain it's also faster. Besides, using fold is not the fastest way to implement `groupBy` anyway. My advice, as always, is to leave optimizations out of the game until you're certain that there actually is a performance problem.

#7. For ** any**, we can use either left or right folding; here's a naive attempt:

any_foldl p xs = foldl step False xs where step acc x = acc || (p x) any_foldr p xs = foldr step False xs where step x acc = acc || (p x)

Note that we are applying the predicate `p` to each `x`, and or-ing the results. So why not apply `p` using `map`?

any_foldl p xs = foldl (||) False (map p xs) any_foldr p xs = foldr (||) False (map p xs)

Remember that Haskell is lazy, so this does not imply that we first apply `p` to all `x`s; rather, it is applied on a per-need basis. And indeed, this version is both shorter, easier to understand, and faster. And finally, the `foldr` version is far superior to the `foldl` one. To see why, ask yourself which of them allows Haskell to use short-circuit evaluation for the "or" operator.

For ** cycle**... wait,

`cycle`wasn't introduced yet. And it involves infinite lists, a concept which wasn't introduced yet, either. So we'll skip it for now.

For ** words**, left and right versions both use an accumulator that includes the words collected so far, plus a boolean indicating if we're currently handling whitespace:

import Data.Char -- (for isSpace) words_foldl :: String -> [String] words_foldl ln = fst(foldl step ([],True) ln) where step (acc, True) x | isSpace x = (acc, True) | otherwise = (acc ++ [[x]], False) step (acc, False) x | isSpace x = (acc, True) | otherwise = ((init acc) ++ [(last acc) ++ [x]], False) words_foldr :: String -> [String] words_foldr ln = fst(foldr step ([],True) ln) where step x (acc, True) | isSpace x = (acc, True) | otherwise = ([x] : acc, False) step x (acc, False) | isSpace x = (acc, True) | otherwise = ((x : (head acc)) : tail acc, False)

Again, the right-folding version is better, this time since `head` and `tail` are preferred over `init` and `last`.

Finally, for ** unlines**:

unlines_foldl :: [String] -> String unlines_foldl lns = foldl step [] lns where step acc ln = acc ++ ln ++ "\n" unlines_foldr :: [String] -> String unlines_foldr lns = foldr step [] lns where step ln acc = ln ++ "\n" ++ acc

At this point the book recommends Graham Hutton's paper:

ReplyDeletehttp://www.cs.nott.ac.uk/~gmh/fold.pdf

I think the answer to the first exercise is wrong, since it accepts input like "--0".

ReplyDeleteHi freich,

ReplyDeleteYou're probably right. This can be solved, for example, by adding a helper function so that the first template calls the helper, and the helper doesn't further support a leading minus sign.

Your groupBy example is much is easier to understand but I didn't want to keep running init, head, tail on every element so I used a tuple to keep the accumulator, the current list and the list head (or comparison value). Not sure if its more efficient. I also wanted to mimic the alder32_try2 example.

ReplyDeletemyGroupBy :: (a -> a -> Bool) -> [a] -> [[a]]

myGroupBy _ [] = []

myGroupBy p (x:xs) = let (acc,ls,lsh) = foldl step ([],[x],x) xs

in acc ++ [ls]

where step (acc',ls',lsh') x'

| p lsh' x' = (acc' , ls'++[x'], lsh')

| otherwise = (acc'++[ls'], [x'] , x' )

In you "naive" any implementation, you should switch the arguments of ||, to allow short circuiting.

ReplyDelete