Haskell Folds Illustrated
Introduction
Now that I’ve submitted my PhD thesis for examination (finally!) I can spend some time fiddling around with things that are unrelated to it. After a hiatus of several months I’m finally back to playing with Haskell.
One of the things that make Haskell notoriously hard and non-intuitive to learn is the idea of functions that act on other functions (higher order functions). Folds were the first thing that really made things fit together in my mind and helped me understand the strengths of functional programming and why people rave about elegance in haskell.
Haskell has two types of folds: the left fold and the right fold with slightly different function signatures. Folds take a function that takes two arguments and returns a third (a -> b -> a)
, a value that can be considered a “seed” value, and a list. For the right fold, the seed has to be of the same type as the second argument of the function and the list elements have to have the same type as the first argument of the function. For the left fold, this is reversed. The seed has to be of the same type as the first argument to the function and the list elements have to be of the same type as the first argument of the function.
foldl :: (a -> b -> a) -> a -> [b] -> a
foldr :: (a -> b -> b) -> b -> [a] -> b
Folds work like knocking down an arrangement of dominoes. It takes the seed value and the first element in the list and applies the function you have it. Then it repeatedly applies the function to the result and the next item in the list until all items in the list are consumed.
The Left Fold - Illustrated
graph TD
BN["B[N]"] --- FN
FN[fun :: b -> a -> b] -.- |"B[N-1]"|F2
FN --- AN["A[N]"]
F2[fun :: b -> a -> b] --- |"B[1]"|F1
F2 --- A1["A[1]"]
F1 --- S["B[0]"]
F1[fun :: b -> a -> b] --- A0["A[0]"]
The Right Fold - Illustrated
graph TD
BN["B[N]"] --- FN
FN[fun :: b -> a -> b] -.- |"B[N-1]"|F2
FN --- AN["A[0]"]
F2[fun :: b -> a -> b] --- |"B[1]"|F1
F2 --- A1["A[N-1]"]
F1 --- S["B[0]"]
F1[fun :: b -> a -> b] --- A0["A[N]"]
Applications
Summing lists
A standard illustrative example is how a fold can be used to sum a list.
Following the diagram, foldl (+) 0 [1, 2, 3, 4, 5]
would evaluate as 15. The fold function applies the (+)
operator to 0 and the 1 (the first element). Then apply (+)
to the result and the second element in the list and so on until we get the final result.
Integration
A neat extension to summing lists is numerical integration. The simplest integral of an signal $x[t]$ (represnted by a list of values) with a sampling time $\Delta T$ can be expressed very simply as a fold (combined with a map).
The equation
$$ S = \sum_{i=0}^{N} x[i] \cdot \Delta T $$
Turns into
simpleIntegral :: Double -> [Double] -> Double
simpleIntegral dT fvals = foldl1 (+) (map (*dT) fvals)
Here, foldl1
is a version of fold that takes the first element of the list as the seed.
Scans and Filtering
From the illustrations of the left and right folds you can see that the folds produce a set of intermediate results that get fed back into the function. When having the intermediate results are useful, we have scan
. scan
is a fold that returns the intermediate results as a list rather than just the final result. This is useful in a lot of filtering operations; for instance to calculate the exponential moving average of stock prices.
$$\begin{equation*} S_t=\begin{cases} Y_1 \quad &\text{if} , t = 0 \\ \alpha Y_{t} + (1-\alpha) S_{t-1} &\text{if} , t \gt 0 \\ \end{cases} \end{equation*}$$
The EMA equation can be implemented as:
ema :: Double -> [Double] -> [Double]
ema alpha vals = scanl1 (\prev curr -> alpha*curr + (1-alpha)*prev) vals
Summing Up
Interestingly, the Complementary Filter - often used to measure tilts using IMU sensors follow a very similar format and folds can be used to implement these type of filters. This is the biggest strength of using higher order functions. They capture commonly used patterns of calculations in a very general way. If you use a fold, the Haskell compiler will produce highly optimized code to run your calculation. If you were to use hand-written loops - especially in programming languages like C/C++ - you might write code that is inefficient or segfaults. Higher order functions are also very general. Haskell’s foldl
and foldr
works on any data type at all. This is the reason the type signatures for these functions don’t mention any concrete types. While I’ve only applied these functions to Double
data types, you can use this pattern on Strings, integers and on random custom defined data types. Higher order functions are also elegant. Rather than writing an ugly loop, I could write a one liner to do integration or complementary filtering. This is actually quite similar to how MATLAB scripting lets me express complicated operations like a matrix multiplication with a single operator.