How do you balance writing expressive code with good correctness guarantees? Haskell enables me to do this easily, which is why it’s one of my favourite programming languages. I used it to solve problems from Advent of Code, a programming advent calendar challenge I took part in. Here’s how I solved Day 14’s problem using Haskell’s exhaustive pattern matching and immutable data structures.
Advent of Code is a fun collection of programming problems released daily from 1 December until Christmas: An advent calendar for programmers! Problems start out easy and become harder toward the end.
Some participants are very competitive, racing to complete problems before anyone else can, and a global leaderboard lets you see how you’ve compared. Others take a more relaxed approach, exploring the problems and enjoying the fun and witty story that develops alongside as the days pass until Christmas.
Advent of Code problems are excellent programming Katas, exercises for honing your skills and refining your technique. From time to time, I revisit and refine my solutions in the spirit of Kata. The solution I discuss here is based on my initial solution but has some refinements.
I chose to complete the Advent of Code problems in Haskell because it provides a good set of exercises to explore Haskell’s unique advantages when writing safe code.
In particular, Haskell’s features make it easy to write correct code because it has:
- Strong static types that allow the compiler to check that function arguments are valid.
- Exhaustive pattern matching to ensure you handle all possible function arguments, preventing runtime errors.
- Immutable values preventing bugs caused by accidentally modifying shared data.
- Algebraic data types provide a transparent way to model problems and provide a natural scheme for writing DSLs and interpreting them.
- Powerful facilities for parsing and a strong ecosystem of parsing libraries make it easy to make sense of complex user input.
Day 14: Pattern matching and folding
On day 14 of Advent of Code you’re racing to the bottom of the ocean to save Santa’s sleigh keys, but the underwater pressure is too much for your submarine. You need to reinforce it with exotic elements produced by volcanic caves. To do so requires solving a chemistry problem.
Day 14’s problem is interesting because it requires thinking about how a string repeatedly changes when applying substitution rules to substrings. Part one can be solved with a direct approach that constructs a new string at each step. However, the straightforward approach is infeasible for part two because the string would become too long.
You are given a starting polymer as a sequence of characters, for example:
NNCB
In addition, there are substitution rules which apply to pairs of characters in a polymer, such as:
NN -> C
CB -> H
If the pair of characters on the left-hand side of a rule matches a pair occurring in the polymer, then the character on the right-hand side of the rule is substituted between those characters in the polymer. All applicable substitutions are applied simultaneously to get the next polymer.
The problem requires you to calculate counts of the most common and least common elements in the polymer that result after some number of rounds of substitution. Part one uses a small number of substitution rounds, but part two requires 40 rounds of substitutions. Because substitutions cause an exponential growth in the size of the string, if actual substitutions are used in part two, the polymer would be enormous after that many substitutions.
Fortunately, it is not required to find the actual polymer. Instead, you can track the counts of each ordered pair of characters in the polymer. The number of distinct pairs remains small even though their individual frequencies grow large.
Exhaustive pattern matching to avoid runtime errors
There are two steps to solving this problem:
- Finding the counts of sequential pairs of characters in the initial polymer and
- Iteratively substituting characters and updating the counts.
Haskell enables me to solve these two steps by:
- Using exhaustive pattern matching, and
- Using folds to process immutable data structures incrementally.
To count the pairs, here’s how I used Haskell’s exhaustive pattern matching:
type PairCounts = M.Map (Char, Char) Integer
type Polymer = String
countPairs :: Polymer -> PairCounts
countPairs [] = M.empty
countPairs [_] = M.empty
countPairs (c : tl@(d : _)) = M.insertWith (+) (c, d) 1 $ countPairs tl
There are three cases to match a polymer against:
- An empty string,
- a string with one character, and
- a string with at least two characters.
If the string has at least two characters, then the frequency of the next pair of characters is incremented in the map. By using the -Wincomplete-
patterns option for the GHC Haskell compiler and making warnings into compiler errors (-Werror
) you can be sure that if you miss a case in the pattern match, the compiler will catch the problem before your program runs.
Compiler flags can be configured using the stack build system for Haskell or by directly using the compiler on the command line.
The update to the map is immutable, returning a new map, but is also efficient, requiring O(logN)
time.
After finding the counts of sequential pairs in the initial polymer, it’s time to substitute the characters and update the counts. To do this, I used folds.
Folding for elegant incremental processing of data structures
In Haskell, folds are a common idiom for processing data structures. Folding requires a function that accepts an element from a collection and an intermediate result. The function uses these to produce the next result. The fold passes each element of the collection to the function to arrive at a final result. Folds are appropriate when you need to process a data structure and incrementally construct a result. In this case, each substitution rule might require updating an existing key in the map, so incremental updates are necessary.
Most built-in data structures in Haskell are immutable because this helps write correct code. In this case, I used a fold to process an immutable map of pair counts.
type Rules = M.Map (Char, Char) Char
applySubstitutions :: PairCounts -> Rules -> PairCounts
applySubstitutions pcs rs = M.foldrWithKey (\pr@(a, b) cnt acc ->
case M.lookup pr rs of
Nothing -> M.insertWith (+) pr cnt acc
Just repl ->
M.insertWith (+) (a, repl) cnt $
M.insertWith (+) (repl, b) cnt acc) M.empty pcs
Because the map of pair counts is immutable, there is no concern about accidental modification of state kept between substitutions and Haskell once again helps me avoid errors.
Using Haskell’s exhaustive pattern matching and immutable data structures enabled me to solve this problem and to do so in a safe way. As mentioned above, using Haskell’s unique features are a great way to avoid bugs and is one of the reasons I love it so much!
If you enjoyed reading about how I used Haskell to solve Advent of Code problems, check out the next two articles in the series where I solve Problems 16 and 18 by writing an interpreter for a DSL using Haskell’s expressive data modelling and using structural recursion on immutable trees, respectively.
Henry is a Scala engineer at GrapheneDB where he uses functional programming to manage graph databases in the cloud. He loves functional programming because it makes code robust and easy to reason about.