Saurabh Nanda Personal Blog

Dhall: A Gateway Drug to Haskell

Table of Contents

The problem with escape hatches

Haskell is hard. In fact, I have a 45-min talk from Functional Conf 2019 about why learning Haskell is hard and how to deal with its steep learning curve.

But recently, I’ve realized that learning typed functional programming the correct way is hard irrespective of the language. Take Kotlin, for example. It has so many escape hatches that someone new to functional programming can end-up writing what is essentially an OOP/imperative style Java program, merely in Kotlin syntax. One simply feels like they’re doing functional programming without actually doing it.

I have witnessed this first hand when I helped transition an Android development team from Java to Kotlin last year.

It’s just that Haskell gets a bad reputation because it has almost no escape hatches and forces you to do things the “right” way.

Learning Haskell directly can be a very frustrating experience for all, but the most headstrong. Trying to directly introduce Haskell to a large team is almost guaranteed to fail.

However, there are a few languages, heavily inspired by Haskell, that have a much gentler learning curve. And these aren’t merely learning tools. They are production-grade languages that you can use to build applications and solve real-world problems. Once you grasp the fundamentally different concepts of typed functional programming in these other languages, approaching Haskell becomes simpler.

For the frontend, Elm is a good stepping stone towards Haskell.

On the server-side, I’ve recently discovered Dhall to hit a certain sweet spot as well.

Even though it’s in the “server-side” category, Dhall cannot be used to write microservices or API servers. It is a specialized language for the primary purpose of managing large and complex configuration files (which is what most backend/server-side developers need to deal with).

Why Dhall is useful irrespective of functional programming

Using a specialized tool to manage configuration files might seem like overkill to someone who has not dealt with this problem. I was in this camp a few years ago1

Then I met Kubernetes.

Seriously. If you want to get a first-hand feel of “configuration explosion”, and its ensuing chaos, try writing Kubernetes config files for a set of services across multiple environments (eg. dev, staging, and production).

I’m not saying that Dhall is used to to manage only Kubernetes config files. It’s just that dealing with Kubernetes’ configuration mess helps you immediately understand the power and use-case of something like Dhall.

If you work in a mid/large sized company that has moved all their infrastructure to the cloud, there is a very high chance that you are already dealing with a configuration explosion, irrespective of whether you’re on Kubernetes, or not.

Ensuring different config files agree with each other

One needs to ensure that configuration values across different configuration files agree with each other. For example:

  • the “topic” (or queue) to which service A is publishing should be the same to which service B is subscribed.
  • the route/path on which your app is serving some sensitive content should be the same route/path that nginx is protecting via basic auth2
  • the port at which your service is listening should be the same port exposed by the container, which should, in turn, be the same backend port being used by your ingress (or load-balancer)

Managing repetition within the same config file

Again, taking an example from the Kubernetes world: there is huge amount of repetition within Kubernetes config files. One needs to ensure that a pariticular key has the same value across multiple stanzas. If not, your service won’t work properly at run time, i.e. your kubectl apply command will work but you’ll spend hours debugging where you forgot to copy-paste the same value.

Managing duplication between config files of different environments

Take a diff between your development, staging, and production config files. What percentage of the file is duplicated? More than 70%? Even if the values are different, every single key is duplicated.

And what happens when you add a new key to the development config but forget to add it in the production config?

Dealing with malformed config files at run-time

If you’re writing YAML files by hand, small typos and indentation errors can cause the entire YAML to become invalid and your deployment to fail.

Same story with trailing commas and unterminated strings in JSON.

Granted that these types of errors can easily be caught by linters, but how many of us habitually set-up such tooling around each and every config file that we manage. We usually realize our mistake once our application refuses to start.

Dealing with structurally incorrect configuration files

Even if our config file is syntatically correct, it is very easy to end-up with a config file that is structurally incorrect, i.e. it does not conform to the schema/shape that the application is expecting.

For example, adding a key at the incorrect level of nesting (very easy mistake - especially with YAML). Or, adding a single scalar value instead of a list with a single element (if the application is expecting the value to be a list).

Again, you can write a separate schema for your config file (for example using json-schema) and validate the structure before-hand, but how many of us really set-up such tooling?

Ensuring no-surprises when promoting code + config artifacts to higher environments

Finally, if your config files are maintained completely separately for each environment, then you will be introducing a completely untested, configuration file directly in the higher environment. That is to say, there is no way to ensure that if the combination of code + config has been tested in the lower environment, then it will largely work when promoted to a higher environment.

However, if all your configuration files are generated from the same single source of truth, then apart from obvious mistakes in values (eg. incorrect IP address of the DB), large parts of your configuration would have already been tested in lower environments. Thus, giving you very few surprises as you promote the combination of code + config to higher environments.

To summarize, Dhall ensures that your configuration is correct by construction, and is guaranteed to be valid and consistent, ahead-of-time (i.e. before your applications reads the config file).

How is Dhall similar to Haskell?

Now, let’s quickly get a feel of what the same code looks like in Dhall and Haskell.

Here’s a Dhall code snippet to prepare two configuration files for two different applications that need to agree with each other. Firstly, both the services should be reading/writing to the same topic/queue. Secondly, both the services should log to the same place to allow for better debugging.

Now, here’s the same code snippet in Haskell. Read through the inline comments to understand why even though the Haskell code is conceptually the same, it would be much harder to write for a new-comer.

Why is Dhall a gateway drug to Haskell?

Based on my first-hand experience it is not enough for a language to have opt-in functional programming features. Folks with years of OOP/imperative-programming experience, will naturally gravitate towards solving problems using approaches that come naturally to then. Therefore, it is important to use an FP language that forces FP idioms upon you.

Haskell has those forcing functions. But it also has a very, very steep learning curve.

Dhall, on the other hand, forces you to start learning, applying, and practicing 3 functional programming concepts with a much gentler learning curve, better developer experience, and while delivering something of business value (i.e. better configuration management).

What core/common FP concepts does Dhall have?

  • Immutability: This is the very first stumbling block of every imperative programmer. At the beginning, it’s almost inconceivable how it would be possible to achieve anything useful in a programming language that doesn’t allow variable values to be changed.
  • Explicit nulls: Another beginner stumbling block. A langauge without implicit nulls really forces you to start thinking about how to deal with values that may be missing and whether its really worth littering the internals of your code will null checks (versus rejecting null values at the boundary of your program).
  • Currying & lambdas: Compare the syntax of defining functions in Haskell with that of Dhall. While every function in Haskell is curried, by default, the language syntax does not make it obvious. And this also leads to some incomprehensible error messages. On the other hand, while Dhall’s syntax for defining functions may seem a little noisy at first, it makes it obvious to a newcomer what is going on.
  • Functional purity: This is one of those FP features that generates quite a lot of debate. A number of awesome features in languages like Haskell and Purescript are possible only because of functional purity. Even Dhall forces functional purity upon you, but in a more manageable way. In fact, there is very little I/O that you can do in Dhall in the first place.
  • Homogeneous lists & sum/union types: A stumbling block for folks coming from dynamically typed languages, where lists/arrays can be heterogenous in nature. The constraint of having homogenous lists is one of the reasons why people start to explore sum/union types and this paves the way to using a rich type system for other advantages, as well.
  • Everything is an expression: This is subtle and folks take some time to realize this. Sandeep has written a long-ish take about this over at his blog in the section titled “About Haskell Functions”
  • Forced usage of the type-system: If you start following discussions about the design of Dhall language over at the Dhall Github repo or listen to some talks by Gabriella, you’ll realize that they try very hard to design the language in such a way that developers are forced to the “Right Thing” (tm). For example, Dhall doesn’t have any way to compare the equality of two String/Text values! The rationale being, that this is an easy escape hatch to circumvent the strong type-system and fall back to “stringly” typed programming.

Slides from Functional Conf 2022 on the same topic


Footnotes

  1. As can be clearly seen by this question that I asked on the Dhall Github repo. 

  2. We’ve all been there. Don’t lie! 

  3. Most of these basic FP concepts are really about practicing and building muscle memory. If you don’t use them to solve real-world problems on a daily basis, your mind will not get re-trained to naturally solve problems using FP idioms.