Notes on Chapter 1
What is functional programming?
- Constructing programs from nothing but pure functions
What is a pure function?
- A function that does not access global variables
- A function that does not modify its input
- A function that does not throw exceptions or modify program execution
- A function that does not read user input or write to the file system
- A function that does not draw on the screen or play a sound
- ... in short:
- A function that always produces the same result for a given input and has no effect other than returning its result
What are the benefits of functional programming?
- Code that is highly modular
- Modularity is a measure of how self-contained a code unit is, and thus, how easy that unit is to combine with other units
- For example, Lego(TM) blocks are very modular because each block is entirely self-contained:
- They have a well-defined boundary defined by the physical extent of the plastic that makes up the block
- They have a mechanism for connecting to other blocks via the cylindrical protrusions on top
- They have a mechanism for receiving a connection from other blocks via the void space on bottom
- Pure functions only have access to the inputs they declare and are therefore entirely self-contained and maximally modular
- Pure functions behave exactly the same way as lego blocks:
- They have a well-defined boundary defined by the signature and body of the function
- They have a mechanism for connecting to other pure functions via return values and function evaluation
- They have a mechanism for receiving a connection from other pure functions via input values and function evaluation
- Modularity is a measure of how self-contained a code unit is, and thus, how easy that unit is to combine with other units
- Code that is easy to test
- Pure functions cannot access global variables and are therefore easier to test because they require no specific global context to be set up before the run
- Pure functions always produce the same result for a given input and are therefore easier to test because the tests only need to be run once
- Code that is easy to reuse
- Pure functions have no side-effects, always produce the same result for a given input, and do not access global variables so they are easy to safely and confidently reuse anywhere in the program
- Code that is easy to parallelize
- Pure functions cannot cause side-effects such as modifying their inputs, therefore they require no data locking mechanism when used in a parallel context, thus making it easier to write parallel algorithms
- Code that is easy to reason about
- Reasoning about code is the act of mentally evaluating the unit and predicting the resultant state of the program
- Pure functions cannot modify their inputs, they cannot access global variables, and they cannot change program state, the result of evaluation of a pure function is entirely contained within its return value, reducing the total number of things required to be held in memory during reasoning to a bare minimum
- To reason about a functional program, one can make use of the substitution model of program evaluation in which one simply replaces the name and arguments of a pure function with the result of the application of the function, allowing one to confidently forget the definition of the function after the substitution has been made
- When reasoning about code written in other programming paradigms, global state must be remembered and procedure calls or method calls can have arbitrary effects on program state, eliminating the possibility of using the simple substitution model and increasing the difficulty of reasoning about the code
- A procedure in Procedural Programming or a method in Object-Oriented Programming doesn't need to return a value at all, often intentionally mutating state in a higher context, thus requiring the programmer to hold many more things in mind while reasoning about the code
- Code that is obviously correct
- Functional programming eliminates entire classes of bugs due to its helpful restrictions on global state, immutable data, exceptions, and side-effects
- Because pure functions cannot access global state, they cannot fail due to the order in which functions are called
- Because pure functions cannot access global state, they cannot fail due to running in an incorrect context
- Because pure functions cannot mutate the value of their inputs, they cannot cause other functions to fail when composed together by passing the output of one into the input of another
- Because pure functions must always produce the same result given the same input, their behavior cannot change during the runtime of the program
- Because pure functions are easier to test, they are more likely to be correct
- Because pure functions cannot throw exceptions, the program cannot fail due to a runtime exception
For an example of an impure program, see the git repository
A formal definition of (pure, mathematical) functions
- A function has exactly one type of input and exactly one type of output
- A function relates every value of the input type to exactly one value of the output type
- A function's output value is determined solely by its input value
- Changes in the state of internal or external processes are irrelevant to the computation of a function's output value
From here onward, there will be a precise distinction between different types of computer code
- "Function" will be used to refer to the above definition of a function
- "Procedure" will be used to refer to parameterized code that may have side-effects
- "Method" may be used to refer to a procedure attached to a class instance in an OOP context
One interesting thing to note, in a language like TypeScript, is that you can create a function without the need for the function
or () => {}
syntax. Consider the following code with respect to the definition of a function:
const not = { [true]: false, [false]: true } const catsAreDogs = false const catsAreNotDogs = not[catsAreDogs] // => true
How about a slightly more complex example?
const and = { [true]: { [true]: true, [false]: false, }, [false]: { [true]: false, [false]: false, } } const fishCanSwim = true const catsHaveFur = true const fishCanSwimAndCatsHaveFur = and[fishCanSwim][catsHaveFur] // => true const dogsMeow = false const fishCanSwimAndDogsMeow = and[fishCanSwim][dogsMeow] // => false
Even more complex?
const plus = { 0: { 0: 0, 1: 1, 2: 2, }, 1: { 0: 1, 1: 2, 2: 3, }, 2: { 0: 2, 1: 3, 2: 4, } } const zero = plus[0][0] // => 0 const one = plus[1][0] // => 1 const two = plus[1][1] // => 2 const alsoTwo = plus[0][one] // => 2 const three = plus[one][alsoTwo] // => 3
Are you feeling it now Mr. Krabs??
How to produce useful programs without effects
While changing process state must be irrelevant to the computation of a function's output value, this does not preclude the possibility of side-effects if they remain irrelevant to the computation!
- The program itself is a function
- An irrelevant side-effect is unobservable from inside the program
- Unobservable side-effects do not break our definition of a function
This is how functional programming is able to produce useful programs while remaining mathematically pure!
Functional purity implies a concept known as referential transparency:
- An expression is referentially transparent if, for all possible uses of the expression, and for every occurrence of the expression in a given program, one can substitute the result of evaluating that expression in place without affecting the meaning of the code
Functional purity and referential transparency enables the substitution model of reasoning about code, also known as equational reasoning:
- Equational reasoning, in the context of a computer program, is the process of drawing conclusions about the program by substituting equal parts of the code
- "equa", "equi", is the latin root in both "equational" and "equal" meaning "equal; the same"
For example, consider a function shout
which takes a string and produces a string:
const shout = (s: string): string => s.toUpperCase()
Consider an application of this function to an arbitrary string in an arbitrary program:
const shout = (s: string): string => s.toUpperCase() const message = 'learn fp for the good of humanity' const shoutedMessage = shout(message); console.log(shoutedMessage)
Substitute equals for equals:
const shout = (s: string): string => s.toUpperCase() const shoutedMessage = shout('learn fp for the good of humanity'); console.log(shoutedMessage)
Substitute referentially transparent expressions:
const shoutedMessage = 'learn fp for the good of humanity'.toUpperCase() console.log(shoutedMessage)
Substitute referentially transparent expressions:
const shoutedMessage = 'LEARN FP FOR THE GOOD OF HUMANITY' console.log(shoutedMessage)
Substitute equals for equals:
console.log('LEARN FP FOR THE GOOD OF HUMANITY')
As you can see, referential transparency and equational reasoning, which are enabled by writing pure functional programs, makes the code very simple to reason about.
Now let's take a look at an example of a function that's not referentially transparent and see how it makes the program harder to reason about.
const strings = ['hello']; // => ['hello'] strings.push('world') const var1 = strings; // => ['hello', 'world'] strings.push('world'); const var2 = strings; // => ['hello', 'world', 'world']
In this example, you cannot simply substitute any occurrence of strings
for its definition. You need to mentally manage the side effects of each statement and the state of the program to determine what a variable's value might be at any given time. Likewise, equals symbols don't actually mean equality! What kind of madness is that?!
We have strings = ['hello']
and var1 = strings
and yet, var1
evaluates to ['hello', 'world']
! To put it bluntly, the program is lying to us. Imagine someone else reading your code, jumping through tens of procedure definitions, desperately trying to understand what's happening. There is a better way.
This is one of the reasons we choose functional programming. The substitution model is much simpler to reason about because we do not need to mentally simulate sequences of state updates to understand a block of code. A given function can be evaluated easily by substituting its arguments into its body.
The role of types in functional programming
Modularity and composition are two important concepts in functional programming. Modularity being the ability to separate and reuse a unit of code, and composition being the ability to combine a unit of code with another such that the output of one is the input of the other.
With type safety we are able to treat a function as a black box, meaning we don't need to know the implementation of the function to determine whether it results in a valid program. This enables fearless composition and increases the utility of modularity.
A type system is like a sort of compile-time unit testing framework. In common unit tests you might "arrange, act, and assert" to prove a theorem about a unit of code. We can do the same with a sufficiently advanced type system. In fact, a program is a proof, and the type for the program is the formula it proves.
To put it less formally, the return type of a function is analogous to a logical theorem, subject to hypotheses corresponding to the types of the argument values passed to the function; and the program to compute that function is analogous to a proof of that theorem. This is an observation known as the Curry-Howard Correspondence and it is the basis for all modern type theory.
A simple way to remember this is that whenever you see a function (remember the definition of a function), think "implies". Given a function, type F<A, B> = (a: A) => B
, think of the phrase "F states that A implies B"
Let's see how this works with a concrete example.
// The simplest theorem I can think of, simply stating that values of type T exist, meaning they can be constructed either manually via declaration, or as the result of evaluating an expression type Exists<T> = T; // Now here is function using this theorem, proving it correct for Boolean values: const booleansExist = (b: boolean): Exists<boolean> => b;
Given the Curry-Howard Correspondence, the fact that we are able to write a function which satisfies the return type means that we have proven the theorem it represents!
In other words, the function "booleansExist" proves the theorem that, given any boolean value (input), there exists a corresponding boolean value (output) related to it by taking the value and doing nothing to it (body).
While this may be a simple example, much more complex types can be created corresponding to much more complex theorems. We'll get into some truly mind-boggling examples in TypeScript later on.