Notes on Chapter 2
With pure functions, how do we write the simplest program?
How can we change our view of programs as a sequence of instructions, each of which cause some effect, into a view of programs as a composition of pure functions?
Should I implement the abs function from page 2/9?
Code on the left-hand side of an equals sign is referred to as its "Signature", code on the right-hand side of an equals sign is referred to as its "Definition".
Example: const signature: string = 'definition';
Despite type inference, it is generally good practice to declare the return type of functions you expect other people to use.
Functions that return void and have side-effects are often referred to as procedures to emphasize the fact that they are impure.
Functional Program State
Suppose we want to write a function to calculate the factorial of a given number:
const factorial = (n: number): number => { const loop = (n: number, state: number): number => n <= 0 ? state : loop(n - 1, n * state); return loop(n, 1); };
Notice that we do not use a mutable loop variable in this example, yet we're able to "loop" or iterate from number 1 to n. The key to functional iteration is recursion.
Another notable aspect of this code is the declaration of the inner function called loop
: in TypeScript we can define a function in any block of code, including inside another function. Scoping rules keep the identifier (loop
) local to the function (factorial
) just like a local variable. This means the declaration is not a side-effect, because it does not affect the global context of the program!
Already you might be starting to get an understanding of how it is possible to perform stateful calculations with pure functions. Indeed, the total number accumulated so far during the calculation of the factorial is a form of program state!
Aside: now would be a good time to talk about tail-call optimization, the ECMAScript specification, and browser support for tail-call optimization, however, this will be left as an exercise for the reader. The important thing to note is that recursion of this nature is not in danger of overflowing the maximum number of call stack frames if you're using a runtime following the ECMAScript specification, as recursion in the "tail-call position" is transformed into the same lower representation as a while loop.
Denotational vs Operational Semantics
There's an important distinction between functional and imperative code, specifically a distinction in the meaning of the source code itself:
- Imperative source code is composed of statements. The meaning of these statements are that the machine should perform the corresponding operations, e.g. "allocate memory for this variable", "place this value in this memory location", "jump to this memory location", etc. This is called operational semantics.
- Functional source code is composed of expressions. The meaning of these expressions are that the given expressions are true, that the given relationships exist, e.g. function signatures and function bodies:
- "factorial is a relationship between a number and a number"
- "the inner loop of the factorial is a relationship between two numbers and a number"
- "the factorial, for n to zero, is, given zero, the current accumulated state, or given a number greater than zero, the factorial of that number with the given state multiplied by the number"
- This is called denotational semantics.
Higher-order Functions
Functions are values. Functions can be assigned to variables. Functions can be passed to other functions. Functions that accept other functions as values are called Higher-order Functions (HOFs).
Suppose we have two functions that both perform a calculation and print the result to the console:
const addThreeAndPrint = (x: number) => { const result = x + 3; console.log(`The result is: ${result}`); }; addThreeAndPrint(2); // => 5 const multiplyTwoAndPrint = (x: number) => { const result = x * 2; console.log(`The result is: ${result}`); }; multiplyTwoAndPrint(2); // => 4
Noticing that the two functions are nearly identical, we can generalize the shared code as a function:
const calculateAndPrint = (n: number, f: (x: number) => number) => { const result = f(n); console.log(`The result is: ${result}`); };
Now our function calculateAndPrint
receives a function to perform a calculation and handles the duplicate responsibility in the previous examples, thus reducing the total code and promoting reuse! Let's see what the resulting code looks like:
const addThree = (x: number) => x + 3; const multiplyTwo = (x: number) => x * 2; const calculateAndPrint = (n: number, f: (x: number) => number) => { const result = f(n); console.log(`The result is: ${result}`); }; calculateAndPrint(2, addThree); // => 5 calculateAndPrint(2, multiplyTwo); // => 4
Excellent. Here, calculateAndPrint
is an example of a higher-order function: part of its definition is receiving another function! This is a way to compose functionality in a purely-functional way. Think of it as plugging a cord into a socket, or clicking a lego block into place.
Aside: you may be wondering why certain variable names are chosen, such as 'x', 'n', 'f', etc. These are just conventional names, mostly for convenience. The values they represent tend to be rather abstract and naming such abstract things tends to be hard to do in a concise manner.
Polymorphic Functions
- Monomorphic functions operate on specific types of data
- Polymorphic functions operate on generic types of data
Aside: polymorphism is a broad concept and has multiple meanings. Remember the meaning of the roots of the word in order to better remember the meanings in the different contexts. "Poly" means "many" and "morph" means "form", thus the meaning of "polymorphic" in a given context can usually be deduced by considering the pieces that can take many forms. In a functional context, with respect to the signature of a function, "polymorphic" refers to a parameter which does not have a single concrete type but can instead be one of many different types.
An example of a monomorphic function would be:
const findFirst = (strings: string[], key: string): number => { const loop = (n: number): number => { if (n >= strings.length) return -1; else if (strings[n] === key) return n; else return loop(n + 1); }; return loop(0); };
As you can see, the function takes specific types of data: a string array to search from, a string key to search for, and it gives back a number indicating the index of the found string (or -1 if not found).
Notice how we avoided imperative looping again.
Now image we wanted to be able to search for numbers in an array of numbers:
const findFirst = (numbers: number[], key: number): number => { const loop = (n: number): number => { if (n >= numbers.length) return -1; else if (numbers[n] === key) return n; else return loop(n + 1); }; return loop(0); };
As you can see, this is very much the same function. The only difference is in the type of the array and the type of the key we're searching for. This is a great use case for a polymorphic function which would allow us to search an array of any type for a value of that type!
Let's see what that looks like:
function findFirst<T>(items: T[], comparator: (item: T) => boolean) { const loop = (n: number): number => { if (n >= items.length) return -1; else if (comparator(items[n])) return n; else return loop(n + 1); }; return loop(0); }
The one important difference to note is that we can no longer use the standard TypeScript triple-equals operator to compare because we have to be able to handle arbitrary types. We change the second parameter into a function (making findFirst
a higher-order function!) which takes an item in the array and returns a boolean value indicating whether we've found the right one.
This is the correct way to preserve the semantics, or in other words the meaning or purpose, of the two original functions, otherwise it would behave differently for value types vs reference types. There's a much deeper philosophical consideration to be had of the notion of equality that will, for now, be left up to the reader but it will become important again later on.
For now, take notice of the fact that the universe of possible implementations is significantly reduced when implementing a polymorphic function. In fact, the only operations you can perform on the polymorphic type are the ones passed into the function as arguments![1] You may find that some type signatures constrain a function such that there is only one valid implementation.
[1] Except, of course, for the methods that are built-in to all types in TypeScript, but this is an implementation detail of the language that is not relevant in general to functional programming
In fact, why don't we take a look at such a function with only one valid implementation? We'll start with the type definition:
type Partial<A, B, C> = (a: A, f: (a: A, b: B) => C) => (b: B) => C;
This is a bit of a leap in complexity, considering we're dealing with three type parameters and a higher-order function, so let's unpack it:
- This is the type of a function, it has inputs and an output
- It has three type parameters, A, B, and C
- There are two inputs: a value of type A, and a function of type (A, B) => C
- There is one output: a function of type (B) => C
Interesting! Now let's consider what kind of function we could write that would conform to such a highly-generic signature. Where can we get a value of type (B) => C inside a function whose only inputs are a value of type A, and a value of type (A, B) => C? We aren't allowed to use global variables, so we can't reference one that already exists. We could try to declare a function that fits the signature and return that but that wouldn't work either because we have no means to construct an arbitrary type C! Well, maybe we can use the hints given to us by the type signature, considering that the name of the type is "Partial" and the arguments seem to be connected. Our first argument is of type A, and the second argument is a function which takes an argument of type A!
So let's try this, we'll declare a new function which takes an arbitrary type B, but we need a way to get a value of type C:
function partial<A, B, C>(a: A, f: (a: A, b: B) => C): (b: B) => C { const g = (b: B): C => ??? return g; }
Luckily, we have a function that produces a value of type C: the second parameter to partial
! All we need to do is take the function f
and use it inside our new function g
. When g
is called, it will have access to both the value of type A from partial
and the value of type B from g
! In fact, this is the only possible implementation[2].
function partial<A, B, C>(a: A, f: (a: A, b: B) => C): (b: B) => C { const g = (b: B): C => f(a, b); return g; }
Aside: when declaring a function inside a function, the inner function has access to all the arguments of the outer function. This is called "closure". If you think about it, it makes a lot of sense. The definition of the inner function is part of the body of the outer function. The inner function comes into existence because of the evaluation of the outer function so it seems almost obvious that the inner function would have access to anything in scope in the body of the outer function.
[2] Again, disregarding the TypeScript/JavaScript specific implementation details such as the prototype chain. We strive to learn generally-applicable techniques because the return on the investment of time and energy required to learn is much greater. We'll be able to write modular, testable, maintainable, parallelizable code in any language, rather than limiting ourselves to languages that support (for example) prototype-based inheritance
Let's look at another example. Again we'll start with the type signature and see if we can determine what the allowable implementation would be:
type Compose<A, B, C> = (f: (b: B) => C, g: (a: A) => B) => (a: A) => C;
Considering the type signature:
- This is the type of a function, it has two inputs and one output
- There are three generic type parameters,
A, B, C
- The inputs are also functions, a function
f
of typeB => C
and a functiong
of typeA => B
- The output is also a function, a function of type
A => C
We should note, again, that this function is highly generic, meaning the universe of implementations is going to be very narrow. Indeed, this is another example of a function with only a single valid implementation.
We need to return a function which takes a value of type A and returns a value of type C:
function compose<A, B, C>(f: (b: B) => C, g: (a: A) => B): (a: A) => C { const h = (a: A): C => ??? return h; }
There is no way to construct an arbitrary type C, we will need to make use of the functions passed in as parameters. Luckily, we have exactly what we need: f
takes a value of type B and creates a value of type C, and g
takes a value of type A and creates a value of type B. All we have to do is call function g
and pass the result to function f
! Then our return value will be the correct type!
function compose<A, B, C>(f: (b: B) => C, g: (a: A) => B): (a: A) => C { const h = (a: A): C => f(g(a)); return h; }
Let's go over one more example of highly generic functions that only seem to have one correctly implementation to really drive the point home. We'll start with the type:
type Curry<A, B, C> = (f: (a: A, b: B) => C) => (a: A) => (b: B) => C;
Considering the type signature:
- This is the type of a function, it has one input and one output
- Input:
(a: A, b: B) => C
- Output:
(a: A) => (b: B) => C
- Input:
- The output itself is a function, which also has one input and one output:
- Input:
(a: A)
- Output:
(b: B) => C
- Input:
- The output of that function, likewise, is a function with one input and one output:
- Input:
(b: B)
- Output:
C
- Input:
- There are three generic type parameters,
A, B, C
Given the highly generic type signature, we again assume the universe of implementations is highly limited. Let's give the implementation a try:
function curry<A, B, C>(f: (a: A, b: B) => C): (a: A) => (b: B) => C { const g = (a: A) => ??? return g }
So we know we must return a function which takes a value of type A
and returns a value of type B => C
. The question is, how do we get that value? We have no means of constructing an arbitrary generic primitive value (that would require that every type in our language has e.g. a new
operator a.k.a. constructor), so clearly our options are limited to functions.
Observing that our input is a function with a very relevant signature, there must be a way to implement the given type. The input function has the two parameters we need, but it expects them at the same time. Given the principles of functions-as-values and closure, it's clear we have an answer. We'll simply return a function which takes the first argument a: A
then passes back another function which takes the argument b: B
and applies our input function to get the needed output value C
!
function curry<A, B, C>(f: (a: A, b: B) => C): (a: A) => (b: B) => C { const g = (a: A) => { const h = (b: B) => { return f(a, b); }; return h; }; return g; }