WARNING: The text below provides a guidance for a Project Euler problem.
If you pride yourself in tackling problems entirely on your own, you are encouraged to exit this post right now.
If you are simply looking for a hint for the problem, please visit the »Project Euler’s official forum«. You may find some hints here, but if you don’t want your problem spoiled, scroll cautiously.
If you are looking for a readily available copy/paste answer, you will not find it here. All of the code snippets in the post have been stripped of some crucial lines (these are clearly marked). It is my aim that anyone able to replicate the answer based on such snippets will have to understand the solution conceptually. At that point, by virtue of learning something new, I believe he deserves it anyway.
Guidance for future problems will not be published before 100 people have solved the problem and there are at least two more recent problems.
The problem tasks us with counting the function ; the number of lambda-terms that can be written with at most symbols, up to -equivalence. Even though the problem statement involving the -equivalence may look appalling at the first glance, it actually brings major simplifications.
The first thing to realize is that any -equivalence class (which we are trying to count) has a representative lambda expression where all of the variables in abstractions are distinct (Visually, any symbol appearing in the expression is followed by a different variable). This follows straight from what -equivalence does: allowing us to rename any free variables.
The fact that any expression is only of three types (pure variable, application or abstraction) gives us a nudge that the problem could be solvable by dynamic programming, and it is indeed the case. We will consider the function giving the number of lambda-terms (up to -equivalence) of length , such that they may contain further distinct variables from any previous abstractions. Then, quite clearly:
and what remains is to find a suitable recursive relation for . Distinguishing by the cases, we find:
where the break-down in the last case comes from our expression being either an application or abstraction. If it is abstraction, it must be of the form , thus we have used 5 symbols, increasing the number of abstracted variables by . If it is an application, it must be of the form , where we have used 2 symbols (and thus the lengths of the two groups must add up to ) and didn’t change the number of previously abstracted variables.
To speed up our future implementation and to make fewer recursive calls, we can also make use of the symmetry in the sum , rewriting it as:
This cuts the number of recursive calls roughly in half. We now estimate the total number of operations to calculatte using this recurrence. There is at most values that we will need to calculate ( options for the number of symbols, options for the number of abstracted variables, as each abstraction takes 5 symbols). For each of these, we will make (at most) recursive calls, with the average about calls. Overall, for given by the problem, we thus expect an upper bound of around calls to take place. This turns out to be fast enough indeed.
The following implementation with memoization runs in about 2 seconds using PyPy3. The number of times that was called is 267268795, consistent with the prediction.
The same approach can be implemented very succintly in the Julia programming language, which gives the answer after approximately 4 minutes.