Memoization and life actuarial software

Memoization is storing the results of functions to avoid recomputing them later on. It is often used to speed up recursive calculations. Let's do a simple calculation to see how it works.

A person dies with probability t100\frac{t}{100} each year. What is the probability of being alive at the beginning of each year for 5 years?

This can be calculated without recursion, but we use recursion here to illustrate something.

Functions are named in a way that might be familiar to actuaries but here is a quick explanation:

  • q(t) is the mortality rate at time t, the percentage of people that die each year
    • 1-q(t) is the probability of not dying in year t
  • pols_if(t) is how many policies there are at time t
    • In this case we start with a single person
    • pols_if(0) == 1 by assumption
    • pols_if(1) == pols_if(0)*(1-q(0))
    • pols_if(2) == pols_if(1)*(1-q(1))
      • And so on

Recursion with no cache

def q(t):
    return t/100

def pols_if(t):
    print(f"pols_if({t})")
    if t == 0:
        return 1
    return (1 - q(t)) * pols_if(t-1)

Calculate probabilities:

[pols_if(t) for t in range(5)]

See that it is recomputing the same values over and over again.

pols_if(0)
pols_if(1)
pols_if(0)
pols_if(2)
pols_if(1)
pols_if(0)
pols_if(3)
pols_if(2)
pols_if(1)
pols_if(0)
pols_if(4)
pols_if(3)
pols_if(2)
pols_if(1)
pols_if(0)

[1, 0.99, 0.9702, 0.9410939999999999, 0.9034502399999998]

pols_if(0) is called 5 times, since the recursion must hit the base case before returning, and we call the function 5 times. pols_if(1) is called 4 times, pols_if(2) is called 3 times, and so on.

Overall the calculation calls pols_if 1+2+...+n1+2+...+n times, giving it a time complexity of O(n2)O(n^2). The space complexity is O(n)O(n) due to the stack space used in recursive calls.

Recursion with cache

The @cache decorator makes it so that if the value of a function call has been computed once, it will be stored and used again. So pols_if(1) will only call pols_if(0) once and then the next time pols_if(1) is called, the stored value will be used and pols_if(0) will not be called again.

from functools import cache

@cache
def pols_if_cache(t):
    print("pols_if", t)
    if t == 0:
        return 1
    return (1 - q(t)) * pols_if_cache(t-1)

Calculate probabilities:

[pols_if_cache(t) for t in range(5)]

pols_if_cache is only called once for each value of t.

pols_if 0
pols_if 1
pols_if 2
pols_if 3
pols_if 4

This gives our calculation a time complexity of O(n)O(n). The space complexity is O(n)O(n) due to the values in the cache.

Recursion with LRU cache

The @lru_cache decorator is similar to @cache, but it has a limit on the number of values it can store. If the limit is reached, the least recently used value is removed from the cache. This is useful if you don't want to store all the values in the cache.

from functools import lru_cache

@lru_cache(maxsize=1)
def pols_if_lru1(t):
    print(f"pols_if({t})")
    if t == 0:
        return 1
    return (1 - q(t)) * pols_if_lru1(t-1)

Calculate probabilities:

[pols_if_lru1(t) for t in range(5)]

See that pols_if_lru1 is only called once for each value of t.

pols_if(0)
pols_if(1)
pols_if(2)
pols_if(3)
pols_if(4)

[1, 0.99, 0.9702, 0.9410939999999999, 0.9034502399999998]

The time complexity is O(n)O(n), but the space complexity is O(1)O(1) since only one value is stored in the cache. Here is an animation showing the difference between the regular cache and the LRU cache.

For larger objects the benefits of the LRU cache are more apparent. If we had 1,000,000 64-bit floating point values then the LRU cache would use 1,000,000×64 bits=8 megabytes1,000,000 \times 64 \text{ bits} = 8 \text{ megabytes} no matter how many timesteps there were. But if we ran a monthly simulation for 100 years, then there are 1200 timesteps, and caching every value (no LRU) would use 1,000,000×1200×64 bits=9.6 gigabytes1,000,000 \times 1200 \times 64 \text{ bits} = 9.6 \text{ gigabytes}.

Issues related to this have come up in the past, see this post from the modelx blog.

Performance

%timeit [pols_if(t) for t in range(5)]
%timeit [pols_if_cache(t) for t in range(5)]
%timeit [pols_if_lru1(t) for t in range(5)]

Yields

1.7 µs ± 2.79 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
303 ns ± 0.811 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
987 ns ± 3.31 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Is recursion with memoization really necessary here?

We could also do something like np.cumprod(1-q), but actuarial calculations are generally a bit more complex than the example given. We discuss recursion with memoization because it is a general approach that is easy to code. Which is probably why actuarial software has users define functions recursively.

Actuarial software

Life actuaries perform recursive calculations inside custom development with built-in memoization, one example of open source actuarial modeling software is modelx.

modelx is a numerical computing tool that enables you to use Python like a spreadsheet by quickly defining cached functions. modelx is best suited for implementing mathematical models expressed in a large system of recursive formulas, in such fields as actuarial science, quantitative finance and risk management. - modelx documentation

There is a modelx plugin for the Spyder IDE which provides features not normally seen in environments like VS Code, like a dependency graph of the function calls. The modelx blog has posts arguing that its system is better than just using Python for these types of calculations.

Several actuaries have made demonstrations of caching recursive functions outside of modelx:

Microsoft Excel and Recursion

Considering that "Use Python like a spreadsheet!" is the slogan of modelx, I thought it worth noting that some computer science professors advocate for teaching dynamic programming through the use of spreadsheets (Dynamic programming from an Excel perspective). For context, dynamic programming problems are often solved via recursion with memoization. People sometimes argue about what is/isn't dynamic programming so I've refrained from using the term to describe the previous probability calculations.

Here is a spreadsheet easily calculating pols_if from our example.

The spreadsheet is a bit simpler than the Python code, but generally actuaries want to do this calculation for a large number of policies and spreadsheets are not designed for this.

Thanks for reading

The computations surrounding the actuarial profession haven't received much attention from a computer science perspective. I think it's just that the actuarial profession is a small niche and there aren't many people who are interested in both actuarial science and computer science. Luckily, there is a LinkedIn group for the Actuarial Open Source Community.