Open In Colab

Fast Computation of the Lagged Fibonacci Generator¶

We define the sequence of a lagged Fibonacci generator as:

$$ S_n = \begin{cases} 1, & 0 \le n < D \\ S_{n-D} + S_{n - D+1} \pmod m, & n \ge D \end{cases}, $$

where $m \in \mathbb{N}^+$ is a modulus larger than 1.

Naive Implementation¶

Below we show a simple example with a naive implementation of the sequence:

In [5]:
D = 4
n = 50
m = 1234
In [6]:
from typing import Callable
import time

def naive_lfib(D: int, m: int) -> Callable[[int, bool], list[int] | int]:
    """
    Creates a Lagged Fibonacci Generator (LFG) function.

    Args:
        D (int): The lag period of the generator. Must be >= 2.
        m (int): The modulus to apply for generating the sequence. Must be > 0.

    Returns:
        Callable[[int, bool], list[int] | int]: A generator function that produces
        the sequence up to the nth element or the last element only.

        The generator function has the following parameters:
            n (int): The number of terms to generate in the sequence. Must be >= D.
            full_sequence (bool): Whether to return the full sequence or just the nth term.
                Defaults to True (return full sequence).

    Examples:
        >>> gen = naive_lfib(5, 100)
        >>> S = gen(10)
        >>> print(S)  # Full sequence
        >>> print(gen(10, full_sequence=False))  # Last term only
    """
    def generator(n: int, full_sequence: bool = True) -> list[int] | int:
        if D < 2:
            raise ValueError("D must be at least 2.")
        if m <= 0:
            raise ValueError("m must be greater than 0.")
        if n < D:
            raise ValueError("n must be greater than or equal to D.")

        # Initialize the sequence with D initial terms set to 1
        S: list[int] = [1] * D

        # Generate terms up to the nth term using the LFG formula
        for _ in range(n - D + 1):
            S.append((S[-D] + S[-D + 1]) % m)

        # Return the full sequence or just the last term based on the flag
        return S if full_sequence else S[-1]

    return generator

# Example usage
D = 5
m = 100
n = 30

start = time.time()
gen = naive_lfib(D, m)
S = gen(n)

print("Time needed (in seconds):", time.time() - start)

print(f"Solution for S_{n} (mod {m}) is {S[-1]}")
print("The whole sequence is:")
print(S)
Time needed (in seconds): 0.0001308917999267578
Solution for S_30 (mod 100) is 71
The whole sequence is:
[1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 4, 4, 4, 5, 7, 8, 8, 9, 12, 15, 16, 17, 21, 27, 31, 33, 38, 48, 58, 64, 71]

Faster Implementation¶

However, the above approach quickly becomes infeasible once the values of $n$ get larger. In this case, a more efficient implementation is necessary. Although it may not be the most efficient approach, we choose to compute $S_n$ using a binary & modular exponentiation algorithm, which can efficiently compute powers of a matrix $\mathbf Q$. The core idea is to obtain $S_n$ by means of a multiplication $\mathbf Q \cdot \mathbf s_{n-1}$, where $\mathbf Q \in \mathbb{N}^{D \times D}$ and $\mathbf s_{n-1} = \begin{bmatrix} S_{n-D} & \ldots & S_{n-2} & S_{n-1}\end{bmatrix}^\mathsf{T} \in \mathbb{N}^D$.

For example, for $D = 5$, we compute $S_n$ using the matrix $\mathbf Q$ and vector $\mathbf s_{n-1}$ as follows:

$$ \mathbf s_n = \left[\begin{matrix} 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 1 & 1 & 0 & 0 & 0 \end{matrix}\right] \left[\begin{matrix} S_{n-5} \\ S_{n-4} \\ S_{n-3} \\ S_{n-2} \\ S_{n-1} \end{matrix} \right] = \left[\begin{matrix} S_{n-4} \\ S_{n-3} \\ S_{n-2} \\ S_{n-1} \\ S_{n-5} + S_{n-4} \end{matrix} \right], \quad n \geq D. $$

The two ones in the last row of $\mathbf Q$ correspond to the lagged sum $S_{n-D} + S_{n-D+1}$, while the diagonal of ones results in a delay operation. The trick now is that we can compute $\mathbf s_n$, $\mathbf s_{n+1}$, $\mathbf s_{n+2}$, etc., by repeatedly multiplying $\mathbf s_{n-1}$ with the matrix $\mathbf Q$:

$$ \begin{align} \mathbf s_{n} &= \mathbf Q \mathbf s_{n-1}, \\ \mathbf s_{n+1} &= \mathbf Q \mathbf s_{n} = \mathbf Q^2 \mathbf s_{n-1}, \\ \vdots \quad & \\ \mathbf s_{n+k} &= \mathbf Q \mathbf s_{n+k-1} = \mathbf Q^{k+1} \mathbf s_{n-1}, \end{align} $$

with $n \geq D, \, k \geq 0$.

Hence, we have:

$$ \mathbf s_{D} = \mathbf Q \cdot \mathbf 1, $$

since $\mathbf s_{D-1} = \mathbf 1$, where $\mathbf 1$ is a $D$-dimensional vector containing only the value 1 in each row.

Thus:

$$ \mathbf s_{n} = \mathbf Q^{n-D+1} \cdot \mathbf 1, \quad n \geq D. $$

In [25]:
import numpy
import cupy
import sympy
from typing import Callable, Tuple

def fast_lfib(D: int, m: int, xp=numpy) -> Tuple[Callable[[int], int], numpy.ndarray | cupy.ndarray]:
    """
    Creates a fast Lagged Fibonacci Generator (LFG) using matrix exponentiation.

    Args:
        D (int): The lag period of the generator. Must be >= 2.
        m (int): The modulus to apply for generating the sequence. Must be > 0.
        xp: The array library to use for computation (e.g., `numpy` or `cupy`).
            Defaults to `numpy`.

    Returns:
        Tuple[Callable[[int], int], numpy.ndarray | cupy.ndarray]:
            - A generator function that computes $S_n$ efficiently.
            - The transformation matrix $\mathbf{Q}$ used for matrix exponentiation.

    Examples:
        >>> generator, Q = fast_lfib(5, 100, xp=numpy)
        >>> S_n = generator(10)
        >>> print(S_n)
    """
    Q = xp.zeros((D, D), dtype=xp.int64)
    Q[-1, 0] = 1
    Q[-1, 1] = 1

    i, j = 0, 1
    while j < D:
        Q[i, j] = 1
        i, j = i + 1, j + 1

    def fast_pow(A: numpy.ndarray | cupy.ndarray, p: int, m: int) -> numpy.ndarray | cupy.ndarray:
        """
        Computes $A^p \mod m$ using binary exponentiation.

        Args:
            A (numpy.ndarray | cupy.ndarray): The input matrix.
            p (int): The power to raise the matrix to. Must be >= 0.
            m (int): The modulus to apply. Must be > 0.

        Returns:
            numpy.ndarray | cupy.ndarray: The result of $A^p \mod m$.
        """
        res = xp.eye(A.shape[0], dtype=xp.int64)
        while p > 0:
            if (p & 1) > 0:
                res = (res @ A) % m
            p //= 2
            A = (A @ A) % m
        return res

    def generator(n: int) -> int:
        """
        Computes the $n$th term in the Lagged Fibonacci sequence.

        Args:
            n (int): The index of the term to compute. Must be >= D.

        Returns:
            int: The value of $S_n \mod m$.
        """
        SS = fast_pow(Q, n - D + 1, m)
        return int(SS[-1, :].sum() % m)

    return generator, Q
In [26]:
D = 5
n = 2**10+5
m = 1234567891011

gen, Q = fast_lfib(D=D, m=m, xp=cupy)
print("Lagged Fibonacci Matrix Q:")
sympy.Matrix(Q.get()) # .get() only required for cupy
# sympy.print_latex(sympy.Matrix(Q.get()))
Lagged Fibonacci Matrix Q:
Out[26]:
$\displaystyle \left[\begin{matrix}0 & 1 & 0 & 0 & 0\\0 & 0 & 1 & 0 & 0\\0 & 0 & 0 & 1 & 0\\0 & 0 & 0 & 0 & 1\\1 & 1 & 0 & 0 & 0\end{matrix}\right]$
In [27]:
start = time.time()
S = gen(n)
print("Time needed (in seconds):", time.time() - start)
print(f"Solution for S_{n} (mod {m}) is {S}")
Time needed (in seconds): 0.0023157596588134766
Solution for S_1029 (mod 1234567891011) is 253512699721

Final Example: Compute $S_n$ with a large value of $n$ and $D$¶

In [36]:
D = 4000
n = 10**20
m = 1234567891011

gen, Q = fast_lfib(D=D, m=m, xp=cupy)

start = time.time()
S = gen(n)
print("Time needed (in seconds):", time.time() - start)
print(f"Solution for S_{n} (mod {m}) is {S}")
Time needed (in seconds): 14.027729272842407
Solution for S_100000000000000000000 (mod 1234567891011) is 838042074774