Fast Computation of the Lagged Fibonacci Generator¶
We define the sequence of a lagged Fibonacci generator as:
$$ S_n = \begin{cases} 1, & 0 \le n < D \\ S_{n-D} + S_{n - D+1} \pmod m, & n \ge D \end{cases}, $$
where $m \in \mathbb{N}^+$ is a modulus larger than 1.
Naive Implementation¶
Below we show a simple example with a naive implementation of the sequence:
D = 4
n = 50
m = 1234
from typing import Callable
import time
def naive_lfib(D: int, m: int) -> Callable[[int, bool], list[int] | int]:
"""
Creates a Lagged Fibonacci Generator (LFG) function.
Args:
D (int): The lag period of the generator. Must be >= 2.
m (int): The modulus to apply for generating the sequence. Must be > 0.
Returns:
Callable[[int, bool], list[int] | int]: A generator function that produces
the sequence up to the nth element or the last element only.
The generator function has the following parameters:
n (int): The number of terms to generate in the sequence. Must be >= D.
full_sequence (bool): Whether to return the full sequence or just the nth term.
Defaults to True (return full sequence).
Examples:
>>> gen = naive_lfib(5, 100)
>>> S = gen(10)
>>> print(S) # Full sequence
>>> print(gen(10, full_sequence=False)) # Last term only
"""
def generator(n: int, full_sequence: bool = True) -> list[int] | int:
if D < 2:
raise ValueError("D must be at least 2.")
if m <= 0:
raise ValueError("m must be greater than 0.")
if n < D:
raise ValueError("n must be greater than or equal to D.")
# Initialize the sequence with D initial terms set to 1
S: list[int] = [1] * D
# Generate terms up to the nth term using the LFG formula
for _ in range(n - D + 1):
S.append((S[-D] + S[-D + 1]) % m)
# Return the full sequence or just the last term based on the flag
return S if full_sequence else S[-1]
return generator
# Example usage
D = 5
m = 100
n = 30
start = time.time()
gen = naive_lfib(D, m)
S = gen(n)
print("Time needed (in seconds):", time.time() - start)
print(f"Solution for S_{n} (mod {m}) is {S[-1]}")
print("The whole sequence is:")
print(S)
Time needed (in seconds): 0.0001308917999267578 Solution for S_30 (mod 100) is 71 The whole sequence is: [1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 4, 4, 4, 5, 7, 8, 8, 9, 12, 15, 16, 17, 21, 27, 31, 33, 38, 48, 58, 64, 71]
Faster Implementation¶
However, the above approach quickly becomes infeasible once the values of $n$ get larger. In this case, a more efficient implementation is necessary. Although it may not be the most efficient approach, we choose to compute $S_n$ using a binary & modular exponentiation algorithm, which can efficiently compute powers of a matrix $\mathbf Q$. The core idea is to obtain $S_n$ by means of a multiplication $\mathbf Q \cdot \mathbf s_{n-1}$, where $\mathbf Q \in \mathbb{N}^{D \times D}$ and $\mathbf s_{n-1} = \begin{bmatrix} S_{n-D} & \ldots & S_{n-2} & S_{n-1}\end{bmatrix}^\mathsf{T} \in \mathbb{N}^D$.
For example, for $D = 5$, we compute $S_n$ using the matrix $\mathbf Q$ and vector $\mathbf s_{n-1}$ as follows:
$$ \mathbf s_n = \left[\begin{matrix} 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 1 & 1 & 0 & 0 & 0 \end{matrix}\right] \left[\begin{matrix} S_{n-5} \\ S_{n-4} \\ S_{n-3} \\ S_{n-2} \\ S_{n-1} \end{matrix} \right] = \left[\begin{matrix} S_{n-4} \\ S_{n-3} \\ S_{n-2} \\ S_{n-1} \\ S_{n-5} + S_{n-4} \end{matrix} \right], \quad n \geq D. $$
The two ones in the last row of $\mathbf Q$ correspond to the lagged sum $S_{n-D} + S_{n-D+1}$, while the diagonal of ones results in a delay operation. The trick now is that we can compute $\mathbf s_n$, $\mathbf s_{n+1}$, $\mathbf s_{n+2}$, etc., by repeatedly multiplying $\mathbf s_{n-1}$ with the matrix $\mathbf Q$:
$$ \begin{align} \mathbf s_{n} &= \mathbf Q \mathbf s_{n-1}, \\ \mathbf s_{n+1} &= \mathbf Q \mathbf s_{n} = \mathbf Q^2 \mathbf s_{n-1}, \\ \vdots \quad & \\ \mathbf s_{n+k} &= \mathbf Q \mathbf s_{n+k-1} = \mathbf Q^{k+1} \mathbf s_{n-1}, \end{align} $$
with $n \geq D, \, k \geq 0$.
Hence, we have:
$$ \mathbf s_{D} = \mathbf Q \cdot \mathbf 1, $$
since $\mathbf s_{D-1} = \mathbf 1$, where $\mathbf 1$ is a $D$-dimensional vector containing only the value 1 in each row.
Thus:
$$ \mathbf s_{n} = \mathbf Q^{n-D+1} \cdot \mathbf 1, \quad n \geq D. $$
import numpy
import cupy
import sympy
from typing import Callable, Tuple
def fast_lfib(D: int, m: int, xp=numpy) -> Tuple[Callable[[int], int], numpy.ndarray | cupy.ndarray]:
"""
Creates a fast Lagged Fibonacci Generator (LFG) using matrix exponentiation.
Args:
D (int): The lag period of the generator. Must be >= 2.
m (int): The modulus to apply for generating the sequence. Must be > 0.
xp: The array library to use for computation (e.g., `numpy` or `cupy`).
Defaults to `numpy`.
Returns:
Tuple[Callable[[int], int], numpy.ndarray | cupy.ndarray]:
- A generator function that computes $S_n$ efficiently.
- The transformation matrix $\mathbf{Q}$ used for matrix exponentiation.
Examples:
>>> generator, Q = fast_lfib(5, 100, xp=numpy)
>>> S_n = generator(10)
>>> print(S_n)
"""
Q = xp.zeros((D, D), dtype=xp.int64)
Q[-1, 0] = 1
Q[-1, 1] = 1
i, j = 0, 1
while j < D:
Q[i, j] = 1
i, j = i + 1, j + 1
def fast_pow(A: numpy.ndarray | cupy.ndarray, p: int, m: int) -> numpy.ndarray | cupy.ndarray:
"""
Computes $A^p \mod m$ using binary exponentiation.
Args:
A (numpy.ndarray | cupy.ndarray): The input matrix.
p (int): The power to raise the matrix to. Must be >= 0.
m (int): The modulus to apply. Must be > 0.
Returns:
numpy.ndarray | cupy.ndarray: The result of $A^p \mod m$.
"""
res = xp.eye(A.shape[0], dtype=xp.int64)
while p > 0:
if (p & 1) > 0:
res = (res @ A) % m
p //= 2
A = (A @ A) % m
return res
def generator(n: int) -> int:
"""
Computes the $n$th term in the Lagged Fibonacci sequence.
Args:
n (int): The index of the term to compute. Must be >= D.
Returns:
int: The value of $S_n \mod m$.
"""
SS = fast_pow(Q, n - D + 1, m)
return int(SS[-1, :].sum() % m)
return generator, Q
D = 5
n = 2**10+5
m = 1234567891011
gen, Q = fast_lfib(D=D, m=m, xp=cupy)
print("Lagged Fibonacci Matrix Q:")
sympy.Matrix(Q.get()) # .get() only required for cupy
# sympy.print_latex(sympy.Matrix(Q.get()))
Lagged Fibonacci Matrix Q:
start = time.time()
S = gen(n)
print("Time needed (in seconds):", time.time() - start)
print(f"Solution for S_{n} (mod {m}) is {S}")
Time needed (in seconds): 0.0023157596588134766 Solution for S_1029 (mod 1234567891011) is 253512699721
Final Example: Compute $S_n$ with a large value of $n$ and $D$¶
D = 4000
n = 10**20
m = 1234567891011
gen, Q = fast_lfib(D=D, m=m, xp=cupy)
start = time.time()
S = gen(n)
print("Time needed (in seconds):", time.time() - start)
print(f"Solution for S_{n} (mod {m}) is {S}")
Time needed (in seconds): 14.027729272842407 Solution for S_100000000000000000000 (mod 1234567891011) is 838042074774