is a mathematical system where numbers cycle after reaching a value called the modulus. The system is often referred to as “clock arithmetic” due to its similarity to how analog 12-hour clocks represent time. This article provides a conceptual overview of modular arithmetic and explores practical use cases in data science.
Conceptual Overview
The Basics
Modular arithmetic defines a system of operations on integers based on a particular integer called the modulus. The expression x mod d is equivalent to the remainder obtained when x is divided by d. If r ≡ x mod d, then r is said to be congruent to x mod d. In other words, this means that r and x differ by a multiple of d, or that x – r is divisible by d. The symbol ‘≡’ (three horizontal lines) is used instead of ‘=’ in modular arithmetic to emphasize that we’re dealing with congruence rather than equality in the usual sense.
For example, in modulo 7, the number 10 is congruent to 3 because 10 divided by 7 leaves a remainder of 3. So, we can write 3 ≡ 10 mod 7. In the case of a 12-hour clock, 2 a.m. is congruent to 2 p.m. (which is 14 mod 12). In programming languages such as Python, the percent sign (‘%’) serves as the modulus operator (e.g., 10 % 7
would evaluate to 3).
Here is a video that explains these concepts in more detail:
Solving Linear Congruences
A linear congruence can be a modular expression of the form n ⋅ y ≡ x (mod d), where the coefficient n, target x, and modulus d are known integers, and the unknown integer y has a degree of one (i.e., it is not squared, cubed, and so on). The expression 2017 ⋅ y ≡ 2025 (mod 10000) is an example of a linear congruence; it states that when 2017 is multiplied by some integer y, the product leaves a remainder of 2025 when divided by 10000. To solve for y in the expression n ⋅ y ≡ x (mod d), follow these steps:
- Find the greatest common divisor (GCD) of the coefficient n and modulus d, also written as GCD(n, d), which is the greatest positive integer that is a divisor of both n and d. The Extended Euclidean Algorithm may be used to efficiently compute the GCD; this will also yield a candidate for n-1, the modular inverse of the coefficient n.
- Determine whether a solution exists. If the target x is not divisible by GCD(n, d), then the equation has no solution. This is because the congruence is only solvable when the GCD divides the target.
- Simplify the modular expression, if needed, by dividing the coefficient n, target x, and modulus d by GCD(n, d) to reduce the problem to a simpler equivalent form; let us call these simplified quantities n0, x0, and d0, respectively. This ensures that n0 and d0 are coprime (i.e., 1 is their only common divisor), which is necessary for finding a modular inverse.
- Compute the modular inverse n0-1 of n0 mod d0 (again, using the Extended Euclidean Algorithm).
- Find one solution for the unknown value y. To do this, multiply the modular inverse n0-1 by the reduced target x0 to obtain one valid solution for y mod d0.
- Finally, by building on the result of step 5, generate all possible solutions. Since the original equation was reduced by GCD(n, d), there are GCD(n, d) distinct solutions. These solutions are spaced evenly apart by the reduced modulus d0, and all are valid with respect to the original modulus d.
Following is a Python implementation of the above procedure:
def extended_euclidean_algorithm(a, b):
"""
Computes the greatest common divisor of positive integers a and b,
along with coefficients x and y such that: a*x + b*y = gcd(a, b)
"""
if b == 0:
return (a, 1, 0)
else:
gcd, x_prev, y_prev = extended_euclidean_algorithm(b, a % b)
x = y_prev
y = x_prev - (a // b) * y_prev
return (gcd, x, y)
def solve_linear_congruence(coefficient, target, modulus):
"""
Solves the linear congruence: coefficient * y ≡ target (mod modulus)
Returns all integer solutions for y with respect to the modulus.
"""
# Step 1: Compute the gcd
gcd, _, _ = extended_euclidean_algorithm(coefficient, modulus)
# Step 2: Check if a solution exists
if target % gcd != 0:
print("No solution exists: target is not divisible by gcd.")
return None
# Step 3: Reduce the equation by gcd
reduced_coefficient = coefficient // gcd
reduced_target = target // gcd
reduced_modulus = modulus // gcd
# Step 4: Find the modular inverse of reduced_coefficient with respect to the reduced_modulus
_, inverse_reduced, _ = extended_euclidean_algorithm(reduced_coefficient, reduced_modulus)
inverse_reduced = inverse_reduced % reduced_modulus
# Step 5: Compute one solution
base_solution = (inverse_reduced * reduced_target) % reduced_modulus
# Step 6: Generate all solutions modulo the original modulus
all_solutions = [(base_solution + i * reduced_modulus) % modulus for i in range(gcd)]
return all_solutions
Here are some example tests:
solutions = solve_linear_congruence(coefficient=2009, target=2025, modulus=10000)
print(f"Solutions for y: {solutions}")
solutions = solve_linear_congruence(coefficient=20, target=16, modulus=28)
print(f"Solutions for y: {solutions}")
Results:
Solutions for y: [225]
Solutions for y: [5, 12, 19, 26]
This video explains how to solve linear congruences in more detail:
Data Science Use Cases
Use Case 1: Feature Engineering
Modular arithmetic has a number of interesting use cases in data science. An intuitive one is in the context of feature engineering, for encoding cyclical features like hours of the day. Since time wraps around every 24 hours, treating hours as linear values can misrepresent relationships (e.g., 11 PM and 1 AM are numerically far apart but temporally close). By applying modular encoding (e.g., using sine and cosine transformations of the hour modulo 24), we can preserve the circular nature of time, allowing machine learning (ML) models to recognize patterns that occur during specific periods like nighttime. The following Python code shows how such an encoding can be implemented:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Example: List of incident hours (in 24-hour format)
incident_hours = [22, 23, 0, 1, 2] # 10 PM to 2 AM
# Convert to a DataFrame
df = pd.DataFrame({'hour': incident_hours})
# Encode using sine and cosine transformations
df['hour_sin'] = np.sin(2 * np.pi * df['hour'] / 24)
df['hour_cos'] = np.cos(2 * np.pi * df['hour'] / 24)
The resulting dataframe df
:
hour hour_sin hour_cos
0 22 -0.500000 0.866025
1 23 -0.258819 0.965926
2 0 0.000000 1.000000
3 1 0.258819 0.965926
4 2 0.500000 0.866025
Notice how the use of sine still differentiates between the hours before and after 12 (e.g., encoding 11 p.m. and 1 a.m. as -0.258819 and 0.258819, respectively), while the use of cosine does not (e.g., both 11 p.m. and 1 a.m. are mapped to the value 0.965926). The optimal choice of encoding will depend on the business context in which the ML model is to be deployed. Ultimately, the technique enhances feature engineering for tasks such as anomaly detection, forecasting, and classification where temporal proximity matters.
In the following sections, we will consider two larger data science use cases of linear congruence that involve solving for y in modular expressions of the form n ⋅ y ≡ x (mod d).
Use Case 2: Resharding in Distributed Database Systems
In distributed databases, data is often partitioned (or sharded) across multiple nodes using a hash function. When the number of shards changes — say, from d to d’ — we need to reshard the data efficiently without rehashing everything from scratch.
Suppose each data item is assigned to a shard as follows:
shard = hash(key) mod d
When redistributing items to a new set of d’ shards, we might want to map the old shard indices to the new ones in a way that preserves balance and minimizes data movement. This can lead to solving for y in the expression n ⋅ y ≡ x (mod d), where:
- x is the original shard index,
- d is the old number of shards,
- n is a scaling factor (or transformation coefficient),
- y is the new shard index that we are solving for
Using modular arithmetic in this context ensures consistent mapping between old and new shard layouts, minimizes reallocation, preserves data locality, and enables deterministic and reversible transformations during resharding.
Below is a Python implementation of this scenario:
def extended_euclidean_algorithm(a, b):
"""
Computes gcd(a, b) and coefficients x, y such that: a*x + b*y = gcd(a, b)
Used to find modular inverses.
"""
if b == 0:
return (a, 1, 0)
else:
gcd, x_prev, y_prev = extended_euclidean_algorithm(b, a % b)
x = y_prev
y = x_prev - (a // b) * y_prev
return (gcd, x, y)
def modular_inverse(a, m):
"""
Returns the modular inverse of a modulo m, if it exists.
"""
gcd, x, _ = extended_euclidean_algorithm(a, m)
if gcd != 1:
return None # Inverse doesn't exist if a and m are not coprime
return x % m
def reshard(old_shard_index, old_num_shards, new_num_shards):
"""
Maps an old shard index to a new one using modular arithmetic.
Solves: n * y ≡ x (mod d)
Where:
x = old_shard_index
d = old_num_shards
n = new_num_shards
y = new shard index (to solve for)
"""
x = old_shard_index
d = old_num_shards
n = new_num_shards
# Step 1: Check if modular inverse of n modulo d exists
inverse_n = modular_inverse(n, d)
if inverse_n is None:
print(f"No modular inverse exists for n = {n} mod d = {d}. Cannot reshard deterministically.")
return None
# Step 2: Solve for y using modular inverse
y = (inverse_n * x) % d
return y
Example test:
import hashlib
def custom_hash(key, num_shards):
hash_bytes = hashlib.sha256(key.encode('utf-8')).digest()
hash_int = int.from_bytes(hash_bytes, byteorder='big')
return hash_int % num_shards
# Example usage
old_num_shards = 10
new_num_shards = 7
# Simulate resharding for a few keys
keys = ['user_123', 'item_456', 'session_789']
for key in keys:
old_shard = custom_hash(key, old_num_shards)
new_shard = reshard(old_shard, old_num_shards, new_num_shards)
print(f"Key: {key} | Old Shard: {old_shard} | New Shard: {new_shard}")
Note that we are using a custom hash function that is deterministic with respect to key
and num_shards
to ensure reproducibility.
Results:
Key: user_123 | Old Shard: 9 | New Shard: 7
Key: item_456 | Old Shard: 7 | New Shard: 1
Key: session_789 | Old Shard: 2 | New Shard: 6
Use Case 3: Differential Privacy in Federated Learning
In federated learning, ML models are trained across decentralized devices while preserving user privacy. Differential privacy adds noise to gradient updates in order to obscure individual contributions across devices. Often, this noise is sampled from a discrete distribution and must be modulo-reduced to fit within bounded ranges.
Suppose a client sends an update x, and the server applies a transformation of the form n ⋅ (y + k) ≡ x (mod d), where:
- x is the noisy gradient update sent to the server,
- y is the original (or true) gradient update,
- k is the noise term (drawn at random from a range of integers),
- n is the encoding factor,
- d is the modulus (e.g., size of the finite field or quantization range in which all operations take place)
Due to the privacy-preserving nature of this setup, the server can only recover y + k, the noisy update, but not the true update y itself.
Below is the now-familiar Python setup:
def extended_euclidean_algorithm(a, b):
if b == 0:
return a, 1, 0
else:
gcd, x_prev, y_prev = extended_euclidean_algorithm(b, a % b)
x = y_prev
y = x_prev - (a // b) * y_prev
return gcd, x, y
def modular_inverse(a, m):
gcd, x, _ = extended_euclidean_algorithm(a, m)
if gcd != 1:
return None
return x % m
Example test simulating some clients:
import random
# Parameters
d = 97 # modulus (finite field)
noise_scale = 20 # controls magnitude of noise
# Simulated clients
clients = [
{"id": 1, "y": 12, "n": 17},
{"id": 2, "y": 23, "n": 29},
{"id": 3, "y": 34, "n": 41},
]
# Step 1: Clients add noise and mask their gradients
random.seed(10)
for client in clients:
noise = random.randint(-noise_scale, noise_scale)
client["noise"] = noise
noisy_y = client["y"] + noise
client["x"] = (client["n"] * noisy_y) % d
# Step 2: Server receives x, knows n, and recovers noisy gradients
for client in clients:
inv_n = modular_inverse(client["n"], d)
client["y_noisy"] = (client["x"] * inv_n) % d
# Output
print("Client-side masking with noise:")
for client in clients:
print(f"Client {client['id']}:")
print(f" True gradient y = {client['y']}")
print(f" Added noise = {client['noise']}")
print(f" Masked value x = {client['x']}")
print(f" Recovered y + noise = {client['y_noisy']}")
print()
Results:
Client-side masking with noise:
Client 1:
True gradient y = 12
Added noise = 16
Masked value x = 88
Recovered y + noise = 28
Client 2:
True gradient y = 23
Added noise = -18
Masked value x = 48
Recovered y + noise = 5
Client 3:
True gradient y = 34
Added noise = 7
Masked value x = 32
Recovered y + noise = 41
Notice that the server is only able to derive the noisy gradients rather than the original ones.
The Wrap
Modular arithmetic, with its elegant cyclical structure, offers far more than just a clever way to tell time — it underpins some of the most critical mechanisms in modern data science. By exploring modular transformations and linear congruences, we have seen how this mathematical framework becomes a powerful tool for solving real-world problems. In use cases as diverse as feature engineering, resharding in distributed databases, and safeguarding user privacy in federated learning through differential privacy, modular arithmetic provides both the abstraction and precision needed to build robust, scalable systems. As data science continues to evolve, the relevance of these modular techniques will likely grow, suggesting that sometimes, the key to innovation lies in the remainder.
Source link
#Modular #Arithmetic #Data #Science