Finite difference gradient approximation is a first-order numerical method to estimate the gradient of a function when an analytical derivative is not available or is too costly to compute. This technique leverages the idea of perturbing the input slightly and observing the corresponding change in the function's output.
Unlike methods that rely on symbolic or automatic differentiation, this approach uses the central difference formula:
$$ \text{grad} = \frac{f(x + h) - f(x - h)}{2h} $$
When dealing with functions that are not easily differentiable or are considered blackbox—meaning we cannot directly evaluate their derivative—the gradient approximation becomes an invaluable tool.
In essence, this method "looks" to the left and right of the current point to determine which direction offers a more promising descent. There are a few factors to consider:
Have a lookg at this wikipedia page.
https://en.wikipedia.org/wiki/Finite_difference
import torch
import matplotlib.pyplot as plt
class FiniteDifferenceSolver:
def __init__(self, func, h=1e-4):
self.func = func
self.h = h
def gradient(self, x):
# Ensure x is a float
x = float(x)
f_plus = self.func(torch.tensor(x + self.h))
f_minus = self.func(torch.tensor(x - self.h))
grad = (f_plus - f_minus) / (2 * self.h)
return grad.item()
def step(self, x, learning_rate):
grad = self.gradient(x)
x_new = x - learning_rate * grad
return x_new, self.func(torch.tensor(x_new)), grad
# Example function to minimize
def f(x):
# Example: A cubic function
return x**3 - 6*x**2 + 9*x + 1
# Create an instance of FiniteDifferenceSolver
solver = FiniteDifferenceSolver(f, h=1e-4)
# Parameters for optimization
x_start = -1.0
learning_rate = 0.002
max_iter = 50
tolerance = 1e-6
# Lists to store optimization path for visualization
x_values = [x_start]
f_values = [f(torch.tensor(x_start)).item()]
x_current = x_start
# Optimization loop using finite difference gradient approximation
for i in range(max_iter):
x_new, f_new, grad = solver.step(x_current, learning_rate)
x_values.append(x_new)
f_values.append(f_new.item())
# Print iteration details
print(f"Iteration {i+1}: x = {x_new:0.6f}, f(x) = {f_new.item():0.6f}, grad = {grad:0.6f}")
# Check for convergence using the gradient magnitude
if abs(grad) < tolerance:
print("Convergence achieved.")
break
x_current = x_new
# Plotting the function and optimization path
x_plot = torch.linspace(min(x_values)-1, max(x_values)+1, 400)
f_plot = [f(x_val) for x_val in x_plot]
plt.figure(figsize=(12, 3))
plt.plot(x_plot.numpy(), f_plot, label='f(x)')
plt.scatter(x_values, f_values, color='red', label='Optimization Path')
plt.title("Finite Difference Gradient Descent")
plt.xlabel("x")
plt.ylabel("f(x)")
plt.legend()
plt.grid(True)
plt.show()