Finite difference gradient approximation is a first-order numerical method to estimate the gradient of a function when an analytical derivative is not available or is too costly to compute. This technique leverages the idea of perturbing the input slightly and observing the corresponding change in the function's output.

Unlike methods that rely on symbolic or automatic differentiation, this approach uses the central difference formula:

$$ \text{grad} = \frac{f(x + h) - f(x - h)}{2h} $$

Finite Difference Gradient Approximation Design

When dealing with functions that are not easily differentiable or are considered blackbox—meaning we cannot directly evaluate their derivative—the gradient approximation becomes an invaluable tool.

In essence, this method "looks" to the left and right of the current point to determine which direction offers a more promising descent. There are a few factors to consider:

Have a lookg at this wikipedia page.

https://en.wikipedia.org/wiki/Finite_difference

Code

import torch
import matplotlib.pyplot as plt

class FiniteDifferenceSolver:
    def __init__(self, func, h=1e-4):
        self.func = func
        self.h = h

    def gradient(self, x):
        # Ensure x is a float
        x = float(x)
        f_plus = self.func(torch.tensor(x + self.h))
        f_minus = self.func(torch.tensor(x - self.h))
        grad = (f_plus - f_minus) / (2 * self.h)
        return grad.item()

    def step(self, x, learning_rate):
        grad = self.gradient(x)
        x_new = x - learning_rate * grad
        return x_new, self.func(torch.tensor(x_new)), grad

# Example function to minimize
def f(x):
    # Example: A cubic function
    return x**3 - 6*x**2 + 9*x + 1

# Create an instance of FiniteDifferenceSolver
solver = FiniteDifferenceSolver(f, h=1e-4)

# Parameters for optimization
x_start = -1.0
learning_rate = 0.002
max_iter = 50
tolerance = 1e-6

# Lists to store optimization path for visualization
x_values = [x_start]
f_values = [f(torch.tensor(x_start)).item()]

x_current = x_start

# Optimization loop using finite difference gradient approximation
for i in range(max_iter):
    x_new, f_new, grad = solver.step(x_current, learning_rate)
    x_values.append(x_new)
    f_values.append(f_new.item())
    
    # Print iteration details
    print(f"Iteration {i+1}: x = {x_new:0.6f}, f(x) = {f_new.item():0.6f}, grad = {grad:0.6f}")
    
    # Check for convergence using the gradient magnitude
    if abs(grad) < tolerance:
        print("Convergence achieved.")
        break
    
    x_current = x_new

# Plotting the function and optimization path
x_plot = torch.linspace(min(x_values)-1, max(x_values)+1, 400)
f_plot = [f(x_val) for x_val in x_plot]

plt.figure(figsize=(12, 3))
plt.plot(x_plot.numpy(), f_plot, label='f(x)')
plt.scatter(x_values, f_values, color='red', label='Optimization Path')
plt.title("Finite Difference Gradient Descent")
plt.xlabel("x")
plt.ylabel("f(x)")
plt.legend()
plt.grid(True)
plt.show()

image.png