1. 引言
在上一篇文章《最小二乘问题详解4:非线性最小二乘》中,介绍了非线性最小二乘问题的基本定义、求解思路及其核心算法Gauss-Newton方法,强调通过局部线性化将非线性问题转化为迭代的线性最小二乘子问题来求解。由于非线性最小二乘问题起来比线性最小二乘复杂多了,这里就通过一个拟合曲线\(y = \exp(a x^2 + b x + c)\)的实例来加深对非线性最小二乘问题的理解。
2. 雅可比矩阵
最麻烦的还是计算雅可比矩阵。设参数向量为:
\[\boldsymbol{\theta} = \begin{bmatrix} a \\ b \\ c \end{bmatrix}\]
模型函数为:
\[f(x; \boldsymbol{\theta}) = \exp(a x^2 + b x + c)\]
令中间变量:
\[u = a x^2 + b x + c\quad \Rightarrow \quadf = e^u\]
根据定义,雅可比矩阵是模型输出对参数的偏导形成的矩阵,即:
\[\mathbf{J}_{i,j} = \frac{\partial f(x_i; \boldsymbol{\theta})}{\partial \theta_j}\]
每一行是模型在第\(i\)个样本处对参数的梯度。计算每个偏导数的解析形式:
- 对 $ a $ 求偏导:
\[\frac{\partial f}{\partial a} = \frac{\partial}{\partial a} e^{a x^2 + b x + c}= e^{a x^2 + b x + c} \cdot \frac{\partial}{\partial a}(a x^2 + b x + c)= e^{u} \cdot x^2 = f(x; \boldsymbol{\theta}) \cdot x^2\]
- 对 $ b $ 求偏导:
\[\frac{\partial f}{\partial b} = e^{u} \cdot \frac{\partial}{\partial b}(a x^2 + b x + c)= e^{u} \cdot x = f(x; \boldsymbol{\theta}) \cdot x\]
- 对 $ c $ 求偏导:
\[\frac{\partial f}{\partial c} = e^{u} \cdot \frac{\partial}{\partial c}(a x^2 + b x + c)= e^{u} \cdot 1 = f(x; \boldsymbol{\theta})\]
因此,对于 $ N $ 个数据点 $ x_1, x_2, \dots, x_N $,雅可比矩阵为:
\[\mathbf{J} =\begin{bmatrix}\frac{\partial f(x_1)}{\partial a} & \frac{\partial f(x_1)}{\partial b} & \frac{\partial f(x_1)}{\partial c} \\\frac{\partial f(x_2)}{\partial a} & \frac{\partial f(x_2)}{\partial b} & \frac{\partial f(x_2)}{\partial c} \\\vdots & \vdots & \vdots \\\frac{\partial f(x_N)}{\partial a} & \frac{\partial f(x_N)}{\partial b} & \frac{\partial f(x_N)}{\partial c}\end{bmatrix}=\begin{bmatrix}f(x_1) \cdot x_1^2 & f(x_1) \cdot x_1 & f(x_1) \\f(x_2) \cdot x_2^2 & f(x_2) \cdot x_2 & f(x_2) \\\vdots & \vdots & \vdots \\f(x_N) \cdot x_N^2 & f(x_N) \cdot x_N & f(x_N)\end{bmatrix}\]
或者写成:
\[\boxed{\mathbf{J} =\begin{bmatrix}e^{a x_1^2 + b x_1 + c} \cdot x_1^2 & e^{a x_1^2 + b x_1 + c} \cdot x_1 & e^{a x_1^2 + b x_1 + c} \\e^{a x_2^2 + b x_2 + c} \cdot x_2^2 & e^{a x_2^2 + b x_2 + c} \cdot x_2 & e^{a x_2^2 + b x_2 + c} \\\vdots & \vdots & \vdots \\e^{a x_N^2 + b x_N + c} \cdot x_N^2 & e^{a x_N^2 + b x_N + c} \cdot x_N & e^{a x_N^2 + b x_N + c}\end{bmatrix}}\]
3. 实例
其实要求解非线性最小二乘问题可以使用现成的库(比如Ceres Solver),不过本文主要为了理解非线性最小二乘的求解过程,尤其是Gauss-Newton方法。因此这个实例还是使用基础的矩阵库Eigen来实现,具体代码如下:
[code]#include #include #include #include #include using namespace std;using namespace Eigen;// 模型函数: y = exp(a*x^2 + b*x + c)double model(double x, const Vector3d& theta) { double a = theta(0); double b = theta(1); double c = theta(2); double exponent = a * x * x + b * x + c; // 防止溢出 if (exponent > 300) return exp(300); if (exponent < -300) return exp(-300); return exp(exponent);}// 计算 Jacobian 矩阵(数值导数或解析导数)MatrixXd computeJacobian(const vector& x_data, const vector& y_data, const Vector3d& theta) { int N = x_data.size(); MatrixXd J(N, 3); // 每行对应一个点,三列:∂f/∂a, ∂f/∂b, ∂f/∂c for (int i = 0; i < N; ++i) { double x = x_data; double exponent = theta(0) * x * x + theta(1) * x + theta(2); double f = model(x, theta); // 当前预测值 // 解析导数(链式法则) double df_de = f; // d(exp(u))/du = exp(u) double de_da = x * x; double de_db = x; double de_dc = 1.0; J(i, 0) = df_de * de_da; // ∂f/∂a J(i, 1) = df_de * de_db; // ∂f/∂b J(i, 2) = df_de * de_dc; // ∂f/∂c } return J;}// 计算残差向量 r_i = y_i - f(x_i; theta)VectorXd computeResiduals(const vector& x_data, const vector& y_data, const Vector3d& theta) { int N = x_data.size(); VectorXd residuals(N); for (int i = 0; i < N; ++i) { double pred = model(x_data, theta); residuals(i) = y_data - pred; } return residuals;}int main() { // ======================== // 1. 设置真实参数 // ======================== Vector3d true_params; true_params |