for small equations of Linear Regression, we can solve it using normal equation method.
Consider \(d\) dimensional feature and \(n\) samples of data. Remember, including the dummy feature, we have a matrix: \(X \in \mathbb{R}^{n \times \qty(d+1)}\) and a target \(Y \in \mathbb{R}^{n}\).
Notice:
\begin{equation} J\qty(\theta) = \frac{1}{2} \sum_{i=1}^{n} \qty(h_{\theta} \qty(x^{(i)}) - y^{(i)})^{2} \end{equation}
and \(h = X \theta\), we we can write:
\begin{equation} J(\theta) = \frac{1}{2} \qty(X \theta - y)^{T} \qty(X \theta - y) \end{equation}
We can take a derivative of this
Setting this to \(0\), taking the pseudoinverse:
\begin{equation} \theta = \qty(X^{T}X)^{-1} X^{T}y \end{equation}