Cross-entropy and MSE principles

Notice

Recent Posts

Tags more

Archives

관리 메뉴

Wiredwisdom

Artificial Intelligence/Deep Learning Basic

Duke_Ryan 2025. 3. 30. 19:58

Mean Square Error : `:L_(MSE) = -1/nsum_i(y_i-q_i)^2`

Cross-Entropy : `L_(CE) = -sum_i y_ilog(q_i)`

MSE is based on Euclidean distance (Norm 2).

Cross-Entropy is geometrically based on "Information Geometry".

Simply put, MSE operates on a 2D plane where only distance matters and direction is straightforward.

In contrast, Cross-Entropy operates on a curved manifold of probability distributions where direction is crucial.

The key insight comes from One-hot encoding:

- With MSE, we unnecessarily calculate errors for incorrect classes: `(0-q_text(incorrect))^2`

- With CE, we focus only on the correct class probability: `-log(q_text(correct))`

This highlights the difference between additive calculations (MSE) and multiplicative ones (CE with one-hot).

In CE, as the correct prediction value approaches '1' on learning,

the `L_(CE)` value approaches '0' due to the Log function, since log(1) = 0.

'Artificial Intelligence/Deep Learning Basic' Related Articles