관리 메뉴

Wiredwisdom

Cross-entropy and MSE principles 본문

Artificial Intelligence/Deep Learning Basic

Cross-entropy and MSE principles

Duke_Ryan 2025. 3. 30. 19:58

Mean Square Error : `:L_(MSE) = -1/nsum_i(y_i-q_i)^2`

Cross-Entropy : `L_(CE) = -sum_i y_ilog(q_i)`

 

MSE is based on Euclidean distance (Norm 2).

Cross-Entropy is geometrically based on "Information Geometry".

 

Simply put, MSE operates on a 2D plane where only distance matters and direction is straightforward.

In contrast, Cross-Entropy operates on a curved manifold of probability distributions where direction is crucial.

 

The key insight comes from One-hot encoding:

- With MSE, we unnecessarily calculate errors for incorrect classes: `(0-q_text(incorrect))^2`

- With CE, we focus only on the correct class probability: `-log(q_text(correct))`

 

This highlights the difference between additive calculations (MSE) and multiplicative ones (CE with one-hot).

In CE, as the correct prediction value approaches '1' on learning,

the `L_(CE)` value approaches '0' due to the Log function, since log(1) = 0. 

 

'Artificial Intelligence > Deep Learning Basic' 카테고리의 다른 글

Why MSE  (0) 2025.03.30
Deep-learning point  (1) 2025.01.13
Neural Network Basic  (0) 2024.08.01
Attention  (0) 2024.06.23
RNN - LSTM - LLM summary  (0) 2024.06.21