| 일 | 월 | 화 | 수 | 목 | 금 | 토 |
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | |
| 7 | 8 | 9 | 10 | 11 | 12 | 13 |
| 14 | 15 | 16 | 17 | 18 | 19 | 20 |
| 21 | 22 | 23 | 24 | 25 | 26 | 27 |
| 28 | 29 | 30 | 31 |
- turtlebot
- covariancematrix
- gaussiansplatting
- tilebasedrasterizer
- realtimerendering
- turtlebot3
- 3dmapping
- raspberrypi
- vectorcalculus
- pointcloud
- vectorfields
- ComputerVision
- usbcamera
- imageprocessing
- rosnoetic
- rostopics
- Ros
- sensorfusion
- opencv
- alphablending
- catkinworkspace
- 3dgaussiansplatting
- adaptivedensitycontrol
- electromagnetism
- NERF
- LIDAR
- rospackages
- differentiablerendering
- Slam
- roslaunch
- Today
- Total
Wiredwisdom
Cross-entropy and MSE principles 본문
Cross-entropy and MSE principles
Duke_Ryan 2025. 3. 30. 19:58Mean Square Error : `:L_(MSE) = -1/nsum_i(y_i-q_i)^2`
Cross-Entropy : `L_(CE) = -sum_i y_ilog(q_i)`
MSE is based on Euclidean distance (Norm 2).
Cross-Entropy is geometrically based on "Information Geometry".
Simply put, MSE operates on a 2D plane where only distance matters and direction is straightforward.
In contrast, Cross-Entropy operates on a curved manifold of probability distributions where direction is crucial.
The key insight comes from One-hot encoding:
- With MSE, we unnecessarily calculate errors for incorrect classes: `(0-q_text(incorrect))^2`
- With CE, we focus only on the correct class probability: `-log(q_text(correct))`
This highlights the difference between additive calculations (MSE) and multiplicative ones (CE with one-hot).
In CE, as the correct prediction value approaches '1' on learning,
the `L_(CE)` value approaches '0' due to the Log function, since log(1) = 0.
'Artificial Intelligence > Deep Learning Basic' 카테고리의 다른 글
| Why MSE (0) | 2025.03.30 |
|---|---|
| Deep-learning point (1) | 2025.01.13 |
| Neural Network Basic (0) | 2024.08.01 |
| Attention (0) | 2024.06.23 |
| RNN - LSTM - LLM summary (0) | 2024.06.21 |