Files in this item
Correlations of cross-entropy loss in machine learning
Item metadata
dc.contributor.author | Connor, Richard | |
dc.contributor.author | Dearle, Al | |
dc.contributor.author | Claydon, Ben | |
dc.contributor.author | Vadicamo, Lucia | |
dc.date.accessioned | 2024-06-03T13:30:01Z | |
dc.date.available | 2024-06-03T13:30:01Z | |
dc.date.issued | 2024-05-30 | |
dc.identifier | 302500976 | |
dc.identifier | 705b9e5e-94e6-49bb-acbb-5c5c7bdf1583 | |
dc.identifier.citation | Connor , R , Dearle , A , Claydon , B & Vadicamo , L 2024 , ' Correlations of cross-entropy loss in machine learning ' , Entropy , vol. 26 , no. 6 , 491 . https://doi.org/10.3390/e26060491 | en |
dc.identifier.issn | 1099-4300 | |
dc.identifier.uri | https://hdl.handle.net/10023/29980 | |
dc.description.abstract | Cross-entropy loss is crucial in training many deep neural networks. In this context, we show a number of novel and strong correlations among various related divergence functions. In particular, we demonstrate that, in some circumstances, (a) cross-entropy is almost perfectly correlated with the little-known triangular divergence, and (b) cross-entropy is strongly correlated with the Euclidean distance over the logits from which the softmax is derived. The consequences of these observations are as follows. First, triangular divergence may be used as a cheaper alternative to cross-entropy. Second, logits can be used as features in a Euclidean space which is strongly synergistic with the classification process. This justifies the use of Euclidean distance over logits as a measure of similarity, in cases where the network is trained using softmax and cross-entropy. We establish these correlations via empirical observation, supported by a mathematical explanation encompassing a number of strongly related divergence functions. | |
dc.format.extent | 16 | |
dc.format.extent | 2747131 | |
dc.language.iso | eng | |
dc.relation.ispartof | Entropy | en |
dc.subject | Softmax | en |
dc.subject | Cross-entropy | en |
dc.subject | f-divergence | en |
dc.subject | Kullback-Liebler divergence | en |
dc.subject | Jensen-Shannon 12 divergence | en |
dc.subject | Triangular divergence | en |
dc.subject | QA75 Electronic computers. Computer science | en |
dc.subject | T-NDAS | en |
dc.subject.lcc | QA75 | en |
dc.title | Correlations of cross-entropy loss in machine learning | en |
dc.type | Journal article | en |
dc.contributor.institution | University of St Andrews. School of Computer Science | en |
dc.identifier.doi | 10.3390/e26060491 | |
dc.description.status | Peer reviewed | en |
dc.identifier.url | https://www.mdpi.com/1099-4300/26/6/491 | en |
This item appears in the following Collection(s)
Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.