Principled and data efficient support vector machine training using the minimum description length principle, with application in breast cancer

Singh, Harsh; Arandelovic, Oggie

Show simple item record

Files in this item

Name:: 2022_ICASSP_paper1.pdf
Size:: 694.6Kb
Format:: PDF

View/Open

Item metadata

dc.contributor.author	Singh, Harsh
dc.contributor.author	Arandelovic, Oggie
dc.date.accessioned	2022-04-07T15:35:02Z
dc.date.available	2022-04-07T15:35:02Z
dc.date.issued	2022-01-21
dc.identifier	278373605
dc.identifier	2f3dbaed-dc8d-4f92-a7c6-351a0a9b634b
dc.identifier	85131259594
dc.identifier	000864187901127
dc.identifier.citation	Singh , H & Arandelovic , O 2022 , ' Principled and data efficient support vector machine training using the minimum description length principle, with application in breast cancer ' , Paper presented at AAAI 2022 Workshop , 1/03/22 - 1/03/22 . https://doi.org/10.1109/icassp43922.2022.9747649	en
dc.identifier.citation	conference	en
dc.identifier.uri	https://hdl.handle.net/10023/25158
dc.description.abstract	Support vector machines (SVMs) are established as highly successful classifiers in a broad range of applications, including numerous medical ones. Nevertheless, their current employment is restricted by a limitation in the manner in which they are trained, most often the training-validation-test or k-fold cross-validation approaches, which are wasteful both in terms of the use of the available data as well as computational resources. This is a particularly important consideration in many medical problems, in which data availability is low (be it because of the inherent difficulty in obtaining sufficient data, or because of practical reasons, e.g. pertaining to privacy and data sharing). In this paper we propose a novel approach to training SVMs which does not suffer from the aforementioned limitation, which is at the same time much more rigorous in nature, being built upon solid information theoretic grounds. Specifically, we show how the training process, that is the process of hyperparameter inference, can be formulated as a search for the optimal model under the minimum description length (MDL) criterion, allowing for theory rather than empiricism driven selection and removing the need for validation data. The effectiveness and superiority of our approach are demonstrated on the Wisconsin Diagnostic Breast Cancer Data Set.
dc.format.extent	5
dc.format.extent	711348
dc.language.iso	eng
dc.subject	QA75 Electronic computers. Computer science	en
dc.subject	RC0254 Neoplasms. Tumors. Oncology (including Cancer)	en
dc.subject	SDG 3 - Good Health and Well-being	en
dc.subject.lcc	QA75	en
dc.subject.lcc	RC0254	en
dc.title	Principled and data efficient support vector machine training using the minimum description length principle, with application in breast cancer	en
dc.type	Conference paper	en
dc.contributor.institution	University of St Andrews. School of Computer Science	en
dc.identifier.doi	https://doi.org/10.1109/icassp43922.2022.9747649
dc.description.status	Peer reviewed	en
dc.identifier.url	https://taih21.github.io/pages/Accepted%20Paper.html	en

This item appears in the following Collection(s)

University of St Andrews Research

Show simple item record