Show simple item record

Files in this item


Item metadata

dc.contributor.authorSingh, Harsh
dc.contributor.authorArandelovic, Oggie
dc.identifier.citationSingh , H & Arandelovic , O 2022 , ' Principled and data efficient support vector machine training using the minimum description length principle, with application in breast cancer ' , Paper presented at AAAI 2022 Workshop , 1/03/22 - 1/03/22 .
dc.identifier.otherPURE: 278373605
dc.identifier.otherPURE UUID: 2f3dbaed-dc8d-4f92-a7c6-351a0a9b634b
dc.identifier.otherScopus: 85131259594
dc.description.abstractSupport vector machines (SVMs) are established as highly successful classifiers in a broad range of applications, including numerous medical ones. Nevertheless, their current employment is restricted by a limitation in the manner in which they are trained, most often the training-validation-test or k-fold cross-validation approaches, which are wasteful both in terms of the use of the available data as well as computational resources. This is a particularly important consideration in many medical problems, in which data availability is low (be it because of the inherent difficulty in obtaining sufficient data, or because of practical reasons, e.g. pertaining to privacy and data sharing). In this paper we propose a novel approach to training SVMs which does not suffer from the aforementioned limitation, which is at the same time much more rigorous in nature, being built upon solid information theoretic grounds. Specifically, we show how the training process, that is the process of hyperparameter inference, can be formulated as a search for the optimal model under the minimum description length (MDL) criterion, allowing for theory rather than empiricism driven selection and removing the need for validation data. The effectiveness and superiority of our approach are demonstrated on the Wisconsin Diagnostic Breast Cancer Data Set.
dc.rightsCopyright © 2022 Association for the Advancement of Artificial Intelligence ( All rights reserved. This work has been made available online in accordance with publisher policies or with permission. Permission for further reuse of this content should be sought from the publisher or the rights holder. This is the author created accepted manuscript following peer review and may differ slightly from the final published version. The accepted version of this work is available at
dc.subjectQA75 Electronic computers. Computer scienceen
dc.subjectRC0254 Neoplasms. Tumors. Oncology (including Cancer)en
dc.titlePrincipled and data efficient support vector machine training using the minimum description length principle, with application in breast canceren
dc.typeConference paperen
dc.contributor.institutionUniversity of St Andrews. School of Computer Scienceen
dc.description.statusPeer revieweden

This item appears in the following Collection(s)

Show simple item record