Verifying the fully “Laplacianised” posterior Naïve Bayesian approach and more

Mussa, Hamse Yussuf; Marcus, David; Mitchell, John B. O.; Glen, Robert

Show simple item record

Files in this item

Name:: Mitchell_2015_JCi_Verifying_CC.pdf
Size:: 1.198Mb
Format:: PDF

View/Open

Item metadata

dc.contributor.author	Mussa, Hamse Yussuf
dc.contributor.author	Marcus, David
dc.contributor.author	Mitchell, John B. O.
dc.contributor.author	Glen, Robert
dc.date.accessioned	2015-06-12T09:40:03Z
dc.date.available	2015-06-12T09:40:03Z
dc.date.issued	2015-06-12
dc.identifier	194612577
dc.identifier	2c6e740e-8b73-4665-9b81-f789487f332c
dc.identifier	84930944620
dc.identifier	000355976200001
dc.identifier.citation	Mussa , H Y , Marcus , D , Mitchell , J B O & Glen , R 2015 , ' Verifying the fully “Laplacianised” posterior Naïve Bayesian approach and more ' , Journal of Cheminformatics , vol. 7 , no. 27 . https://doi.org/10.1186/s13321-015-0075-5	en
dc.identifier.issn	1758-2946
dc.identifier.other	ORCID: /0000-0002-0379-6097/work/34033386
dc.identifier.uri	https://hdl.handle.net/10023/6813
dc.description	Mussa and Glen would like to thank Unilever for financial support, whereas Mussa and Mitchell thank the BBSRC for funding this research through grant BB/I00596X/1. Mitchell thanks the Scottish Universities Life Sciences Alliance (SULSA) for financial support.	en
dc.description.abstract	Background In a recent paper, Mussa, Mitchell and Glen (MMG) have mathematically demonstrated that the “Laplacian Corrected Modified Naïve Bayes” (LCMNB) algorithm can be viewed as a variant of the so-called Standard Naïve Bayes (SNB) scheme, whereby the role played by absence of compound features in classifying/assigning the compound to its appropriate class is ignored. MMG have also proffered guidelines regarding the conditions under which this omission may hold. Utilising three data sets, the present paper examines the validity of these guidelines in practice. The paper also extends MMG’s work and introduces a new version of the SNB classifier: “Tapered Naïve Bayes” (TNB). TNB does not discard the role of absence of a feature out of hand, nor does it fully consider its role. Hence, TNB encapsulates both SNB and LCMNB. Results LCMNB, SNB and TNB performed differently on classifying 4,658, 5,031 and 1,149 ligands (all chosen from the ChEMBL Database) distributed over 31 enzymes, 23 membrane receptors, and one ion-channel, four transporters and one transcription factor as their target proteins. When the number of features utilised was equal to or smaller than the “optimal” number of features for a given data set, SNB classifiers systematically gave better classification results than those yielded by LCMNB classifiers. The opposite was true when the number of features employed was markedly larger than the “optimal” number of features for this data set. Nonetheless, these LCMNB performances were worse than the classification performance achieved by SNB when the “optimal” number of features for the data set was utilised. TNB classifiers systematically outperformed both SNB and LCMNB classifiers. Conclusions The classification results obtained in this study concur with the mathematical based guidelines given in MMG’s paper—that is, ignoring the role of absence of a feature out of hand does not necessarily improve classification performance of the SNB approach; if anything, it could make the performance of the SNB method worse. The results obtained also lend support to the rationale, on which the TNB algorithm rests: handled judiciously, taking into account absence of features can enhance (not impair) the discriminatory classification power of the SNB approach.
dc.format.extent	1257006
dc.language.iso	eng
dc.relation.ispartof	Journal of Cheminformatics	en
dc.subject	Classification	en
dc.subject	Naïve Bayes	en
dc.subject	Tapering	en
dc.subject	Features	en
dc.subject	3rd-DAS	en
dc.title	Verifying the fully “Laplacianised” posterior Naïve Bayesian approach and more	en
dc.type	Journal article	en
dc.contributor.sponsor	BBSRC	en
dc.contributor.institution	University of St Andrews. School of Chemistry	en
dc.contributor.institution	University of St Andrews. Biomedical Sciences Research Complex	en
dc.contributor.institution	University of St Andrews. EaSTCHEM	en
dc.identifier.doi	https://doi.org/10.1186/s13321-015-0075-5
dc.description.status	Peer reviewed	en
dc.identifier.url	http://www.jcheminf.com/content/7/1/27	en
dc.identifier.url	http://www.ncbi.nlm.nih.gov/pubmed/26075027	en
dc.identifier.grantnumber	BB/I00596X/1	en

This item appears in the following Collection(s)

University of St Andrews Research

Show simple item record