Computers in Biology and Medicine 167 (2023) 107573
A
0
Contents lists available at ScienceDirect
Computers in Biology and Medicine
journal homepage: www.elsevier.com/locate/compbiomed
Localization and phenotyping of tuberculosis bacteria using a combination of
deep learning and SVMs
Marios Zachariou a,∗, Ognjen Arandjelović a, Evelin Dombay b, Wilber Sabiiti b, Bariki Mtafya c,
Nyanda Elias Ntinginya c, Derek J. Sloan b
a School of Computer Science, University of St Andrews, St Andrews, KY16 9SX, United Kingdom
b School of Medicine, University of St Andrews, St Andrews, KY16 9AJ, United Kingdom
c Mbeya Medical Research Center, Mbeya, Tanzania
A R T I C L E I N F O A B S T R A C T
Keywords: Successful treatment of pulmonary tuberculosis (TB) depends on early diagnosis and careful monitoring of
Microscopy treatment response. Identification of acid-fast bacilli by fluorescence microscopy of sputum smears is a common
Machine learning tool for both tasks. Microscopy-based analysis of the intracellular lipid content and dimensions of individual
Fluorescence Mycobacterium tuberculosis (Mtb) cells also describe phenotypic changes which may improve our biological
Feature descriptors
MSVR understanding of antibiotic therapy for TB. However, fluorescence microscopy is a challenging, time-consuming
Regression and subjective procedure. In this work, we automate examination of fields of view (FOVs) from microscopy
Deep learning images to determine the lipid content and dimensions (length and width) of Mtb cells. We introduce an
Treatment monitoring adapted variation of the UNet model to efficiently localising bacteria within FOVs stained by two fluorescence
Mycobacterium tuberculosis dyes; auramine O to identify Mtb and LipidTox Red to identify intracellular lipids. Thereafter, we propose a
feature extractor in conjunction with feature descriptors to extract a representation into a support vector multi-
regressor and estimate the length and width of each bacterium. Using a real-world data corpus from Tanzania,
the proposed method i) outperformed previous methods for bacterial detection with a 8% improvement (Dice
coefficient) and ii) estimated the cell length and width with a root mean square error of less than 0.01%. Our
network can be used to examine phenotypic characteristics of Mtb cells visualised by fluorescence microscopy,
improving consistency and time efficiency of this procedure compared to manual methods.1. Introduction approaches for measurement of individual Mtb lipid content and cell
dimensions could contribute to this effort. Traditionally sputum smear
Tuberculosis (TB), the leading infectious cause of death worldwide, microscopy has been important for diagnosis and treatment monitoring
is mainly caused by Mycobacterium tuberculosis (Mtb), a bacterial of pulmonary TB. The technique involves heat-fixing a small (10-20 μm)
species which is transmitted by coughing droplets and aerosols. 85% aliquot of sputum from symptomatic patients onto microscopy slides
of TB disease is pulmonary, affecting the lungs. The World Health and staining them using procedures that selectively detect acid-fast
Organisation (WHO) report up to 10 million cases of active disease bacilli (AFB) such as Mtb cells. The contemporary approach to this
per year, with almost 2 million deaths [1]. The greatest burden of uses Auramine-O based fluorescence staining to label AFB yellow-
morbidity and mortality from TB occurs in low- and middle-income green on a black background (usually at ×400 magnification) [4].
countries with fewer healthcare resources [2]. Since the 1940s, TB has Sputum smear grading scales describing the number of AFB seen in
been curable by antibiotic treatment, but the long duration of therapy each sample are able to triage disease severity at the start of treat-
(commonly at least 6 months) is challenging, both for patients and ment and describe changes in bacterial load over time [5]. In recent
public health programmes. years, many centres worldwide have shifted their focus from smear
In 2015, WHO developed the ‘‘End TB Strategy’’, aiming to elim- microscopy to molecular technologies (such as the Xpert MTB/RIF
inate TB as a public health problem by 2050 [3]. However, this test) for TB diagnosis [6]. However, current molecular tools are not
will require major advances in biological tools to improve our un- recommended for on-treatment monitoring as results stay ‘positive’ for
derstanding of the effect of antibiotic treatment on Mtb. This paper months even when treatment is progressing well [7]. Therefore, smear
will use sputum smear microscopy images to show how automated
∗ Corresponding author.
E-mail address: marios.zachariou@hotmail.com (M. Zachariou).https://doi.org/10.1016/j.compbiomed.2023.107573
Received 7 June 2023; Received in revised form 9 September 2023; Accepted 11 O
vailable online 13 October 2023
010-4825/© 2023 The Author(s). Published by Elsevier Ltd. This is an open access actober 2023
rticle under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
M. Zachariou et al. Computers in Biology and Medicine 167 (2023) 107573
o
d
f
c
7
t
p
a
r
e
p
t
t
a
s
i
f
f
i
f
a
F
a
f
c
f
t
T
a
t
3
b
b
s
s
p
d
o
u
m
2 a
d
i
h
d 3
n
H
f h
t i
r a
f
a c
a w
a o
l i
p
microscopy remains important for this purpose [8]. Smear microscopy
can also study changes within individual Mtb cells during therapy.
This is important because findings from recent microbiological research
suggest that the physical morphology of each organism offers pheno-
typic information on its physiological behaviour in relation to antibiotic
susceptibility. For example, some Mtb cells accumulate nonpolar lipids
intracellularly, allowing them to be classed as lipid-rich (LR) rather
than lipid-poor (LP) [9–13]. In vitro microbiological data suggest that
LR bacteria are antibiotic tolerant (less easy to kill by the first-line
drugs used to treat TB) [12,13] and may play a role in poor patient
outcomes (treatment failure or post-treatment relapse) [14]. Modifica-
tion of the Auramine-O fluorescence staining method, by incorporating
a LipidTox Red (LTR) dye to show intracellular lipids within AFB allows
discrimination between LR and LP cells [11,14].
Additionally, in vitro microscopy has previously demonstrated that
Mtb cells grow asymmetrically creating variation in cell length over
time [15,16]. Cells of different sizes with different growth poles have
variable susceptibility to individual antibiotics [16,17]. Preliminary
clinical data suggest that the median length of persistent Mtb cells may
be associated with worse disease severity and increases after antibi-
otic exposure [18,19]. To understand whether changes in intracellular
lipid content or dimensions of Mtb cells really are useful character-
istics for the study of TB treatment response, larger scale laboratory
and clinical studies are required. However, smear microscopy is time-
intensive and subjective which makes this work difficult to perform
at scale [20]. Each slide must be examined in discrete fields of view
(FOV) that are inspected sequentially. This process is tiring which can
introduce errors [20]. Some slides are challenging to evaluate because
AFB might have odd appearances or because non-bacterial compo-
nents (artefacts) inside the sputum matrix mimic Mtb cells. A possible
strategy for tackling these issues is to apply contemporary artificial
intelligence techniques [21]. Recent studies have demonstrated signif-
icant accomplishments in the realm of automated diagnosis, treatment
monitoring, and the potential prevention of other medical conditions
(e.g., cardiovascular and gynaecological pathology) [22,23].
In this paper we aim to advance computer-based approaches to TB
microscopy by developing methods to:
• Locate Mtb cells within given FOVs on Auramine-O and LTR
stained fluorescence microscopy images, with performance eval-
uation by two established metrics (Jaccard index and Dice coef-
ficient).
• Co-localise the same Mtb cells on paired Auramine-O and LTR
stained images of an FOVs, in order to assess the proportion of
LR bacteria, achieving a maximum error of less than 1.5 pixel
difference between ground truth and predicted FOVs.
• Estimate the length and width of Mtb cells in FOV patches from
sputum smears collected at 0, 2 and 6 months of therapy with less
or equal to 2% error across 3 regression metrics.
. Related work
Most research on automating TB microscopy has focussed on en-
ancing diagnosis. We were unable to find any other work that was
irectly comparable to our approach of developing morphological phe-
otypes of Mtb cells which may be relevant to treatment response.
owever, some previous literature describes use of deep learning tools
or morphological phenotypic evaluation of other cell types in order
o understand the pathophysiology of infectious diseases and bacterial
esponse to antibiotics.
In the realm of mycobacteria, Bao et al. used using light microscopy
nd convolutional neural networks (CNN) to classify morphological
lterations of macrophages infected with Mycobacterium marinum,
surrogate model for Mtb, to show the role of the essential viru-
ence factor EsxA [24]. Whilst this work focussed on identification
henotypic changes in host cells rather than bacteria and did not t
2
fall under the purview of treatment monitoring, it still demonstrates
the capacity of automated image analysis to detect changes in cell
appearance of individual cells which enhance our understanding of
bacterial pathophysiology.
In the domain of antibiotic response, Yu et al. assessed susceptibility
f Escherichia coli bacteria in urine to five relevant antibiotics using
eep learning video microscopy [25]. Whilst conventional procedures
or antimicrobial susceptibility testing can take several days and delay
linical decision making, the authors described a technique that used a
layer CNN to evaluate footage of freely moving bacterial cells in real
ime. Inhibition (or not) by antibiotics was reported by learning several
henotypic characteristics of the cell without requiring the definition
nd quantification of each characteristic. Antibiotic susceptibility was
eported with mean accuracy of 91.8% within 30 min. Similarly, Zahir
t al. used high throughput screening and deep learning to describe
henotypic ‘bulging’ in E. coli which are associated with resistance and
olerance to 𝛽-lactam antibiotics [26].
To be best of our knowledge there are three published algorithms for
he detection of bacteria from fluorescence microscopy slides. Mithra
nd Emmanuel proposed a methodology consisting of three sequential
tages: segmentation, feature extraction, and classification [27]. The
nitial step in detecting bacteria is to perform a colour space trans-
ormation on the input microscope image to better separate bacteria
rom the background. Thresholding is then applied to the transformed
mage to segment potential bacteria regions based on colour intensity
or further analysis. Data on length, density, area, and histogram char-
cteristics are gathered for the purpose of classifying contours using a
uzzy Hyco-Entropy Decision Tree classifier as: low-bacilli, non-bacilli,
nd overlapping bacilli. Diaz-Huerta et al. proposed a method that
ocuses solely on the segmentation stage and implements a Bayesian
lassifier, based on a Gaussian mixture model, to differentiate bacteria
rom background [28]. The latest technique in tuberculosis bacterial de-
ection utilises Cycle-GANs in an image-to-image translation approach.
he objective of this method is to learn how to transfer bounding boxes
round possible regions of interest from labelled field of views (FOVs)
o unlabelled ones [29].
. Proposed method
The methods proposed in this paper consists of three stages: (i)
acterial detection from microscopy FOVs, (ii) paired detection of
acterial locations from two images of each FOV (one captured to
how auramine-O staining of Mtb cells, and one captured to show LTR
taining of intracellular lipid; collectively these allow inference of the
roportion of LR bacteria in the FOV), and finally (iii) estimation of in-
ividual bacterial dimensions (length and width) from cropped patches
f FOVs containing one or more Mtb cell. Segmentation techniques are
sed for stage (i) and (ii), and regression is used for stage (iii). These
ethods are designed and evaluated separately, with distinct objectives
nd evaluation criteria. Although we describe them to operate indepen-
ently, they could be used sequentially with a future goal of pipeline
ntegration.
.1. Convolutional neural networks: A brief introduction
CNNs are a class of artificial neural networks that have proven
ighly effective for visual analysis tasks. CNNs take advantage of the
nherent grid structure of image data by employing convolutional layers
s building blocks. These convolutional layers consist of learnable
ilters that are convolved across the input image to extract spatially-
orrelated features. During training, the CNN learns values for the filter
eights that activate on specific visual patterns, such as edges, textures,
r higher-level concepts. The convolutional filters are slid across the
mage, computing dot products between the filter and local regions of
he input at each location.
M. Zachariou et al. Computers in Biology and Medicine 167 (2023) 107573Multiple convolutional layers are stacked to extract hierarchical
feature representations, from low-level features like edges in early
layers to high-level semantic features like objects in later layers. CNNs
also contain pooling layers to gradually reduce spatial dimensions and
provide translational invariance. This hierarchical architecture loosely
mimics the organisation of neurons in the visual cortex. After several
convolutional and pooling layers, CNNs generally use fully-connected
layers to condense feature representations and perform classification.
Moreover, CNNs are also suitable for regression and segmentation
tasks. The entire network, including filter values (with the exception
of pooling layers), is trained using backpropagation to minimise a loss
function through gradient descent. This provides an efficient way to
tune the filters to extract optimal features tailored to the training data.
CNNs have achieved state-of-the-art results on many computer vi-
sion tasks, including image classification, object detection and semantic
segmentation. Their automatic feature extraction, ease of training,
and layered feature learning have made CNNs the dominant approach
for nearly all vision problems. Careful CNN architecture design, reg-
ularisation, and hyperparameter tuning is crucial to ensure robust
generalisation and avoid overfitting. Overall, CNNs provide a flexible
yet efficient deep learning framework well-suited for diverse computer
vision applications.
3.1.1. UNet: segmentation-based CNN
As explained previously, CNNs leverage convolutional layers to
extract hierarchical features from images. A UNet model builds on these
CNN principles to create an encoder–decoder segmentation architec-
ture [30].
The encoder portion of UNet uses repeated blocks of convolution,
activation, and max pooling layers, similarly to a typical CNN. This
encodes the input image into high-level feature representations while
downsampling spatially. The decoder pathway then upsamples these
features back to the original input resolution using transpose con-
volutions. A key difference from a CNN is the introduction of skip
connections that concatenate encoder features with the upsampled
decoder features. These skips provide the decoder with both con-
textual information from the encoder (information recall) as well as
fine-grained localisation from the upsampled features.
Finally, the decoded features are fed into a convolution layer to
generate a pixel-wise probability map for semantic segmentation. So
UNet leverages a CNN encoder to analyse contextual features, but
adds a decoding path with skip connections to localise and precisely
segment input images in an end-to-end manner. The model is trained
via backpropagation just like ordinary CNNs. This architecture remains
popular for segmentation tasks, especially in medical imaging where
precision is critical.
In summary, UNet extends CNNs into an efficient encoder–decoder
structure specialised for precise pixel-level segmentation while retain-
ing automated feature extraction capabilities. Fig. 1 provides a visual
example of the UNet architecture.
3.2. Bacteria detection
As described in Section 1, lipid content within Mtb cells is calculated
as the proportion of total bacteria detected in an FOV stained with
Auramine-O (green channel) which are also detected in the same FOV
stained with LTR (red channel) at the same location. As each FOV is
represented as two RGB images, we set the red and blue channels to 0
to make bacteria visible in the green channel only. Similarly, we set the
green and blue channels to 0 makes bacteria visible in the red channel
only. If a bacterium is localised in the green channel and co-localised at
the same spot in the red channel, it is LR. If the bacterium is localised
in the green channel without co-localisation in the red channel, it is LP.
For example, when examining paired images of an FOV, if 5 bacterial
locations are found in the green channel, 3 are co-localised in the red
channel are found in the red channel only, 5 Mtb cells have been3
identified in total, 3 (60% lipid content) of which are LR. Any other
objects resembling bacteria in the red channel are also insignificant, as
microbiologists typically address this task in a unidirectional manner
rather than bidirectionally, i.e. first detect bacteria in the green channel
and then the red one.
The key component in this analysis is not the actual colour intensity,
but rather the intensity of the object in contrast to the image back-
ground when the other two channels of an RGB image are set to a
value of 0, i.e. suppressed. We convert each FOV to greyscale in order
to prevent extra training for each coloured FOV and also to reduce
complexity as we go from three dimensions to one. In addition, to make
FOVs less susceptible to noise, we use the image enhancement tech-
nique described by Zachariou et al. to make the approach more robust
and effective [29,31]. Fig. 2 presents an example of our preprocessing
procedure, as well as featuring segmentation ground truths.
The first objective is to binarize the FOV, which means that the
resulting image has a black background and the objects of interest
(bacteria) are white areas. We adapt and apply UNet [30] since it is
an effective choice for learning to collect important information about
objects of interest and generate a binarised image. We replace the
first layer of the UNet with one that has input channels of one rather
than three and output channels of 32 rather than 64 as required by
the original implementation. Therefore, the input and output channels
of subsequent layers are adjusted to align with the original UNet
implementation, whereby the number of channels in each layer is
doubled compared to the previous layer. Consequently, the proposed
network architecture exhibits a reduction in the number of channels
at the bottleneck level from 1024 to 512. Kernel sizes and padding
for the convolutional layers are not changed. In addition, the max
pooling layers in the model have a stride of 1, as opposed to the
original UNet that used stride of 2, while the kernel size remains the
same. These modifications of the layers are driven by the fact that
bacteria do not have complicated shapes. As the form of a bacterium
is relatively concise, the first layer requires less deductive reasoning;
therefore, higher channel layers may cause the model to overfit on
the training data and acquire extraneous features. As this is supervised
learning, an experienced mycobacterial microscopist manually examins
and highlights bacterial outlines in each a FOV, which is then converted
into a binary image and used as ground truth for both the UNet and the
proposed network training.
3.2.1. Training of segmentation networks
The training is carried out in an end-to-end fashion; there is no use
of transfer learning. Due to the fact that there is no transfer learning,
we train the network for over 1000 epochs. We employ AdaBelief, a
novel optimiser that has been demonstrated to converge as rapidly as
adaptive optimisers (such as Adam [32]) and to generalise better than
Stochastic Gradient Descent (SGD) [33] in intricate models such as
GANs [34]. A circular scheduler with a step size equal to five times the
size of the dataset (which in turn is dependent on the batch size) is used
in conjunction with a learning rate of 0.0001, which was the default
setting [35]. Both the base learning rate and the upper learning rate
are set to their respective default values of 0.00001 and 0.0004. We use
Dice loss [36] (also known as F1-score) as the loss function to train the
model. To increase the robustness and generalisability of the learning
process, we augmented real data with synthetic data. To achieve this we
synthesise images randomly rotated by ±25◦ and mirrored around the
vertical or horizontal axis; this increases the quantity of training data
by roughly 50% [37]. Note that this type of enhancement is particularly
well suited to the task at hand because, unlike natural images, in which
there is an inherent asymmetry in directions (e.g., the horizontal and
vertical directions are objectively defined and cannot be swapped),
in the microscopy slides of interest, all directions are interchangeable
and therefore equivalent. Furthermore, input images are resized to
256 × 256 pixels using bicubic interpolation [38].
M. Zachariou et al. Computers in Biology and Medicine 167 (2023) 107573Fig. 1. Flow of information of the UNet architecture.3.2.2. Minimising false positives with bacterial morphological features
Existing literature described that, detection of bacteria using
gradient-based methods alone is not always successful [39,40]. Speci-
ficity can be compromised by false positive misinterpretation of arte-
facts as bacteria, on the basis of similar colour intensity. To reduce de-
tection of these false positives, our method includes heuristic morpho-
logical characteristics (area, perimeter, number of edges, and Fourier
descriptors). Using the Douglas–Peucker [41] technique, we compute
the area, perimeter and approximate form of each contour. Essential
parameters for a detected shape to be identified as a bacterium are:
the area must be between 80 and 1200 pixels, the perimeter must
be between 40 and 300 pixels, and the approximate form must have
between 9 and 20 edges.
In the last step of this process, we calculate the elliptic Fourier de-
scriptors for each contour of the ground truth labels using the 20thhar-
monic. The application of the 20thharmonic for representation yields
proximate coefficients that capture well the morphology of a majority
of the designated bacteria specimens chosen at random. Higher num-
bers of harmonic result in an overfitted outline of the current contour.
Once each Fourier descriptor for every contour has been computed,
the resulting matrix has the dimensions 𝑛 × 20 × 4, where n is the
total number of contours. The last dimension, 4, reflects the coefficients
returned, of the Fourier series representation of the contour. The final
20 × 4 matrix is created by averaging the Fourier descriptors from
all calculated contours. Furthermore, the Fourier descriptors of each
predicted contour are calculated. These descriptors are then used in the
calculation of the Euclidean distance between the average descriptors
derived from the ground truth labels. To be considered a valid bacterial
shape, the Euclidean difference between the predicted contour Fourier
descriptors and the average descriptors must be between 14 and 18
pixels.4
3.3. Estimating cell length and width
For the last step of this process, any bacterium/contour in the green
channel images that fits the requirements given in Section 3.2.2 is
utilised as the test set. Firstly, a medical microscopist manually crops
patches containing one or more bacteria that overlap and annotates
the cells with straight lines down their entire length. If a bacterium
has a somewhat curved form, multiple straight lines may be necessary.
Multiple straight lines are needed for bacteria with curved or angular
forms. Since the cell width across all cells is very similar (typically 5–6
pixels), width is averaged per patch, thus a patch with three cells is
represented by a scalar for its width. Furthermore, we observe that the
maximum number of bacteria per patch is four (n.b. for our dataset),
the size of the vector acting as the ground truth label during training is
five. If a patch contains two bacteria, for example, the first entry is the
average width, the second entry is the sum of the lengths of the first
bacterium, and the third entry is the sum of the lengths of the second
bacterium. The remaining entries are all 0. Evidently, an additional
benefit of our approach is its ability to count the number of bacteria
present in a patch, similarly to previous works [21,29].
We utilise these labels to train a second CNN model, using re-
gression, i.e. the final linear output layer does not contain a sigmoid
activation. The trained model is stored and later used as a pre-trained
model with its linear output layer removed, transforming it into a
feature extraction encoder of 128 sized vector. Additionally, several
feature descriptors are applied to extract a supplementary 128 sized
vector of features from the input patches. These are: RootSIFT [42],
Multiple Kernel local descriptors [43], HardNet [44], HardNet8 [45],
HyNet [46], TFeat [47], SOSNet [46], Histogram of Oriented Gradi-
ents [48] and Local Binary Patterns [49,50]. The two vectors, one
M. Zachariou et al. Computers in Biology and Medicine 167 (2023) 107573
Fig. 2. Examples of paired images from the same FOV from our dataset. In the top row (a) all colour channels except green suppressed and (b) all colour channels except red
suppressed. In the second row, all images are converted to grayscale using the aforementioned image enhancement technique. In the bottom row, manual ground-truth labelling
of cells in the two images is shown; the two separate fluorescence labels visible on the same FOV display different information, necessitating a different ground truth label. At
this phase, we train our model to detect as many Mtb-shaped objects as possible in both green and red channel images, even although objects which are only detected in the red
image are ultimately discarded.
5
M. Zachariou et al. Computers in Biology and Medicine 167 (2023) 107573Fig. 3. Following an encoding procedure from both the CNN and the feature descriptor, the MSVR outputs the final predictions. Since 1 × 1 convolutional filters may be used to
modify the dimensionality of the filter space while maintaining linear activation of pixel values, the kernel size of all CNN layers is set to 1. This is due to the small dimensions
of the input images; thus, we want to capture bacterial characteristics without losing spatial information.from the CNN and the other from the feature descriptor, are then
concatenated, creating a 256-dimensional feature vector. This vector
serves as input to a multi-output support vector regressor (MSVR) [51]
aiming to predict the same 5-dimensional ground truth. Fig. 3 shows
an overview of our method’s information flow.
3.3.1. Training setup
As with the training for segmentation, no transfer learning is per-
formed in this instance, and the model is trained from scratch for
1000 epochs. For the optimiser we use Adam for its straightforward
implementation, being computationally efficient, and low memory re-
quirements. The hyper-parameters 𝛽1 and 𝛽2 are set to 0.5 and 0.999
respectively, the learning rate to 0.001, and the cosine annealing learn-
ing rate scheduler employed [52]. This scheduler decreases the learning
rate every 20 iterations until it reaches 0.0001 before initiating over
again. Finally, the loss function used is the Least Absolute Deviation
(L1), since the dataset contains many outliers which are emphasised by
squared differences. Considering that we have 1000 patches available
for training in this stage (80% for training and 20% for testing), no
data augmentation is performed. Like the previous CNN, the input
patches are made square before being scaled to 80 × 80 pixels using
bicubic interpolation. Following grid search hyperparameter learning,
the following are used: a radial basis function (RBF) kernel, 𝐶 = 1,
𝜖 = 0.001, and 𝛾 = 0.01. Fig. 4 presents a graphical summary.
4. Experimental evaluation
4.1. Dataset
The images used in this work are from the dataset of TB patients de-
scribed in previous work [29,31]. Briefly, 46 patients with pulmonary
TB were recruited at clinical facilities affiliated to NIMR-Mbeya Medical
Research Centre (NIMR-MMRC). Microscopy smears were made from
sputum samples collected pre-treatment, and after 2 and 5–6 months
of TB therapy. These were stained according to standard Auramine-
O LTR protocols and viewed at ×1000 using an oil immersion lens
of a Leica DM5500 microscope with a DFC 300G camera attachment.
Paired FOVs containing Mtb were photographed at manual microscopy,
using an N3 filter cube (excitation and emission spectra of 546/12 and
600/40 nm) to assess Auramine-O staining and a TX2 filter cube to
assess LTR staining (excitation and emission spectra of 560/40 and
645/75 nm).6
Altogether, 1000 FOVs were selected at random from the Tanzanian
corpus [29,31]. To confirm that the automated image analysis approach
under development is unaffected by changes in the morphology of
Mtb cells during or after TB treatment, images were selected from
all sample collection time periods. To create ground-truth data for
the segmentation analysis, a microscopist who was independent of the
original Tanzanian project re-examined these images, labelling objects
of interest in both green and red channel images. 80% of the FOVs were
utilised for training, while the remaining 20% were used for testing and
assessment.
4.2. Semantic segmentation of bacteria detection
As outlined in Section 3.2, bacterial detection and estimation of
lipid content must be done in combination. Therefore, evaluating the
performance of these tasks should be done together too. However,
distinct techniques are required to assess the separate processes of
semantic segmentation on green and red channel images of an FOV,
and distance-based evaluation of whether the same objects have been
localised on both images. For example, although being a true positive in
terms of detection, for the lipid content it cannot be deemed accurate.
This is also the primary reason why these two stages of our work
require two distinct assessment techniques.
The evaluation metrics used in the assessment of semantic segmen-
tation are the Jaccard index [53] and Dice coefficient [54]. When
only Auramine-O stained FOVs are included, these are 97.00% and
96.06%. The value of the Jaccard index and Dice coefficient exceeds
that achieved by earlier efforts [27–29]. All three works employ the
same evaluation metrics, which facilitates direct comparison with our
method. However, when LTR stained images are included, the percent-
ages decrease to 92.03% and 85.84%, respectively. As seen in Fig. 5,
Nile red stained FOVs often result in false positives, which motivated
our subsequent use of morphology. Table 1 presents a comprehensive
overview of the outcomes obtained from the comparison of the two
UNet models. Following the application of morphological criteria and
Fourier descriptors to the FOVs of both dyes, the final percentages are
95.47% and 91.33%. Considering that it is very difficult to match pre-
cise bacterial outlines by manual or automated labelling, it unrealistic
to anticipate that the form of the predicted contour would precisely
match the shape outline of the ground-truth contour. Therefore, even
if the model accurately predicted a contour, the errors in the reference
used as the ground truth itself may penalise it slightly.
M. Zachariou et al. Computers in Biology and Medicine 167 (2023) 107573
Fig. 4. The diagram presented illustrates the proposed approach. The first method involves the training of the UNet model and the proposed network. Afterwards, bacterial patches
are manually cropped from ground truth labels in the green channel that were employed in the segmentation method. The decision stage involves determining whether it is
necessary to pre-train the CNN model for the final stage, or alternatively, to utilise the same CNN along with a regression layer to make predictions on the vector representing cell
length and width. If pre-training is not needed, the features extracted from the pre-trained CNN are combined with the output of a feature descriptor. This concatenated feature
representation is then fed into a MSVR, which produces a prediction vector that resembles the output of the CNN’s regression layer.
7
M. Zachariou et al. Computers in Biology and Medicine 167 (2023) 107573Fig. 5. (a) and (b) are separate examples of Auramine-O stained FOVs, the prediction image from the segmentation model and labels applied by a microscopist to corresponding
ground truth images. (c) is an example of a different LipidTox Red stained FOV (not paired with (a) or (b). The prediction in (c) has localised three false positives objects, which
are likely due to noise or artefacts and are subsequently rejected using our morphology-based approach.Table 1
A comparison of segmentation results between the original UNet and our proposed network. In the training phase, a composite of both stained
FOVs was used, whereas in the testing phase, both models were initially evaluated using only green and then both types of FOVs. The LTR dye
stained more artefact, making it more difficult to detect Mtb cells on the red images precisely. Although the original UNet performed better in
training, the proposed network performed better on unseen test data.
Models Training Test
Dice coefficient Jaccard index Dice coefficient Jaccard index
UNet (Baseline) Green FOVs 99.53% 99.07% 96.10% 92.49%
UNet (Baseline) All FOVs 91.29% 83.97%
Our network Green FOVs 99.04% 98.11% 97.00% 96.06%
Our network All FOVs 92.03% 85.24%8
M. Zachariou et al. Computers in Biology and Medicine 167 (2023) 107573Fig. 6. Plots (a, b) length and width samples vs their respective percentage error rate. Due to the large number of samples in both the training set and the test set, the graph is
simplified by averaging identical samples with 0.01% error difference.Table 2
Performance evaluation metrics for both training and test sets, including all model and shape characteristics. The quantitative results demonstrate
that our approach has learnt and generalised the problem. The CNN model utilised throughout the evaluation phase was the pretrained model
specifically designed for this purpose.
Training Test
RMSE MAPE MAE RMSE MAPE MAE
CNN 1.9840 0.0213 0.5366 2.4746 0.1111 1.7442
CNN & HOG 0.0161 0.0046 0.0212 0.0815 0.0112 0.1004
CNN & SIFT 0.6727 0.0350 0.6753 0.8357 0.0533 1.0778
CNN & MKD 0.5374 0.0290 0.5469 0.6915 0.0431 0.8732
CNN & HardNet 0.4651 0.0246 0.4646 0.5307 0.0339 0.6628
CNN & HardNet 8 0.7747 0.0393 0.7599 0.9575 0.0615 1.2421
CNN & HyNet 0.5034 0.0263 0.5162 0.6476 0.0402 0.8076
CNN & TFeat 0.1322 0.0077 0.1572 0.1563 0.0096 0.2068
CNN & SOSNet 0.4718 0.0256 0.4948 0.6248 0.0394 0.8007
CNN & LBPs 0.7684 0.0396 0.7624 0.9737 0.0626 1.24959
M. Zachariou et al. Computers in Biology and Medicine 167 (2023) 1075734.3. Distance-based evaluation
Next, we evaluated the ability of our network to recognise the
same bacteria in both images of each FOVs at the same location. We
utilise the 𝐿1, 𝐿2, and 𝐿𝑖𝑛𝑓 𝑖𝑛𝑖𝑡𝑦(𝐿∞) norms in a manner similar to that
described by Zachariou et al. [29]. Instead of comparing ground truth
contours to predicted contours, contours from the green FOV and the
red FOV are compared in this paper. Essentially, we are attempting to
correlate the centroids of bacteria in the green FOV with the centroid of
bacteria in the red FOV. The pairing was determined by the minimum
Euclidean distance between the centroid positions, with a threshold
of 15 if an apparent bacterium in one image could not be matched
with a partner in the other. If no suitable contour is obtained in the
red FOV, the contour from the green FOV is discarded, since it is
deemed irrelevant. The combined distances constitute a vector which is
subsequently used for the norms calculation. Additionally, we provide
the counts of paired contours from each category, namely, green and
red ground truth FOVs, along with their corresponding predicted coun-
terparts. The 𝐿1, 𝐿2, and L∞ norms for the ground truth FOVs measured
at 1010.77, 49.17, and 8.54 pixels, respectively, with a total of 572
pairs. For the predicted FOVs, the norms were 1067.7, 56.12, and 9.85
pixels, with 577 pairs. The close proximity of the norms between the
two sets of FOV pairings underscores the accuracy of our technique
in predicting bacterial locations in both scenarios. Specifically, the 𝐿∞
norm, representing the maximum absolute distance, has a difference
of less than 2 pixels, and the total of all distances is within 70 pixels
of each other. Considering that the average length of a bacterium can
range from 20 to 100 pixels, these numbers suggest that the predicted
pairings closely align with the ground truth ones.
4.4. Bacterial length and width
As described in Section 3.3, we use regression to estimate the
individual length and average width of bacteria. Therefore, we apply
regression evaluation metrics, comprising of root-square mean error
(RMSE), mean absolute percentage error (MAPE), and mean absolute
error (MAE). The rationale for incorporating both MAPE and MAE
lies in the dissimilar nature of the scaling of length and width. As
depicted in Fig. 6, it is evident that the scaling of length and width
differs significantly, thereby implying that an error in length would
not have an equivalent effect as an error in width. This figure also
indicates that MAE is a more suitable loss function than MSE for
our dataset because the outliers, represented by the two tails of the
distribution, exhibit a smaller deviation from 0% error, while the
majority of errors occur on the average samples. This is due to the
fact that the outliers, represented by the two tails of the distribution,
exhibit a smaller deviation from 0% error, while the majority of errors
occur on the average samples. Altogether the results are promising
and all model combinations performed well, with the CNN + HOG
combination consistently performing best according to all the criteria.
Fig. 7 shows two examples of cell dimensions measurements using
the best model. Table 2 summarises all training and test set metrics.
Two additional plots depicted in Figs. 8 and 9, derived from the
test set, provide supplementary evidence that the model has exhibited
exceptional performance and has acquired the ability to extrapolate to
novel data.
5. Conclusion
The majority of machine learning and deep learning research on au-
tomating sputum smear microscopy has focussed on its long-established
role as a frontline diagnostic test for pulmonary TB. As molecular
tools, such as Xpert MTB/RIF, replace this function, a key contribu-
tion of microscopy may become its ability to report on phenotypic
characteristics of individual Mtb cells for treatment monitoring and
to improve our biological understanding of therapeutic response. The10Fig. 7. Examples of patches to illustrate the labelling procedure for cell dimensions,
and show examples of ground truth and prediction distances. The length of bacteria
are shown by the blue straight lines while width is depicted in green straight lines.
When the length of a curved or angular bacterium requires several blue lines for full
coverage, its total length is calculated as the sum of all the blue lines within it .
Distances written next to individual cells in blue are ground truth lengths in pixels,
while those in red are predicted lengths. The width value is the average of all green
lines in each patch and is written in the bottom left corner of each image.
work we publish here is the first demonstration of artificial intelligence
approaches for this application. We have pioneered a new method
for semantic segmentation of Mtb bacteria on fluorescence microscopy
FOVs which performs well according to established evaluation metrics.
Our method, is robust for use with multiple fluorescence stains so that
paired images of the same FOV can be used to report on bacterial
detection and the presence of important intracellular structures such
as lipid content. Finally, a significant contribution of our work is that
our models accurately predict the dimensions (length and width) of
cells in original ground truth images, which will improve the ability
of clinical researchers and microbiologists to investigate the relevance
of heterogenous bacterial appearances in biological samples.
Next steps for this work will include: (i) interdisciplinary collab-
oration between Infectious Disease and Computer Science researchers
M. Zachariou et al. Computers in Biology and Medicine 167 (2023) 107573Fig. 8. The histogram of residuals plot depicts a concentration of residuals around 0, indicating that the model’s residuals are predominantly distributed in close proximity to the
origin. Patches consisting of 3 or 4 cells are infrequent. As a result, the third length is typically 0, which aligns with the model’s accurate prediction. For clarity, the fourth length
is not displayed.Fig. 9. Residual plot indicates a dispersion of residuals that is close to zero. An additional observation provides further evidence for the selection of MAE as a more appropriate
loss function, given that the outliers in the test set exhibit a proximity to zero that exceeds that of the average sample.to deploy these tools on more microscopy image-sets to assess their
real-world application, and (ii) optimisation of methods for automated
reading of whole slides, so that the manual labour required to identify
FOVs and patches before deep learning techniques can be used is also
eliminated.
Overall, the information compiled in this work argues that mi-
croscopy based treatment monitoring and Mtb cell phenotyping re-
search is important, and we have shown than automated deep learning
techniques make these activities possible.11Declaration of competing interest
The authors would like to declare no conflict of interest.
Acknowledgements
Supported by a Wellcome Trust Institutional Strategic Support Fund
award to the University of St Andrews, grant code 204821/Z/16/Z.
M. Zachariou et al. Computers in Biology and Medicine 167 (2023) 107573References
[1] World Health Organization, Global Tuberculosis Report, Tech. Rep., 2022, URL
https://www.who.int/teams/global-tuberculosis-programme/tb-reports/global-
tuberculosis-report-2022.
[2] D.P. Spence, J. Hotchkiss, C.S. Williams, P.D. Davies, Tuberculosis and poverty,
Br. Med. J. 307 (6907) (1993) 759–761.
[3] World Health Organization, End TB strategy, 2015, URL https://apps.
who.int/iris/bitstream/handle/10665/331326/WHO-HTM-TB-2015.19-
eng.pdf?sequence=1&isAllowed=y.
[4] K.R. Steingart, M. Henry, V. Ng, P.C. Hopewell, A. Ramsay, J. Cunningham,
R. Urbanczik, M. Perkins, M.A. Aziz, M. Pai, Fluorescence versus conventional
sputum smear microscopy for tuberculosis: a systematic review, Lancet Infect.
Dis. 9 (6) (2006) 570–581.
[5] Stop T. B. Partnership, Global Laboratory Initiative Advancing TB Diag-
nosis: Mycobacteriology Laboratory Manual, Stop TB Partnership, Geneva,
2014, URL https://stoptb.org/wg/gli/assets/documents/gli_mycobacteriology_
lab_manual_web.pdf.
[6] C.C. Boehme, P. Nabeta, D. Hillemann, M.P. Nicol, S. Shenai, F. Krapp, J. Allen,
R. Tahirli, R. Blakemore, R. Rustomjee, A. Milovic, M. Jones, S.M. O’Brien, D.H.
Persing, S. Ruesch-Gerdes, E. Gotuzzo, C. Rodrigues, D. Alland, M.D. Perkins,
Rapid molecular detection of tuberculosis and rifampin resistance, N. Engl. J.
Med. 363 (11) (2010) 1005–1015.
[7] S.O. Friedrich, A. Rachow, E. Saathoff, K. Singh, C.D. Mangu, R. Dawson, P.P.
Phillips, A. Venter, A. Bateson, C.C. Boehme, N. Heinrich, R.D. Hunt, M.J. Boeree,
A. Zumla, T.D. McHugh, S.H. Gillespie, A.H. Diacon, M. Hoelscher, Assessment
of the sensitivity and specificity of Xpert MTB/RIF assay as an early sputum
biomarker of response to tuberculosis treatment, Lancet Respir. Med. 1 (6) (2013)
462–470.
[8] World Health Organization, WHO Operational Handbook on Tuberculosis.
Module 4: Treatment. Drug-Susceptible Tuberculosis Treatment, World Health
Organization, Geneva, 2022, URL https://www.who.int/publications/i/item/
9789240050761.
[9] R.J.H. Hammond, F. Kloprogge, O.D. Pasqua, S.H. Gillespie, Implications of drug-
induced phenotypical resistance: Is isoniazid radicalizing M. tuberculosis? Front.
Antibiot. 1 (2022) 928365.
[10] J. Daniel, H. Maamar, C. Deb, T.D. Sirakova, P.E. Kolattukudy, Mycobacterium
tuberculosis uses host triacylglycerol to accumulate lipid droplets and acquires a
dormancy-like phenotype in lipid-loaded macrophages, PLoS Pathog. 7 (6) (2011)
e1002093.
[11] N.J. Garton, S.J. Waddell, A.L. Sherratt, S.-M. Lee, R.J. Smith, C. Senner, J.
Hinds, K. Rajakumar, R.A. Adegbola, G.S. Besra, P.D. Butcher, M.R. Barer,
Cytological and transcript analyses reveal fat and lazy persister-like bacilli in
tuberculous sputum, PLoS Med. 5 (4) (2008) e75.
[12] R.J.H. Hammond, V.O. Baron, K. Oravcova, S. Lipworth, S.H. Gillespie, Pheno-
typic resistance in mycobacteria: is it because I am old or fat that I resist you? J.
Antimicrob. Chemother. 70 (10) (2015) 2823–2827.
[13] C. Deb, C.M. Lee, V.S. Dubey, J. Daniel, B. Abomoelak, T.D. Sirakova, S. Pawar,
L. Rogers, P.E. Kolattukudy, A novel in vitro multiple-stress dormancy model
for mycobacterium tuberculosis generates a lipid-loaded, drug-tolerant, dormant
pathogen, PLoS One 4 (6) (2009) e6077.
[14] D.J. Sloan, H.C. Mwandumba, N.J. Garton, S.H. Khoo, A.E. Butterworth, T.J.
Allain, R.S. Heyderman, E.L. Corbett, M.R. Barer, G.R. Davies, Pharmacodynamic
modeling of bacillary elimination rates and detection of bacterial lipid bodies
in sputum to predict and understand outcomes in treatment of pulmonary
tuberculosis, Clin. Infect. Dis. 61 (1) (2015) 1–8.
[15] E.S. Chung, W.C. Johnson, B.B. Aldridge, Types and functions of heterogeneity
in mycobacteria, Nat. Rev. Microbiol. 20 (9) (2022) 529–541.
[16] B.B. Aldridge, M. Fernandez-Suarez, D. Heller, V. Ambravaneswaran, D. Irimia,
M. Toner, S.M. Fortune, Asymmetry and aging of mycobacterial cells lead
to variable growth and antibiotic susceptibility, Science 335 (6064) (2012)
100–104.
[17] K. Richardson, O.T. Bennion, S. Tan, A.N. Hoang, M. Cokol, B.B. Aldridge,
Temporal and intrinsic factors of rifampicin tolerance in mycobacteria, Proc.
Natl. Acad. Sci. 113 (29) (2016) 8302–8307.
[18] S. Vijay, D.N. Vinh, H.T. Hai, V.T.N. Ha, V.T.M. Dung, T.D. Dinh, H.N. Nhung,
T.T.B. Tram, B.B. Aldridge, N.T. Hanh, D.D.A. Thu, N.H. Phu, G.E. Thwaites,
N.T.T. Thuong, Influence of stress and antibiotic resistance on cell-length
distribution in mycobacterium tuberculosis clinical isolates, Front. Microbiol. 8
(2017) 2296.
[19] D.A. Barr, C. Schutz, A. Balfour, M. Shey, M. Kamariza, C.R. Bertozzi, T.J. de
Wet, R. Dinkele, A. Ward, K.A. Haigh, Serial measurement of M. tuberculosis in
blood from critically-ill patients with HIV-associated tuberculosis, EBioMedicine
78 (2022).
[20] H.L. Rieder, A. Van Deun, K. Man Kam, S. Jae Kim, T.M. Chonde, A. Trebucq,
R. Urbanczik, Priorities for Tuberculosis Bacteriology Services in Low-Income
Countries, Bull. Int. Union Tuberc. Lung. Dis. (2007).
[21] D. Vente, O. Arandjelović, V.O. Baron, E. Dombay, S.H. Gillespie, Using machine
learning for automatic estimation of M. Smegmatis cell count from fluorescence
microscopy images, in: International Workshop on Health Intelligence, 2019, pp.
57–68.12[22] B. Yesilkaya, M. Perc, Y. Isler, Manifold learning methods for the diagnosis of
ovarian cancer, J. Comput. Sci. 63 (2022) 101775.
[23] M. Surucu, Y. Isler, M. Perc, R. Kara, Convolutional neural networks predict the
onset of paroxysmal atrial fibrillation: Theory and applications, Chaos 31 (11)
(2021).
[24] Y. Bao, X. Zhao, L. Wang, W. Qian, J. Sun, Morphology-based classification of
mycobacteria-infected macrophages with convolutional neural network: reveal
EsxA-induced morphologic changes indistinguishable by naked eyes, Transl. Res.
212 (2019) 1–13.
[25] H. Yu, W. Jing, R. Iriya, Y. Yang, K. Syal, M. Mo, T.E. Grys, S.E. Haydel, S.
Wang, N. Tao, Phenotypic antimicrobial susceptibility testing with deep learning
video microscopy, Anal. Chem. 90 (10) (2018) 6314–6322.
[26] T. Zahir, R. Camacho, R. Vitale, C. Ruckebusch, J. Hofkens, M. Fauvart,
J. Michiels, High-throughput time-resolved morphology screening in bacteria
reveals phenotypic responses to antibiotics, Commun. Biol. 2 (1) (2019) 1–13.
[27] K.S. Mithra, W.R. Sam Emmanuel, FHDT: fuzzy and hyco-entropy-based decision
tree classifier for tuberculosis diagnosis from sputum images, Sādhanā 43 (8)
(2018) 1–15.
[28] J.L. Díaz-Huerta, A.d.C. Téllez-Anguiano, M. Fraga-Aguilar, J.A. Gutierrez-
Gnecchi, S. Arellano-Calderón, Image processing for AFB segmentation in
bacilloscopies of pulmonary tuberculosis diagnosis, PLoS One 14 (7) (2019)
e0218861.
[29] M. Zachariou, O. Arandjelović, W. Sabiiti, B. Mtafya, D. Sloan, Tuberculosis
bacteria detection and counting in fluorescence microscopy images using a
multi-stage deep learning pipeline, Information 13 (2) (2022) 96.
[30] O. Ronneberger, P. Fischer, T. Brox, U-net: Convolutional networks for
biomedical image segmentation, in: International Conference on Medical Image
Computing and Computer-Assisted Intervention, 2015, pp. 234–241.
[31] M. Zachariou, O. Arandjelović, E. Dombay, W. Sabiiti, B. Mtafya, D. Sloan,
Extracting and Classifying Salient Fields of View From Microscopy Slides of
Tuberculosis Bacteria, in: International Conference on Pattern Recognition and
Artificial Intelligence, 2022, pp. 1–12.
[32] D.P. Kingma, J. Ba, Adam: A method for stochastic optimization, 2014, arXiv
preprint arXiv:1412.6980.
[33] H. Robbins, S. Monro, A stochastic approximation method, Ann. Math. Stat.
(1951) 400–407.
[34] J. Zhuang, T. Tang, Y. Ding, S. Tatikonda, N. Dvornek, X. Papademetris,
J.S. Duncan, Adabelief optimizer: Adapting stepsizes by the belief in observed
gradients, Adv. Neural Inf. Process. Syst. 33 (2020) 18795–18806.
[35] L.N. Smith, Cyclical learning rates for training neural networks, in: IEEE Winter
Conference on Applications of Computer Vision, 2017, pp. 464–472.
[36] X. Li, X. Sun, Y. Meng, J. Liang, F. Wu, J. Li, Dice loss for data-imbalanced NLP
tasks, 2019, arXiv preprint arXiv:1911.02855.
[37] X. Yue, N. Dimitriou, O. Arandjelovic, Colorectal cancer outcome prediction
from H&E whole slide images using machine learning and automatically in-
ferred phenotype profiles, in: 11th International Conference, Vol. 60, 2019, pp.
139–149.
[38] O. Arandjelović, Hallucinating optimal high-dimensional subspaces, Pattern
Recognit. 47 (8) (2014) 2662–2672.
[39] R.O. Panicker, K.S. Kalmady, J. Rajan, M.K. Sabu, Automatic detection of
tuberculosis bacilli from microscopic sputum smear images using deep learning
methods, Biocybern. Biomed. Eng. 38 (3) (2018) 691–699.
[40] V. Makkapati, R. Agrawal, R. Acharya, Segmentation and classification of
tuberculosis bacilli from ZN-stained sputum smear images, in: International
Conference on Automation Science and Engineering, 2009, pp. 217–220.
[41] D.H. Douglas, T.K. Peucker, Algorithms for the reduction of the number of points
required to represent a digitized line or its caricature, Cartogr. Int. J. Geogr. Inf.
Geovisualization 10 (2) (1973) 112–122.
[42] R. Arandjelović, A. Zisserman, Three things everyone should know to improve ob-
ject retrieval, in: IEEE Conference on Computer Vision and Pattern Recognition,
2012, pp. 2911–2918.
[43] A. Mukundan, G. Tolias, A. Bursuc, H. Jégou, O. Chum, Understanding and
improving kernel local descriptors, Int. J. Comput. Vis. 127 (11) (2019)
1723–1737.
[44] A. Mishchuk, D. Mishkin, F. Radenovic, J. Matas, Working hard to know your
neighbor’s margins: Local descriptor learning loss, Adv. Neural Inf. Process. Syst.
30 (2017).
[45] M. Pultar, Improving the HardNet Descriptor, 2020, arXiv preprint arXiv:2007.
09699.
[46] Y. Tian, X. Yu, B. Fan, F. Wu, H. Heijnen, V. Balntas, Sosnet: Second order
similarity regularization for local descriptor learning, in: IEEE Conference on
Computer Vision and Pattern Recognition, 2019, pp. 11016–11025.
[47] V. Balntas, E. Riba, D. Ponsa, K. Mikolajczyk, Learning local feature descriptors
with triplets and shallow convolutional neural networks, in: BMVC, Vol. 1, No.
2, 2016, p. 3.
[48] N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in:
IEEE Computer Society Conference on Computer Vision and Pattern Recognition,
Vol. 1, IEEE, 2005, pp. 886–893.
[49] T. Ojala, M. Pietikainen, D. Harwood, Performance evaluation of texture mea-
sures with classification based on Kullback discrimination of distributions, in:
12th International Conference on Pattern Recognition, Vol. 1, 1994, pp. 582–585.
M. Zachariou et al. Computers in Biology and Medicine 167 (2023) 107573[50] J. Fan, O. Arandjelović, Employing domain specific discriminative information
to address inherent limitations of the LBP descriptor in face recognition, in:
International Joint Conference on Neural Networks, IEEE, 2018, pp. 1–7.
[51] Y. Bao, T. Xiong, Z. Hu, Multi-step-ahead time series prediction using
multiple-output support vector regression, Neurocomputing 129 (2014) 482–493.
[52] I. Loshchilov, F. Hutter, Sgdr: Stochastic gradient descent with warm restarts,
2016, arXiv preprint arXiv:1608.03983.13[53] A. Beykikhoshk, O. Arandjelović, D. Phung, S. Venkatesh, Overcoming data
scarcity of Twitter: using tweets as bootstrap with application to autism-related
topic content analysis, in: International Conference on Advances in Social
Networks Analysis and Mining, 2015, pp. 1354–1361.
[54] B. Guindon, Y. Zhang, Application of the dice coefficient to accuracy assessment
of object-based image classification, Can. J. Remote Sens. 43 (1) (2017) 48–61.