
238
http://doi.org/10.17993/3ctecno.2020.specialissue5.233-247
3C Tecnología. Glosas de innovación aplicadas a la pyme. ISSN: 2254 – 4143 Edición Especial Special Issue Abril 2020
2.2. CHARACTERISTICS OF MACHINE LEARNING ALGORITHMS
Supervised machine learning techniques are applicable in numerous domains. In general,
Support Vector Machines and neural networks tend to work much better when it comes
to multidimensional and continuous features (Agarwal & Sagar, 2019). On the other hand,
logic-based systems tend to work better when it comes to discrete/categorical features. For
neural network models and Support Vector Machines, the large sample size is required to
achieve maximum prediction accuracy, while Bayesian networks may need a relatively small
data set.
There is a general agreement that the K nearest neighbor algorithm is very sensitive to
irrelevant characteristics: this characteristic can be explained by the way the algorithm
works. Besides, the presence of irrelevant characteristics can make the training of the
neural network very inecient, even impractical. The most decision tree algorithms cannot
work well with problems that require diagonal partitions (Sathya & Abraham, 2013). The
division of the instance space is orthogonal to the axis of a variable and parallel to all other
axes. Therefore, the resulting regions after separation are all hyper-angles. Articial neural
networks and support vector machines work well when multicollinearity is present, and
there is a non-linear relationship between the input and output characteristics.
Naive Bayes (NB) requires little storage space during the training and classication stages:
the strict minimum is the memory needed to store prior and conditional probabilities. The
basic kNN algorithm uses a large amount of storage space for the training phase (Cao et al.,
2019), and its execution space is at least as ample as its training space. On the contrary, for
all non-lazy learners, the execution space is usually much smaller than the training space,
since the resulting classier is often a very condensed summary of the data. Besides, Naive
Bayes and CNN can easily be used as incremental learners, while rule algorithms cannot.
Naive Bayes is naturally robust to missing values since these are ignored in the probabilities
of calculation and, therefore, have no impact on the nal decision. On the contrary, kNN
and neural networks require complete records to do their job.
Finally, the decision trees and NB generally have dierent operational proles, when one
is very precise, and the other is not, and vice versa. In contrast, decision trees and rule
classiers have a similar operational prole. SVM and ANN also have a similar operational