RECONSTRUCTION OF PHYSICAL DANCE
TEACHING CONTENT AND MOVEMENT
RECOGNITION BASED ON A MACHINE
LEARNING MODEL
Lei Li
Football College, Wuhan Sports University, Wuhan, Hubei, 430079, China
Tingting Yang*
School of Journalism and Communication, Wuhan Sports University, Wuhan,
Hubei, 430079, China
ytt342784081@163.com
Reception: 14/11/2022 Acceptance: 17/01/2023 Publication: 05/03/2023
Suggested citation:
L., Lei and Y., Tingting (2023). Reconstruction of physical dance teaching
content and movement recognition based on a machine learning model.
3C TIC. Cuadernos de desarrollo aplicados a las TIC, 12(1), 267-285. https://
doi.org/10.17993/3ctic.2023.121.267-285
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
267
ABSTRACT
With the technological development of movement recognition based on machine
learning model algorithms, the content and movements for physical dance teaching
are also seeking changes and innovations. In this paper, a set of three-dimensional
convolutional neural network recognition algorithms based on a machine learning
model is constructed through the collection to recognition of sports dance movement
data. By collecting the skeleton information of typical movements of physical dance, a
typical movement dataset of physical dance is constructed, which is recognized by the
improved 3D convolutional neural network recognition algorithm under the machine
learning model, and the method is validated on the public dataset. The experimental
results show that the 3D CNNs in this paper can produce relatively satisfactory results
for sports dance action recognition with high accuracy of action recognition, which
verifies the feasibility of the 3D convolutional neural network action recognition
algorithm under the machine learning model for the acquisition to recognition of sports
dance actions. It illustrates that the future can be better to open a new direction of
physical dance education content through machine learning models in this form.
KEYWORDS
machine learning model; sports dance movements; DDPG algorithm model; 3D
convolutional neural network movement recognition algorithm; movement skeleton
information dataset
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
268
PAPER INDEX
ABSTRACT
KEYWORDS
1. INTRODUCTION
2. MACHINE LEARNING MODELS
2.1. The DDPG algorithm model used for machine learning
2.2. Observed and rewarded values of machine learning model algorithms
2.3. THREE-DIMENSIONAL CONVOLUTIONAL NEURAL NETWORK
3. METHODS OF RECONSTRUCTING THE CONTENT AND MOVEMENT
IDENTIFICATION OF PHYSICAL DANCE TEACHING
3.1. Physical dance movement data pre-processing
3.2. Experimental data analysis methods
4. EXPERIMENTAL RESULTS AND ANALYSIS
5. CONCLUSION
DATA AVAILABILITY
CONFLICT OF INTEREST
REFERENCES
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
269
1. INTRODUCTION
With the boom of artificial intelligence, academics have begun to explore the
research use of machine learning models in various fields in the hope of reducing
human costs and improving output efficiency through these techniques [1-2]. In the
era of big data, machine learning models that can model relevant variables with
complex relationships will surely become the mainstream of scientific research in the
future [3]. Machine learning models are a top science and technology that specializes
in how computers can simulate or implement human learning behaviors [4]. Machine
learning models are trained to acquire new knowledge or skills and reorganize the
existing knowledge structure to continuously improve their performance [5]. With the
advent of the third wave of artificial intelligence, machine learning models, which are
the core of artificial intelligence, have started to appear frequently in the limelight [6].
Machine learning models are models that make predictions about unknown data
based on known data [7]. Machine learning models usually divide the original data set
into a training set and a test set. The data are then fitted and optimized several times
in the training set to build the model that best reflects the characteristics of the data,
and finally, the performance of the model is evaluated in the test set to verify the
generalization ability and reliability of the model [8-11].
In recent years, machine learning models have been applied to various fields with
remarkable results and their predictive reliability has been widely recognized by
researchers. In the literature [12], two sets of classification models were developed to
predict students' academic performance at graduation using individual course grades
and GPA, respectively. Logistic regression, random forest and plain Bayesian
algorithms in machine learning were used to build prediction models for academic
performance. The literature [13-14] proposed that when machine learning models
were used to predict students' academic performance, the prediction of students'
academic performance was more accurate when combining multiple factors
compared to a single factor. The literature [15] used machine learning methods to
predict changes in major depressive and generalized anxiety disorder symptoms from
pre-treatment to 9-month follow-up. The literature [16] revealed the relationship
between Internet use behaviors and academic performance and used machine
learning models to predict the academic performance of college students with these
behavioral data. The literature [17] studied a review of different machine learning
algorithms and their application in cardiovascular diseases and found that the
application of machine learning can increase the understanding of different types of
heart failure and congenital heart disease. The literature [18] proposed a
multidimensional approach based on GPS measurements and machine learning to
predict injuries in professional soccer. It also provided a simple and practical method
to assess and interpret the complex relationship between sports injury risk and
training performance in professional soccer. The literature [19] analyzed how to
predict and prevent financial losses through public news and historical prices in the
Brazilian stock market through a machine learning model. The literature [20] uses
Recurrent Neural Networks (RNN) for stock market forecasting. RNN is a machine
learning model dedicated to time series, which can take into account past correlation
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
270
series when predicting future trends, and the authors introduced the model to the
analysis of financial time series and tested it on the Nikkei index, proving the
usefulness of the approach. The literature [21] uses oil price movements, gold and
silver price movements, and foreign exchange movements as features to demonstrate
that the KSE-100 index can be predicted by machine learning algorithms and that the
multilayer perceptron outperforms the other algorithms among the machine learning
algorithms used. The literature [22] applied machine learning algorithms such as deep
learning, random forests, neural networks, and support vector regression machines to
stock index prediction in the UK stock market and found that deep learning gave the
best prediction results and support vector machines gave the second best results. The
literature [23] used the XGBoost algorithm for stock index futures forecasting and
compared it with LSTM and traditional autoregressive time series processing
methods, and the empirical results showed that the XGBoost algorithm was superior
in determining the ups and downs. Research on dance moves also abounds, and the
literature [24] proposes a modern deep architecture for C3D that can learn on large-
scale datasets. And the C3D method based on linear classifiers outperforms or
approaches the current state-of-the-art methods in both recognition accuracy of action
videos. The literature [25] uses convolutional neural networks (CNNs) for the
classification of Indian classical dance movements. Two hundred dance poses and
gestures were collected from online videos and offline recordings, respectively, and
the experiments were compared with the results of other classification algorithms on
the same dataset, finally obtaining a recognition rate of 93.33%. In [26], six Greek folk
dance movements were collected using a Kinect sensor, and four common classifiers
were used to directly classify the movements on the raw data to compare the
classification results, and the effect of different body joints on the recognition rate was
also investigated. The literature [27] studied motion data acquisition devices, among
which the Kinect depth vision sensor device has the advantages of high depth map
resolution, low cost, and the ability to directly track the human skeleton motion
trajectory.
In this paper, the typical movement dataset of sports dance was constructed by
collecting the skeleton information of the typical movement of sports dance. This
dataset was identified using a 3D convolutional neural network recognition algorithm
based on a machine learning model, and the original data recorded in this dataset
was checked for missing skeleton point data. In the case that the skeleton point data
was lost in the experiment and could not be recorded, the missing skeleton point data
of a frame was filled with the skeleton data of the previous frame to reduce the
influence of the subsequent machine learning model on the judgment of different
movements. The frame rate of the few skeletal sequences of typical sports dance
movements obtained is 30 fps. The duration of each movement in the dataset is
different, and the corresponding number of frames is also different. The sequence
length of each movement sequence in the dataset is unified, and the maximum
number of frames of the samples in the dataset is obtained. The remaining movement
samples with less than the maximum number of frames are copied to the last frame
and supplemented to the maximum number of frames for subsequent input into the
machine learning model for training Learning. After the dataset is completed, it is input
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
271
into the machine learning model, and the 3D convolutional neural network recognition
algorithm is used to train the dataset. The experimental results show that the 3D
CNNs in this paper have greater recognition advantages and accuracy compared with
several other action recognition algorithms, which can help the innovative
development of physical dance teaching contents and also make another perspective
of physical dance teaching methods. It is verified that the 3D convolutional neural
network action recognition algorithm under the machine learning model is feasible in
the acquisition to recognition of physical dance actions, which can better open a new
direction of physical dance education content through this form of machine learning
model and provide an optional path for the reconstruction of physical dance teaching
content and action recognition.
2. MACHINE LEARNING MODELS
With the development of information technology, intelligent algorithms represented
by reinforcement learning are increasingly used in the field of robot control with their
adaptive characteristics. In recent years, DQN algorithms combining deep neural
networks and reinforcement learning have been proposed to solve the problem of
high-dimensional input to machine learning models. However, DQN is still an
algorithm oriented to discrete control and has insufficient capacity to handle
continuous actions. In the practical control of machine learning models, the angular
output of each joint is a continuous value, and if the range of values taken for each
joint angle is discretized, the number of behaviors grows exponentially with the
number of degrees of freedom. If this accuracy is further improved, the number of
values taken will grow exponentially [28-29].
2.1. THE DDPG ALGORITHM MODEL USED FOR MACHINE
LEARNING
DDPG is an Actor-Critic based algorithm in reinforcement learning, the core idea of
which is to use the Actor-network to generate the behavior policy of the intelligence,
and the Critic network to judge the good and bad actions and guide the updated
direction of the actions.
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
272
Figure 1. DDPG algorithm structure
As shown in Figure 1, the DDPG algorithm structure contains an Actor-network with
parameters and a Critic network with a parameter to compute the deterministic policy
and action-value function, respectively. Since the learning process of a single network
is not stable, the Actor-network and the Critic network are subdivided into a realistic
network and a target network, respectively, drawing on the successful experience of
DQN fixed target networks. The Actor-network and the Critic network are each
subdivided into a realistic network and a target network. The real network and the
target network have the same structure, and the target network parameters are softly
updated by the real network parameters with a certain frequency.
The loss function of the realistic Critic network is:
(1)
Among them:
(2)
is the number of samples, is the weight of the different samples used,
is the action value calculated by the realistic Critic network when
sample j takes action at the state , is the target action value calculated by the
samples and derived from the target Critic network,
is the immediate reward
obtained by sample j for taking action at the state , and γ is the discount factor.
The loss function of a realistic Actor network is :
(3)
Which, finding the minimal value of this loss function by the gradient descent
method is equivalent to the process of maximizing the action value .
The target Critic network and target Actor network parameters are updated in the
following way:
(4)
(5)
is the update coefficient, and the range is 0.01~0.1 to avoid excessive parameter
changes.
When a machine learning model has a high degree of freedom, an additional class
of exploration strategy, maximum a posteriori policy optimization, is needed to
enhance the efficiency of the machine learning model based on the use of the DDPG
algorithm model. The maximum a posteriori policy optimization approach models the
reinforcement learning problem as an inference problem from a probabilistic point of
Q Q 2
1
1
( ) ( ( , , ))
m
j j j j
t t
j
J y Q S A
m
θ ω θ
=
=
Q
1 1
,
Q( , , ),
j j
j
j j j j
t t
R end
yR S A not endγ θ ʹ
+ +
+
m
ωj
Q(Sj
t,Aj
t,θQ)
St
yj
Rj
St
Q
1
1
( ) ( , , )
m
j j
t t
j
J Q S A
m
π
θ θ
=
=
J(θπ)
Q(Sj
t,Aj
t,θQ)
Q Q Q
(1 )θ τθ τ θ
ʹ ʹ
+
(1 )
π π π
θ τθ τ θ
ʹ ʹ
+
τ
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
273
view. Assuming the probability of completing the task, the probability, according to the
inference problem, is:
(6)
Let the loss function :
(7)
where: is the proposed distribution. the optimization problem can be solved by
the EM method, and the optimal proposed distribution is obtained in step E. The
non-parametric optimization solution in this step is:
(8)
where the optimal temperature term can be optimized according to equation (9):
(9)
In the M-step, the optimal proposal distribution is used to update the neural network
strategy:
(10)
2.2. OBSERVED AND REWARDED VALUES OF MACHINE
LEARNING MODEL ALGORITHMS
To give full play to the performance advantages of the DDPG algorithm, this paper
takes into account the motion state of the machine learning model, the processing
efficiency of the intelligent body, the environment and other factors, and selects some
observations as shown in Table 1.
Table 1. Learning process observations
log ( 1) log ( ) ( 1| )
( )
( ) log ( 1| ) log ( )
p O p p O d
p
q p O d
q
π π
π
τ τ τ
τ
τ τ τ
τ
= = =
= +
( , ) / ( ( ) || ( ))
q t
t
J q E r a KL q p
π
π τ τ
=
q(τ)
q(τ)
*
( , )
( | ) ( | , )exp( )
i
i i
Q s a
q a s a s θ
π θ η
η*
( , )
( ) ( ) log ( | , ) exp( )
i
i
Q s a
g s a s dads
θ
η ηε η µπ θ η
= +
( ) ( | )
max ( , ) max [ [log ( | , )]] log ( )
q
i S q a s
J q E E a s p
µ
θ θ
θ π θ θ= +
Parameter Symbol
Machine learning model location
Machine learning model pose angle
Machine learning model speed
Contact force
Joint torque output
Last joint moment output
[vx,vy,vz]
[thipl,thipr,tkneel,tkneer,ttirel,ttirel]
[roll,pitch,ya w]
[x,y,z]
[FNl,FNr]
[t
hipl,t
hipr,t
kneel,t
kneer,t
tirel,t
tirel]
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
274
and the following weighted reward values r are designed based on the observed
values, where 1~9 are the weights.
(1) The speed reward value
, which is to encourage the machine learning model
to move forward, is shown in equation (11):
(11)
where: is the velocity of the machine learning model on the axis.
(2) The stability reward value
, lies in rewarding the machine learning model for
completing smooth motions in both instantaneous and global decisions, as shown in
equation (12):
(12)
where: denotes the current position of the machine learning model on the
axis, , denotes the position of the machine learning model on the
axes,
denote the attitude angle (roll angle, pitch angle, yaw
angle) of the machine learning model.
(3) The joint stability bonus value
, which lies in improving the energy utilization
efficiency of the robot, is shown in equation (13):
(13)
where: and represent the output of each joint torque and its torque output at the
previous time.
(4) The value of touchdown reward
, lies in rewarding the machine learning
model for controlling the same contact force between its own two feet and the ground,
reducing the probability of training to generate singular motion poses, as shown in
equation (14):
(14)
where:
denotes the contact forces between the left and right feet of the
machine learning model and the ground, respectively.
(5) The motion duration reward value
, lies in encouraging the machine learning
model to keep moving, as shown in equation (15):
(15)
where is a constant and takes any value.
2.3. THREE-DIMENSIONAL CONVOLUTIONAL NEURAL
NETWORK
Compared with other traditional deep learning methods, 3D convolutional neural
networks are not limited to the input of 2D single-frame images, and can better extract
features from the temporal and spatial dimensions and extract motion information of
rv
1v x
r k v=
vx
x
rs
2 2 2 2 2
4 5 6
( ) 3( )
s init init
r y y k z z k roll k pitch k yaw=
y,z
y,z
yinit
zinit
y,z
roll,pitch,roolll,pit yaw
r3
2
7
( )
js i i
i
r k t t
ʹ
=
ti
t
i
rF
2
8( ])
l r
F N N
r k F F=
FNl,FNr
rc
9c
r k T=
T
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
275
multiple consecutive frames after reinforcement learning algorithms by machine
learning models.
The three-dimensional deep convolutional neural network used in this paper
includes four convolutional layers, two downsampling layers, two fully connected
layers and one Softmax classification layer, and the downsampling layer uses Max-
pooling with a kernel size of 3×3×3 and a step size of 1, as shown in Figure 2.
Figure 2. Three-dimensional convolutional neural network framework
To capture the sports dance movement information in multiple consecutive frames,
the features are calculated from the spatial and temporal dimensions, the value of the
cell with position coordinates in the th feature map of the th layer is given by
the following equation:
(16)
The time dimension of the 3D convolution kernel is , and the weight value of the
convolution kernel connected to the th feature map at position is .
The ReLU function is the most commonly used activation function in deep machine
learning models, which keeps the original feature value unchanged when the input
feature value is greater than 0, and sets it to 0 when it is less than 0. This is the
unilateral inhibition of the activation function, which allows the model parameters to
become sparse and thus reduces the risk of overfitting to some extent. In addition, the
derivative of the activation function is very simple to compute, which can speed up the
computation to a certain extent, and the derivative is always 1 when the input is
positive, so it can effectively alleviate the problem of gradient disappearance, the
ReLU activation function is defined as:
(17)
Pooling is also known as downsampling, where, unlike the processing of 2D
images, the information of the video in the temporal dimension is taken into account;
(x,y,z)
j
i
1 1 1 ( )( )( )
( 1)
0 0 0
i i i
l m n x l y m z n
xyz lmn
ij ij ijr i r
r l m n
V f b v
ω
+ + +
= = =
= +
ni
r
(l,m,n)
ωlmn
ijr
0,( 0)
( ) max(0, ) , ( 0)
x
f x x x x
= = >
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
276
by pooling, the feature map is reduced, the dimensionality of the data is reduced, and
the number of calculations is reduced, making it easier to train and improve accuracy:
(18)
The Softmax function is often used in the last layer of a classification task to map
an n-dimensional vector x to a probability distribution such that the probability of the
correct category tends to 1 and the probability of the other categories tends to 0, and
the sum of the probabilities of all categories is 1.
The Dropout strategy is to temporarily discard the neurons in the deep model from
the network and disconnect them according to a certain probability when the model is
trained so that the closed neurons do not participate in the calculation of forward
propagation and the update of weights by the backward gradient. Therefore, Dropout
can be considered as an integration method, i.e., averaging different model
architectures over several iterations, thus significantly reducing the risk of overfitting.
3. METHODS OF RECONSTRUCTING THE CONTENT
AND MOVEMENT IDENTIFICATION OF PHYSICAL
DANCE TEACHING
At present, there are many action recognition methods based on deep learning,
and the commonly used classical action recognition algorithms are the C3D algorithm,
P3D ResNet algorithm, and ConvLSTM algorithm, and these algorithms have a good
effect on action recognition. To realize the recognition of physical dance teaching
content movements, this paper firstly constructs a physical dance movement dataset,
which needs to record the skeleton information of dancers, using the skeleton
information for recognition can be processed faster and reduce the storage space.
Secondly, based on the machine learning model and 3D convolutional neural network,
an improved model is proposed for the recognition of physical dance movements. In
this paper, the Dropout technique is also used to reduce the overfitting phenomenon,
and Dropout is experimentally compared by setting different ratios.
This experiment uses the 5-fold cross-validation method, in which the pre-
processed sports dance typical movement dataset is randomly divided into 5 groups,
of which 4 groups are used as the training set and 1 group is used as the test set, and
then the results of 5 times are averaged as the recognition rate.
3.1. PHYSICAL DANCE MOVEMENT DATA PRE-PROCESSING
To reduce the impact of the subsequent machine learning model on the judgment
of different movements, the missing skeletal data of a certain frame was filled with the
skeletal data of the previous frame to check the missing skeletal data of the recorded
raw data. To reduce the redundant information, only one frame out of every five
frames of the original data was kept, and the maximum number of frames for each
action in the dataset is shown in Table 2, The remaining action samples with less than
1 2 3
, , , ,
0 ,0 ,0
max ( )
x y z x s i y t j z r k
i s j s k s
Vµ
×+×+×+
=
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
277
the maximum number of frames are copied to the last frame to be added to the
maximum number of frames for subsequent input into the machine learning model for
training and learning.
Table 2. Sports Dance Motion Dataset Motion Name and Maximum Number of Frames
3.2. EXPERIMENTAL DATA ANALYSIS METHODS
Using the test set in the dataset, the recognition results of each method for 20
typical sports dance movements are evaluated in terms of accuracy. the recognition
results of the experiments using the 3D convolutional neural network recognition
method under the machine learning model of this paper are calculated, which can
clearly express the categories and numbers of correct and incorrect recognition of
each movement, and then the recognition accuracy of three different types of
recognition methods are calculated, namely, the UTKinect dataset, the MSRAction3D
Dance type Serial
number The action name The maximum
number of frames
Modern dance
1 Next to the T-step step 62
2 Z-step 37
3 Pendulum 60
4 Side-click steps 30
5 Single step 33
Waltz
6 Step back 50
7 Lift the step 40
8 One-handed one-handed step 42
9 Single step 36
Tango
10 Cats wash their faces 25
11 Alternate cover hands 30
12 Flick your fingers 35
Latin dance
13 Flat step 48
14 Drag 31
15 Shrug 40
16 Crankulated arms 38
Samba
17 High shake hands 35
18 Alternate waving hands 32
19 Wave your hands 50
20 Draw circles with both hands 24
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
278
dataset and the TYYDance dataset of this paper under the recognition algorithm.
MSRAction3D dataset and the sports dance action TYYDance dataset of this paper
are then calculated for the overall dance action recognition accuracy under the
recognition algorithm, and finally, the method of this paper is compared with the
classical algorithm, and the corresponding recognition results and conclusions are
drawn.
4. EXPERIMENTAL RESULTS AND ANALYSIS
First, to alleviate the overfitting phenomenon, the Dropout technique was used, and
the Dropout ratios were set to 0.2, 0.4, 0.5, 0.6, and 0.8 and the experiments were
conducted on the dataset of this paper, and the results of the validation set are shown
in Figure 3. After 50 iterations, the recognition accuracy of the Dropout ratio of 0.5 is
slightly higher than that of the ratio of 0.4.
Figure 3. Experimental results of different Dropout ratios
To verify the effectiveness of the 3D convolutional neural network dance movement
recognition method based on the machine learning model proposed in the text,
experiments were conducted on the public dataset UTKinect dataset, MSRAction3D
dataset and the sports dance movement TYYDance dataset of this paper. The
UTKinect dataset contains 10 types of movements with 220 movement samples, the
MSRAction3D dataset contains 20 types of movements with 540 movement samples,
and the dance movement dataset in this paper contains 20 types of dance
movements with 640 movement samples, The dance movement dataset in this paper
contains 20 types of dance movements, with a total of 640 movement samples.
Figure 4 shows the recognition accuracy of the training set and Figure 5 shows the
recognition accuracy of the test set. 82% recognition rate was obtained on the
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
279
UTKinect dataset, 90% recognition rate was obtained on the MSRAciton3D dataset,
and 96% recognition rate was obtained on the TYYDance dataset. The reasons for
the better recognition results are as follows: (1) in the selection process of the typical
movements of sports dance, the dance movements with relatively large differences in
movements are selected; (2) more dance movements are collected. ) more samples
of movements were collected, and more deep learning data were obtained; (3) the
collected dance movements had more duration than the general movements, and
more movement representations were also obtained.
Figure 4. The training recognition rate of three data sets
Figure 5. The test recognition rate of three data sets
In the experimental process, the 3D convolutional neural network action recognition
algorithm under the machine learning model was trained on the training and validation
sets of all subjects, and the loss function curves were obtained as shown in Figure 6.
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
280
In the figure, the vertical coordinate is the loss value, and the horizontal coordinate is
the number of iterations; from Figure 6, we can see that after 45 iterations, the loss
gradually stabilizes at about 0.016, which is the best training effect of the model.
Figure 6. The loss function of the training set and test set
As shown in Figure 7, the graphs of the recognition accuracy functions of the
training and validation sets are obtained experimentally. In the graph, the recognition
rate of the training set reaches 99.74% after 45 iterations of model training, and then
the curve remains stable, reaching the best recognition effect of the model.
Figure 7. The loss function of the training set and test set
Other classical algorithms also use skeleton data input, and compared with the 3D
convolutional neural network action recognition algorithm based on the machine
learning model, the results are shown in Table 3, the model used in this paper has
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
281
higher recognition accuracy in the dataset of this paper, which is slightly higher than
P3D, about 6% higher than ConvLSTM, and about 3.5% higher than C3D(1net).
Table 3. Compare results with other methods
From the experimental results, we can see that using human skeleton information
can occupy less storage space to obtain good recognition results, so each model has
good recognition results in the dataset of this paper. Compared with the C3D
recognition algorithm model, this paper follows the convolutional kernel small and
large 3×3×
3, but there are 8 convolutional layers in C3D, and the data volume is too
large for the data set in this paper, so this paper adjusts the number of layers and the
number of convolutional kernels in the model to obtain better recognition results. This
is one of the reasons why the skeleton information of sports dance movements is
collected in this paper, and the good results obtained from the experiments in this
paper validate the experimental movement collection, as well as the data pre-
processing and the rationality of this data set.
5. CONCLUSION
Each kind of sports dance has its own unique culture and spirit, and dance is a way
for human beings to use body language to express emotions and convey their
feelings, which is a common language for human beings regardless of borders and
race. In this paper, a complete process from dance movement data collection to
movement recognition is realized, and the typical movement data set is constructed
by collecting the typical movement skeleton information of physical dance, excluding
the interference of background, lighting and other factors. The experimental results
show that the 3D CNNs in this paper can produce satisfactory results for sports dance
movement recognition, and validate the feasibility of the 3D convolutional neural
network movement recognition algorithm based on a machine learning model in the
acquisition to recognition of sports dance movements, which can better open up the
content of sports dance education through machine learning model. However, there
are still some deviations in the recognition results of a few movements with high
similarity, and only some typical movement fragments were collected as data samples
for the construction of the dataset, and there is still a lot of research space for the
whole performance. Therefore, the next research direction is to use more optimized
deep learning algorithms to optimize the recognition of sports dance movements,
collect more sports dance movement data to expand the dataset, and to recognize
different dance movements in longer dance performances.
Methods Recognition accuracy
C3D(1 net) 91.27%
P3D ResNet 94.29%
ConvLSTM 90.13%
3D CNNs(ours) 96.91%
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
282
DATA AVAILABILITY
Data for this study are available from the authors upon request.
CONFLICT OF INTEREST
The authors declare that the research was conducted in the absence of any
commercial or financial relationships that could be construed as a potential conflict of
interest.
REFERENCES
(1) Lou, R., Lv, Z., Dang, S., Li, S., Zhang, Y., & Li, L. (2021). Application of
machine learning in ocean data. Multimedia Systems, 1-10.
(2) Guo, J. X., & Huang, C. (2020). Feasible roadmap for CCS retrofit of coal-
based power plants to reduce Chinese carbon emissions by 2050. Applied
Energy, 259, 114112.
(3) Zounemat-Kermani, M., Matta, E., Cominola, A., Giuliani, M., & Castelletti, A.
(2020). Neurocomputing in Surface Water Hydrology and Hydraulics: A
Review of Two Decades Retrospective, Current Status and Future
Prospects. Journal of Hydrology.
(4) Waring, J., Lindvall, C., & Umeton, R. (2020). Automated Machine Learning:
Review of the State-of-the-Art and Opportunities for Healthcare. Artificial
Intelligence in Medicine, 104, 101822.
(5) Bougie, N., & Ichise, R. (2020). Skill-based curiosity for intrinsically
motivated reinforcement learning. Machine Learning, 109(3), 493-512.
(6) Horton, M. B., Brady, C. J., Cavallerano, J., et al. (2020). Practice Guidelines
for Ocular Telehealth-Diabetic Retinopathy, Third Edition. Telemedicine and
e-Health, 26(4).
(7) Shuntaro, C., Kenjirowelq, L., Narin, S., et al. (2021). eSkip-Finder: a machine
learning-based web application and database to identify the optimal
sequences of antisense oligonucleotides for exon skipping. Nucleic Acids
Research.
(8) Mangold, C., Zoretic, S., Thallapureddy, K., et al. (2021). Machine Learning
Models for Predicting Neonatal Mortality: A Systematic Review.
Neonatology, 1-12.
(9) Hindson, J. (2022). Proteomics and machine-learning models for alcohol-
related liver disease biomarkers. Nature Reviews Gastroenterology &
Hepatology, 19(8), 488.
(10) Hasan, M. S., Kordijazi, A., Rohatgi, P. K., et al. (2022). Machine learning
models of the transition from solid to liquid lubricated friction and wear in
aluminum-graphite composites. Tribology International, 165.
(11) Sun, Z. Y., Herold, F., Cai, K. L., et al. (2022). Prediction of Outcomes in Mini-
Basketball Training Program for Preschool Children with Autism Using
Machine Learning Models. International Journal of Mental Health Promotion,
24(2), 143-158.
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
283
(12) Tatar, A. E., & Distegor, D. (2020). Prediction of Academic Performance at
Undergraduate Graduation: Course Grades or Grade Point Average?
Applied Sciences, 10(14), 4967.
(13) Lau, E. T., Sun, L., & Yang, Q. (2019).
Modelling, prediction and classification
of student academic performance using artificial neural networks. SN
Applied Sciences, 1(9), 1-10.
(14) Oyedeji, A. O., Salami, A. M., Folorunsho, O., et al. (2020). Analysis and
prediction of student academic performance using machine learning.
Journal of Information Technology and Computer Engineering, 4(1), 10-5.
(15) Jacobson, N. C., & Nemesure, M. D. (2021). Using artificial intelligence to
predict change in depression and anxiety symptoms in a digital
intervention: Evidence from a transdiagnostic randomized controlled trial.
Psychiatry research, 259, 113618.
(16) Xu, X., Wang, J., Peng, H., et al. (2019). Prediction of academic performance
associated with internet usage behaviors using machine learning
algorithms. Computers in Human Behavior, 98, 166-173.
(17) Mathur, P., Srivastava, S., Xu, X., et al. (2020). Artificial intelligence, machine
learning, and cardiovascular disease. Clinical Medicine Insights: Cardiology,
14, 1179546820927404.
(18) Rossi, A., Pappalardo, L., Cintia, P., et al. (2018). Effective injury forecasting
in soccer with GPS training data and machine learning. PloS one, 13(7),
e0201264.
(19) Duarte, J. J., Gonzalez, S. M., & Jr JC C. (2020). Predicting stock price falls
using news data: Evidence from the Brazilian market. Computational
economics, 57(2), 311-340.
(20) Yoshihara, A., Fujikawa, K., Seki, K., et al. (2014). Predicting stock market
trends by recurrent deep neural networks. In Pacific Rim International
Conference on Artificial Intelligence (pp. 759-769). Springer, Cham.
(21) Usmani, M., Adil, S. H., Raza, K., et al. (2016). Stock market prediction using
machine learning techniques. In 2016 3rd International Conference on
Computer and Information Sciences (ICCOINS) (pp. 322-327). IEEE.
(22) Nikou, M., Mansourfar, G., & Bagherzadeh, J. (2019). Stock price prediction
using deep learning algorithm and its comparison with machine learning
algorithms. Intelligent Systems in Accounting, Finance and Management, 26(4),
164-174.
(23) Hah, D. W., Kim, Y. M., & Ahn, J. J. (2019). A study on KOSPI 200 direction
forecasting using XGBoost model. The Korean Data & Information Science
Society, 30(3), 655-669.
(24) Tran, D., Bourdev, L., Fergus, R., et al. (2015). Learning spatiotemporal
features with 3D convolutional networks. In Proc of IEEE International
Conference on Computer Vision (pp. 4489-4497). Washington DC: IEEE
Computer Society.
(25) Kishore, P. V. V., Kumar, K. V. V., Kiran Kumar, E., et al. (2018). Indian classical
dance action identification and classification with convolutional neural
networks. Advances in Multimedia, 2018.
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
284
(26) Protopapadakis, E., Grammatikopoulos, A., Doulamis, A., et al. (2017). Folk
dance pattern recognition over depth images acquired via Kinect sensor.
The International Archives of Photogrammetry, Remote Sens.
(27) Li, G., & Li, C. (2020). Learning skeleton information for human action
analysis using Kinect. Signal Processing: Image Communication, 84, 115814.
(28) Frayssinet, M., Esenarro, D., Juárez, F. F., & Díaz, M. (2021). Methodology
based on the NIST cybersecurity framework as a proposal for
cybersecurity management in government organizations. 3C TIC.
Cuadernos de desarrollo aplicados a las TIC, 10(2), 123-141. https://doi.org/
10.17993/3ctic.2021.102.123-141
(29) Chen, S., & Ren, Y. (2021). Small amplitude periodic solution of Hopf
Bifurcation Theorem for fractional differential equations of balance point in
group competitive martial arts. Applied Mathematics and Nonlinear Sciences,
7(1), 207-214. https://doi.org/10.2478/AMNS.2021.2.00152
https://doi.org/10.17993/3ctic.2023.121.267-285
3C TIC. Cuadernos de desarrollo aplicados a las TIC. ISSN: 2254-6529
Ed.42 | Iss.12 | N.1 January - March 2023
285