A Preventive Model for Hamstring Injuries in Professional Soccer: Learning Algorithms

Francisco Ayala, Alejandro López-Valenciano, Jose Antonio Gámez Martín, Mark De Ste Croix, Francisco J Vera-Garcia, Maria Del Pilar García-Vaquero, Iñaki Ruiz-Pérez, Gregory D Myer, Francisco Ayala, Alejandro López-Valenciano, Jose Antonio Gámez Martín, Mark De Ste Croix, Francisco J Vera-Garcia, Maria Del Pilar García-Vaquero, Iñaki Ruiz-Pérez, Gregory D Myer

Abstract

Hamstring strain injury (HSI) is one of the most prevalent and severe injury in professional soccer. The purpose was to analyze and compare the predictive ability of a range of machine learning techniques to select the best performing injury risk factor model to identify professional soccer players at high risk of HSIs. A total of 96 male professional soccer players underwent a pre-season screening evaluation that included a large number of individual, psychological and neuromuscular measurements. Injury surveillance was prospectively employed to capture all the HSI occurring in the 2013/2014 season. There were 18 HSIs. Injury distribution was 55.6% dominant leg and 44.4% non-dominant leg. The model generated by the SmooteBoostM1 technique with a cost-sensitive ADTree as the base classifier reported the best evaluation criteria (area under the receiver operating characteristic curve score=0.837, true positive rate=77.8%, true negative rate=83.8%) and hence was considered the best for predicting HSI. The prediction model showed moderate to high accuracy for identifying professional soccer players at risk of HSI during pre-season screenings. Therefore, the model developed might help coaches, physical trainers and medical practitioners in the decision-making process for injury prevention.

Conflict of interest statement

The authors declare no conflict of interest.

© Georg Thieme Verlag KG Stuttgart · New York.

Figures

Fig. 1
Fig. 1
Graphical representation of testing procedure. The order of the different tests used to record the personal or individual, psychological and neuromuscular risk factors in the testing session is shown.
Fig. 2
Fig. 2
Graphical representation of the first classifier. Prediction nodes are represented by ellipses and splitter nodes by rectangles. Each splitter node is associated with a real valued number indicating the rule condition, meaning: If the feature represented by the node satisfies the condition value, the prediction path will go through the left child node; otherwise, the path will go through the right child node. The numbers before the feature names in the prediction nodes indicate the order in which the different base rules were discovered. This ordering can to some extent indicate the relative importance of the base rules. This classifier number 1 reports an initial score of − 1.152 in its root node. Furthermore, this classifier shows a tree-shape structure comprising 6 main branches whose father nodes (first leaves) are the following: a) PT-QCON180-Dominant Leg, b) APTHECC180-Dominant Leg, c) 45-UniRatio-H/QCONV240-Dominant Leg, d) YBalance-Ant-Non-Dominant Leg, e) APT-QECC30-Non-Dominant Leg and f) Sleep quality. All the classifier’s main branches must be addressed, and the scores obtained in each branch (resulting from the data input in the father and child [if necessary] nodes) must be summed to the score initially reported by the root node in order to get the final vote of the classifier (yes = negative score [high risk of injury] or no = positive score [low risk of injury]) for the player. Thus, and if we start by addressing the branch whose father node is the feature PT-QCON180-Dominant Leg, it is shown that the score reported by the soccer player (145 Nm) satisfies the condition present in the node (> 136.9 Nm) and hence, he obtains the score of − 0.647 from the prediction node Yes. This circumstance drives to the child node represented by the feature PT-QECC60-Non-Dominant Leg. In this case, the player does not satisfy the condition presented in the just-mentioned feature; in other words, the value reported (208.4 Nm) is not higher than 211.45 Nm. Therefore, here the player achieves a score of − 0.963 coming from the predictive node ‘No’. As a consequence, the final result of this branch is the sum of − 0.647 plus − 0.963, ergo − 1.61 points. The pathway to follow in the branch whose father node is the feature titled APT-HECC180-Dominant Leg is shorter than the one previously described, and here the player demonstrated a score of 28°, which does not satisfy the established condition (> 35°). Consequently, in this second branch, the player obtains a score of 0.988 from the predictive node “No”. The third branch, composed by the father node titled 45-UniRatio-H/QCONV240-Dominant Leg provides a total score of − 1.412 (− 0.198 + [− 0.567] + [− 0.647]), as the soccer player’s values do not satisfy the condition presented in either father or child nodes. For its part, in the fourth branch, the soccer player does satisfy the condition of the father node, UniRatio-H/QCON60-Dominant Leg, that provides a score of − 0.291. Finally, and for both the fifth and sixth branches, the player again satisfies the condition presented in their respective father nodes (APT-QECC30-Non-Dominant Leg and Sleep quality, respectively) and hence, the scores obtained were 0.416 and − 0.358, respectively. All in all, and after summing up the baseline score of the root node with the scores reported in each of the 6 branches of the classifier, a total score of − 3.419 was achieved. This final score is a negative value, and this supposes a “Yes” vote with a weight of 2.29. The final classification will be based on the combination of the votes of each individual classifier to each class (yes or no).

Source: PubMed

3
Sottoscrivi