Diving Sports Auxiliary Training Video Analysis Research

Liu, Zhiwu

ABSTRACT

In this study, a video analysis system is developed for diving sports according to the characteristics of the diving sports. This system is mainly used in assisting coach to instruct athletes more effectively and it can help the judgers give the correct judgments. The video analysis, 3D motion simulation, edit interface and the database management are researched. The background and meaning of developing is stated in this system and then the relation and the data flow are analyzed among the main modules. Tracking model and virtual humanmotion modeling are built, the key techniques and innovations are addressed in detail. In practice, this system can not only improve the efficiency of training, reduce the repeated training and the probability of false judgments but also there is the high efficiency in the system and it is easy to be operated.

PDF Abstract XML References Citation

INTRODUCTION

All along, the level of China’s diving is a world leader, mainly because we have an experienced coaching staff and the talented and hard training athletes. However, in recent years, the dominance of the diving project is seriously challenged, like the rising star in Canada, the US and Australia, the gold target of diving sport athletes had a huge impact at all leve of diving competitionls. The main reason for this phenomenon is because our research is still relatively backward sports, science and technology in diving training is almost zero. The United States, Australia and other countries produces sharp contrast to China diving training, the computer-aided motion analysis system CAMAS (Computer Assistant Motion Analysis System) are using advanced without exception. The CAMAS is the product of a variety of IT integration, which includes the multi-disciplinary field of image processing and analysis, computer vision, pattern recognition, artificial intelligence, computer graphics, mathematics, kinesiology and other knowledge and technology. Three-dimensional simulation exercise training and video analysis system is with real-time and the full range of observation, the merits of a training or competition can be quantified in the form of analysis and evaluation, so CAMAS play an increasingly important role.

Rapid progress has been made for sports video analysis system in recent years, related hardware and software system has been initially applied to a variety of sports training. Kanav Kahol had layered the behavioral segmentation method, dance gesture is segmented from the motion sequences of dance, dance training is identifed and guidanced (Kahol et al., 2004). In order to improve the competitive level of our athletes and promote the development of China’s sports undertakings, Institute of Computing Technology had developed simulation of three-dimensional human motion video analysis system for sports training, where there are the digitized three-dimensional human motion in computer simulation technology, human sports biomechanics data with real human motion data, there are graphically realistic three-dimensional simulation, design and analysis of technical movements and there is highly training guidance capability in competitive sports (Kun et al., 2005; Wang et al., 2005). However, the system also needs to be further enhanced and improved in order to achieve a higher degree of utility. For example and for an action in diving, there are a series of the quality of athletes completing action and athletes, athletes weight, height, the proportion of the various parts of the body, take-off angle, each node acceleration, flipped operation timing in the air, the show arm timing before entering the water and angles into the water. These problems still be not completely solved, what is the relationship between these data and what factors it is related to, what kind of law is existed between them, these are pending further study. Therefore, a diving video motion analysis system is to be designed and developed, it is of great help for improving the competitive level of diving athletes and it is a very important significance.

DIVING MOTION VIDEO ANALYSIS SYSTEM FRAMEWORK

Diving motion video analysis system consists of three modules; a video analysis module, 3D motion simulation and graphical editing interface module and database management module. The main process is shown in Fig. 1.


Fig. 1:	Diving sport video analysis system process


Fig. 2:	Main interface

First, the existing diving video or real-time training video are imported the main interface (Fig. 2), which is captured from the camera, this video is split. Motion tracking is done based on segmentation. The purpose of tracking is to obtain motion parameters of each athlete’s joints, mathematical modeling of these parameters are established by using regression analysis, the model is built with athlete motion parameters and attitude parameters.

A set of motion parameters are obtained by tracking and the 3D virtual human is drived, athletes training is simulated by a virtual human for the whole process, so that coaches, athletes and referees can observe the completed situation of the dive from a comprehensive, multi-angle. In addition, with such a 3D graphics editor interface, the action can be also modified, the basic movements are designed, operation is integrated and the actual situation is contrast to virtual one.

The athlete data management and user management are mainly completed in database management. For an athlete, some of his basic information, such as height and weightand movement difficulty coefficient play an important role in the results of the training judgement. In general, this data is relatively fixed, so this can be save in the database, artificial troubles should be reduced in each training set. In order to avoid abuse and misuse of system functions, user management is defined as that users are created by the super administrator and different users are given with different permissions.

KEY TECHNOLOGY AND RELATED ALGORITHMS

Video analysis: Diving is a timeless sport, which requires athletes to complete a coherent set of actions in a short time. Video recording technology has been widely used in diving sports training and competition but only some simple video playback and playback can not meet the demand for coaches and referees diving evaluation. Video analytics is committed to extract motion parameters from the original captured video. These parameters are modeled and they are understood by the behavior method. The merits of the action, which the athletes completed, is analyzed quantitatively.

Video segmentation: When diving video is shooted, in order to capture the full track athletes, the camera movement is generally made. This background motion, which caused by the camera motion, is called global movement (Farin et al., 2004; Tsaig and Averbuch, 2002). Typically, there is a global motion in diving video, so to obtain accurate global motion parameters are that human key and basic of athletes are extracted from the video.

Taking into account the characteristics of diving video, we use the three-parameter camera motion model (Wang et al., 2006), such as the Eq. 1:

(1)

where, a = z_xy, a_x = f_x (p_x, z_xy), a_y = f_y(p_y, z_xy). Z_xy is a camera zoom (scaling) factor and (p_x, p_y) is a translation components in the x and y axis of the camera panning.

For fast, accurate access to the global motion parameters, we propose a new algorithm based on human skin model for global motion estimation. First, according to most naked and exposed features of player body in the diving video, skin model is established and moving foreground is extracted. In the (Y, Cb, Cr) color space, according to statistics, the human area is mainly shown in the area in Fig. 3.

According to the human skin model, the current frame I_t and the previous frame I_t-1 is respectively pretreated by operation, the athletes are extracted from the current frames I_t and the previous frame I_t-1. Thereafter, the closing operation is made for the current frame I_t image and the previous frame I_t-1 image, the communicated image is obtained between a current frame I_t and a previous frame I_t-1. Then, at the exclusion of the majority of the foreground image, it is macro-block match. In order to improve the matching speed of macroblock, by KMP string pattern matching algorithm revelation (Cormen et al., 2001), we designed a no backtracking image block matching algorithm (NBT).


Fig. 3:	Human skin color distribution

Generally, assuming that the main block is:

Image for - Diving Sports Auxiliary Training Video Analysis Research

Sub-block:

Image for - Diving Sports Auxiliary Training Video Analysis Research

The following inequality can be obtained by deducing:

(2)

where, P_{i, j} is the pixels in the ith column and the jth row position of the sub-block, d is a threshold value. Thus, when there is a “mismatch”, the matching is only the beginning to continue to compare between the kth pixel of sub-block and the jth pixel of the main block.

By the algorithm of above two steps, the global motion estimation results can be quickly and accurately obtain.


Fig. 4(a-b):	Diving video segmentation results (a) Original video and (b) Segmentation results

There is the presence of global motion for a video sequence, I_n-1, I_n, I_n+1 are assumed as one of three consecutive images. In this reference frame, the global motion of image frame is estimated and and compensated by using the above method, the background context of I_n-1 and I_n+1 image frames is mapped, that I_n-1, I_n, I_n+1 become into continuous constant 3 frames. The moving targets of sequences are extracted based on the differential intersection of video object segmentation algorithm (Wang and Gu, 2004), difference DFD is done for two consecutive frame, adaptive noise filtering and mathematical morphology filter are made respectively for the two differential image and then the intersection is taken between two difference images. The moving target contour information is obtained through morphological processing. The moving object binary segmentation mask is gotten and eventually video objects are extracted. Figure 4 shows the segmentation results in diving test sequence.

MATERIALS AND METHODS

Model tracking: Video segmentation is for tracking service, the motion parameters of each athlete’s joints can be gotten from the original video by tracking. Tracking methods, which is currently used, are block-matching method (Block Matching) and optical flow method (Optical Flow). In this study, feature point tracking method is used and it is based on optical flow method, the human skeleton model is established in Fig. 5.

Human motion model is that the state of the previous frame is used to predict the state of the current frame. While, diving plays different game action according to the different requirements in competitions but generally, the diving sport trend and the number of vacant tumbling in the air are certain rules to follow. Diving process can be decomposed into three phases; start jump, vacated tumbling motion, into the water.


Fig. 5:	2-dimensional node human skeleton model

Off process can be understood as the throwing motion and vacated tumbling motion can be understood as a the circular motion with constant acceleration motion point as the center, into the water can be understood as the process of free fall. Common motion model are uniform motion model, polynomial motion model, etc. There are some common special model, such as B-spline, hidden Markov models. According to the characteristics of diving, the second-order auto-regressive movement mode is established to diving:

(3)

In the motion tracking process, human diving is tracked by using particle filtering technology. First, particle initialization is performed. In order to track the accuracy and real-time, the skeleton model is manually marked in the first and second frames of video tracking, the key joints (head node, the node shoulder, wrist nodes, nodes waist, an inter-ministerial node node ankle, foot node) need to be tracked and each particle (discrete samples) is initialized and then a matching template is created for each key joint point position. The template should include color information and gradient information. By using the established motion model, the position of each particle (discrete sample) is determined in the key joint point of the next frame, the weight of each particle is determined according to the matching template and if the weight is larger than a threshold value, the motion model is considered as inaccurate motion model, which is adjusted. The tracking position of each key joints is finalized according to the position and weight of the individual particles and the particles and the matching template are updated in order to track in the next frame.

Calculating model: Motion parameters and attitude parameters are important data to analyze gymnastics, diving and it is used to describe the state of motion of athletes. The motion parameters of various joints are instantaneous velocity, acceleration, etc. The pose parameters are mainly height and direction angle in each part of the body and the geometric relationships between them. Calculation model is established based on the extracted joint motion parameters and body posture parameters, it is for the calculation of motion parameters and pose parameters at any time point in each joint. To calculate motion parameters and pose parameters at any time or anywhere, the physiological characteristics of human and biological movement principle are studied. The way and the limits are to understood in the movement changes of the human body various parts. On this basis, human motion tracking model is combined. A computational model and calculation methods are researchd about in the diving joint points and body parts motion parameters, pose parameters.

To determine the reference coordinate system: There are two coordinate systems. Global coordinate system stationary coordinate system is that when the diving athlete is in the initial state and Hanim standard, joints are as the coordinate origin. Local coordinate system (moving frame) is that the joints of the body is set as a local coordinate system, whose origin is located in the center of each joint. Z-axis positive direction points paper in accordance with the right-handed coordinate.

Calculation of the each joint displacement amount:

where, x₁ is the local coordinate values of a key point in previous frame, x₂ is the coordinate values of the joints in the current frame,which is relative to the previous frame local coordinate system.

Global motion camera model is set up between the two frames:

(4)

where, (x, y) is coordinate values of the key points in the current frame (x', y') is coordinates of the articulation point on the previous frame.

Speed/acceleration calculation: Linear speed is as follows:

where, t₂ is for the current time frame, t₁ is as a previous time. Similarly available v_y, v_z.

Linear acceleration is as follows:

empathy available a_y, a_z.

Turn the angle/angular velocity calculation: According to the local coordinates of the key points in the tracking results, the inverse kinematics method is used and the rotation angle of each joint can be obtained. Kinematics can be obtained by the coordinate transformation equation and it is as follows:

(5)

where, R is the rotation matrix between the two coordinate systems, P is the measuremnt for the joint in its current local coordinate system, P' is the measuremnt for the joint in its initial local coordinate system. By the Eq. 5, the rotation angle of the joint can be obtained.

Behavior recognition and understanding: The motion trajectory, the motion parameters and attitude parameters, which have been obtained, is used as a basis for recognition analysis. Athletes actions identifiation is also semantic classification category. The dive is decomposed into 13 kinds of basic actions class which are the run, arm stand in off stage, swivel, forward, reflexive, backward and inward somersault, straight body air stages, curve body, tuck, somersault and twist, into the water and then they are combined into a dive actions, such as 207 B backward flipped curved body with three weeks and a half. The machine is enabled to automatically identify diving video. The identification result is used for real-time operation of automatic annotation and it is action base on search criteria.

Motion analysis ultimate goal is to be able to analyze the deficiencies of athletes actions to guide athletes training. Athletes actions are compared with standard actions in database to get numerical differences, then after adjusting the semantic mapping process, the pre-defined rules are converted in the form of natural language to describe the completion of the action. It includes the overall level evaluation of completing action. It is decomposed into four parts which are the run-up, takeoff, air, into the water. These basic movements are essentials to grasp the situation, where should be improved and how to improve the place.

In this study, CRF (Conditional Random Fields) method (Steffens et al., 1998) is used to do the behavior recognition and understanding. The CRF is a method of calculating the specified output node value undirected graph model at the time of the conditional probability, which is at the given input node value. Its general model is shown in Fig. 6.

In Fig. 6, X is the input set of observed random variables, Y is a collection of output random variables, which can be predicted by models, dependencies between variables indicate undirected edge connector.


Fig. 6:	CRF model schematic diagram


Fig. 7:	Training process

Behavior recognition is divided in two steps:

•	Training process: Artificial segmentation of video sequences is used to learn the basic movements, the basic operation CRFs model is trained. Training process framework is shown in Fig. 7
•	Identification process: The segment video sequences are inputed, its traversal CRF parameters are calculated, what kind of maximum parameter value is on the kind of attribution. Identification process framework is shown in Fig. 8

3D motion simulation and graphics editing interfaces: Motion and attitude parameters are obtained in reverse at optimum state of diving athletes. A new video needs to be reconstructed, which is a visual movement pattern reproduction of human three-dimensional model. It is diving course and trajectory of the best states. These can help athletes understand the new process to understand themselves, to adjust better their training methods. To this end, we propose a three-dimensional virtual human animation based on key video frame tracking data, there are the 3D graphical editing interfaces later, the action can be interactively modified on this basis, the action is designed, action choreography is done and so on.

By using the H-anim standard in VRML, virtual human geometric model is built (Badler et al., 1999; Humanoid Animation Working Group, 2003; Li et al., 1998; Bregler and Malik, 1998). In the virtual human file import process, the Finite State Machine (FSM) is used to interpret the document, in order to display and control virtual people easily, virtual human geometric data is represented by a tree data structure in computer memory space.


Fig. 8:	Recognition process

A local coordinate system is established of in the world coordinate system of human movement and its joints, the inverse kinematics is used to calculate the displacement vector and the rotation axis angle on each video frame of the body joints, it is in relative to the initial state.

Virtual human motion modeling: Based on human motion of each joint process at a certain time, coordinate values are J₁: x₁, y₁, z₁; J₂: x₂, y₂, z₂;... J₁₆: x₁₆, y₁₆, z₁₆ in the world coordinate system (J₁~J₁₆ logo in Fig. 5); direction angle of joint rotation is firstly the rotating angle α around the Z-axis, then β angle of rotation around the Y-axis and finally γ angle about the X-axis rotation. The base coordinate system (reference coordinate system) origin is set at the joints J1, its direction is always consistent with the direction of the local coordinate system of joint J1 on human body initial state. Displacement vector of root joint J1 is the J1 coordinate value in the world coordinate system.

Since the role of the torque is considered on the three-dimensional data, it can be considered that the rotational component in the Y-axis is zero. The corresponding rotation matrix is the matrix, it can be obtained by the forward kinematics that the coordinate system J1 is respected to a rotation matrix of the base coordinate system:

(6)

Transformation equations is as follow expression (Eq. 7) between the J2 establish joint local coordinate and the world coordinate:

(7)

According to the above equation, equations can be obtained as follows:

Image for - Diving Sports Auxiliary Training Video Analysis Research

(8)

The solution was:

(9)

Virtual human animation experiment: In the experimental prototype system, virtual human file is used in accordance with H-anim standard, test data is drivered from the joint coordinate data sequence in video tracking results, bouncing video tracking results are used in data source. In the prototype system, the global movement pose of virtual human can be changed by the mouse, such as whole body rotation, scaling the body. Figure 9 is the frame sequence results of a virtual people jumping portion.


Fig. 9:	Virtual people doing jumping sports


Fig. 10(a-d):	Diving video segmentation results (a, c) Original video sequence and (b, d) Segmentation results


Fig. 11:	Diving video tracking results

The system provides a rich set of functions, some of the potential demand is constantly digging. From the trial process, the use of the system can help players master the technical essentials of action as soon as possible. Training efficiency is greatly improved and duplication of training blind is reduced. The possibility of injury of athletes is reduced. The results of the system operation is shown in Fig. 10-11 (Hsu and Tsan, 2004; Dufaux and Konrad, 2000).

DISCUSSION

Human-board coordination: Athlete C is as an example, correlation analysis is made between a single operation cycle sampling jump board process video signals and their take-off model (Shen et al., 1995a, b). The correlation coefficient of his take-off model is in Table 1. indicates correlation coefficient between single springboard video signal A_i and take-off model .

Table 1:	Correlation coefficients for athlete C different individual action and its take-off video model

Table 2:	Correlation coefficient list between different players take-off model

It can clearly be seen from the data in Table 1, the lower the difficulty of movement, the better athletes stability. The average correlation coefficient shows the overall stability circumstances of the athlete to complete the action. Additionally, the correlation coefficients of act 301 take-off video model and act 303 take-off video model, act 305 take-off video model is 0.968, 0.956, respectively, it indicates that different take-off modes of the same group is very similar, the simple operation of training is helpful for an athlete larger training within the same group difficulty of movement.

Correlation coefficient of 3 take-off video model is determined by the same method for all the four player, the same conclusions for all the athletes were obtained as above. Upon completion of the same action, jump model correlation analysis is done between different athletes. It can reflect the take-off mode degree of consistency between different individuals. Table 2 lists the correlation coefficient of four athletes between any two jump models. It is clear that between any two different athletes, the correlation coefficient of take-off model is the larger. This shows that in the process to complete the same operation by using the same equipment, the different trained athletes will select the same operation mode.

CONCLUSION AND OUTLOOK

Sports goal is to “higher, faster, stronger”. Sports athletes improve speed and strength by continuing to challenge themselves. In addition to hard training, scientific training methods and means is an important and effective way to improve athletic performance. The current sports has grown to a considerable level. To further enhance athletic performance is increasingly dependent on advanced science and technology.

In this study, the video analysis, 3D motion simulation, edit interface and the database management are researched. The relation and the data flow are analyzed among the main modules of diving sports auxiliary training system. Tracking model and virtual human motion modeling are built, the key techniques and innovations are addressed in detail. The system is used Hunan diving training and training athletes has made good results in the game. This system application can not only improve the efficiency of training, reduce the repeated training and the probability of false judgments but also there is the high efficiency in the system and it is easy to be operated.

With this training video analysis and human diving simulation system, diving training mode changes from traditional training to science one, diving training changes from the coache eye method and athletes repeated training to the precision video capture, human motion analysis and simulation. During the diving competition, it can assist the referee to judge the diving competition to avoid false positives which is caused by subjective factors. The objectivity, fairness and impartiality are improved in referees work.

Diving sports video analysis system is integration of computer graphics, digital image processing, pattern recognition, biology, mathematics and human motion technology. The system is used as a specialized diving support system, which departure from diving video and diving itself features, they will have a significant impact on the diving training. In the future, the diving team will try and feedback through the system and its fuctions are constantly improved, it has been used officially for training and competition. But how would a better graphics technology is used in sports training or assist? it is a worthy topic in-depth study. We will continue to develop and strive to achieve greater breakthroughs in key technology research and system practical development.

REFERENCES

Badler, N., C.B. Phillips and B.L. Webber, 1999. Simulating Humans: Computer Graphics Animation and Control. Oxford University Press, London.
Bregler, C. and J. Malik, 1998. Tracking people with twists and exponential maps. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, June 23-25, 1998, Santa Barbara CA., pp: 8-15.
CrossRef
Cormen, T.H., C.E. Leiserson, R.L. Rivest and C. Stein, 2001. Introduction to Algorithms. 2nd Edn., The MIT Press, Cambridge, UK., ISBN-13: 9780262032933, Pages: 1180.
Dufaux, F. and J. Konrad, 2000. Efficient, robust and fast global motion estimation for video coding. IEEE Trans. Image Process., 9: 497-501.
CrossRef
Farin, D., P.H.N. de With and W.A. Effelsberg, 2004. Video-object segmentation using multi-sprite background subtraction. Proceedings of the IEEE International Conference on Multimedia and Expo, Volume 1, June 30, 2004, Taipei, pp: 343-346.
CrossRef
Hsu, C.T. and Y.C. Tsan, 2004. Mosaics of video sequences with moving objects. Signal Process.: Image Commun., 19: 81-98.
CrossRef
Humanoid Animation Working Group, 2003. Specification for a standard humanoid. Version 1.1. http://h-anim.org/Specifications/H-Anim1.1/.
Kahol, K., P. Tripathi and S. Panchanathan, 2004. Automated gesture segmentation from dance sequences. Proceedings of the 6th IEEE International Conference on Automatic Face and Gesture Recognition, May 17-19, 2004, Seoul, Korea, pp: 883-888.
CrossRef
Kun, T., S. Wu, S. Lin and Y. Zhang, 2005. Research on panorama composition technique of sports video. J. Comput. Aided Des. Comput. Graph., 17: 2552-2557.
Direct Link
Li, Y., S. Ma and H. Lu, 1998. Human posture recognition using multi-scale morphological method and Kalman motion estimation. Proceedings of the 14th IEEE International Conference on Pattern Recognition, Volume 1, August 16-20, 1998, Brisbane Qld, Australia, pp: 175-177.
CrossRef
Steffens, J., E. Elagin and H. Neven, 1998. Person spotter-fast and robust system for human diction, tracking and recognition. Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, April 14-16, 1998, Nara, Japan, pp: 516-521.
Tsaig, Y. and A. Averbuch, 2002. Automatic segmentation of moving objects in video sequences: A region labeling approach. IEEE Trans. Circuits Syst. Video Technol., 12: 597-612.
CrossRef
Wang, C. and G. Gu, 2004. Video object segmentation and tracking algorithm based on difference and intersection. Opt. Tech., 30: 564-566, 570.
Direct Link
Wang, J., H.F. Wang, Q.S. Liu and H.Q. Lu, 2006. Fast global motion estimation based on 3-parameter global motion model. Chinese J. Comput., 29: 920-927.
Wang, Z., Y. Zhang and S. Xia, 2005. 3D human motion simulation and a video analysis system for sports training. J. Comput. Res. Dev., 42: 344-352.
Direct Link
Shen, Y., J. Chunlin, J. Yong and H. Bao, 1995. Human-board system model (I): Basic movement equation. Shandong Instit. Phys. Educ., 11: 26-29.
Direct Link
Shen, Y., H. Bao, J. Yong and J. Chunlin, 1995. Human-board system model (II): Initial value and constraint conditions about human-board equation. Shandong Instit. Phys. Educ., 11: 30-33.
Direct Link

Information Technology Journal

Research Article