For human beings, neck movement will be degraded due to aging, trauma, musculoskeletal disorders, or degenerative diseases. Cervical range of motion (CROM) measurement is one of the popular quantitative neck examinations. Despite radiography is considered as the gold standard, it suffers from invasiveness, radiation exposure, and expensiveness.
Recently, vision-based methods have been applied for CROM measurement but achieve large errors and require depth camera. On the other hand, deep neural networks provide good performances on head pose estimation (HPE) from a single image, thus promising for medical CROM measurement.
We propose to use CNN networks to extract pyramidal or multi-level image features, which are passed to cross-level attention modules for feature fusion and then to a modified ASPP module and a multi-bin classification/regression module for spatial-channel attention and Euler angle conversion/prediction, respectively.
The proposed technique was evaluated on public datasets, such as 300W_LP, AFLW2000, and BIWI, to verify its superior performances (with mean MAE = 3.50°, 3.40°, and 2.31° for different experimental protocols) than state-of-the-art methods. Our pre-trained model was also evaluated with our own collected dataset from hospital for CROM measurement.
It also achieved the lowest MAE of 4.58° among other methods and conformed with a medical standard of 5 degrees except the pitch angle (which has a MAE of 5.70°, larger than the standard and the yaw (MAE = 3.60°) and roll angles (MAE = 4.44°)). In general, HPE technique is feasible for CROM measurement and shows its advantages of speed, non-invasiveness, free of anatomical landmark and low cost of operation.