นอกจากนี้ กระบวนการ re-training สามารถปรับปรุงการนำ trained model ใช้บน unseen dataset ซึ่งจะเป็นขั้นตอนที่จำเป็นในการใช้ trained model ในสถานการณ์จริง
Ai ตรวจการเคลื่อนไหวของลิ้น เพื่อประโยชน์ทางการแพทย์ มหาวิทยาลัยมหิดล
Available Online 📝สามารถเข้าอ่านได้แล้ว🤓
“Encoder-decoder network with RMP for tongue segmentation” in Medical & Biological Engineering & Computing (2023) journal of International Federation of Medical and Biological Engineering .
✒️ Kusakunniran W., Borwarnginn P., Karnjanapreechakorn S., Thongkanchorn K., Ritthipravat P., Tuakta P., Benjapornlert P.
Abstract: Tongue and its movements can be used for several medical-related tasks, such as identifying a disease and tracking a rehabilitation. To be able to focus on a tongue region, the tongue segmentation is needed to compute a region of interest for a further analysis. This paper proposes an encoder-decoder CNN-based architecture for segmenting a tongue in an image.
The encoder module is mainly used for the tongue feature extraction, while the decoder module is used to reconstruct a segmented tongue from the extracted features based on training images. In addition, the residual multi-kernel pooling (RMP) is also applied into the proposed network to help in encoding multiple scales of the features.
The proposed method is evaluated on two publicly available datasets under a scenario of front view and one tongue posture. It is then tested on a newly collected dataset of five tongue postures. The reported performances show that the proposed method outperforms existing methods in the literature. In addition, the re-training process could improve applying the trained model on unseen dataset, which would be a necessary step of applying the trained model on the real-world scenario.
The research paper proposes a solution of tongue segmentation in images. The solution relies on a convolutional neural network, using deep U-Net with deep layers of encoder-decoder modules. The model is trained with a starting resolution of 512 x 512 pixels.
To enhance the segmentation performances of the trained model across recording environments, three main types of data augmentations are added in the training process, including additive gaussian noise, multiply and add to brightness, and change color temperature. They could also handle an inadequate number of data samples in the limited datasets.
The proposed method is evaluated based on four measurement metrics of Dice coefficient, mean IoU, Jaccard distance, and accuracy. The model is successfully trained on publicly available datasets, and then transferred to be tested with the self-collected dataset in the real-world environment.