Hand Gesture Recognition with Fine-Tuned Residual Neural Network
Introduction
In a collaborative effort to improve communication for the deaf community, this project aims to evaluate various methods for hand gesture recognition using machine learning (ML) techniques. Sign language serves as a vital means of communication for the deaf, yet it remains a challenge for those unfamiliar with it. Automatic sign language recognition, powered by ML models, can potentially bridge this communication gap by enabling fast and accurate translation of hand signals into understandable language.
My Contribution:
Within this group project, my focus was on exploring the effectiveness of a fine-tuned Residual Neural Network (ResNet) architecture for hand gesture recognition. This involved adapting a ResNet50V2 model to the task at hand, comparing its performance with a pre-trained VGG16 model, and assessing the impact of fine-tuning on model accuracy and efficiency.
Methodology:
- Image Pre-processing:
- Prior to model training, a crucial step involved image pre-processing and data augmentation. Given the dataset's characteristics (28x28 grayscale images), up-sampling and resizing were performed to meet the input requirements of the ResNet50V2 model.
- Model Architecture:
- The ResNet50V2 model, pre-trained on the ImageNet dataset, was fine-tuned for the hand gesture recognition task. This involved modifying the final layers of the network to adapt it to our specific classification task.
Evaluation and Discussion:
- Model Performance:
- The fine-tuned ResNet50V2 achieved promising results, with an accuracy of 98.581% on the testing set and an F1-score of 0.97. These metrics were compared to those obtained from a fine-tuned VGG16 model (96%) and a ResNet trained from scratch, showcasing the efficacy of fine-tuning in improving performance.
- Runtime Efficiency:
- The model training and evaluation processes were conducted efficiently, leveraging Google Colab's GPU capabilities to expedite computation time.
- Challenges and Future Directions:
- Despite the overall success of the fine-tuned ResNet50V2 model, some challenges were identified, particularly in distinguishing subtle differences between similar hand gestures. Future iterations of the project could explore techniques to enhance the model's sensitivity to such nuances.
Conclusion:
Through meticulous experimentation and evaluation, the fine-tuned ResNet50V2 model emerged as a promising candidate for hand gesture recognition, demonstrating its potential to facilitate seamless communication for the deaf community. Further refinement and optimization of the model hold promise for even greater accuracy and efficiency in real-world applications.
Because this is a University project the project code must remain in a private repository. But I can share a link upon request.