Capsule Vision Challenge - A deep learning solution using ResNetSE
This study presents an approach to classify abnormalities in video capsule endoscopy
(VCE) frames using a modified ResNet101 architecture with Squeeze and Excitation (SE)
blocks. The aim was to build a generalized model capable of automatic abnormality
detection across ten classes namely Angioectasia, Bleeding, Erosion, Erythema, Foreign
Body, Lymphangiectasia, Polyp, Ulcer, Worms, and Normal. Our approach involves
augmenting and balancing the dataset to address class imbalance, followed by training
and evaluating the model. The model achieved a mean AUC of 0.984, mean specificity of
0.990, mean average precision of 0.839, mean sensitivity of 0.759, and a balanced accuracy
of 0.780. These results demonstrate the potential of SE-ResNet101 for VCE abnormality
detection, with opportunities for further improvement in sensitivity and overall accuracy.