In 2023, the World Health Organization (W.H.O) reported that more than 2.2 billion people suffer from vision problems, of which a billion can be prevented or treated.
Approximately US$ 411 billion is lost annually to global productivity.
Macular disease refers to various conditions affecting the central part of the retina, known as the macula. These include Choroidal Neovascularization (CNV), Age-Related Macular Degeneration (AMD), Diabetic Macular Edema, Glaucoma, Drusen, and Diabetic Retinopathy (DR). Early diagnosis is critical, as these conditions can lead to irreversible vision loss. However, the conventional interpretation of OCT images can be time-consuming, labour-intensive, and susceptible to errors. Consequently, it is essential to implement computer-assisted diagnosis alongside OCT imaging to mitigate these challenges and ensure precise and efficient disease detection. In this project, I developed an algorithm that uses an Attention Mechanism to automatically classify macular diseases such as (CNV), Diabetic Macular Edema (DME), and Drusen while distinguishing them from normal macular conditions with optical coherence tomography (OCT) imaging.
The retinal Optical Coherence tomography image ( OCT) datasets consist of 108,309 training images and 1,000 test images. The proposed model was trained utilising the Vision Transformer and Swin Transformer self-attention mechanisms to classify OCT images.
Language: Python 3.10.
Libraries: TensorFlow, PyTorch, NumPy, Matplotlib, Transformer,
Model: NVIDIA Tesla T4 GPU
Environment: Google Colab
The ViTransformer architecture employs a global self-attention mechanism to capture long-range dependencies across the image.
Also, It has a drawback, as it requires a significant amount of memory space and computational costs.
In contrast, the Swin Transformer employs a shifted window multi-head self-attention (SW-MSA) mechanism, which partitions the image into nonoverlapping windows and computes self-attention within these windows to address the computational inefficiencies of ViT.
Vision Transformer achieved an accuracy of 89%, while Swin Transformer achieved the highest accuracy in OCT classification at 96% by concentrating on the relevant features. The model accurately categorised the Macular Disease as CNV, DMR, Drusen or Normal.
Reference:
W.H.O. (2023) Vision impairment and blindness. Available at: https://www.who.int/news-room/fact-sheets/detail/blindness-and-visual-impairment [Accessed: 13 June 2024].
Kermany, D., Zhang, K. and Goldbaum, M. (2018) ‘Large Dataset of Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images’, 3. Available at: https://doi.org/10.17632/rscbjbr9sj.3 [Accessed: 19 May 2024]