Sorry, you need to enable JavaScript to visit this website.

Optical Character Recognition for Medical Records Digitization with Deep Learning

DOI:
10.60864/2p20-mz40
Citation Author(s):
Submitted by:
Muhammad Ateequ...
Last updated:
17 November 2023 - 12:05pm
Document Type:
Presentation Slides
Document Year:
2023
Event:
Presenters:
Muhammad Ateeque Zaryab
Paper Code:
3136
 

The importance of document digitization has increased due to recent technological advancements, including in the medical field. Digitization of medical records plays a vital role in the healthcare sector as it helps expedite emergency treatment. Due to the scarcity of published studies and public German textual resources, a medical records database with German handwriting was collected and digitized. In this study, document digitization was accomplished by implementing deep learning, region of interest (ROI) detection, and optical character recognition (OCR) on a dataset containing medical forms filled with German and English characters. To find the best model for ROI detection, YOLOv5, and SSDResNet50 models were utilized and compared with YOLOv5 producing a better mean average precision (mAP) of 0.91. OCR was then carried out on the output from YOLOv5 with two different methods again for comparison. The Gated-CNN-BLSTM algorithm yielded a character error rate (CER) of 9%, while transformer-based OCR (TrOCR) achieved a CER of 6%. The proposed system could be implemented and further tested in local hospitals, with the OCR dictionary being expandable to include other Roman character based languages.

up
0 users have voted:

Comments

Paper Presentation