The client was looking for a large dataset of text documents which includes various forms, dense documents, certificates, invoices, receipts, bank statements, bank cheque, legal documents, clinical notes, doctor’s prescriptions, tax statements in various formats printed, handwritten, scanned copy, mobile capture and many more. The main purpose of this dataset was to develop a reliable text document segmentation and data extraction model.
Project Category
Achievement
AIW collected different forms, reports, legal documents, clinical notes, doctor’s hadnwriting etc. from different sources across various geographies. These documents were then scanned in different types of scanners including mobile phone scanners. The created data source was used by a large organization to train and build their data extraction model.