Bankruptcy Form Scraping
Activity: Bankruptcy Form Scraping
Client: Data Vendor
This project involved the bulk identification and classification of bankruptcy related forms on the Public Access to Court Electronic Records (PACER) system. Once the bankruptcy files had been downloaded they were unitized to extract Chapter 7, 11, 12 and 13 bankruptcy forms. Data was then automatically extracted from these forms using optical character recognition and a combination of rules-based and machine learning algorithms. A quality control review was performed the results of which were used to automatically update the underlying machine learning algorithm in real-time.