Shaping the Future of Document Processing: Rossum Presents the Winners of the DocILE Competition

  • Team using GraphDoc-based solution from USTC and iFLYTEK Research Institute wins Rossum’s DocILE competition, showcasing groundbreaking techniques in document information extraction.

  • The competition underscores the synergy between computer vision and transformer architectures, emphasizing the need for a comprehensive strategy in processing complex business documents.

  • Rossum’s DocILE initiative sparks collaboration, innovation, and sets a global benchmark for Intelligent Document Processing, reinforcing the company’s commitment to advancing research in this field.

Rossum, a leader in the Intelligent Document Processing industry, is thrilled to reveal the remarkable results of its groundbreaking DocILE (Document Information Localization and Extraction) competition. This global event, which kicked off in February 2022, has left an indelible mark on the field of document processing.

Rossum launched the DocILE initiative in 2022, granting access to a treasure trove of over 6,700 meticulously annotated business documents in addition to 100,000 synthetically-generated documents.

This unprecedented benchmark dataset served as the litmus test for participants worldwide, enabling them to measure their solutions against established methodologies. Over the course of a year, diverse teams harnessed this dataset to sharpen their prowess in pinpointing critical data, such as VAT numbers and company addresses, within semi-structured business documents.

The competition concluded on May 24th of 2023 and attracted a wide range of submissions. Participants showcased their innovations by creating varied approaches to tackle the complex challenges inherent in document information extraction.

A team from the University of Science and Technology of China and iFLYTEK Research Institute presented a method called ‘GraphDoc’ and took the first place by achieving top honors in both Key Information Localization and Extraction (KILE) and Line Item Recognition (LIR) tasks, surpassing other participants by a significant margin.

Read More: Twilio recognized as a Leader in 2023 Magic Quadrant for CPaaS

Their success was driven by an innovative use of transformer architecture, which gave them a head start in the competition. They introduced a noteworthy technique that involved learning which words have to be combined to get the correct extracted value, and leveraged heuristics based on data trends to further enhance their results.

The competition saw a mix of different methods, with some relying on computer vision and others on transformer architectures, demonstrating the rising popularity of the latter in the field. More importantly, the competition demonstrated that it’s necessary to understand the document simultaneously as an image and as the text it contains since purely computer vision methods and traditional transformers working only with the text cannot achieve the same performance.

By combining these two approaches, participants were able to achieve a deeper and more accurate understanding of complex business documents, where computer vision addressed specific challenges while transformers handled different aspects. This emphasized the need for a comprehensive strategy that takes into account both the text and visual structure of documents for precise interpretation.

Štěpán Šimsa, Research Scientist at Rossum, expressed his enthusiasm for the competition’s impact, stating, “The DocILE initiative has not only spurred groundbreaking research but has also facilitated industry collaboration and innovation. By bridging methodological gaps, we’re empowering the Intelligent Document Processing community to develop solutions that revolutionize business operations.”

Read More: SalesTechStar Interview with Shawn Conahan, Chief Revenue Officer at Wildfire Systems 

As part of the competition, the participants had to open-source their code and publish a paper describing their applied method. The prize pool comprised of $8,000, of which $6,000 went to the winning GraphDoc solution as it received the first place award as well as the ‘Best Paper Award’.

This competition embodies Rossum’s unwavering mission to accelerate the evolution of the Intelligent Document Processing field on a global scale, establishing a benchmark for document understanding. This initiative serves as a catalyst, igniting the creation of novel techniques that enhance the precision and efficiency of document information extraction—a testament to Rossum’s core values of innovation and excellence.

Write in to psen@itechseries.com to learn more about our exclusive editorial packages and programs.

Best Paper AwardDocument Information LocalizationGraphDocKey Information Localization and ExtractionLine Item RecognitionNewsRossumsemi-structured business documentsVAT