Book a Call

How We Solved a Major AI Challenge: Invoice Labeling Done Right

Overview

Our client, a fast-growing startup backed by top investors and a total of $20M raised in Series A funding, has revolutionized data extraction from invoices using artificial intelligence (AI).

With a global customer base, including clients paying over six figures monthly, they provide access to their AI system via API, making data extraction more efficient. However, as each invoice is unique in layout, format, and content, the challenge of training their AI system arose, requiring a scalable and accurate solution.

Problem

Invoices from different countries, companies, and even industries vary significantly in layout and structure, making it difficult for AI systems to consistently extract the necessary data. This lack of uniformity slowed down their processes, requiring a large amount of manual labeling to train the AI system properly. Additionally, the high costs of internal and external labelers, combined with the inability to scale quickly, made it difficult for the company to keep up with demand.

Our Solution

We stepped in to provide a scalable, high-quality labeling service that addresses these issues. Our dedicated team in the Philippines was trained to carefully analyze invoices and label them accurately, ensuring the AI system learns effectively.

To improve efficiency and maintain high quality, we developed a custom Chrome extension that integrates with the client’s website, offering several enhancements:

  • Improved interface: Making the labeling process easier and faster.
  • Keyboard shortcuts: Speeding up navigation and task execution.
  • GPT integration: Enabling labelers to ask questions about selected fields, reducing the need for external consultation.
  • In-line documentation: Providing instant access to guidelines without leaving the page.
  • Basic math operations: Allowing labelers to perform calculations directly on selected invoice fields.
  • Copy-paste functionality: Simplifying the transfer of information from the invoice.
  • VAT number search: Helping labelers find related invoices to assist in identifying consistent field placements.
  • Reviewer comments translation: Streamlining the process by translating feedback from Spanish into English for the labelers.

Additionally, the Chrome extension allows us to track key performance metrics, including the number of invoices labeled, labeler speed, errors identified during review, and feedback from reviewers. These insights allow us to continuously optimize performance and focus on areas where improvement is needed.

Results

Our approach has significantly improved the client's AI training process:

  • Increased efficiency: With the Chrome extension, our labelers work faster, processing up to 3,000 invoices monthly.
  • Scalability: We can easily scale the team with just 20 days' notice, allowing the client to ramp up capacity as needed.
  • Cost savings: By using our team, the client has reduced costs compared to their internal and external labeling resources, while benefiting from our data-driven performance monitoring.
  • Improved accuracy: The combination of real-time performance tracking, GPT integration, and feedback loops has led to fewer errors and better quality data for AI training.
  • Regular updates: Monthly meetings with the client keep them informed of progress, provide insights from analytics, and allow us to continuously refine our approach based on feedback.

Our solution not only improved the efficiency of the labeling process but also provided the client with the ability to scale their operations quickly and affordably.