AI Accounting Agent

Industry

Finance

Client

Finance United

Team

SCALE

Year

2024

We were hired to build a Lithuanian-language NLP system that turns the daily flood of invoices, bank statements, and contracts into processed, compliance-checked, insight-ready data, designed from day one as a licensable product for Lithuania's 10,000+ accounting firms.



Challenge

Accounting firms run on documents. Invoices, bank statements, contracts arrive by email, as scanned PDFs, as digital files, in no consistent format and at volumes that scale faster than headcount. Every document needs to be validated against current regulations and checked for discrepancies. Do it manually and errors compound: a missed regulatory update becomes a compliance gap that surfaces months later in an audit.

Finance United wanted to automate these workflows, but the ambition went further than internal efficiency. Lithuania has over 10,000 accounting firms, most of them SMEs facing the same documentation bottleneck. The goal was to build a system that works for Finance United first and becomes a licensable product second; hence, turning an internal tool into a market opportunity.

The technical barrier was language. Lithuanian is a low-resource language, which means off-the-shelf NLP models can't handle the accounting-specific terminology and legal phrasing the system needs to process accurately. And so the domain-specific training data had to be built.

Approach

The foundation is a data ingestion pipeline that accepts documents from emails, scanned PDFs, and digital files and normalizes everything into a consistent text format. This eliminates the format diversity problem as the system doesn't care how a document arrives, only what it contains.

The NLP model was trained specifically on Lithuanian accounting language, including regulatory vocabulary that makes compliance checking possible. Fine-tuning for a low-resource language meant building domain-specific datasets rather than leveraging pre-trained multilingual weights, which is a heavier lift, but the only path to the accuracy threshold that accounting work demands. The model identifies compliance issues and flags discrepancies in real time, as documents are processed rather than weeks later during review.

Behind the model sits an adaptive knowledge base that stores insights, patterns, and best practices from every processed transaction. Unlike a static rule engine, this knowledge base improves with use. Each interaction makes subsequent processing more accurate that creates a compounding advantage that deepens over time. A legal information scraper runs continuously alongside it, pulling the latest regulatory data so compliance checks always reflect current law without manual intervention.

The system also includes a gamification layer that awards points for user contributions and participation to encourage knowledge-sharing across the platform's user base and building collective intelligence that benefits all users.

Architecture/Backend

Multi-source ingestion & normalization; adaptive knowledge base; legal info scraper.

AI/ML models

Accounting NLP in LT; real-time compliance monitoring; document insight extraction.

Infrastructure/Deployment

Licensable design; gamification; real-time self-improving knowledge base.

Key results

  1. Automated end-to-end document processing for invoices, bank statements, and contracts arriving in mixed formats, replacing the manual handling that was both the largest time sink and the primary error source.

  2. Built real-time compliance monitoring that catches regulatory discrepancies during processing, not during audits, shifting from reactive discovery to proactive prevention.

  3. Trained a Lithuanian-language NLP model on accounting and legal vocabulary, solving the low-resource language problem that made off-the-shelf solutions unusable for this domain.

  4. Architected the system as a licensable product from day one, targeting Lithuania's 10,000+ accounting firms with an adaptive knowledge base that gets more accurate with every transaction and a legal scraper that keeps compliance checks current automatically.