The project focuses on the development and enhancement of OCR (Optical Character Recognition) solutions for the automatic extraction of information from thousands of PDF documents — especially fees and recoverable amounts.
The goal is to eliminate manual verification and achieve near 100% accuracy in extraction results.
Responsibilities:
- Develop and optimize OCR pipelines for reading and extracting structured and unstructured data from PDF documents.
- Create and maintain Python scripts for processing, validating, and transforming extracted data.
- Implement testing, performance tuning, and continuous improvements to OCR models and extraction rules.
- Collaborate with the client’s internal teams (engineering and business) to understand requirements and propose efficient technical solutions.
- Support technical documentation and the maintenance of the developed solution.
Technical Requirements:
- Advanced English for technical collaboration with the client’s global team
- Strong Python experience, ideally 8 to 10 years — automation, PDF manipulation, regular expressions, testing
- Hands-on use of AI tools such as Claude or Codex, with a focus on prompt engineering
- Solid (Senior-level) experience in OCR projects
- Experience with data pipelines, modeling, and integration of results (e.g., JSON, CSV, APIs)
- QA experience
- AWS experience
- Familiarity with agentic workflows
- Strong analytical skills and attention to detail to ensure high accuracy
Nice-to-Have Qualifications:
- Previous experience in document automation projects within energy, utilities, or finance companies
- Knowledge of Machine Learning applied to OCR (e.g., layout analysis, entity recognition)
- Familiarity with Google Vision, AWS Textract, or Azure Cognitive Services
- Experience with Gemini
- Architecture background
We aim to provide our team with a welcoming, dynamic, and collaborative environment. To achieve this, we offer several initiatives, such as:
- 100% remote opportunities 👨🏻💻
- Home office allowance 💻
- Regular feedback 💬
- Referral program 🏅
- Psychological support 🙋🏻♂️
- Workplace exercise sessions 🏋️
- Knowledge academy 🧠
- Partnership with an English school 🔤
- Monthly transparency meetings 🔃
- Online happy hours 🍻
- Welcome kit 🎁