Back to Blog

Automating Document Data Extraction with OpenClaw and Kudra AI

# Automating Document Data Extraction with OpenClaw and Kudra AI OpenClaw is an incredible open-source AI agent that lives on your machine, capable of reading files, browsing the web, and answering you directly in Telegram or WhatsApp. But out of the box, standard LLMs struggle with highly structured, complex documents like medical records, tax forms, or layered legal contracts. They tend to hallucinate numbers or miss nested tables, creating unreliable results. If you want your agent to process serious paperwork flawlessly, you need to supercharge it with dedicated Document Intelligence. In this tutorial, we will build a custom **OpenClaw Skill** that connects your agent to **Kudra AI**, a powerful document extraction engine. By doing so, your OpenClaw agent can handle complex documents with precision, saving you time and providing highly accurate results. Let’s dive into the details. --- ## Why OpenClaw and Kudra Are a Perfect Match OpenClaw uses a modular approach to AI with its **Skills** feature, allowing you to extend the agent’s functionality with minimal effort. While OpenClaw’s default capabilities are impressive for general tasks, it’s not designed to excel at extracting data from highly structured PDFs. That’s where Kudra AI comes into play. Kudra AI is a specialized document extraction engine built to understand the fine-grained structures of complex files. Unlike general LLMs, Kudra doesn’t hallucinate, misinterpret tables, or lose context in deeply nested data. It uses advanced methods for layout-aware optical character recognition (OCR) and context-based parsing, delivering results tailored to your document type. By combining OpenClaw’s Skills framework with Kudra’s API, you can bridge this gap. Once configured, your agent can automatically recognize when a document exceeds its native capabilities and invoke Kudra precisely when needed. --- ## The Architecture Before building, it’s important to understand the architecture. OpenClaw’s Skills system is what makes this integration possible. A Skill in OpenClaw is simply a folder containing: - A `SKILL.md` file, which defines the prompts and logic for when the skill should be used. - An executable script (Python, Node.js, Bash, or another language) that does the actual processing. In our case, the `SKILL.md` will teach OpenClaw to handle complex documents by offloading the heavy lifting to Kudra. Meanwhile, our Python script will serve as the bridge to Kudra’s API, facilitating secure communication and returning structured data. The flexibility of OpenClaw’s architecture ensures that this Skill can be integrated seamlessly into your workflow. Whether you’re working on a local or remote machine, the process remains the same. Let's get into the practical steps. --- ## Step 1: Create the Skill Directory First, create a directory in your OpenClaw skills folder where the new Skill will reside. ```bash mkdir -p ~/.openclaw/skills/kudra-extractor cd ~/.openclaw/skills/kudra-extractor This directory will house everything Kudra needs to function within OpenClaw. The name `kudra-extractor` is arbitrary but descriptive. --- ## Step 2: Write the SKILL.md Prompt The `SKILL.md` file instructs OpenClaw on how and when to apply this Skill. Here’s the content of our `SKILL.md` file: ```markdown # Kudra Document Extractor Skill Use this skill when the user asks you to extract data from a complex PDF, medical record, or legal document. Do not try to read complex PDFs manually. Use the `extract.py` tool provided in this directory. Tool: `./extract.py --file <path_to_pdf>` ### How the Prompt Works The `Tool` directive points OpenClaw to the script `extract.py`, which will handle all the communication with Kudra. When your agent receives a task matching the skill description (e.g., "extract data from X"), OpenClaw will delegate the task to this tool. It's crucial to make the instructions clear to prevent the agent from mishandling requests. --- ## Step 3: Write the Extraction Script The Python script (`extract.py`) connects OpenClaw to the Kudra API. Write the following code: ```python #!/usr/bin/env python3 import argparse import requests import json import os parser = argparse.ArgumentParser() parser.add_argument('--file', required=True, help="Path to the PDF document") args = parser.parse_args() # Kudra API setup api_key = os.environ.get("KUDRA_API_KEY") url = "https://api.kudra.ai/v1/extract" # Read and upload file with open(args.file, 'rb') as f: files = {'document': f} headers = {'Authorization': f'Bearer {api_key}'} response = requests.post(url, headers=headers, files=files) # Return the extracted JSON back to OpenClaw print(json.dumps(response.json(), indent=2)) ``` ### Key Points - **Environment Variable for Security:** The API key is passed via an environment variable for better security. - **Handles File Uploads:** The script reads the PDF path, uploads the document, and waits for the Kudra response. - **Formatted Output:** Kudra’s JSON response is formatted for easy readability and further processing. Make the script executable with: ```bash chmod +x extract.py ``` --- ## Step 4: Configure Your API Key For `extract.py` to function, OpenClaw must inject the Kudra API key into its environment. Open your OpenClaw configuration file: ```bash nano ~/.openclaw/openclaw.json ``` Add the following lines under `env`: ```json { "env": { "KUDRA_API_KEY": "sk_live_your_api_key_here" } } ``` Replace `"sk_live_your_api_key_here"` with your actual Kudra API key. --- ## Step 5: Test Your New Skill Restart OpenClaw to load your new Skill: ```bash openclaw gateway restart ``` Now test it! For example, send your OpenClaw agent a message: **You:** "Extract all table data and contact details from the employment contract PDF." **OpenClaw:** (Notices the complexity, runs `extract.py`, and fetches structured data.) *"Here’s the extracted data: [Table data] [Contact details]"* --- ## Advanced Tip: Creating Custom Extraction Templates in Kudra Kudra supports template-based extraction for specific document types. For example, if you routinely process invoices, you can create a Kudra template to extract only fields like totals, line items, and dates. To use a template: 1. Add your template ID to the API call: ```python response = requests.post(url, headers=headers, files=files, data={"template_id": "invoice-template-id"}) ``` 2. Update your `SKILL.md` to let OpenClaw know about this option: ```markdown Extract data using Kudra. Default behavior for general documents, or supply `template_id` for structured types (e.g., invoices). ``` --- ## New H2 Section: Benefits of Kudra AI Integration Adding Kudra AI to OpenClaw delivers numerous advantages: ### 1. **Accuracy in Complexity** Kudra’s layout-aware OCR ensures nested tables and multi-column formats are captured flawlessly. OpenClaw alone might overlook such structures. ### 2. **Scalability** From a single medical form to thousands of tax filings, Kudra processes documents at scale, making it ideal for businesses and power users. ### 3. **Customization** Custom templates allow you to train Kudra for specific formats, making each extraction perfectly aligned with your needs. ### 4. **Seamless Workflow** With this integration, you don’t need multiple apps. Use your instant messengers like Telegram or WhatsApp to process documents on the fly. --- ## FAQ 1. **What types of documents can Kudra handle?** Kudra works best with PDFs, scanned images, and structured forms. Supported types include: - Medical records - Legal contracts - Invoices and bills - Bank statements - Tax forms 2. **Do I need a Kudra subscription?** Yes, Kudra requires an active subscription. Their API pricing plans are based on usage, offering flexibility for small and large-scale operations. 3. **Can I use this Skill on remote servers?** Absolutely. OpenClaw works on any machine with internet access. Simply replicate the Skill directory to your remote setup. 4. **What if the extraction fails?** The Python script captures API errors, so OpenClaw will inform you of the failure. Common reasons include invalid API keys or exceeding your API quota. 5. **Can I add more tools like Kudra?** Yes! OpenClaw’s modular nature means you can add other APIs or tools by following the Skill creation process. --- ## Conclusion By combining OpenClaw’s flexibility with Kudra AI’s powerful document intelligence, you can automate the extraction of complex data with precision. This integration transforms your personal AI into an enterprise-grade tool. Setting it up requires only basic scripting knowledge, and the result is a custom solution tailored to your workflow. Whether you’re a freelancer handling legal contracts, a doctor processing medical forms, or a business managing invoices, this setup saves you time and ensures accuracy. Start building your Kudra Skill today and unlock the true potential of OpenClaw! ## Advanced Use Case: Automating Bulk Document Processing One of the biggest advantages of integrating Kudra AI with OpenClaw is the ability to process documents in bulk. For users handling multiple files—such as medical researchers working with patient records or businesses processing invoices—this integration makes tedious manual tasks effortless. ### Automating Bulk Workflows To enable bulk processing, you can extend the functionality of `extract.py`: 1. Modify the script to accept a folder path instead of a single file: ```python import glob parser.add_argument('--folder', help="Path to a folder containing PDF documents") ``` 2. Use Python’s `glob` library to loop through all PDFs in the folder: ```python if args.folder: pdf_files = glob.glob(f"{args.folder}/*.pdf") for pdf in pdf_files: with open(pdf, 'rb') as f: files = {'document': f} response = requests.post(url, headers=headers, files=files) print(f"Results for {pdf}:") print(json.dumps(response.json(), indent=2)) ``` 3. Update your `SKILL.md` to describe this new capability: ```markdown Use the `extract.py` tool for single or bulk PDF processing. For bulk, supply a folder path: `./extract.py --folder <path_to_folder>`. ``` ### Practical Example: Invoice Processing at Scale Imagine you’re an accountant with a folder full of client invoices. With this Skill, you can simply drop all the PDFs into a designated folder and ask OpenClaw to process them. On your device: ~/Documents/Invoices/ - Client1_Invoice.pdf - Client2_Invoice.pdf - Client3_Invoice.pdf You can then message OpenClaw: > "Extract all totals and dates from the invoices in the Invoices folder." Within minutes, you receive structured data for every file, ready to be exported to your accounting software. --- ## Comparing Kudra AI to Standard LLMs for Document Processing It’s natural to wonder why Kudra AI is necessary when general-purpose LLMs, like GPT models, exist. While LLMs have immense strengths in understanding natural language and generating insights, they struggle to handle the structured complexity of many real-world documents. Here’s a side-by-side comparison: | **Feature** | **Kudra AI** | **Standard LLMs** | |----------------------------|--------------------------------------------------|------------------------------------------| | **Accuracy** | Near-perfect for structured tables, forms, OCR | Prone to errors in layout or table parsing | | **Data Types** | Handles nested keys, multi-column structures | May misinterpret layouts as linear text | | **Reliability** | Always returns machine-readable JSON | May hallucinate fields or return ambiguous data | | **Customization** | Supports templates for specific document types | Requires extensive prompt engineering | | **API-First Design** | Designed for automation and integration | Requires intermediate steps for output parsing | ### Why Choose Kudra The primary reason to invest in Kudra is its focus. Document processing is Kudra’s specialization, while LLMs are generalists. This focus means you get consistent, accurate results, particularly for high-stakes use cases like contract analysis or healthcare administration. Integrating Kudra with OpenClaw further amplifies the advantages. OpenClaw acts as the orchestrator, intelligently deferring only the necessary tasks to Kudra while handling everything else within its default capabilities. --- ## How to Enhance Security and Privacy When working with sensitive documents like medical records or legal contracts, security is paramount. Fortunately, both Kudra and OpenClaw offer mechanisms to safeguard your data. ### Tips for Secure Integration 1. **API Key Management:** Always store your API keys securely. Do not hard-code them into scripts. Using environment variables (as shown in Step 4) is a safer practice. 2. **Local File Handling:** By running OpenClaw and Kudra integration locally, you keep documents off the cloud, except where API calls are required. This significantly reduces exposure. 3. **Use Role-Based Access Control (RBAC):** Kudra supports role-based access control for API keys. Assign limited permissions, such as restricting keys to only document extraction, for an additional layer of security. 4. **Audit Logs:** Enable logging to keep track of every document processed: ```python logger = open('audit_log.txt', 'a') logger.write(json.dumps({'file': pdf, 'response': response.json()})) ``` 5. **Token Expiry:** Set short expiry times for API keys in production settings. Rotate them regularly to minimize risks. Taking these steps ensures that sensitive client or business documents are handled responsibly and comply with industry standards like GDPR or HIPAA. --- ### Summary of Added Content This extension included a detailed exploration of advanced use cases like bulk processing, a feature comparison between Kudra and general-purpose LLMs, and a complete guide to enhancing security when working with sensitive data. These additions make the article more comprehensive and practical for both individual and enterprise users.