Make extraction models robust

Well-designed models work reliably across different document variations.

Start with AI generation

The best way to create a robust model is to let Moby generate it:

Click Generate with AI
Provide a prompt, upload a workpaper, or select sample documents
Review and refine the suggested fields

AI generation creates descriptive field names and handles common variations automatically.

Example of a good model

Here's what a well-designed invoice extraction model looks like:

Field Name	Description
Invoice Number	Invoice reference, usually at top. May appear as "Invoice #", "Inv No.", or just a number.
Invoice Date	Date of invoice, typically near the invoice number. Format varies (DD/MM/YYYY, MM/DD/YYYY, etc.)
Vendor Name	Company that issued the invoice. Usually in header or letterhead.
Total Amount	Final amount due, usually at bottom right. May be labeled "Total", "Amount Due", "Grand Total". Includes tax.
Due Date	Payment deadline. May appear as "Due Date", "Payment Due", or "Pay By".

What makes this good:

Descriptive field names (not just "Amount" or "Date")
Descriptions explain where to find the value
Mentions common label variations
Notes format differences

Refine descriptions for edge cases

If extraction misses values, improve the field description:

Before: "Invoice total"

After: "Total amount due on the invoice, usually at the bottom right. May be labeled as 'Total', 'Amount Due', 'Grand Total', or 'Balance Due'. Includes tax and shipping if applicable."

Test across clients

A model that works for one client's invoices may need adjustment for another.

Testing workflow:

Create initial model based on sample documents
Test on 5-10 documents from each client type
Note any extraction errors
Refine descriptions to handle variations
Re-test and iterate

When to create new models

Create a new model when:

Document format is significantly different
Different fields need to be extracted
Existing model accuracy is below 90%

Use the same model when:

Documents are similar in structure
Only minor variations in layout
Same fields need to be extracted

Start with AI generation​

Example of a good model​

Refine descriptions for edge cases​

Test across clients​

When to create new models​

Start with AI generation

Example of a good model

Refine descriptions for edge cases

Test across clients

When to create new models