Create & refine Extraction Models

Extraction models define what data to pull from documents.

Field types

Type	Use for	Example
Text	Free-form text	Vendor name, description
Number	Numeric values	Amounts, quantities
Date	Date values	Invoice date, due date
List/Table	Repeating items	Line items on an invoice

Extracting lists (line items)

The List/Table field type extracts repeating data from a single document — like invoice line items, transaction rows, or inventory lists.

How list extraction works

Define a list field with its columns (sub-fields):

Field	Columns
Line Items	Description, Quantity, Unit Price, Amount
Transactions	Date, Reference, Debit, Credit

Moby extracts each row as a separate entry, preserving the table structure.

When to use list extraction

Good for:

Invoice line items (up to ~100 rows)
Short transaction lists
Inventory summaries
Contract schedules

Limits: Quality drops after ~100 rows

List extraction works best for smaller tables. For documents with more than 100 rows, quality and accuracy decrease significantly.

For large tables, use OCR to Excel instead:

Select the document in Workspace
Click OCR to Excel
Specify the page range containing the table
Export to a clean Excel file

See OCR to Excel for details.

When to choose which

Scenario	Use
Invoice with 20 line items	List extraction
Bank statement with 500 transactions	OCR to Excel
Contract with a fee schedule	List extraction
Full general ledger export	OCR to Excel

Creating a new model (recommended: AI-assisted)

The fastest way to create a model is to let Moby generate it for you:

Go to Models in the sidebar
Click New Extraction Model
Click Generate with AI
Provide a prompt, upload a workpaper, or select sample documents
Review the suggested fields
Adjust names and descriptions if needed
Save the model

Start with AI generation

AI-assisted generation is the default and recommended approach. It saves time and often catches fields you might miss manually.

Manual field creation

If you prefer to build from scratch:

Click New Extraction Model
Add fields manually one by one
Give each field a clear name and description
Save the model

Testing and iteration

Test your model on a few documents before running a large batch:

Select your model
Run on 3-5 sample documents
Review extraction accuracy
Refine field descriptions if needed
Re-test until satisfied

Tips for better accuracy

Use descriptive field names — "Invoice Total Amount" is better than "Amount"
Add field descriptions — Explain where the field typically appears
Handle variations — Mention alternate formats in descriptions (e.g., "Total" vs "Grand Total")
Test across clients — Models may need adjustment for different document formats

Field types​

Extracting lists (line items)​

How list extraction works​

When to use list extraction​

Limits: Quality drops after ~100 rows​

Creating a new model (recommended: AI-assisted)​

Manual field creation​

Testing and iteration​

Tips for better accuracy​