Skip to main content

Create & refine Extraction Models

Extraction models define what data to pull from documents.

Field types

TypeUse forExample
TextFree-form textVendor name, description
NumberNumeric valuesAmounts, quantities
DateDate valuesInvoice date, due date
List/TableRepeating itemsLine items on an invoice

Extracting lists (line items)

The List/Table field type extracts repeating data from a single document — like invoice line items, transaction rows, or inventory lists.

How list extraction works

Define a list field with its columns (sub-fields):

FieldColumns
Line ItemsDescription, Quantity, Unit Price, Amount
TransactionsDate, Reference, Debit, Credit

Moby extracts each row as a separate entry, preserving the table structure.

When to use list extraction

Good for:

  • Invoice line items (up to ~100 rows)
  • Short transaction lists
  • Inventory summaries
  • Contract schedules

Limits: Quality drops after ~100 rows

List extraction works best for smaller tables. For documents with more than 100 rows, quality and accuracy decrease significantly.

For large tables, use OCR to Excel instead:

  1. Select the document in Workspace
  2. Click OCR to Excel
  3. Specify the page range containing the table
  4. Export to a clean Excel file

See OCR to Excel for details.

When to choose which
ScenarioUse
Invoice with 20 line itemsList extraction
Bank statement with 500 transactionsOCR to Excel
Contract with a fee scheduleList extraction
Full general ledger exportOCR to Excel

The fastest way to create a model is to let Moby generate it for you:

  1. Go to Models in the sidebar
  2. Click New Extraction Model
  3. Click Generate with AI
  4. Provide a prompt, upload a workpaper, or select sample documents
  5. Review the suggested fields
  6. Adjust names and descriptions if needed
  7. Save the model
Start with AI generation

AI-assisted generation is the default and recommended approach. It saves time and often catches fields you might miss manually.

Manual field creation

If you prefer to build from scratch:

  1. Click New Extraction Model
  2. Add fields manually one by one
  3. Give each field a clear name and description
  4. Save the model

Testing and iteration

Test your model on a few documents before running a large batch:

  1. Select your model
  2. Run on 3-5 sample documents
  3. Review extraction accuracy
  4. Refine field descriptions if needed
  5. Re-test until satisfied

Tips for better accuracy

  • Use descriptive field names — "Invoice Total Amount" is better than "Amount"
  • Add field descriptions — Explain where the field typically appears
  • Handle variations — Mention alternate formats in descriptions (e.g., "Total" vs "Grand Total")
  • Test across clients — Models may need adjustment for different document formats