PDF Table Extraction
for Developers & AI Agents
Register in seconds, submit PDFs via REST or MCP, receive JSON, Excel, CSV, or Markdown.
25 free extraction credits on every new account — no credit card, no subscription.
Get your API key
25 free extraction credits included — no credit card required.
Just want to use TableForge in Claude?
No API key needed — connect via Claude's Connectors UI in 3 steps.
Three ways to integrate
Choose the approach that fits your stack.
MCP Server
Zero-code integration for Claude Desktop, Cursor, Zed, and any MCP-compatible host. Add one config block — tools appear automatically in your AI assistant.
Recommended for AI agentsREST API
Async job model — submit, poll, retrieve. Works from any language or automation platform. OpenAPI spec available for SDK generation.
Any language / platformOpenAPI Spec
Machine-readable API definition at /openapi.json. Import into Postman, generate a typed SDK, or feed directly to your AI agent.
REST API — one-shot quickstart
Register → submit → poll → retrieve. All from code, no browser required.
// 1. Register (fully automated — no human step)
const reg = await fetch('https://www.tableforge.ai/api/v1/auth/register', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ email: 'bot@example.com', password: 'securepass' }),
}).then(r => r.json())
const apiKey = reg.apiKey.key // store this — shown once
// 2. Submit extraction job
const form = new FormData()
form.append('file', fs.createReadStream('report.pdf'), 'report.pdf')
form.append('mode', 'data_clean') // table | data_clean | markdown | layout_match
// form.append('layout_scope', 'pages') // layout_match only: 'tables' (default) | 'pages'
// form.append('table_names', JSON.stringify(['Revenue Summary'])) // selective by name
// form.append('pages', JSON.stringify([1, 2, 3])) // selective by page
const { jobId } = await fetch('https://www.tableforge.ai/api/v1/jobs', {
method: 'POST',
headers: { 'Authorization': `Bearer ${apiKey}` },
body: form,
}).then(r => r.json())
// 3. Poll until complete
let job
do {
await new Promise(r => setTimeout(r, 30_000))
job = await fetch(`https://www.tableforge.ai/api/v1/jobs/${jobId}`, {
headers: { 'Authorization': `Bearer ${apiKey}` },
}).then(r => r.json())
} while (job.status === 'queued' || job.status === 'processing')
// 4. Retrieve result — choose your format
// ?format=json (default) | excel | csv | markdown
const result = await fetch(
`https://www.tableforge.ai/api/v1/jobs/${jobId}/result?format=json`,
{ headers: { 'Authorization': `Bearer ${apiKey}` } }
).then(r => r.json())
console.log(`Extracted ${result.tables.length} tables from ${result.pageCount} pages`)
// Download as Excel:
// const xlsx = await fetch(`…/result?format=excel`, { headers: … })
// fs.writeFileSync('output.xlsx', Buffer.from(await xlsx.arrayBuffer()))Two-phase workflow
Scan first with POST /api/v1/analyze, then extract only the tables you want.
Scan credits and extraction credits are metered separately — previewing a document before committing to extraction avoids wasting credits on unwanted tables.
POST /api/v1/analyzeScan the document. Returns table names, page ranges, and an analysis_id. Uses scan credits only.
POST /api/v1/jobsPass analysisId + optional tableNames or pages filter. Skips re-scan — uses the cached result.
GET …/result?format=excelRetrieve as JSON, Excel, CSV, or Markdown. Same as the one-shot flow.
// Phase 1 — scan the document (uses scan credits, not extraction credits)
const form = new FormData()
form.append('file', fs.createReadStream('report.pdf'), 'report.pdf')
const scan = await fetch('https://www.tableforge.ai/api/v1/analyze', {
method: 'POST',
headers: { 'Authorization': `Bearer ${apiKey}` },
body: form,
}).then(r => r.json())
// { analysis_id, pageCount, tables: [{name, pages, estimatedRows, estimatedColumns}], expiresAt }
scan.tables.forEach((t, i) =>
console.log(`${i+1}. "${t.name}" — pages ${t.pages.join(', ')} (~${t.estimatedRows}r × ${t.estimatedColumns}c)`)
)
// 1. "Revenue Summary" — pages 1, 2 (~15r × 6c)
// 2. "Cost Breakdown" — page 3 (~20r × 4c)
// 3. "Regional Data" — pages 4, 5 (~8r × 5c)
// Phase 2 — extract only the tables you want (analysis_id skips re-scan)
const { jobId } = await fetch('https://www.tableforge.ai/api/v1/jobs', {
method: 'POST',
headers: { 'Authorization': `Bearer ${apiKey}`, 'Content-Type': 'application/json' },
body: JSON.stringify({
analysisId: scan.analysis_id,
mode: 'data_clean',
tableNames: ['Revenue Summary', 'Cost Breakdown'], // omit to extract all
// pages: [1, 2, 3], // or filter by page
}),
}).then(r => r.json())
// Poll + retrieve exactly as in the one-shot flow
// analysis_id is valid 48 hours — reuse it to run multiple jobs without paying scan credits twicetableNames does partial, case-insensitive matching. pages includes tables that overlap any of the specified page numbers. Both filters AND together when combined.Connect to Claude in seconds
No config files. No API keys to copy. Just paste a URL.
Open Claude Desktop or Claude.ai
Go to Customize → Connectors → Add custom connector
Paste this URL
https://www.tableforge.ai/api/mcpSign in and click Allow
Claude opens a browser window — sign in to TableForge and approve access. Done.
Other clients (Claude Code CLI, Cursor, Continue.dev, Zed)
Use Authorization: Bearer header — not ?key= in the URL. Keys in URLs appear in server logs, CDN caches, and browser history.
Claude Code (CLI) — run once in your terminal:
claude mcp add --transport http tableforge \
--header "Authorization: Bearer tf_live_YOUR_KEY" \
https://www.tableforge.ai/api/mcpAdd --scope user to share across all projects. OAuth auto-discovery is supported in Claude Code but has known bugs — the --header flag is more reliable.
Cursor — ~/.cursor/mcp.json:
{
"mcpServers": {
"tableforge": {
"url": "https://www.tableforge.ai/api/mcp",
"headers": {
"Authorization": "Bearer tf_live_YOUR_KEY"
}
}
}
}Cursor also supports "Authorization": "Bearer ${env:TF_KEY}" to read from an environment variable.
Continue.dev — config.yaml:
mcpServers:
tableforge:
type: streamable-http
url: https://www.tableforge.ai/api/mcp
requestOptions:
headers:
Authorization: "Bearer tf_live_YOUR_KEY"Zed — settings.json:
{
"context_servers": {
"tableforge": {
"url": "https://www.tableforge.ai/api/mcp",
"headers": {
"Authorization": "Bearer tf_live_YOUR_KEY"
}
}
}
}Zed also supports OAuth auto-discovery — omit the headers block and Zed will open a browser for sign-in.
Extraction modes
Pass as mode on job submission. Retrieve results with ?format=json|excel|csv|markdown.
Each format is a separate call — retrieve JSON for processing, then download the same job as Excel or CSV. Results persist until job expiry.
| Mode | Credits/page | Download formats | Best for |
|---|---|---|---|
| table | 1× | JSON · Excel · CSV · Markdown | Quick extraction, simple tables |
| data_clean | 2× | JSON · Excel · CSV · Markdown | Database import, analytics pipelines |
| markdown | 2× | Markdown (native) | LLM context windows, RAG pipelines |
| layout_match | 4× | JSON · Excel · CSV · Markdown | Legal docs, compliance, pixel-perfect fidelity |
layout_match — layout_scope sub-mode
Pass layout_scope to control what gets extracted from each page:
tables(default)Extracts only the detected tables from each page.
form.append('layout_scope', 'tables')pagesFull-page extraction — captures all content as it appears: text, headers, tables, and surrounding context together.
form.append('layout_scope', 'pages')JSON body: use layoutScope instead of layout_scope.
Monthly credits reset on your billing cycle. Overage billing available on all paid plans.
API access by plan
Ready for production?
Pro plan — 500 extraction pages/month, full quota, overage billing, no rate limits.
Start with the free trial above, upgrade when you're ready.