Methodology

Data sources

All tax data is sourced from authoritative public databases and cross-referenced across multiple sources for accuracy. No AI-generated or estimated data is used.

How the data is assembled

The ingest pipeline fetches structured data (CSV, JSON) from each source via their APIs and public download endpoints. Records are validated against a typed Zod schema, stored as individual JSON objects in a Cloudflare R2 bucket, and assembled into vector tiles with tippecanoe. Country geometry is joined from Natural Earth (public domain). Identity fields (ISO codes, names, regions, centroids) come from Natural Earth.

Source priority

When multiple sources provide the same field, the pipeline uses this priority: manually verified jurisdictions (zero-tax and territorial classifications) → Tax Foundation ITCI (OECD countries) → OECD SDMX API → Tax Foundation worldwide → PwC Tax Summaries (gap-fill).

Coverage

231 jurisdictions are included. Coverage varies by field: corporate tax rates cover 208 jurisdictions (90%), personal income and VAT cover ~138 (60%), capital gains and withholding cover ~130 (56%). Remaining gaps are small territories and island nations not covered by any public structured source.

Refresh cadence

The dataset is refreshed monthly (1st of the month, 03:00 UTC) via the ingest-cron.yml GitHub Actions workflow.