Skip to content

Financial document KV extraction with normalization and reconciliation#126

Open
zavera wants to merge 1 commit into
Azure-Samples:mainfrom
zavera:add-financial-document-sample
Open

Financial document KV extraction with normalization and reconciliation#126
zavera wants to merge 1 commit into
Azure-Samples:mainfrom
zavera:add-financial-document-sample

Conversation

@zavera

@zavera zavera commented Jun 6, 2026

Copy link
Copy Markdown

Purpose

Adds sample_analyze_financial_documents.py to Pre_or_post_processing_samples.

Covers five IRS form types used in financial aid and tax reconciliation workflows:

  • IRS Form 1040
  • W-2
  • Schedule C (self-employment)
  • Schedule E (rental income)
  • Schedule K-1 Form 1065 (partnership income)

Post-processing layer normalizes raw Azure DI string output to typed Decimal values and optionally reconciles against reference values with HIGH/MEDIUM/LOW severity scoring. Synthetic sample data included in Data/.

Contributed by Ambreen Zaver, Callisto Tech.

Does this introduce a breaking change?

[x] No

Pull Request Type

[x] Feature

How to Test

pip install azure-ai-documentintelligence python-dotenv
export DOCUMENTINTELLIGENCE_ENDPOINT=
export DOCUMENTINTELLIGENCE_API_KEY=
python sample_analyze_financial_documents.py

What to Check

  • Normalization handles currency symbols, parenthetical negatives, percent values, and N/A correctly
  • W-2 non-negative field suppression works correctly
  • Delta and severity scoring matches expected thresholds

@zavera

zavera commented Jun 6, 2026

Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree company="Callisto Tech"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant