# BI Resilience and Semantic Completeness — Handoff V1

**Created**: 2026-05-02
**Owner**: BI / Core API / Cube
**Trigger**: 2026-05-02 07:07:41 UTC incident — Pascucci asked "show me margin for coffee products last month", got 502 `bi_plan_invalid`
**Read first**: `CLAUDE_MEMORY_BANK_BI_RESILIENCE_AND_SEMANTIC.md` (full architectural memory), `plans/BI_RESILIENCE_AND_SEMANTIC_COMPLETENESS_V1.md`, `plans/BI_QUESTION_CATALOG_AND_CUBE_COVERAGE_V1.md`

## Goal of This Doc

A focused entry point for any future agent or operator working on:
- BI resilience (ai-gateway / core-api retry, watchdog drain, UI auto-retry).
- BI semantic completeness (margin, AR, AP, cash, runway, balance-sheet measures and their AI prompt routing + deterministic fallbacks).
- Verification of any catalog question against Finanly's truth source.

## Hard Rules

1. **Verification source for BI = Finanly only.** ERPNext is NOT a verification source for BI. The only sanctioned verification paths:
   - `POST /v1/business/bi/ask` (or `/v1/personal/bi/ask`)
   - `POST /cubejs-api/v1/load` with a tenant-scoped JWT signed using `CUBE_API_SECRET`
   - SQL against Finanly Postgres `ledger_*` tables, scoped exactly like the ask endpoint
2. **Books are clean.** If SQL disagrees with what the BI ask returns, the SQL is wrong (missed `tenant_id`, `ledger_integration_instance_id`, account-type filter, or sign convention).
3. **Cube owns the semantic.** Adding a new measurable BI concept means: add a Cube measure, register it in `business_report_builder_schema.py` and `cube_report_adapter.py`, add an AI prompt rule, and wire a deterministic fallback when applicable.

## Architecture Map

```
UI fetch (90s timeout, retryOn503) → Next.js BFF → core-api /v1/business/bi/ask
                                                       │
                                          retry on 503 ai_gateway_unreachable
                                          (1s/4s/12s; 17s total)
                                                       │
                                            on retries exhausted + governed_intent:
                                            _build_governed_intent_fallback_plan
                                            → _serve_deterministic_plan via Cube
                                                       │
                                                       ▼
                                             ai-gateway /internal/v1/business/bi/ask
                                                       │
                                            pause check (file marker)
                                            if all attempts 598/599 → HTTPException(503)
                                                       │
                                                       ▼
                                                     Ollama
```

Watchdog (out-of-repo) drains via `/internal/v1/admin/pause` → `stop -t 30 ollama` → `start ollama` → poll `/internal/v1/ready` → `/admin/unpause`.

## Files

### ai-gateway (`finanly/services/ai-gateway/src/finanly_ai_gateway/`)
- `app.py`:
  - `_OllamaUnreachableError` (raised when every attempt was 598/599)
  - `_PAUSE_MARKER`, `_is_paused()`, `_probe_ollama_ready()`
  - `GET /internal/v1/health`, `GET /internal/v1/ready`, `POST /internal/v1/admin/pause`, `POST /internal/v1/admin/unpause`
  - `_bi_generate_plan_single`, `_bi_generate_plan_agent` track per-attempt status, raise `_OllamaUnreachableError` only when ALL attempts hit 598/599
  - `_bi_ask_single_impl`, `_bi_ask_agent_impl` translate the exception to `HTTPException(503, code="ai_gateway_unreachable")`
- `bi_prompt_business_plan_v1.txt`: rules 14 (margin/profitability), 15 (opex), 16 (bottom-line), 17 (AR), 18 (AP), 19 (cash/runway), 20 (no-data clarify), 21 (deterministic period resolution)

### core-api (`finanly/services/core-api/src/finanly_core_api/`)
- `routers/business_bi.py`:
  - Retry: `_call_ai_gateway_bi_plan_once` + `_call_ai_gateway_bi_plan` wrapper. Env `BI_AI_GATEWAY_RETRY_DELAYS_S=1.0,4.0,12.0`. Audit events `bi.ai_gateway_retry`, `bi.ai_gateway_exhausted`.
  - Intent detection: `_detect_governed_intent`, `_is_margin_question`, `_is_operating_question`, `_is_ar_question`, `_is_ap_question`, `_is_cash_question`, `_is_balance_sheet_question`. AR/AP disambiguation uses `_AR_TOKENS` (specific) + `_AR_SHARED_AGING_TOKENS` (shared) + AP-priority guard.
  - Period resolution: `_resolve_period_for_question` returns absolute ISO range for "last month/quarter/year", "ytd"/"mtd"/"qtd", "last N days/weeks/months", explicit year, defaulting to YTD.
  - Plan builders: `_build_governed_margin_fallback_plan`, `_build_governed_ar_aging_fallback_plan`, `_build_governed_ap_aging_fallback_plan`, `_build_governed_cash_fallback_plan`, `_build_governed_balance_sheet_fallback_plan`, `_build_governed_intent_fallback_plan` dispatcher.
  - Wire-in: `ask_bi` wraps `_call_ai_gateway_bi_plan` in try/except — on `ai_gateway_unreachable` AND governed intent → `_serve_deterministic_plan` (compiles + executes via Cube; returns success envelope with `deterministic_fallback: True`).
  - No-data envelope: `_PRIMARY_TABLE_TO_SOURCE_TABLE`, `_NO_DATA_HINT_BY_TABLE`, `_probe_source_data_empty` — when `report.row_count == 0` AND source canonical table is empty for tenant, append actionable hint to `answer` + `no_data_reason` + `no_data_source_table` fields. Audit event `bi.no_source_data`.
- `routers/personal_bi.py`: retry mirror of business_bi (deterministic fallback not yet wired on personal — by-design; personal questions don't share the governed-accounting set).
- `services/business_report_builder_schema.py`: 13 new synthetic-measure fields on `ledger_gl_entries` and `ledger_invoice_lines`.
- `services/cube_report_adapter.py`: 19 new `_BUSINESS_MEASURES` + 50+ new `_BUSINESS_DIMENSIONS` mappings.

### Cube (`finanly/services/cube-core/model/business/`)
- `revenue.js`: 10 new measures — `revenueByAccount`, `expensesByAccount`, `totalCogs`, `totalOpex`, `grossProfit`, `grossMarginRate`, `operatingIncome`, `operatingMarginRate`, `netIncome`, `netMarginRate`.
- `invoice_lines.js`: 3 new measures — `costSum` (qty × landed-cost incoming_rate), `grossProfitSum`, `marginRate`.
- `ar_aging.js` (NEW): `BusinessArAging` with `arBalance`, `arCurrent`/`arBucket1_30`/`arBucket31_60`/`arBucket61_90`/`arBucket90Plus`, `daysOverdueAvg/Max`.
- `ap_aging.js` (NEW): symmetric for purchase invoices.
- `cash_balances.js` (NEW): `BusinessCashBalances` with `cashPosition` (debit-credit on Bank/Cash assets), `totalInflow`, `totalOutflow`.
- `balance_sheet.js` (NEW): `BusinessBalanceSheet` with `totalAssets`, `totalLiabilities`, `totalEquity`, `workingCapital` (sign conventions per accounting standard).

### UI (`ui/finanly-ui/`)
- `lib/coreApi.ts`: `coreProxyFetch` accepts `timeoutMs` (default 90s, AbortController-backed) and `retryOn503` (one auto-retry on 503 ai_gateway_unreachable after 3s).
- `app/business/tools/bi/page.tsx`, `app/personal/tools/bi/page.tsx`: all 4 BI ask call sites pass `timeoutMs:90_000, retryOn503:true` and surface a friendly "AI is briefly unavailable while the model restarts. Please try again in a few seconds." message.

## How to Verify a Catalog Question

Pick a question from `plans/BI_QUESTION_CATALOG_AND_CUBE_COVERAGE_V1.md` (or its Dropbox mirror at `/mnt/sata2tb/Dropbox/Claude Files/BI_QUESTIONS_CATALOG_FINANLY.md`). Each row names the Cube measure(s) and dimension(s) needed.

```bash
# 1. Mint a tenant-scoped JWT (mirrors core-api).
CUBE_SECRET=$(grep "^CUBE_API_SECRET=" /home/docker/finanly.ai/infra/docker/.env | cut -d= -f2-)
docker exec -e CUBE_SECRET="$CUBE_SECRET" finanly-cube-core-1 node -e "
const jwt = require('jsonwebtoken');
console.log(jwt.sign({tenant_id: '<tenant-uuid>'}, process.env.CUBE_SECRET, {algorithm:'HS256'}));
"
# Tenants:
#   Pascucci USA  837eebaa-1ae8-5b79-95c3-26649fb25c42
#   Fine Line     5bd249c8-99b0-5343-a112-31aeae1c4f3a
#   Nourishing    9f26992c-66ab-5d16-af1f-b9ae98f40021
#   Ad Astrum     2328ddfa-f48d-5213-872b-aa07a3b1db01

# 2. Issue a Cube query.
docker exec finanly-cube-core-1 sh -c '
curl -s -X POST http://localhost:4000/cubejs-api/v1/load \
  -H "Authorization: <jwt-from-step-1>" \
  -H "Content-Type: application/json" \
  -d "{\"query\": {\"measures\": [\"BusinessRevenue.grossProfit\"], \"timeDimensions\": [{\"dimension\":\"BusinessRevenue.postingDate\",\"dateRange\":\"last year\"}]}}"
'

# 3. (Optional) Cross-check via raw SQL on Finanly Postgres — must use SAME tenant_id and SAME account-classification rules the Cube measure encodes.
docker exec finanly-postgres-1 psql -U finanly_bootstrap -d finanly -c "..."
```

If the Cube response differs from the SQL, the SQL is wrong.

## Common Pitfalls

1. **Aggregating across all integrations**: Some tenants have multiple `integration_instances` (ERPNext + Plaid + Mercury + ...). Only the ERPNext one has GL/invoice data. The Cube measures don't filter by integration_instance by default — at the BI ask path, integration scope is added at the report adapter layer. For a pure Cube probe, expect aggregation across all integrations of a tenant (which is fine when only one carries data).
2. **Sign convention**: Asset / Expense / debit-side accounts increase on debit. Income / Liability / Equity accounts increase on credit. The Cube measures encode this:
   - `revenueByAccount` = SUM(credit-debit) on `root_type='Income'`
   - `totalCogs` = SUM(debit-credit) on `account_type='Cost of Goods Sold'`
   - `cashPosition` = SUM(debit-credit) on `account_type IN ('Bank','Cash')` AND `root_type='Asset'`
3. **Voucher-type vs account-classification**: The legacy `BusinessRevenue.totalRevenue` measure filters by `voucher_type='sales invoice'`. The new `BusinessRevenue.revenueByAccount` is voucher-agnostic and uses account classification — this is the correct measure for a complete P&L because revenue can also come from journal entries, opening balances, etc.
4. **"Books look wrong" → SQL is wrong, not the data.** User has confirmed Finanly UI numbers are correct.

## Tests

- `services/ai-gateway/tests/test_bi_resilience.py` (7 tests) — `_OllamaUnreachableError` semantics, pause marker, Ollama probe.
- `services/core-api/tests/test_bi_ai_gateway_resilience.py` (26 tests) — error-code extraction, intent detection (incl. the failing-incident question), period resolution, deterministic plan builder shapes, retry-config sanity.

Run inside live containers:
```bash
docker cp /home/docker/finanly.ai/finanly/services/ai-gateway/tests finanly-ai-gateway-1:/app/tests
docker exec finanly-ai-gateway-1 sh -c "cd /app && PYTHONPATH=src python -m pytest tests/test_bi_resilience.py -q"

docker cp /home/docker/finanly.ai/finanly/services/core-api/tests/test_bi_ai_gateway_resilience.py finanly-core-api-1:/app/tests/
docker exec -e FINANLY_ALLOW_LIVE_DB_TESTS=1 finanly-core-api-1 sh -c "cd /app && PYTHONPATH=src python -m pytest tests/test_bi_ai_gateway_resilience.py -q"
```

## Pending Followups

- **Personal BI deterministic fallback parity**: business_bi has the full intent dispatch; personal_bi has only the retry layer. Personal's question set is different (annotations, baskets, transactions) and will need its own intent/builder set when the user asks for parity.
- **`new-v2` catalog questions**: 21 questions need additional source-data ingestion (inventory on-hand qty for turns, full cash-flow waterfall split into operating/investing/financing, customer LTV/churn, vendor payment-timing). Each is tracked in the catalog with its data prerequisite.
- **Watchdog hardening**: `plans/OLLAMA_WATCHDOG_HARDENING_V1.md` describes the out-of-repo edits. Until applied, the watchdog still kills in-flight requests; the resilience plumbing in this work absorbs that with the 17s retry budget but the hardening eliminates the disruption entirely.

## Related Plans / Memory

- `plans/BI_RESILIENCE_AND_SEMANTIC_COMPLETENESS_V1.md`
- `plans/BI_QUESTION_CATALOG_AND_CUBE_COVERAGE_V1.md` (in-repo) and `/mnt/sata2tb/Dropbox/Claude Files/BI_QUESTIONS_CATALOG_FINANLY.md` (mirror)
- `plans/OLLAMA_WATCHDOG_HARDENING_V1.md`
- `CLAUDE_MEMORY_BANK_BI_RESILIENCE_AND_SEMANTIC.md`
- `CLAUDE_MEMORY_BANK_AI_INTEGRATION.md` (BI runtime, sister doc)
- `CLAUDE_MEMORY_BANK_CHANGELOG.md` (2026-05-02 entry)