# Cube Semantic Map V1 — Finanly Business BI

**Created**: 2026-05-02
**Status**: Live
**Sister docs**: `CUBE_SEMANTIC_ENGINE_HANDOFF_V1.md`, `BI_RESILIENCE_AND_SEMANTIC_HANDOFF_V1.md`, `SEMANTIC_CONNECTION_MATRIX_V1.md`
**Catalog mirror**: `plans/BI_QUESTION_CATALOG_AND_CUBE_COVERAGE_V1.md`, `/mnt/sata2tb/Dropbox/Claude Files/BI_QUESTIONS_CATALOG_FINANLY.md`

## Goal of this doc

Answer one question: **what cubes exist, what they aggregate, how they connect, and how the AI generates reports against them?**

This is the **schema-first** reference. The catalog (sister doc) is the **question-first** reference. Together they are the AI's complete knowledge map.

## How an AI BI Report is Generated (End-to-End)

```
User natural-language question ("show me margin for coffee products last month")
    │
    ▼
ai-gateway (Ollama gpt-oss:20b)
    Reads bi_prompt_business_plan_v1.txt rule set (rules 1-21).
    Emits a strict semantic plan JSON keyed by `primary_table` (a Postgres-table-name token):
      { "primary_table": "ledger_gl_entries",
        "values": [{"table":"ledger_gl_entries","field":"gross_profit","aggregation":"sum"}],
        "filters": [{"field":"posting_date","operator":"between","value":["2026-04-01","2026-04-30"]}],
        "rows":   [{"table":"ledger_gl_entries","field":"account_name"}], ... }
    │
    ▼
core-api semantic_plan_compiler + cube_report_adapter
    Translates the table-name plan to a Cube query:
      - `_BUSINESS_DIMENSIONS[(table, field)]`  → Cube dimension path  (e.g. `BusinessRevenue.accountName`)
      - `_BUSINESS_MEASURES[(table, field, agg)]` → Cube measure path  (e.g. `BusinessRevenue.grossProfit`)
    │
    ▼
Cube  (POST /cubejs-api/v1/load with tenant-scoped JWT)
    Runs the SQL the cube's `sql:` block defines (with SECURITY_CONTEXT.tenant_id substitution).
    │
    ▼
Returns rows + columns to core-api → user
```

The AI never sees Cube measure names directly. It speaks in **canonical Postgres table+field tokens** (`ledger_gl_entries.gross_profit`). The adapter is the dictionary.

## Cube Inventory (Business mode, live as of 2026-05-02)

**18 business cubes, 89 measures, 668 dimensions.** Verified via `GET /cubejs-api/v1/meta`.

### Source-table cubes (1:1 with a canonical Postgres table)

| Cube | Source table | Measures | Key dims | AI `primary_table` token |
|------|--------------|----------|----------|---------------------------|
| `BusinessRevenue` | `ledger_gl_entries` joined to `ledger_accounts` + counterparty/customer/supplier | 16 | postingDate, accountName, accountType, accountRootType, counterpartyDisplayName, counterpartyGroupName, customerGroup, customerTerritory, supplierGroup, supplierCountry, currency, costCenterSourceObjectId | `ledger_gl_entries` |
| `BusinessInvoiceLines` | `ledger_invoice_lines` joined to invoices + counterparty + product + customer + supplier | 10 | productItemName, productItemGroup, productBrand, productItemCode, customerName, customerGroup, customerTerritory, supplierName, supplierGroup, salesChannel, invoicePostingDate (+ 250+ JSON-payload dims) | `ledger_invoice_lines` |
| `BusinessInvoices` | `ledger_invoices` | 3 | invoiceKind, status, postingDate, dueDate, counterpartyName, salesChannel | `ledger_invoices` |
| `BusinessProducts` | `ledger_products` | 5 | itemCode, itemName, itemGroup, brand, barcode (+ 125+ JSON-payload dims for ERPNext fields) | `ledger_products` |
| `BusinessCustomers` | `ledger_customers` | 3 | customerName, customerType, customerGroup, territory, defaultCurrency, address (+ 60+ JSON-payload dims) | `ledger_customers` |
| `BusinessSuppliers` | `ledger_suppliers` | 3 | supplierName, supplierType, supplierGroup, country, defaultCurrency, address (+ 40+ JSON-payload dims) | `ledger_suppliers` |
| `BusinessCounterparties` | `ledger_counterparties` | 1 | displayName, counterpartyKind, groupName, territory | `ledger_counterparties` |
| `BusinessAccounts` | `ledger_accounts` | 1 | accountName, rootType, accountType, accountCurrency, parentSourceObjectId | `ledger_accounts` |
| `BusinessTime` | `transactions` | 2 | postedAt, amount, currency, direction, counterpartyName, status, sourceSystem | `transactions` |
| `BusinessExpenses` | `ledger_gl_entries` view filtered to expense | 2 | (small) | (not used directly) |
| `BusinessShopifyChannelDaily` | `shopify_daily_channel_metrics` | 3 | metricDate, channelGroup, channelKey, currency | `shopify_daily_channel_metrics` |
| `BusinessGa4ChannelDaily` | `ga4_daily_channel_metrics` | 5 | metricDate, channelGroup, channelKey, propertyId | `ga4_daily_channel_metrics` |
| `BusinessBlendedShopifyGa4Daily` | `blended_shopify_ga4_daily` | 7 | metricDate, channelGroup, channelKey | `blended_shopify_ga4_daily` |
| `BusinessSmoke` | dummy | 1 | (test only) | – |

### Aggregation/derived cubes (V1 of this work — 2026-05-02)

These are NEW cubes added by the BI Resilience + Semantic Completeness work. They do **not** map 1:1 to a Postgres table — they're tenant-scoped views over `ledger_gl_entries` or `ledger_invoices` with SQL-LATERAL denormalization. They expose synthetic `primary_table` tokens (`ledger_ar_aging`, `ledger_ap_aging`, `ledger_cash_balances`, `ledger_balance_sheet`) that the AI prompt is taught to use for receivables/payables/cash/balance-sheet questions.

| Cube | Source | Measures | Key dims | AI `primary_table` token |
|------|--------|----------|----------|---------------------------|
| `BusinessArAging` | `ledger_invoices WHERE invoice_kind ∈ {sales|sales_invoice|sales invoice} AND outstanding_amount > 0` + LATERAL counterparties + customers | arBalance, arCurrent, arBucket1_30, arBucket31_60, arBucket61_90, arBucket90Plus, daysOverdueAvg, daysOverdueMax, invoiceCount | postingDate, dueDate, agingBucket, counterpartyDisplayName, counterpartyGroupName, customerGroup, customerTerritory, salesChannel | `ledger_ar_aging` |
| `BusinessApAging` | `ledger_invoices WHERE invoice_kind ∈ {purchase|purchase_invoice|purchase invoice} AND outstanding_amount > 0` + LATERAL counterparties + suppliers | apBalance, apCurrent, apBucket1_30, apBucket31_60, apBucket61_90, apBucket90Plus, daysOverdueAvg, daysOverdueMax, invoiceCount | postingDate, dueDate, agingBucket, counterpartyDisplayName, supplierGroup, supplierCountry | `ledger_ap_aging` |
| `BusinessCashBalances` | `ledger_gl_entries` LATERAL accounts where `account_type ∈ {Bank,Cash} AND root_type=Asset` | cashPosition (= SUM(debit-credit)), totalInflow (= SUM(debit)), totalOutflow (= SUM(credit)), entryCount | postingDate, postingMonth, postingQuarter, postingYear, accountName, accountType, accountCurrency | `ledger_cash_balances` |
| `BusinessBalanceSheet` | `ledger_gl_entries` LATERAL accounts where `root_type ∈ {asset,liability,equity}` | totalAssets (debit-credit on assets), totalLiabilities (credit-debit on liabilities), totalEquity (credit-debit on equity), workingCapital, entryCount | postingDate, accountName, accountType, rootType, accountCurrency | `ledger_balance_sheet` |

### How aggregation cubes connect (cross-cube join policy)

**No `joins:` declarations between cubes.** All aggregation cubes are self-contained:

- `BusinessRevenue.sql` includes LATERAL JOINs to `ledger_accounts`, `ledger_counterparties`, `ledger_customers`, `ledger_suppliers` — so account / counterparty / customer / supplier dimensions are first-class fields on the cube itself.
- `BusinessInvoiceLines.sql` includes LATERAL JOINs to `ledger_invoices`, `ledger_counterparties`, `ledger_products`, `ledger_customers`, `ledger_suppliers` — same pattern.
- The 4 new aggregation cubes (`BusinessArAging`, `BusinessApAging`, `BusinessCashBalances`, `BusinessBalanceSheet`) follow the same pattern: SQL-LATERAL denormalization, no Cube-level `joins:` block.

**Why no cube-to-cube joins?** Because the cube's `sql:` block is where tenant filtering happens (`WHERE tenant_id = SECURITY_CONTEXT.tenant_id`). Cube-level joins would need to push the same filter through the join graph, which is harder to reason about than denormalization. The denormalized pattern keeps each cube tenant-scoped and self-contained.

**Trade-off**: a question like *"AR balance grouped by ledger_products.brand"* cannot be answered by joining `BusinessArAging` to `BusinessProducts` at the Cube layer — but no real BI question does that (AR is invoice-header; brand is line-item). Multi-cube questions in practice only mix Revenue + Invoice-lines, both of which already include their cross-table dims via LATERAL.

## Complete Measure Catalog (Business Mode)

### `BusinessRevenue` (16 measures)

Source: `ledger_gl_entries` joined to `ledger_accounts`. Sign convention: `revenue_amount = credit-debit`, `expense_amount = debit-credit`.

| Measure | Formula | Use |
|---------|---------|-----|
| `entryCount` | `count` | Q-INC count |
| `totalRevenue` | SUM(revenue_amount WHERE voucher_type='sales invoice' AND root_type='Income') | Existing — voucher-filtered, narrower |
| `totalExpenses` | SUM(expense_amount WHERE voucher_type='purchase invoice' AND root_type='Expense') | Existing — voucher-filtered |
| `totalAmount` | SUM(revenue_amount) | Raw |
| `totalDebit` | SUM(debit) | Raw |
| `totalCredit` | SUM(credit) | Raw |
| `revenueByAccount` (NEW) | SUM(revenue_amount WHERE root_type='Income') | **Voucher-agnostic revenue** — captures journal entries, openings, etc. |
| `expensesByAccount` (NEW) | SUM(expense_amount WHERE root_type='Expense') | Voucher-agnostic expenses |
| `totalCogs` (NEW) | SUM(expense_amount WHERE account_type='Cost of Goods Sold') | Q-COG-* — account-classification truth |
| `totalOpex` (NEW) | SUM(expense_amount WHERE root_type='Expense' AND account_type != 'Cost of Goods Sold') | Q-OPX-001 |
| `grossProfit` (NEW) | `revenueByAccount − totalCogs` | Q-GM-001 |
| `grossMarginRate` (NEW) | `(revenueByAccount − totalCogs) / NULLIF(revenueByAccount, 0)` | Q-GM-002 |
| `operatingIncome` (NEW) | `revenueByAccount − totalCogs − totalOpex` | Q-BTM-001 |
| `operatingMarginRate` (NEW) | `operatingIncome / NULLIF(revenueByAccount, 0)` | Q-BTM-002 |
| `netIncome` (NEW) | `revenueByAccount − expensesByAccount` | Q-BTM-004 |
| `netMarginRate` (NEW) | `netIncome / NULLIF(revenueByAccount, 0)` | Q-BTM-005 |

### `BusinessInvoiceLines` (10 measures)

Source: `ledger_invoice_lines` joined to invoices + counterparty + product + customer + supplier.

| Measure | Formula | Use |
|---------|---------|-----|
| `lineCount` | `count` | Q-INC line count |
| `quantitySum` | SUM(quantity) | Q-INC-015 |
| `stockQtySum` | SUM(stock_qty) | – |
| `netAmountSum` | SUM(net_amount) | Q-INC-003-014 (per-product/customer/channel revenue) |
| `baseNetAmountSum` | SUM(base_net_amount) | – |
| `rateAvg` | AVG(rate) | – |
| `incomingRateAvg` | AVG(incoming_rate) | Avg landed cost per unit |
| `costSum` (NEW) | SUM(quantity × COALESCE(NULLIF(incoming_rate,0), product_valuation_rate, product_standard_rate, 0)) | **Per-line landed COGS** (incoming_rate already includes landed allocations from Phase F) |
| `grossProfitSum` (NEW) | `netAmountSum − costSum` | Q-GM-004 (per-product gross profit) |
| `marginRate` (NEW) | `(netAmountSum − costSum) / NULLIF(netAmountSum, 0)` | Q-GM-003 (per-product margin %) |

### `BusinessArAging` / `BusinessApAging` (9 measures each)

Aging buckets computed against `CURRENT_DATE`:
- `current`: due_date >= today OR no due_date
- `d1_30`, `d31_60`, `d61_90`, `d90_plus`: graded by `(today - due_date)` days

Each measure (e.g. `arBucket1_30`) is `SUM(outstanding_amount WHERE aging_bucket = '<bucket>')`. `arBalance` / `apBalance` = SUM(outstanding_amount) over all buckets.

### `BusinessCashBalances` (4 measures)

Source: `ledger_gl_entries` filtered to `account_type ∈ {Bank,Cash}` AND `root_type='Asset'`. Cash-side accounting:
- `cashPosition = SUM(debit - credit)` (assets increase on debit)
- `totalInflow = SUM(debit)`, `totalOutflow = SUM(credit)`

For "current cash position" use without timeDimensions; for "monthly trend" group by `postingMonth`/`postingQuarter`/`postingYear`.

### `BusinessBalanceSheet` (5 measures)

Sign-conventional sums per root_type:
- `totalAssets = SUM(debit-credit WHERE root_type='Asset')`
- `totalLiabilities = SUM(credit-debit WHERE root_type='Liability')`
- `totalEquity = SUM(credit-debit WHERE root_type='Equity')`
- `workingCapital` proxies current-asset accounts (Bank, Cash, Receivable, Stock) minus current-liability accounts (Payable, Tax, Current Liability).

## AI ↔ Cube Mapping Layer (the dictionary)

The AI emits plans keyed by `(table, field, aggregation)`. The translator lives in `finanly/services/core-api/src/finanly_core_api/services/cube_report_adapter.py`.

### Measures (table, field, aggregation) → Cube path

| Token (from AI) | Cube path |
|-----------------|-----------|
| `ledger_gl_entries.amount.sum` | `BusinessRevenue.totalAmount` |
| `ledger_gl_entries.revenue_by_account.sum` (NEW) | `BusinessRevenue.revenueByAccount` |
| `ledger_gl_entries.cogs.sum` (NEW) | `BusinessRevenue.totalCogs` |
| `ledger_gl_entries.opex.sum` (NEW) | `BusinessRevenue.totalOpex` |
| `ledger_gl_entries.gross_profit.sum` (NEW) | `BusinessRevenue.grossProfit` |
| `ledger_gl_entries.gross_margin_rate.avg` (NEW) | `BusinessRevenue.grossMarginRate` |
| `ledger_gl_entries.operating_income.sum` (NEW) | `BusinessRevenue.operatingIncome` |
| `ledger_gl_entries.net_income.sum` (NEW) | `BusinessRevenue.netIncome` |
| `ledger_invoice_lines.cost.sum` (NEW) | `BusinessInvoiceLines.costSum` |
| `ledger_invoice_lines.gross_profit.sum` (NEW) | `BusinessInvoiceLines.grossProfitSum` |
| `ledger_invoice_lines.margin_rate.avg` (NEW) | `BusinessInvoiceLines.marginRate` |
| `ledger_ar_aging.outstanding_amount.sum` (NEW) | `BusinessArAging.arBalance` |
| `ledger_ar_aging.ar_bucket_1_30.sum` (NEW) | `BusinessArAging.arBucket1_30` |
| `ledger_ap_aging.outstanding_amount.sum` (NEW) | `BusinessApAging.apBalance` |
| `ledger_cash_balances.cash_position.sum` (NEW) | `BusinessCashBalances.cashPosition` |
| `ledger_balance_sheet.total_assets.sum` (NEW) | `BusinessBalanceSheet.totalAssets` |
| ... (full table in `cube_report_adapter.py:_BUSINESS_MEASURES`) | |

### Dimensions (table, field) → Cube path

Same pattern. Lives in `_BUSINESS_DIMENSIONS` in the adapter. Examples:

| Token | Cube path |
|-------|-----------|
| `ledger_gl_entries.account_name` (NEW) | `BusinessRevenue.accountName` |
| `ledger_gl_entries.account_type` (NEW) | `BusinessRevenue.accountType` |
| `ledger_gl_entries.posting_date` | `BusinessRevenue.postingDate` (time) |
| `ledger_invoice_lines.product_item_name` | `BusinessInvoiceLines.productItemName` |
| `ledger_ar_aging.aging_bucket` (NEW) | `BusinessArAging.agingBucket` |
| `ledger_cash_balances.account_name` (NEW) | `BusinessCashBalances.accountName` |

## How AI is Taught to Use This (prompt rules)

`finanly/services/ai-gateway/src/finanly_ai_gateway/bi_prompt_business_plan_v1.txt` rule-by-rule:

| Rule | Question intent | AI emits `primary_table` | Resolves to Cube |
|------|-----------------|---------------------------|-------------------|
| 9 | "sales by product by customer" | `ledger_invoice_lines` | `BusinessInvoiceLines.netAmountSum` × productItemName + customerGroup |
| 10 | "revenue by customer" | `ledger_invoices` | `BusinessInvoices.totalInvoiced` |
| 10b | sales/revenue by product trait | `ledger_invoice_lines` | `BusinessInvoiceLines.netAmountSum` × productCustomFields/JSON dims |
| 12 | "paid/spent to vendor or class" | `transactions` | `BusinessTime.netAmount` × counterpartyName |
| **14 (NEW)** | margin / gross profit / GP% / COGS / profitability | `ledger_gl_entries` (company) or `ledger_invoice_lines` (per-product) | `BusinessRevenue.grossProfit/grossMarginRate/totalCogs` or `BusinessInvoiceLines.grossProfitSum/marginRate` |
| **15 (NEW)** | OpEx | `ledger_gl_entries` | `BusinessRevenue.totalOpex` |
| **16 (NEW)** | operating income / net income / EBITDA | `ledger_gl_entries` | `BusinessRevenue.operatingIncome/netIncome` |
| **17 (NEW)** | AR / receivables / aging / DSO | `ledger_ar_aging` | `BusinessArAging.*` |
| **18 (NEW)** | AP / payables / DPO | `ledger_ap_aging` | `BusinessApAging.*` |
| **19 (NEW)** | cash position / runway / burn | `ledger_cash_balances` | `BusinessCashBalances.cashPosition` |
| **20 (NEW)** | per-product breakdown when no invoice_lines for tenant | sets `clarify` | core-api routes to GL fallback |
| **21 (NEW)** | period resolution | absolute ISO `[start, end]` in filters | – |

## Deterministic Fallback (when AI is down or wrong)

`finanly/services/core-api/src/finanly_core_api/routers/business_bi.py` ships `_build_governed_*_fallback_plan` builders that synthesize the same `primary_table` plans the AI would emit, but without calling the AI:

| Detected intent (regex on question) | Builder | Default primary_table | Default measure |
|-------------------------------------|---------|------------------------|-----------------|
| `_is_margin_question` | `_build_governed_margin_fallback_plan` | `ledger_gl_entries` | `gross_profit` (sum) or `gross_margin_rate` (avg) if "%" / "rate" / "percentage" |
| `_is_operating_question` | dispatcher reuses margin builder + substitutes `operating_income` | `ledger_gl_entries` | `operating_income` |
| `_is_ar_question` | `_build_governed_ar_aging_fallback_plan` | `ledger_ar_aging` | `outstanding_amount` (sum) or 5 bucket measures if "aging" |
| `_is_ap_question` | `_build_governed_ap_aging_fallback_plan` | `ledger_ap_aging` | `outstanding_amount` or buckets |
| `_is_cash_question` | `_build_governed_cash_fallback_plan` | `ledger_cash_balances` | `cash_position` |
| `_is_balance_sheet_question` | `_build_governed_balance_sheet_fallback_plan` | `ledger_balance_sheet` | three-line `total_assets`/`total_liabilities`/`total_equity` |

Triggered when (a) AI gateway exhausted retries, OR (b) AI returned `clarify` but intent matched. Dispatched in `ask_bi` via `_detect_governed_intent` + `_serve_deterministic_plan`.

## Verifying Any Question Through Cube

```bash
# 1. Mint tenant-scoped JWT
CUBE_SECRET=$(grep "^CUBE_API_SECRET=" /home/docker/finanly.ai/infra/docker/.env | cut -d= -f2-)
docker exec -e CUBE_SECRET="$CUBE_SECRET" finanly-cube-core-1 node -e \
  "console.log(require('jsonwebtoken').sign({tenant_id:'<uuid>'}, process.env.CUBE_SECRET, {algorithm:'HS256'}))"

# 2. Hit Cube /load with the same query the AI plan would compile to
docker exec finanly-cube-core-1 sh -c '
curl -s -X POST http://localhost:4000/cubejs-api/v1/load \
  -H "Authorization: <jwt>" -H "Content-Type: application/json" \
  -d "{\"query\":{
        \"measures\":[\"BusinessRevenue.grossProfit\",\"BusinessRevenue.grossMarginRate\"],
        \"dimensions\":[\"BusinessRevenue.accountName\"],
        \"timeDimensions\":[{\"dimension\":\"BusinessRevenue.postingDate\",\"dateRange\":\"last month\"}],
        \"filters\":[{\"member\":\"BusinessRevenue.accountType\",\"operator\":\"equals\",\"values\":[\"Cost of Goods Sold\"]}]
      }}"
'
```

For the verification matrix with live numbers per tenant, see the catalog (Dropbox file or in-repo plan).

## What's NOT in this map yet (new-v2 catalog)

- Inventory cube (`BusinessInventory`) — needs on-hand quantity from a stock-balance source.
- Cash flow waterfall (`BusinessCashFlow`) split into operating / investing / financing — needs account-purpose classification beyond `account_type`.
- Customer LTV / churn cube — needs invoice-line aggregation across product lifetimes.
- Vendor payment-timing cube — needs `paid_at` populated on invoices.

Each is tracked in `plans/BI_QUESTION_CATALOG_AND_CUBE_COVERAGE_V1.md` with its data prerequisite.

## Related Docs

- `CUBE_SEMANTIC_ENGINE_HANDOFF_V1.md` — runtime + tenancy + JWT signing
- `CUBE_CORE_API_CONTRACT_V1.md` — public API contract
- `CUBE_MIGRATION_CUTOVER_RUNBOOK_V1.md` — backend switch ops
- `SEMANTIC_CONNECTION_MATRIX_V1.md` — table-level join graph (machine-readable in `business_connection_matrix_v1.yaml`)
- `SEMANTIC_FIELD_INVENTORY_V1.md` — per-table field inventory (`business_field_inventory_v1.yaml`)
- `SEMANTIC_DIMENSIONAL_COVERAGE_MATRIX_V1.md` — coverage by dimension category
- `BI_RESILIENCE_AND_SEMANTIC_HANDOFF_V1.md` — resilience layer + 86-question catalog
