Glossary & Disambiguation

This page is canonical. The system has accumulated several overloaded terms — source vs signal_source, "coverage" vs "confidence", "raw" vs "displayed" scores. This page is the disambiguating reference; other pages link here rather than re-explain.

The structure follows the Vextor ops-doc pattern: one row per distinct meaning, with the canonical form spelled out so confusion patterns can be retired by reference.

Disambiguation: Overloaded Terms

Each overloaded term appears below as one row per distinct meaning, making the overload explicit.

`source` — 2 meanings

Meaning	Scope	Canonical form	Key distinguisher
Database table holding source records	DB schema	`signal_source` (table name in current schema)	The actual table is named `signal_source` in migrations 001/002/006. ADR-0003 uses the shorter `source` in prose; both refer to the same thing.
Conceptual "data source" the system scrapes	Domain language	"source" or "data source"	Used in user-facing copy (`/methodology` page) and ADR language. Refers to the abstract upstream (e.g. "the BHRRC source"), not the DB row.

Common confusion pattern: "Check the source table" is ambiguous — prefer "check the signal_source table" for DB references; reserve "source" for domain talk. Migration 012 may rename the table to source in due course; until it does, the table name is signal_source.

`coverage` — 3 meanings

Meaning	Scope	Canonical form	Key distinguisher
Sub-criterion-level binary	DB column	`score_sub_criterion.coverage_pct` (0.0 or 1.0)	At sub-criterion grain, `coverage_pct` is binary: 1.0 if any signal exists for this rule × institution × run, 0.0 otherwise. EXISTS-based per migration 009.
Pillar / Stage 1 / Stage 2 ratio	DB column	`score_pillar.coverage_pct` and equivalents (0.0–1.0)	At pillar+ grain, `coverage_pct` is the unweighted ratio of covered applicable rules to total applicable rules.
UI hero metric	Display	`coverage` (rendered as percentage with RAG band)	What the user sees in the institution detail hero. Same number as Stage 1 `coverage_pct`, formatted as integer percent.

Common confusion pattern: "Barclays is at 11% coverage" is the displayed Stage 1 coverage_pct (3 of 27 applicable rules covered). At the rule level individual rules are either covered (1.0) or not (0.0), not 11%. When discussing a rule, say "covered" or "uncovered"; when discussing an institution or pillar, give the percent.

`confidence` — 2 meanings

Meaning	Scope	Canonical form	Key distinguisher
Per-signal quality measure	DB column	`signal.confidence` (0.0–1.0)	How sure is the scraper about this finding. e.g. SBTi "not found" returns `conf=0.5` because absence could be a name-match issue.
Aggregated propagated quality	DB column	`score_*.confidence` (0.0–1.0)	Coverage × quality at sub-criterion, weighted mean propagating up.

Confidence is NOT coverage. Confidence answers "how sure are we about this finding"; coverage answers "did we look at all?" A rule with high-confidence boolean=0 (e.g. NatWest's "Commitment removed" on SBTi) is fully covered, even though it's a negative finding.

`score` — 4 meanings

Meaning	Scope	Canonical form	Key distinguisher
Raw v0.4 methodology composite	DB column	`score_composite.composite_raw_v04`	Faithful v0.4 calculation including 50/100 base values for uncovered rules. Stored for audit, not displayed.
Coverage-weighted displayed composite	DB column	`score_composite.composite_coverage_weighted`	Per ADR-0001 — average only over rules with live signals. This is the headline number.
Existing alias column	DB column	`score_composite.composite`	After migration 011, an alias of `composite_coverage_weighted`. Kept for backward compat.
Rule-level numeric	DB column	`score_sub_criterion.score` (0–100) and `raw_score` (0–5)	Per-rule scoring; doesn't apply to a whole institution.

Common confusion pattern: "Barclays scores 19.2" was the v0.4 raw number under low coverage. Per ADR-0001 the displayed number is now ~66.7 (coverage-weighted). Both exist in the DB. The displayed score is what the UI shows; raw is for audit and convergence checks.

`signal_source` / `source_id` — 3 meanings

Meaning	Scope	Canonical form	Key distinguisher
Logical source identifier	DB column	`signal_source.source_id` (e.g. `NZBA-MEMBERS`)	The string code. Used as FK throughout.
Display name of source	UI / docs	`name` column in `signal_source` (e.g. "UNEP FI Net-Zero Banking Alliance")	The human-readable label. Don't use the `source_id` in UI copy.
Scraper module	Filesystem	`src/scrapers/<name>.js` (e.g. `nzba.js`)	The code that fetches and parses that source. Module name is lowercase, source_id is SCREAMING-KEBAB.

Pattern note: SBTI-DASHBOARD and SBTI-CORPORATE are legacy misnomers (source is an Excel file, not a dashboard or CSV). Rename to SBTI-VALIDATED deferred — known parked item.

`run` — 3 meanings

Meaning	Scope	Canonical form	Key distinguisher
Scrape execution	DB row	`scrape_run.run_id` (integer)	One row per scrape execution. Created by `run.js` runner. Holds start_at, finish_at, status, error_count.
Scoring execution	Conceptual	"score run"	Re-running the scoring engine against an existing scrape run. Doesn't create a new `scrape_run` row — overwrites `score_*` rows for the same `run_id`.
Cron-triggered combined	Conceptual	"weekly run"	Sunday 02:00 UTC scheduled execution that does scrape then score.

Common confusion pattern: "Re-run the scores" doesn't create a new run_id — it updates the score rows for the existing one. Only the scraper run creates new run_ids.

`institution` — 2 meanings

Meaning	Scope	Canonical form	Key distinguisher
The entity being screened	DB row	`institution.institution_id` (LEI)	LEI-keyed. Pilot set is 8: 4 UK banks + 4 non-financials.
Legal entity identifier (general)	Domain language	LEI	A 20-character GLEIF-issued identifier. Always for the holding company, not operating subsidiaries (Barclays PLC, not Barclays Bank PLC).

Pattern note: Default to holding-company LEI for any institution. Operating subsidiary LEIs are used only for source-specific lookups (e.g. UK Modern Slavery Act, which is filed by the subsidiary). That sub-entity ID lives in scraper_config_json.

`rule` — 2 meanings

Meaning	Scope	Canonical form	Key distinguisher
Atomic scoreable unit	DB row	`rule.rule_id` (e.g. `E1.1`, `G3-PRB.1`)	The thing a scraper feeds and the scoring engine evaluates. 148 rows: 89 financial + 24 universal + 35 non-financial.
Sub-criterion (grouping above rules)	Conceptual / DB	`sub_criterion` (e.g. E1)	A group of rules under a sub-criterion code. The v0.4 hierarchy is pillar → sub-criterion → rule.

Pattern note: ADR-0005 adds a third grouping — themes — for UI presentation only. Themes don't appear in the DB or affect scoring.

`applicability` — 2 meanings

Meaning	Scope	Canonical form	Key distinguisher
Which rules apply to which institution	DB column	`rule.applicable_sectors` (GICS code string)	`'40'` = financials only; `'ALL'` = universal; specific GICS codes = sector-scoped.
Which scrapers run for which institution	Conceptual	Routing logic in `run.js`	Determined at scraper-runner level based on institution sector. Financials-only scrapers (NZBA, SBTi for banks) skip non-financial institutions.

Disambiguation: Stages

Stage 1 vs Stage 2

	Stage 1	Stage 2
Scope	All institutions (financial + non-financial)	Financial institutions only (GICS sector 40)
Combines	E pillar + S pillar + G pillar	Stage 1 ESG + Credit + Returns
DB table	`score_stage1_esg`	`score_stage2_composite`
Status	Live	Stage 1 only — Stage 2 credit/returns dimensions placeholder ("awaiting")
Triangulation thesis	No	Yes — original framework justification

The triangulation thesis (ESG × credit × returns) was the original banks-only justification. The two-stage split honestly separates the universal ESG-only screening from the financials-only triangulation.

Disambiguation: Documentation Surfaces

Surface	What it is	What it's for
ADRs (`docs/adr/`)	Standing design decisions	"Why is the system shaped this way?"
Design note (`DESIGN_NOTE_v2_two_stage.md`)	Architecture rationale	"Why did we go to two-stage?"
Migrations	Schema record	"What's in the DB?"
v0.4 workbook	Framework spec	Historical reference. Framework lives on; implementation diverges.
Slack `#esg-screening`	Event log	"What changed in the last session?"
`CLAUDE.md` (in repo)	CC-on-VM operating protocol	How Claude behaves on the VM
Project instructions (claude.ai)	Chat-session operating protocol	How chat sessions behave
`ops.esg-screen.org` (this site)	Standing reference	The shape of the system, not what just happened

Quick reference: terms that are NOT overloaded

For completeness — a few terms that get used loosely but have a single canonical meaning:

GICS — Global Industry Classification Standard. Always the MSCI/S&P 4-level hierarchy. gics_classification table holds the taxonomy; institution.gics_* columns hold per-institution classification.
Peer group — peer_group table. Per institution, the system finds peers via GICS fallback ladder: sub-industry → industry → industry group → sector. Thresholds 10/15/25.
Red flag — manual-at-intake binary per v0.4. Fossil financing, weapons manufacturing, etc. Set on institution, not derived from signals. Per ADR-0004, no amber state.
Watchlist — surface for findings not captured by score (e.g. NatWest's withdrawn SBTi commitment, BHRRC allegations). Schema pending; expected as part of BHRRC scraper cycle.

When you find a new disambiguation

This page is canonical. New overloads land here, not in scattered comments. Add a row, link to it, move on. The point is that future-you shouldn't have to re-derive the distinction.

Glossary & Disambiguation

Disambiguation: Overloaded Terms

source — 2 meanings

coverage — 3 meanings

confidence — 2 meanings

score — 4 meanings

signal_source / source_id — 3 meanings

run — 3 meanings

institution — 2 meanings

rule — 2 meanings

applicability — 2 meanings