Skip to content

Glossary & Disambiguation

This page is canonical. The system has accumulated several overloaded terms — source vs signal_source, "coverage" vs "confidence", "raw" vs "displayed" scores. This page is the disambiguating reference; other pages link here rather than re-explain.

The structure follows the Vextor ops-doc pattern: one row per distinct meaning, with the canonical form spelled out so confusion patterns can be retired by reference.


Disambiguation: Overloaded Terms

Each overloaded term appears below as one row per distinct meaning, making the overload explicit.

source — 2 meanings

Meaning Scope Canonical form Key distinguisher
Database table holding source records DB schema signal_source (table name in current schema) The actual table is named signal_source in migrations 001/002/006. ADR-0003 uses the shorter source in prose; both refer to the same thing.
Conceptual "data source" the system scrapes Domain language "source" or "data source" Used in user-facing copy (/methodology page) and ADR language. Refers to the abstract upstream (e.g. "the BHRRC source"), not the DB row.

Common confusion pattern: "Check the source table" is ambiguous — prefer "check the signal_source table" for DB references; reserve "source" for domain talk. Migration 012 may rename the table to source in due course; until it does, the table name is signal_source.


coverage — 3 meanings

Meaning Scope Canonical form Key distinguisher
Sub-criterion-level binary DB column score_sub_criterion.coverage_pct (0.0 or 1.0) At sub-criterion grain, coverage_pct is binary: 1.0 if any signal exists for this rule × institution × run, 0.0 otherwise. EXISTS-based per migration 009.
Pillar / Stage 1 / Stage 2 ratio DB column score_pillar.coverage_pct and equivalents (0.0–1.0) At pillar+ grain, coverage_pct is the unweighted ratio of covered applicable rules to total applicable rules.
UI hero metric Display coverage (rendered as percentage with RAG band) What the user sees in the institution detail hero. Same number as Stage 1 coverage_pct, formatted as integer percent.

Common confusion pattern: "Barclays is at 11% coverage" is the displayed Stage 1 coverage_pct (3 of 27 applicable rules covered). At the rule level individual rules are either covered (1.0) or not (0.0), not 11%. When discussing a rule, say "covered" or "uncovered"; when discussing an institution or pillar, give the percent.


confidence — 2 meanings

Meaning Scope Canonical form Key distinguisher
Per-signal quality measure DB column signal.confidence (0.0–1.0) How sure is the scraper about this finding. e.g. SBTi "not found" returns conf=0.5 because absence could be a name-match issue.
Aggregated propagated quality DB column score_*.confidence (0.0–1.0) Coverage × quality at sub-criterion, weighted mean propagating up.

Confidence is NOT coverage. Confidence answers "how sure are we about this finding"; coverage answers "did we look at all?" A rule with high-confidence boolean=0 (e.g. NatWest's "Commitment removed" on SBTi) is fully covered, even though it's a negative finding.


score — 4 meanings

Meaning Scope Canonical form Key distinguisher
Raw v0.4 methodology composite DB column score_composite.composite_raw_v04 Faithful v0.4 calculation including 50/100 base values for uncovered rules. Stored for audit, not displayed.
Coverage-weighted displayed composite DB column score_composite.composite_coverage_weighted Per ADR-0001 — average only over rules with live signals. This is the headline number.
Existing alias column DB column score_composite.composite After migration 011, an alias of composite_coverage_weighted. Kept for backward compat.
Rule-level numeric DB column score_sub_criterion.score (0–100) and raw_score (0–5) Per-rule scoring; doesn't apply to a whole institution.

Common confusion pattern: "Barclays scores 19.2" was the v0.4 raw number under low coverage. Per ADR-0001 the displayed number is now ~66.7 (coverage-weighted). Both exist in the DB. The displayed score is what the UI shows; raw is for audit and convergence checks.


signal_source / source_id — 3 meanings

Meaning Scope Canonical form Key distinguisher
Logical source identifier DB column signal_source.source_id (e.g. NZBA-MEMBERS) The string code. Used as FK throughout.
Display name of source UI / docs name column in signal_source (e.g. "UNEP FI Net-Zero Banking Alliance") The human-readable label. Don't use the source_id in UI copy.
Scraper module Filesystem src/scrapers/<name>.js (e.g. nzba.js) The code that fetches and parses that source. Module name is lowercase, source_id is SCREAMING-KEBAB.

Pattern note: SBTI-DASHBOARD and SBTI-CORPORATE are legacy misnomers (source is an Excel file, not a dashboard or CSV). Rename to SBTI-VALIDATED deferred — known parked item.


run — 3 meanings

Meaning Scope Canonical form Key distinguisher
Scrape execution DB row scrape_run.run_id (integer) One row per scrape execution. Created by run.js runner. Holds start_at, finish_at, status, error_count.
Scoring execution Conceptual "score run" Re-running the scoring engine against an existing scrape run. Doesn't create a new scrape_run row — overwrites score_* rows for the same run_id.
Cron-triggered combined Conceptual "weekly run" Sunday 02:00 UTC scheduled execution that does scrape then score.

Common confusion pattern: "Re-run the scores" doesn't create a new run_id — it updates the score rows for the existing one. Only the scraper run creates new run_ids.


institution — 2 meanings

Meaning Scope Canonical form Key distinguisher
The entity being screened DB row institution.institution_id (LEI) LEI-keyed. Pilot set is 8: 4 UK banks + 4 non-financials.
Legal entity identifier (general) Domain language LEI A 20-character GLEIF-issued identifier. Always for the holding company, not operating subsidiaries (Barclays PLC, not Barclays Bank PLC).

Pattern note: Default to holding-company LEI for any institution. Operating subsidiary LEIs are used only for source-specific lookups (e.g. UK Modern Slavery Act, which is filed by the subsidiary). That sub-entity ID lives in scraper_config_json.


rule — 2 meanings

Meaning Scope Canonical form Key distinguisher
Atomic scoreable unit DB row rule.rule_id (e.g. E1.1, G3-PRB.1) The thing a scraper feeds and the scoring engine evaluates. 148 rows: 89 financial + 24 universal + 35 non-financial.
Sub-criterion (grouping above rules) Conceptual / DB sub_criterion (e.g. E1) A group of rules under a sub-criterion code. The v0.4 hierarchy is pillar → sub-criterion → rule.

Pattern note: ADR-0005 adds a third grouping — themes — for UI presentation only. Themes don't appear in the DB or affect scoring.


applicability — 2 meanings

Meaning Scope Canonical form Key distinguisher
Which rules apply to which institution DB column rule.applicable_sectors (GICS code string) '40' = financials only; 'ALL' = universal; specific GICS codes = sector-scoped.
Which scrapers run for which institution Conceptual Routing logic in run.js Determined at scraper-runner level based on institution sector. Financials-only scrapers (NZBA, SBTi for banks) skip non-financial institutions.

Disambiguation: Stages

Stage 1 vs Stage 2

Stage 1 Stage 2
Scope All institutions (financial + non-financial) Financial institutions only (GICS sector 40)
Combines E pillar + S pillar + G pillar Stage 1 ESG + Credit + Returns
DB table score_stage1_esg score_stage2_composite
Status Live Stage 1 only — Stage 2 credit/returns dimensions placeholder ("awaiting")
Triangulation thesis No Yes — original framework justification

The triangulation thesis (ESG × credit × returns) was the original banks-only justification. The two-stage split honestly separates the universal ESG-only screening from the financials-only triangulation.


Disambiguation: Documentation Surfaces

Surface What it is What it's for
ADRs (docs/adr/) Standing design decisions "Why is the system shaped this way?"
Design note (DESIGN_NOTE_v2_two_stage.md) Architecture rationale "Why did we go to two-stage?"
Migrations Schema record "What's in the DB?"
v0.4 workbook Framework spec Historical reference. Framework lives on; implementation diverges.
Slack #esg-screening Event log "What changed in the last session?"
CLAUDE.md (in repo) CC-on-VM operating protocol How Claude behaves on the VM
Project instructions (claude.ai) Chat-session operating protocol How chat sessions behave
ops.esg-screen.org (this site) Standing reference The shape of the system, not what just happened

Quick reference: terms that are NOT overloaded

For completeness — a few terms that get used loosely but have a single canonical meaning:

  • GICS — Global Industry Classification Standard. Always the MSCI/S&P 4-level hierarchy. gics_classification table holds the taxonomy; institution.gics_* columns hold per-institution classification.
  • Peer grouppeer_group table. Per institution, the system finds peers via GICS fallback ladder: sub-industry → industry → industry group → sector. Thresholds 10/15/25.
  • Red flag — manual-at-intake binary per v0.4. Fossil financing, weapons manufacturing, etc. Set on institution, not derived from signals. Per ADR-0004, no amber state.
  • Watchlist — surface for findings not captured by score (e.g. NatWest's withdrawn SBTi commitment, BHRRC allegations). Schema pending; expected as part of BHRRC scraper cycle.

When you find a new disambiguation

This page is canonical. New overloads land here, not in scattered comments. Add a row, link to it, move on. The point is that future-you shouldn't have to re-derive the distinction.