Stack
The infrastructure layer. Where things run, how they deploy, where secrets live.
For the data model that runs on this stack, see Data model. For the architecture this stack implements, see Overview.
At a glance
| Layer | Technology | Notes |
|---|---|---|
| VM | Azure Linux (Ubuntu 24.04) | esg-screening-01 in RG rg-esg-screening |
| Runtime | Node.js + PM2 | Single VM, single process |
| Database | SQLite at data/esg.db |
better-sqlite3 driver |
| Scheduler | node-cron in-process | Sunday 02:00 UTC weekly |
| External APIs | googleapis | (was for Sheets writer — now likely deprecating per ADR-0002) |
| Tunnel | Cloudflare Tunnel | Connector token in 1Password |
| Auth | Cloudflare Access | OTP, 5-address allow list |
| DNS | Cloudflare | esg-screen.org zone in account mcmillangrubb |
| Repo | GitHub | McMillanGrubb/esg-screening |
| CI/CD | (none yet) | Manual git pull + pm2 restart on the VM |
VM
Hostname: esg-screening-01
OS: Ubuntu 24.04
Login: azureuser (default Azure VM convention)
Worktree: /home/azureuser/esg-screening (the repo working directory)
Access paths:
- Cloudflare tunnel SSH (preferred) —
ssh esg-screening-01via the SSH config that routes through the tunnel - Public port 22 on the Azure NSG — still open as of last check, but redundant since the tunnel works. Worth closing.
Daily operations happen via CC-on-VM (see Session protocol).
Node + PM2
The Node application runs under PM2 for process management and auto-restart.
| Process name | What it is | Where |
|---|---|---|
esg-web |
Express server for the read interface (per ADR-0002) | (pending — not yet running) |
esg-scheduler |
node-cron scheduler for weekly scrape + score | Enabled when ENABLE_SCHEDULER=1 env var is set |
Manual triggers during development:
npm run scrape # Trigger a scrape run on demand
npm run score [run_id] # Re-score against a specific or latest run
PM2 commands:
pm2 list
pm2 logs esg-web
pm2 restart esg-web
pm2 show esg-scheduler
SQLite
The DB file lives at data/esg.db in the repo worktree. It is NOT
checked into Git — .gitignore excludes it. Schema is rebuilt from
src/db/migrations/*.sql on a fresh checkout.
Driver: better-sqlite3 (synchronous, in-process). Singleton client
at src/db/client.js.
Backups: not automated. The DB is small enough that a manual
sqlite3 esg.db .dump > backup.sql on demand is fine for now. Worth
automating when DB grows or before a destructive migration.
Migrations: applied via the migration runner. The runner wraps each
file in its own transaction — do not include BEGIN TRANSACTION /
COMMIT inside migration files (nested transactions error in SQLite).
Cloudflare
Account: mcmillangrubb
Zone: esg-screen.org
Workers / Pages: TBD — the ops site (this site you're reading) will deploy via Cloudflare Pages.
Cloudflare Tunnel: named esg-screening. Connector token lives in
1Password under the ESG Screening vault (it is NOT a User API
Token; it's a Tunnel connector token, surfaced under Zero Trust →
Networks → Tunnels). This architectural distinction matters — wrong
page = wrong search.
Cloudflare Access: gates esg-screen.org (the product) and
ops.esg-screen.org (this site) under a single Access app. OTP
flow, 5-address email allow list. Adding the ops site to the existing
app rather than a new one keeps the allow list in one place.
DNS
Zone: esg-screen.org in the mcmillangrubb Cloudflare account.
Records:
| Type | Name | Target | Purpose |
|---|---|---|---|
| CNAME | esg-screen.org |
Tunnel | Product UI |
| CNAME | ops.esg-screen.org |
Cloudflare Pages | This ops site |
Old DNS: esg.mcmillangrubb.com — superseded by esg-screen.org. Do
not use the old name.
Secrets
Where each secret lives:
| Secret | Location | Notes |
|---|---|---|
| GitHub auth (for the VM to push) | gh CLI on the VM, token in ~/.config/gh/hosts.yml |
Set up via gh auth login device-code flow on 12 May |
| Cloudflare Tunnel token | 1Password ESG Screening vault | Connector token |
| GLEIF API | None — public unauthenticated | |
| SBTi data download | None — public Excel | |
| UK Companies House | None for the read endpoints used | |
| Slack | Bot token in environment (managed by claude.ai integration, not on the VM) |
Anti-patterns to avoid:
- PAT (personal access token) at rest on the VM filesystem (e.g.
/root/.esg-gh-token) —gh auth loginis the preferred pattern - Secrets in committed files (
.envis gitignored; never commit one) - Secrets in
scraper_config_json— that column is for non-secret config (alternate names, Companies House numbers, etc.)
Repository
Org: McMillanGrubb
Main repo: esg-screening
Ops site repo: esg-screen-ops (pending — to be created when ADR-0006 lands)
Branch convention: main only. No feature branches in single-operator
mode. If/when a second contributor lands, switch to PR-based workflow.
Commit convention: Conventional Commits-ish prefixes:
feat(scope): ...for new functionalityfeat(schema): ...for migrationsfix(scope): ...for fixesdocs: ...for documentationchore: ...for housekeeping
Push permissions: managed by gh on the VM. See "Secrets" above.
Local development (on the VM)
The VM is also the dev environment. There is no separate "dev" instance of the stack. Changes happen in the worktree, tested in-place against the live DB, committed and pushed when stable.
This is fine at single-operator scale. Pattern note: when adding
destructive migrations, take a DB backup first (sqlite3 data/esg.db
.dump > data/backups/esg-pre-NNN.sql).
Monitoring
Cron health: PM2 logs for esg-scheduler. No external monitoring yet.
Scraper health: per ADR-0003, the /methodology page in the product
shows last successful run + last error per source. Becomes the
operator-visible source status surface.
Stack health: PM2 status, /health endpoint (TBD on the read interface).
Alerting: none. Single-operator scale, weekly cadence — Rob notices.
Deployment
Code deploys:
# On the VM
cd /home/azureuser/esg-screening
git pull
npm install # If package.json changed
pm2 restart esg-web
Schema migrations: applied via the migration runner, idempotent (uses
schema_migrations tracking table). CC-on-VM applies migrations during
the cycle that introduces them.
Cloudflare changes (tunnels, Access policies, DNS): manual via the Cloudflare dashboard. Document significant changes in Slack handoffs.
Ops site (this site): deploys via GitHub Actions → Cloudflare Pages
from the esg-screen-ops repo. Push to main = deploy. (See ADR-0006
for the architecture.)