Architecture

How Yako sources Microsoft cloud data, caches it on getyako.com, and serves it to the browser extension.

Overview

Yako has two deliverables:

  • A browser extension (Edge, Chrome, Firefox) that renders the new tab.
  • The getyako.com website, which doubles as a CDN for cached Microsoft data and assets.

The extension only ever talks to getyako.com. It never fetches from GitHub, msportals.io, cmd.ms, or any other origin at runtime. This gives us:

  • Stable URLs that survive upstream repo reorganisations.
  • A single, auditable domain in the extension manifest.
  • Lower latency (Cloudflare edge-cached) and no rate-limiting risk.

Upstream data sources

Three community-maintained repositories feed the pipeline:

Sync pipeline

A single Deno task, tasks/sync-icons.ts, orchestrates everything:

  1. Downloads the MicrosoftCloudLogos tarball, extracts only the logos/ and icons/ trees, and writes them to website/public/ms/.
  2. Builds website/public/ms/catalog.json — a unified index of every product (from metadata.md) plus every loose icon, with the best-quality image picked per product (SVG > 512px PNG > smaller).
  3. Downloads commands.csv to website/public/data/commands.csv.
  4. Downloads the nine msportals.io category files to website/public/data/portals/*.json.
  5. Derives two server-side manifests the extension actually consumes:
    • portals-by-category.json — single file bundling all nine msportals categories. Replaces nine HTTP requests with one.
    • portals.json — a flat, deduplicated list of searchable entries (cmd.ms + msportals) with iconUrl already resolved by fuzzy-matching portal names against the logos catalog (exact normalised name, altname, then substring).

Build + deploy

Cloudflare Pages runs on every push to main and on every deploy hook trigger. The build command installs Deno, runs deno task sync-icons (populating website/public/), then npm run build for the Astro site. The generated ms/ and data/ trees are .gitignored — only upstream SHAs are committed.

Change detection

.github/workflows/sync-upstream.yml runs hourly (cron at :17) and also reacts to repository_dispatch events from upstream forks. It:

  1. Fetches the current HEAD SHA of each upstream branch (via GitHub API).
  2. Compares against a cached SHA snapshot from the previous run.
  3. If any SHA changed, POSTs the Cloudflare Pages deploy hook.

This is the only trigger that rebuilds cached data. No change, no deploy. Upstream repos can additionally fire instant rebuilds by posting a repository_dispatch with type upstream-logos-changed (see each source's notify-yako.yml).

Public URL contract

These URLs are stable. Extension versions currently shipping depend on them.

  • https://getyako.com/ms/catalog.json — logos + icons index.
  • https://getyako.com/ms/logos/… — product logo assets.
  • https://getyako.com/ms/icons/… — per-service icon assets.
  • https://getyako.com/data/commands.csv — raw cmd.ms commands.
  • https://getyako.com/data/portals/<category>.json — raw msportals category files.
  • https://getyako.com/data/portals-by-category.json — bundled msportals view.
  • https://getyako.com/data/portals.json — unified searchable manifest with resolved iconUrl.

Extension consumption

FeatureEndpointCache
cmd.ms terminal /data/commands.csv chrome.storage.local, 24h
msportals.io view /data/portals-by-category.json chrome.storage.local, 24h
Add-portal dialog /data/portals.json chrome.storage.local, 24h
Icon picker /ms/catalog.json + /ms/… chrome.storage.local, 24h

Flow diagram

  upstream repos                   yako repo / CF Pages                 extension
  ──────────────                   ────────────────────                 ─────────
  merill/MicrosoftCloudLogos ─┐
                              │    sync-upstream.yml (hourly)
  merill/cmd ─────────────────┼─▶  ├── poll SHAs
                              │    └── deploy hook ──▶ CF Pages build
  adamfowlerit/msportals.io ──┘                         │
                                                        │  deno task sync-icons
                                                        │   ├── download tarballs
                                                        │   ├── build catalog.json
                                                        │   ├── build portals.json
                                                        │   └── build by-category.json
                                                        ▼
                                               getyako.com/{ms,data}/…
                                                        │
                                                        ▼
                                                new tab ──▶ fetch + 24h cache

Stability guarantees

  • Icon URLs never 404 as long as a product exists in the logos fork. Upstream reorganisations are absorbed in the extension's next launch — icon paths resolve fresh from /ms/catalog.json.
  • Extension never hard-fails when upstream is down. Each data endpoint falls back to its cached copy in chrome.storage.local; entries without a matched icon fall back to pastel initials.
  • No client-side override files. Icon matching is fuzzy but entirely server-side. Unmatched portals are logged during sync so upstream gaps can be fixed at the source rather than patched per-client.