Open-source biosurveillance · MIT licensed

Streamlined pathogen intelligence, straight from the wastewater.

MOSAIC fuses wastewater, genomic, and outbreak-news signals into a single calibrated outbreak posterior, giving epidemiologists a population-scale early warning weeks before clinical case data catches up.

No signup. Real CDC NWSS data.

mosaic, surveillance console LIVE
Global sewershed network · P(Rt>1)39 sites · 19 countries
Paris, France

94%

P(Rt>1)

1.55

Rt

91d

lead

SARS-CoV-2

85

Influenza A

14

Influenza B

19

RSV

18

Open Paris in the live console

The mission

Outbreaks deserve software that sees them coming.

By the time clinical cases confirm a wave, it is already underway. The pathogens are already in the sewers, and the variants are already in the sequence databases. MOSAIC reads all three independent signals at once and turns them into one number an epidemiologist can trust.

CDC NWSS

Wastewater

Viral activity levels from treatment-plant sewersheds, a population-scale signal that doesn't depend on who seeks care or gets tested.

Nextstrain

Genomic

Lineage frequencies tracked over time; a KL-divergence jump flags an emerging variant before it dominates.

WHO DON · ProMED

Outbreak text

NLP extracts pathogen, place and counts from official outbreak reports and clinician posts worldwide.

How it works

From sewer to signal in four stages.

Each stream is scored independently, fused in a hierarchical Bayesian model, and calibrated so the probability means what it says. Every stage is inspectable in the console.

Ingest

3 surveillance streams

WastewaterCDC NWSS
GenomicNextstrain
Outbreak textWHO · ProMED
Per-stream detectors

anomaly scoring

BOCPD71%
KL-divergence42%
NLP + change-pt31%
Bayesian fusion

hierarchical model

EpiEstim Rt1.34
Learned weightslogistic
PosteriorNUTS / MCMC
Calibrate → alert

P(Rt > 1)

Isotonic ECE0.086
P(Rt>1)-
Lead time68 d

The platform

Six capabilities, one calibrated posterior.

Wastewater monitoring

Per-sewershed activity levels with BOCPD change-point detection and sustained-elevation flags.

Genomic lineage tracking

Rolling lineage frequencies and divergence-based anomaly scores for emerging variants.

Outbreak text mining

Structured epi events extracted from WHO DON and ProMED with novelty detection.

Bayesian fusion

A hierarchical model combines the streams into a single fused outbreak posterior, P(Rt>1).

Calibrated forecasting

Isotonic calibration so a stated 70% means 70%, validated across four historical outbreaks.

Daily briefings

Auto-generated, per-site situation reports an epidemiologist can act on in minutes.

Validated, not vibes

Calibrated against four historical outbreaks.

MOSAIC was back-tested on real CDC NWSS records across Omicron, mpox, polio and H5N1. The headline result: well-calibrated probabilities with a meaningful early-warning lead.

Full findings

0.086

Expected calibration error

ECE < 0.10 ⇒ well-calibrated

0.917

AUROC

strong growth discrimination

68 d

Median lead time

ahead of clinical confirmation

1,334

Day-ahead forecasts

real CDC NWSS record, 2021-25

Reliability diagram
Reliability diagram, predicted vs. observed outbreak frequency (ECE 0.086).
ROC curve
ROC for P(Rt>1) growth discrimination (AUROC 0.917).

See the whole surveillance picture, live.

Explore sewershed sites across the globe, fused posteriors, live WHO/ProMED outbreak news, lineage mixes, auto-generated briefings, and a built-in Claude assistant. No signup required.

Launch the live demo