Skip to content

Performance

Cost of generating a CBC catalogue data product with gwmock-signal, across backends (lal, pycbc, ripple), methods (per-event vs the batched on-device path), and hardware.

Each run produces the catalogue twice — a cold run that pays one-time JIT/XLA compilation and a warm steady-state run — and records both wall times, the compile_seconds difference, throughput, core-hours, peak memory, and output size. The warm numbers are the headline: at catalogue scale the one-time compile amortizes away, so steady state is what a year-long run actually sees. The cold points are kept beside it because a GPU's compile is larger than a CPU's, which can mask the device's advantage at small event counts.

The charts are scatter plots over the waveform model (x-axis): each backend/method/hardware cell is a coloured point, so the hardware comparison lives in the colour legend and models sit side by side. Cold and warm appear as different point shapes, and models are sorted with the best result on the left (highest throughput, lowest wall time / memory). The gwmock-signal version each cell was produced with is shown in the tooltip and the table.

The charts are interactive — hover for exact values, click a cell in the legend to isolate it, and use the chart menu to export. The table below is sortable and searchable.

cellmodeldevicegwmock-signalwarm ev/scold wall (s)warm wall (s)compile (s)peak mem (GB)output (GB)contributor
lal per-event (CPU)IMRPhenomDAMD EPYC 7643 48-Core Processor0.9.0341511483.02.00.81
lal per-event (CPU)IMRPhenomDIntel(R) Core(TM) i7-4770K CPU @ 3.50GHz0.9.0261921911.22.00.81
pycbc per-event (CPU)IMRPhenomDAMD EPYC 7643 48-Core Processor0.9.095695663.92.40.81
pycbc per-event (CPU)IMRPhenomDIntel(R) Core(TM) i7-4770K CPU @ 3.50GHz0.9.077677670.22.40.81
ripple batched (CPU)IMRPhenomDAMD EPYC 7643 48-Core Processor0.9.026227197.711.10.81
ripple batched (CPU)IMRPhenomDIntel(R) Core(TM) i7-4770K CPU @ 3.50GHz0.9.017737288.49.50.81
ripple batched (GPU a30)IMRPhenomDNVIDIA A300.9.0420271215.16.60.81
ripple batched (GPU)IMRPhenomDNVIDIA GeForce RTX 5060 Ti0.9.0225362213.56.50.81
ripple per-event (CPU)IMRPhenomDAMD EPYC 7643 48-Core Processor0.9.03190719990.05.60.81
ripple per-event (CPU)IMRPhenomDIntel(R) Core(TM) i7-4770K CPU @ 3.50GHz0.9.02224023120.05.70.81

Reproduce / contribute

uv run --extra signal gwmock-benchmark signal performance \
    --backend ripple --method batched --n-events 5000 \
    -o data/signal/performance/ripple_batched_<your-gpu>.json

Then open a pull request adding the data file — figures and tables regenerate automatically. See Contribute a benchmark.