Publishing Simulation Data to Zenodo¶
The gwmock repository command suite allows you to create, manage, and publish
your gravitational-wave simulation datasets to Zenodo, a
community-driven open-access repository. This enables long-term preservation,
DOI assignment, and easy sharing with the GW community.
Overview¶
Publishing a dataset involves these steps:
- Create a new deposition (draft repository)
- Upload your simulation files (GWF frames, metadata, configs)
- Update metadata (title, authors, keywords, etc.)
- Publish to finalize (generates a persistent DOI)
- Download published records (by anyone, no token needed)
Setup¶
Get an API Token¶
Before using the repository commands, you need a Zenodo API token:
- Production Zenodo: Go to https://zenodo.org/account/settings/applications/tokens/new
- Sandbox (Testing): Go to https://sandbox.zenodo.org/account/settings/applications/tokens/new
When creating a token, ensure it has these scopes:
deposit:write— Write access to create/upload filesdeposit:actions— Permission to publish depositions
Set Environment Variables¶
Store your tokens securely as environment variables:
# Production token
export ZENODO_API_TOKEN="your_production_token_here"
# Sandbox token (for testing)
export ZENODO_SANDBOX_API_TOKEN="your_sandbox_token_here"
Tip: Add these to your .bashrc, .zshrc, or .env file to avoid re-entering them.
Verify Your Token¶
Test that your token works before publishing:
# Verify production token
gwmock repository verify
# Verify sandbox token
gwmock repository verify --sandbox
If successful, you'll see:
✓ Token is valid!
Environment: Zenodo (Production)
Found 3 draft deposition(s)
Workflow¶
Step 1: Create a Deposition
Start by creating a new draft deposition:
gwmock repository create \
--title "GW Mock Data Challenge v1" \
--description "Simulated binary black hole coalescences for ET"
Interactive mode: Omit options to be prompted:
gwmock repository create
# Deposition Title: GW Mock Data Challenge v1
# Deposition Description: Simulated binary black hole coalescences
Output:
Creating deposition...
✓ Deposition created successfully!
ID: 123456
Next: gwmock repository upload 123456 --file <path>
Save the deposition ID (e.g., 123456) for subsequent commands.
Step 2: Upload Files
Upload your simulation outputs and metadata:
# Single file
gwmock repository upload 123456 --file simulation_output.gwf
# Multiple files
gwmock repository upload 123456 \
--file simulation_output.gwf \
--file metadata.yaml \
--file config.yaml
Features:
- Files are uploaded with automatic timeout adjustment (10 seconds per MB)
- Progress bar shows upload status
- Failed uploads are reported; retry-safe via exponential backoff
Output:
Uploading 3 file(s) to deposition 123456...
Uploading ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100%
✓ simulation_output.gwf (245.50 MB)
✓ metadata.yaml (0.05 MB)
✓ config.yaml (0.02 MB)
Next: gwmock repository update <id> --metadata-file <file>
Step 3: Update Metadata
Enhance your deposition with structured metadata:
creators:
- name: 'Jane Doe'
affiliation: 'LIGO Laboratory'
orcid: '0000-0000-0000-0000'
- name: 'John Smith'
affiliation: 'Virgo Collaboration'
keywords:
- 'gravitational waves'
- 'mock data challenge'
- 'binary black holes'
- 'LIGO'
- 'Virgo'
license: 'cc-by-4.0'
contributors:
- name: 'LIGO Laboratory'
role: 'Hosting institution'
related_identifiers:
- identifier: '10.7935/gqm7-wf12'
relation: 'references'
resource_type: 'publication'
Upload metadata:
gwmock repository update 123456 --metadata-file deposition_metadata.yaml
Output:
Updating metadata for deposition 123456...
✓ Metadata updated successfully
Next: gwmock repository publish 123456
Step 4: Publish
Once files and metadata are complete, publish your deposition:
gwmock repository publish 123456
Confirmation prompt:
Publish deposition 123456? This action is permanent and cannot be undone. [y/N]:
Output (on success):
Publishing deposition 123456...
✓ Published successfully!
DOI: 10.5281/zenodo.123456
Important: Publishing is permanent. Once published, you cannot modify files or delete the record. Always verify metadata before publishing.
Step 5: Share & Download
Your dataset now has a permanent DOI and is discoverable:
Share the DOI: 10.5281/zenodo.123456
Download files (anyone can do this without a token):
gwmock repository download 123456 --file simulation_output.gwf --output ./data.gwf
Advanced Usage¶
Testing with Sandbox
Use the Zenodo Sandbox to test your workflow before publishing to production:
# Create in sandbox
gwmock repository create \
--title "Test Dataset" \
--sandbox
# Upload files
gwmock repository upload 123456 --file data.gwf --sandbox
# Publish to sandbox
gwmock repository publish 123456 --sandbox
Sandbox DOI example: 10.5072/zenodo.123456 (note the 10.5072/ prefix)
List Your Depositions
View all your depositions:
# List published records
gwmock repository list
# List draft (unpublished) records
gwmock repository list --status draft
# List in sandbox
gwmock repository list --status draft --sandbox
Output:
Listing published depositions...
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ ID ┃ Title ┃ DOI ┃ Created ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ 123456 │ GW Mock Data Challenge v1 │ 10.5281/zenodo.123456 │ 2024-01-15 │
│ 123455 │ BBH Parameter Study │ 10.5281/zenodo.123455 │ 2024-01-10 │
└────────────┴────────────────────────────────────┴───────────────────────┴────────────┘
Delete a Draft
Remove an unpublished deposition:
gwmock repository delete 123456
# Skip confirmation
gwmock repository delete 123456 --force
Note: Only unpublished (draft) depositions can be deleted. Published records are permanent.
Download Existing Records
Download files from any published record using the deposition ID:
# Download a file
gwmock repository download 123456 \
--file simulation_output.gwf \
--output ./downloaded_data.gwf
# Specify file size for faster timeout tuning
gwmock repository download 123456 \
--file large_dataset.gwf \
--output ./large_dataset.gwf \
--file-size-mb 5000
Metadata Best Practices¶
When publishing GW simulation data, include:
- Title: Clear, descriptive (e.g., "GW Mock Data Challenge v1: Binary Black Holes")
- Description: Simulation parameters, instruments, frequency range
- Creators: Full names and ORCiDs (if available)
- Keywords: gravitational waves, detector names (LIGO, Virgo), signal types (binary black holes, neutron stars)
- License: Recommend cc-by-4.0 for open science
- Related Identifiers: Link to papers, talks, or other related datasets
Example:
title: 'GW Mock Data Challenge v1: Synthetic Binary Black Hole Signals'
description: |
Simulated gravitational-wave strain data for LIGO Hanford, LIGO Livingston,
and Virgo detectors. Includes 1000 binary black hole coalescence waveforms
with varying masses (10-100 solar masses), spins, and sky positions.
Sampling rate: 16384 Hz
Duration: 8 seconds per event
Frequency range: 20-512 Hz
Generated using PyCBC v1.18.4 and LALSuite v7.0.
creators:
- name: 'Jane Doe'
orcid: '0000-0001-2345-6789'
affiliation: 'LIGO Laboratory, Caltech'
keywords:
- 'gravitational waves'
- 'LIGO'
- 'Virgo'
- 'binary black holes'
- 'mock data challenge'
- 'synthetic data'
license: 'cc-by-4.0'
Troubleshooting¶
403 Forbidden Error¶
Problem: Publishing or listing fails with 403 Client Error: FORBIDDEN
Solutions:
- Verify your token is valid:
gwmock repository verify - Generate a new token from https://zenodo.org/account/settings/applications/tokens/new
- Ensure the token has
deposit:writeanddeposit:actionsscopes - Check that you're using the correct environment (--sandbox for sandbox, omit for production)
Token Not Found¶
Problem: Error: No Zenodo access token provided
Solution: Set environment variables:
export ZENODO_API_TOKEN="your_token"
export ZENODO_SANDBOX_API_TOKEN="your_sandbox_token"
Upload Timeout¶
Problem: Large files fail with timeout errors
Solution: The CLI auto-adjusts timeouts based on file size (10 seconds per MB). For very large files (> 10 GB), you can manually specify:
gwmock repository upload 123456 --file huge_file.gwf
The retry logic with exponential backoff will automatically retry on transient failures.
Cannot Modify After Publishing¶
Problem: Need to fix metadata after publishing
Solution: Create a new deposition. Zenodo treats each published version as immutable. You can:
- Create a new deposition with updated metadata
- Link it to the previous version using
related_identifiersin metadata - Publish with an increment (e.g., "v2")