Reading Data¶
This guide explains how to read and work with GWF (frame file) data generated by gwmock.
We use the GWpy Python package for these examples. For more details, refer to the GWpy documentation.
Reading Frame Files¶
gwmock generates data in GWF (frame file) format. To read data from a frame file:
from gwpy.timeseries import TimeSeries
# Read specific channel from a frame file
data = TimeSeries.read("filename.gwf", channel="E1:STRAIN")
Parameters:
filename: Path to the GWF filechannel: Channel name to read (common format:DETECTOR:CHANNEL_NAME)
Example¶
from gwpy.timeseries import TimeSeries
# Read E1 strain data
e1_data = TimeSeries.read("E-E1-NOISE_STRAIN-1577491218-4096.gwf", channel="E1:STRAIN")
# Check properties
print(f"Duration: {e1_data.duration}")
print(f"Sampling frequency: {e1_data.sample_rate}")
print(f"Start time: {e1_data.t0}")
Merging Frame Files¶
Frame files generated by gwmock may contain different types of content (noise, signals, glitches). To obtain a realistic data stream, merge multiple files:
from gwpy.timeseries import TimeSeries
# Read noise and signal data
noise_data = TimeSeries.read("filename_noise.gwf", channel="E1:STRAIN")
signal_data = TimeSeries.read("filename_signal.gwf", channel="E1:STRAIN")
# Combine them
combined_data = noise_data.inject(signal_data)
You can also merge files directly using the CLI:
gwmock merge filename_noise.gwf filename_signal.gwf \
--metadata noise-0.metadata.yaml \
--metadata signal-0.metadata.yaml \
--channel E1:STRAIN \
--output-channel E1:STRAIN
This produces a merged frame file and a merged metadata file documenting all input files and merge details.
Merging Multiple Files¶
To merge a sequence of files:
from gwpy.timeseries import TimeSeries
files = [
"E-E1-NOISE_STRAIN-1000000000-1024.gwf",
"E-E1-NOISE_STRAIN-1000001024-1024.gwf",
"E-E1-NOISE_STRAIN-1000002048-1024.gwf"
]
# Read all files
data_list = [TimeSeries.read(f, channel="E1:STRAIN") for f in files]
# Concatenate
combined = data_list[0]
for data in data_list[1:]:
combined = combined.append(data)
Warning
Two time series can only be combined if:
- Time properties match: Same start time, sampling frequency, and continuous coverage
- Units match: Both must have the same physical units (e.g., strain)
If units differ, override them before combining:
from astropy.units import Unit
noise_data.override_unit(Unit(""))
signal_data.override_unit(Unit(""))
Accessing Metadata¶
gwmock automatically generates metadata files for each simulation. Access them with:
import json
# Read metadata
with open("metadata/noise-0.metadata.yaml", "r") as f:
import yaml
metadata = yaml.safe_load(f)
print(metadata["simulator_name"])
print(metadata["simulator_config"])
print(metadata["output_files"])
Metadata includes:
- Simulator configuration
- Random number generator state (for reproducibility)
- Output file names
- Version information
- Generation timestamps
For a quick guide on how to inspect and reuse metadata files to reproduce a dataset, see the Metadata Files page.
Working with Multiple Detectors¶
Process data from multiple detectors:
from gwpy.timeseries import TimeSeries
detectors = ["E1", "E2", "E3"]
# Read data for each detector
detector_data = {}
for detector in detectors:
channel = f"{detector}:STRAIN"
filename = f"E-{detector}-NOISE_STRAIN-1000000000-1024.gwf"
detector_data[detector] = TimeSeries.read(filename, channel=channel)
# Process or analyze each
for detector, data in detector_data.items():
print(f"{detector}: {data.duration.to('minute')} of data")
Plotting Data¶
Visualize the data using GWpy's plotting utilities:
from gwpy.timeseries import TimeSeries
import matplotlib.pyplot as plt
# Read data
data = TimeSeries.read("E-E1-NOISE_STRAIN-1000000000-1024.gwf", channel="E1:STRAIN")
# Plot time series
plot = data.plot(title="Strain Data")
plot.show()
# Plot power spectral density
spectrum = data.psd()
plot = spectrum.plot()
plot.show()
Best Practices¶
- Always specify the channel: Use full channel name format
DETECTOR:CHANNEL_NAME - Check continuity: Verify time properties before combining files
- Preserve units: Don't remove or override units unless necessary
- Use metadata: Reference metadata files to understand generation parameters
- Handle large files: Use streaming/windowing for files larger than available RAM
Troubleshooting¶
"Channel not found" error
Check available channels in the file:
from gwpy.io import gwf
# List all channels
channels = gwf.get_channels("filename.gwf")
print(channels)
Units mismatch
Ensure both time series have compatible units:
# Check units
print(data1.unit)
print(data2.unit)
# Convert if needed
data2_converted = data2.to("strain")
Time alignment issues
Verify time properties before merging:
print(f"Data 1: {data1.t0} to {data1.tf}")
print(f"Data 2: {data2.t0} to {data2.tf}")