Hints on Data Analysis
What data do we get from the Guardian Earbud?
EEG
You EEG file is exported as a CSV file with two columns: timestamps and ch1. Here’s what each column represents:
timestamps (Column A): The timestamps in your EEG recording file correspond to the UNIX time format, representing the number of seconds elapsed since the epoch time of January 1, 1970, 00:00:00 UTC. Each timestamp shows the exact time at which the corresponding EEG amplitude was recorded, in seconds with millisecond precision (up to three decimal places).
For example: The timestamp
1730465950.0
corresponds to a specific moment in time: Monday, October 2, 2024, 08:32:30 UTC. Each subsequent timestamp (such as1730465950.004
,1730465950.008
) represents an additional 4 milliseconds after the previous one, meaning your device is recording at a sample rate of 250 Hz (one sample every 4 milliseconds).ch1 (Column B): The
ch1
column in the EEG CSV file represents the raw electrical potential difference between the two earpieces of the EEG device, with respect to a reference electrode. This differential signal, expressed in microvolts (µV), is used to capture the brain’s electrical activity by measuring the tiny voltage changes caused by neural processes. Each value inch1
corresponds to a specific timestamp, indicating the potential difference recorded at that exact moment.The use of a reference electrode ensures that the measurements are standardized, reducing noise and focusing on the brain’s activity. This differential recording technique helps provide cleaner and more reliable EEG signals for analysis.
IMU
The IMU (Inertial Measurement Unit) data in the EEG device provides information about the device’s motion and orientation. Each IMU data point is recorded periodically and includes measurements from an accelerometer, gyroscope, and magnetometer, all timestamped for synchronization with EEG data.
Here’s a breakdown of the IMU fields in the CSV file:
timestamp: The UNIX timestamp indicating when the IMU data was recorded (in seconds since January 1, 1970). This allows you to align the IMU data with EEG data for analysis.
acc_x, acc_y, acc_z: The accelerometer readings in the X, Y, and Z axes (usually in meters per second squared, m/s²). These values represent the linear acceleration experienced by the device along each axis:
Positive or negative values in these axes indicate movement in a specific direction.
acc_z
often shows values close to -9.8 m/s² when the device is stationary, due to gravity’s effect.
magn_x, magn_y, magn_z: The magnetometer readings in the X, Y, and Z axes (usually in microteslas, µT). These values measure the magnetic field around the device:
The magnetometer readings can help determine the orientation of the device with respect to the Earth’s magnetic field, useful in compass-based orientation and heading.
gyro_x, gyro_y, gyro_z: The gyroscope readings in the X, Y, and Z axes (typically in degrees per second or radians per second). These values represent the rotational velocity of the device around each axis:
High values indicate rapid rotation around a given axis, while values close to zero indicate no rotation.
What’s the difference between raw EEG and filtered EEG?
Raw EEG
Raw EEG data represents the unprocessed electrical signals recorded directly from the brain. It contains:
Brain activity: Neural oscillations like delta, theta, alpha, beta, and gamma waves.
Artifacts: Non-brain signals such as:
Muscle activity (EMG), e.g., jaw clenches or facial movements.
Eye movements or blinks (EOG artifacts).
Electrical noise (e.g., from the environment or the device itself).
Heartbeats (ECG artifacts).
Baseline wander: Slow drifts in the signal due to factors like electrode movement or sweat.
Filtered EEG
Filtered EEG is a processed version of the raw data, designed to remove noise and artifacts while preserving brain activity of interest. Common filters applied:
High-pass filter: Removes slow drifts (baseline wander).
Low-pass filter: Eliminates high-frequency noise (e.g., muscle artifacts or device interference).
Bandpass filter: Focuses on specific frequency bands (e.g., alpha waves between 8–12 Hz).
Notch filter: Removes powerline noise (e.g., 50/60 Hz from electrical equipment).
Filtering helps enhance the signal-to-noise ratio, making it easier to analyze brain activity patterns without interference from unwanted sources.
How do I process raw EEG data?
You can simply preprocess your raw EEG data using Python libraries like scipy. Here’s a basic guide to get you started:
from scipy import signal
def do_bandpass(dataset: np.ndarray, filter_range: list, sample_rate=250) -> np.ndarray:
"""
This function band passes the data_set.
Parameters:
dataset (numpy array) : dataset to be band passed
fs (int) : sampling frequency
filter_range (list) : list of the filter range (usually 0.5 - 35)
Returns:
filtered_data (numpy array) : band passed dataset
"""
denom, nom = signal.iirfilter(
int(3),
[filter_range[0], filter_range[1]],
btype="bandpass",
ftype="butter",
fs=float(sample_rate),
output="ba",
)
filtered_data = signal.filtfilt(b=denom, a=nom, x=dataset, padtype=None)
return filtered_data
- Filter Design (
signal.iirfilter
): The function first designs a third-order IIR Butterworth filter. The third-order setting dictates the filter’s roll-off slope, achieving a gradual attenuation of about 18 dB per octave outside the passband. This is ideal for applications needing smooth frequency transitions without abrupt drops.
The Butterworth filter type (
ftype="butter"
) is chosen due to its flat frequency response in the passband, minimizing ripple and ensuring consistent amplitude for frequencies within the range.Bandpass Configuration (
btype="bandpass"
): The filter attenuates frequencies outside the user-specified range [0.5, 35] by combining a low-pass and high-pass effect, which isolates the desired band.Normalized Cutoff Frequencies: The cutoff frequencies are defined as
[0.5, 35]
and normalized usingfs=float(sample_rate)
, ensuring the filter design correctly aligns with the sampling rate. This avoids aliasing and ensures the cutoff points correspond precisely to the frequencies of interest.
- Filter Design (
- Zero-Phase Filtering (
signal.filtfilt
): Once the filter is defined, it’s applied to the dataset using signal.filtfilt, which performs zero-phase filtering. This method applies the filter forward and backward, canceling out any phase shift introduced by the IIR filter, critical for applications where signal shape and timing are essential.
By using padtype=None, no additional padding is added to the signal edges, which reduces the risk of edge distortion from incorrect boundary assumptions. This parameter, however, may introduce slight distortion at the signal’s ends if it’s short in duration, so for short signals, alternative padding methods might be considered.
- Zero-Phase Filtering (
How do I know if my EEG data is clean?
To assess the quality of your EEG data, you can use our quality score algorithm. You can generate a quality score for each EEG recording from the console.
You will get a CSV file with the following columns:
timestamp: The UNIX timestamp indicating when the quality score was generated.
signalQuality: The quality score of the EEG recording, ranging from 0 to 100.
You can consider a recording is clean if the quality score is above 50%. However, depending on your tasks, you can still use recordings that are above 30% quality score.
Where can I find public data for analysis?
We provide a sample dataset for you to get started with analysis. You can download the sample data from this link.