Glossary

Sensor Module

DQM

Data Quality Metric. The metric generated by the pipeline, including all the quality assessment metrics that are related to user’s research.

Correctness

One of the three aspects that the pipeline evaluates. It measures the overall level of the representational and value integrity of the input data via APD and SNR.

Completeness

One of the three aspects that the pipeline evaluates. It measures the overall level of availability and validity of the input data via IRLR, MDR, SCR, and VDR.

Consistency

One of the three aspects that the pipeline evaluates. It measures the overall uniformity of the input data via RLC, SRC, and VRC.

APD

Anomalous Point Density. The ratio of the number of outliers/anomalous data samples over the total number of samples.

IRLR

Interpretable Record Length Ratio. The ratio of the number of interpretable records over the total number of records. A record is interpretable if it satisfies the following: a. Has more than one row of data. b. Has a non-decreasing timestamp. c. Has non-zero standard deviation on each data channel.

MDR

Missing Data Ratio. The ratio of the number of missing data samples over the total number of samples. The MDR metric measures the level of discontinuity manifested through skipped data points due to irregular sampling.

RLC

Record Length Consistency. The uniformity of data record length across multiple records.

SCR

Sensor Channel Ratio. The ratio of the number of records that has full channels over the total number of records. A record is defined to have full channels if the number of its channels is the same as the mode of the number of channels of given input.

SNR

Signal-to-noise Ratio. The ratio of desired signal amplitude over the noise amplitude. This metric gives insight into the noise level rather than its type, providing feedback on the required level of data denoising.

SRC

Sampling Rate Consistency. The uniformity of the data sampling rate within a record.

VRC

Value Range Consistency. The uniformity of the value range across multiple records.

VDR

Valid Data Ratio. The ratio of the number of non-NaN data points over the total number of data points.

Audio Module

Observation Duration

The length of the audio file(s), in seconds.

Sampling Rate

The number of samples per second of the audio file(s), in Hertz.

Voice Classification

The detected sounds in the audio file(s) using YAMNet, a deep neural network.

RMS

The root mean squared value of the audio file(s). This can be used to distinguish audios that are louder from each other.

Video Module

Duration

The length of the video file(s), in seconds.

Resolution

The number of pixels contained in each frame of the video file(s), in pixels.

Format

The extension of the video file(s).

Bit Rate

The number of bits of the video file(s) per second, in bits/second.

Frame rate

The number of frames of the video file(s) per second , in frames/second.

Illumination

The average pixel magnitude of the RGB channels of the video file(s).

Creation Date

The creation date of the media instead of the actual file. If this part is missing in the metadata, the module will return N/A.

Artifact Ratio

The ratio of distorted frames in the video file(s) by the total number of frames, in fraction.