Recording Guidelines

Audio Format

Canary Speech has published several requirements and recommendations when it comes to producing audio files containing voice. These are vital to ensuring that audio quality is to a sufficient standard so that vocal features are not lost due to sampling errors or compression, thus increasing the accuracy of the analysis as much as possible.

When uploading a recording, the raw PCM bytes must be prefixed by a wav header to allow Canary Speech to parse the audio encoding. In the HTTP request, a content-type header is required and must be set to any variation of the audio/wav MIME type.

Minimum Configuration
Codec	• Uncompressed (WAV)
Sample Rate	16,000 samples per second
Bit Depth	16 bits (2 bytes) per sample
Channel Count	1 per speaker

Recommended Configuration
Codec	Uncompressed (WAV)
Sample Rate	48,000 samples per second
Bit Depth	16 bits (2 bytes) per sample
Channel Count	1 per speaker

Recording Guidelines

Audio Format

Further Reading