Canary Speech has published several requirements and recommendations when it comes to producing audio files containing voice. These are vital to ensuring that audio quality is to a sufficient standard so that vocal features are not lost due to sampling errors or compression, thus increasing the accuracy of the analysis as much as possible.
Minimum Configuration | |
---|---|
Codec |
|
Sample Rate | 16,000 samples per second |
Bit Depth | 16 bits (2 bytes) per sample |
Channel Count | 1 per speaker |
Recommended Configuration | |
---|---|
Codec | Uncompressed (PCM, WAV) |
Sample Rate | 48,000 samples per second |
Bit Depth | 16 bits (2 bytes) per sample |
Channel Count | 1 per speaker |
In addition to the format of the file itself, Canary Speech also publishes a recommended convention for the file names. While this is not required, it does help reduce the likelihood of filename collisions as well as aid in debugging efforts should manual review ever become necessary.
Example: