C#
Installation
Install the nuget package from https://www.nuget.org/packages/AprilAsr
Getting Started
To get started, import AprilAsr
using AprilAsr;
Model
You can load a model like so:
string modelPath = "/path/to/model.april";
AprilModel model = new AprilModel(modelPath);
Models have a few metadata fields:
string name = model.Name;
string description = model.Description;
string language = model.Language;
int sampleRate = model.SampleRate;
Session
A session needs a callback. You can define one inline, this example concatenates the tokens to a string and prints it.
AprilSession session = new AprilSession(model, (result, tokens) => {
if (tokens == null) return;
string s = "";
if(result == AprilResultKind.PartialRecognition) {
s = "- ";
}else if(result == AprilResultKind.FinalRecognition) {
s = "@ ";
}else{
s = " ";
}
foreach(AprilToken token in tokens) {
s += token.Token;
}
Console.WriteLine(s);
});
Session Options
There are more options when it comes to creating a session, here is the initializer signature:
public AprilSession(AprilModel model, SessionCallback callback, bool async = false, bool noRT = false, string speakerName = "") {
Refer to the General Concepts page for an explanation on asynchronous, non-realtime, and speaker name options
Feed data
Most of the examples use a very simple method like this to load and feed audio:
// Read the file data (assumes wav file is 16-bit PCM wav)
var fileData = File.ReadAllBytes(wavFilePath);
short[] shorts = new short[fileData.Length / 2];
Buffer.BlockCopy(fileData, 0, shorts, 0, fileData.Length);
// Feed the data
session.FeedPCM16(shorts, shorts.Length);
This works only if the wav file is PCM16 and sampled in the correct sample rate. When you attempt to load an mp3, non-PCM16/non-16kHz wav file, or any other audio file in this way, you will likely get gibberish or no results.
Asynchronous
Asynchronous sessions are a little more complicated. You can create one by setting the asynchronous flag to true:
AprilSession session = new AprilSession(..., async: true);
Now, when feeding audio, be sure to feed it in realtime.
var fileData = File.ReadAllBytes(wavFilePath);
short[] shorts = new short[2400];
for(int i=0; i<(fileData.Length/2); i+=shorts.Length){
int size = Math.Min(shorts.Length, (fileData.Length/2) - i);
Buffer.BlockCopy(fileData, i*2, shorts, 0, size*2);
session.FeedPCM16(shorts, size);
Thread.Sleep(size * 1000 / model.SampleRate);
}
session.Flush();
Complete example
using AprilAsr;
var modelPath = "aprilv0_en-us.april";
var wavFilePath = "audio.wav";
// Load the model and print metadata
var model = new AprilModel(modelPath);
Console.WriteLine("Name: " + model.Name);
Console.WriteLine("Description: " + model.Description);
Console.WriteLine("Language: " + model.Language);
// Create the session with an inline callback
var session = new AprilSession(model, (result, tokens) => {
string s = "";
if(result == AprilResultKind.PartialRecognition) {
s = "- ";
}else if(result == AprilResultKind.FinalRecognition) {
s = "@ ";
}else{
s = " ";
}
foreach(AprilToken token in tokens) {
s += token.Token;
}
Console.WriteLine(s);
});
// Read the file data (assumes wav file is 16-bit PCM wav)
var fileData = File.ReadAllBytes(wavFilePath);
short[] shorts = new short[fileData.Length / 2];
Buffer.BlockCopy(fileData, 0, shorts, 0, fileData.Length);
// Feed the data and flush
session.FeedPCM16(shorts, shorts.Length);
session.Flush();
Congratulations! You have just performed speech recognition using AprilAsr!