C#

Installation

Install the nuget package from https://www.nuget.org/packages/AprilAsr

Getting Started

To get started, import AprilAsr

using AprilAsr;

Model

You can load a model like so:

string modelPath = "/path/to/model.april"; 
AprilModel model = new AprilModel(modelPath);

Models have a few metadata fields:

string name = model.Name;
string description = model.Description;
string language = model.Language;
int sampleRate = model.SampleRate;

Session

A session needs a callback. You can define one inline, this example concatenates the tokens to a string and prints it.

AprilSession session = new AprilSession(model, (result, tokens) => {
    if (tokens == null) return;

    string s = "";
    if(result == AprilResultKind.PartialRecognition) {
        s = "- ";
    }else if(result == AprilResultKind.FinalRecognition) {
        s = "@ ";
    }else{
        s = " ";
    }

    foreach(AprilToken token in tokens) {
        s += token.Token;
    }

    Console.WriteLine(s);
});

Session Options

There are more options when it comes to creating a session, here is the initializer signature:

public AprilSession(AprilModel model, SessionCallback callback, bool async = false, bool noRT = false, string speakerName = "") {

Refer to the General Concepts page for an explanation on asynchronous, non-realtime, and speaker name options

Feed data

Most of the examples use a very simple method like this to load and feed audio:

// Read the file data (assumes wav file is 16-bit PCM wav)
var fileData = File.ReadAllBytes(wavFilePath);
short[] shorts = new short[fileData.Length / 2];
Buffer.BlockCopy(fileData, 0, shorts, 0, fileData.Length);

// Feed the data
session.FeedPCM16(shorts, shorts.Length);

This works only if the wav file is PCM16 and sampled in the correct sample rate. When you attempt to load an mp3, non-PCM16/non-16kHz wav file, or any other audio file in this way, you will likely get gibberish or no results.

Asynchronous

Asynchronous sessions are a little more complicated. You can create one by setting the asynchronous flag to true:

AprilSession session = new AprilSession(..., async: true);

Now, when feeding audio, be sure to feed it in realtime.

var fileData = File.ReadAllBytes(wavFilePath);
short[] shorts = new short[2400];

for(int i=0; i<(fileData.Length/2); i+=shorts.Length){
    int size = Math.Min(shorts.Length, (fileData.Length/2) - i);
    Buffer.BlockCopy(fileData, i*2, shorts, 0, size*2);
    session.FeedPCM16(shorts, size);
    Thread.Sleep(size * 1000 / model.SampleRate);
}

session.Flush();

Complete example

using AprilAsr;

var modelPath = "aprilv0_en-us.april";
var wavFilePath = "audio.wav";

// Load the model and print metadata
var model = new AprilModel(modelPath);
Console.WriteLine("Name: " + model.Name);
Console.WriteLine("Description: " + model.Description);
Console.WriteLine("Language: " + model.Language);

// Create the session with an inline callback
var session = new AprilSession(model, (result, tokens) => {
    string s = "";
    if(result == AprilResultKind.PartialRecognition) {
        s = "- ";
    }else if(result == AprilResultKind.FinalRecognition) {
        s = "@ ";
    }else{
        s = " ";
    }

    foreach(AprilToken token in tokens) {
        s += token.Token;
    }

    Console.WriteLine(s);
});

// Read the file data (assumes wav file is 16-bit PCM wav)
var fileData = File.ReadAllBytes(wavFilePath);
short[] shorts = new short[fileData.Length / 2];
Buffer.BlockCopy(fileData, 0, shorts, 0, fileData.Length);

// Feed the data and flush
session.FeedPCM16(shorts, shorts.Length);
session.Flush();

Congratulations! You have just performed speech recognition using AprilAsr!

april-asr Documentation