Virtual Device API Documentation

Overview

We provide APIs to interact with our virtual devices programmatically. These APIs can be accessed via Node.js or HTTP.

They are easy to work with, and require simply sending a payload of what you want to "say" to Alexa or Google, such as:

virtualDevice.message("ask my skill what is the weather", (result) => {
    console.log(result.transcript); // Prints out the reply from Alexa - e.g., "the weather is nice"
});

It is as easy as that! For more information on how our end-to-end testing work, read here.

Node.js API

Installation

  1. Add the Virtual Device SDK to your project:
    npm install virtual-device-sdk --save
    
  2. Get your token: Follow the instructions here.

Constructor parameters

  • token: Your virtual device token, check here how to obtain it
  • locale: The locale you are using, defaults to en-US
  • voiceID: The voice from Polly to use with the current locale, defaults to the default voice for the locale
  • skipSTT: Skip speech to text for Google (Google can return text directly), defaults to false
  • asyncMode: Retrieve the conversation sent in batch asynchronously, defaults to false
  • stt: What speech to text to use (google or witai), defaults to google
  • locationLat: Location Latitude used in Google Virtual Devices.
  • locationLong: Location Longitude used in Google Virtual Devices.
  • conversationId: Set a conversation id in advance for the batch process in async mode.

Sending a Message

Here is a simple example in Javascript:

const vdSDK = require("virtual-device-sdk");
const locale = "en-US";
const voiceId = "Joey"
const virtualDevice = new vdSDK.VirtualDevice("<PUT_YOUR_TOKEN_HERE>", locale, voiceId);
virtualDevice.message("open my skill").then((result) => {
    console.log("Reply Transcript: " + result.transcript);
    console.log("Reply Card Title: " + result.card.mainTitle);
});

Result Payload

Here is the full result payload:

export interface IVirtualDeviceResult {
    card: ICard | null;
    debug?: {
        rawJSON: any;
    };
    sessionTimeout: number;
    streamURL: string | null;
    transcript: string;
    transcriptAudioURL: string;
}

export interface ICard {
    imageURL: string | null;
    mainTitle: string | null;
    subTitle: string | null;
    textField: string;
    type: string;
}

Adding homophones

Our end-to-end tests use speech recognition for turning the output speech coming back from the virtual device into text.

This process is imperfect - to compensate for this, homophones can be specified for errors that occur when a reply from the virtual device is misunderstood.

Before sending your message call the "addHomophones" method to indicate what are the ones you expect

const virtualDevice = new vdSDK.VirtualDevice("<PUT_YOUR_TOKEN_HERE>", locale, voiceId);

virtualDevice.addHomophones("white", ["wife", "while"]);

virtualDevice.message("open my skill").then((result) => {
    console.log("Reply Transcript: " + result.transcript);
    console.log("Reply Card Title: " + result.card.mainTitle);
});

Sending messages in batch

You can send multiple messages and expected phrases in an object array. The goal of this method is to handle a complete interaction with the virtual device. By sending all messages to the endpoint, it is able to sequence them faster, avoiding session timeouts.

sdk.batchMessage(
    [{text: "what is the weather"}, {text:  "what time is it"}, {text: "tell test player to play"}],
).then((results) => {
    console.log("Results: ", results);
});

The result will be an array of objects as defined in the Result Payload Section

Processing the results asynchronously

By default when you use batchMessage the promise will return a promise that will resolve once the complete interaction is finished. By setting the async mode in the constructor of the SDK instance you can change that behavior so that it returns a conversation id that can be used to retrieve the results progressively.

const locale = "en-US";
const voiceId = "Joey";
const skipSTT = false;
const asyncMode = true;

const virtualDevice = new vdSDK.VirtualDevice("<PUT_YOUR_TOKEN_HERE>", locale, voiceId, skipSTT, asyncMode);

sdk.batchMessage(
    [{text: "what is the weather"}, {text:  "what time is it"}, {text: "tell test player to play"}],
).then((result) => {
    console.log("Conversation Id: ", result.conversation_id);

    // logic to wait some time

    return sdk.getConversationResults(result.conversation_id);
}).then((results) => {
    console.log("Results: ", results);
});

As the results are processed the "getConversationResults" method will return them in subsequent calls to the method.

HTTP API

The VirtualDevice service can also be called directly via HTTP.

To use it, first get your token:
Follow the instructions here.

The Base URL is:
https://virtual-device.bespoken.io

Process Endpoint

Takes a single message and returns the AVS response in text form.

  • URL

    /process

  • Method:

    GET

  • URL Params

    Required:

    message=[string]: the message that we want to send to Alexa

    user_id=[string]: "validation token" obtained from Bespoken Dashboard (http://apps.bespoken.io/dashboard)

    Optional:

    language_code=[string]: one of Alexa's supported locales (e.g. en-US, de-DE, etc.). Default value: "en-US". Taken from https://developer.amazon.com/docs/custom-skills/develop-skills-in-multiple-languages.html#h2-code-changes

    voice_id=[string]: one of Amazon Polly's supported voices (e.g. Joey, Vicki, Hans, etc.). Default value: "Joey". MUST correspond with the language_code. Taken from: https://docs.aws.amazon.com/polly/latest/dg/voicelist.html

    phrases=[string]: a word or phrase used as a hint so that the speech recognition is more likely to recognize them as part of the alexa response. You can use this multiple times in the query string to send more than one.

    debug=[boolean]: return additional information like process duration, transcript duration and raw response. Default value: "false"

    new_conversation=[boolean]: open a new session, only for google virtual devices. Default value: "false"

    stt=[string]: speech to text service to use, supported services are google and witai. Default value: "google"

  • Success Response:

    • Code: 200
      Content:
       {
              "streamURL": "string",
              "sessionTimeout": 0,
              "transcriptAudioURL": "string",
              "message": "string",
              "transcript": "string",
              "card": {
                  "subTitle": "string",
                  "mainTitle": "string",
                  "textField": "string",
                  "type": "string",
                  "imageURL": "string"
              }
          }
      
  • Error Response:

    • Code: 400 BAD REQUEST
      Content: "message is required"

    • Code: 400 BAD REQUEST
      Content: "user_id is required"

    • Code: 400 BAD REQUEST
      Content: "Invalid user_id"

    • Code: 400 BAD REQUEST
      Content: {error: 'stt is invalid, it could be google or witai'}

    • Code: 500 INTERNAL SERVER ERROR
      Content: {error: 'error message in case of an exception'}

  • Sample Call:

curl "https://virtual-device.bespoken.io/process?message="what time is it"&user_id=<your user id>&voice_id=Joey&language_code=en-US" ;
  • Notes:

    • Not sending the language_code or voice_id will default both to en-US and Joey.

Batch process Endpoint

Receives multiple messages and expected phrases in an object array. The goal of this endpoint to handle a complete interaction with Alexa. By sending all messages to the endpoint, it is able to sequence them faster, avoiding session timeouts.

  • URL

    /batch_process

  • Method:

POST

  • URL Params

    Required:

    user_id=[string]: "validation token" obtained from Bespoken Dashboard (http://apps.bespoken.io/dashboard)

    Optional:

    language_code=[string]: one of Alexa's supported locales (e.g. en-US, de-DE, etc.). Default value: "en-US". Taken from https://developer.amazon.com/docs/custom-skills/develop-skills-in-multiple-languages.html#h2-code-changes

    voice_id=[string]: one of Amazon Polly's supported voices (e.g. Joey, Vicki, Hans, etc.). Default value: "Joey". MUST correspond with the language_code. Taken from: https://docs.aws.amazon.com/polly/latest/dg/voicelist.html

    async_mode=[boolean]: process the messages in the background, results can be obtained in the conversation endpoint. Default value: "false".

    debug=[boolean]: return additional information like process duration, transcript duration and raw response. Default value: "false"

    stt=[string]: speech to text service to use, supported services are google and witai. Default value: "google"

    location_lat=[float]: only for google, latitude coordinate from the location of the request

    location_long=[float]: only for google, longitude coordinate from the location of the request

    conversation_id=[string]: only from async_mode, set the conversation id

  • Data Params

    Required:

    messages=[array]: object array where each object represent a message sent to the device. It could be text or audio. To send text, set the "text" field with the message, "phrases" property is an optional array of strings representing words or phrases used as hint for the speech recognition library to recognize them better. To send audio, set the "audio" field with the base64 representation of the audio, is also mandatory the "format" field of the audio, currently we support the formats supported by ffmpeg, for "raw" or "pcm" formats "frame_rate", "channels" and "sample_width" could be set, if not Default values will be used "frame_rate": 16000, "channels": 1, "sample_width": 2

    {
      "messages": [
        {"text":"string", "phrases":["string"], "audio":["string"], "format":["string"], "frame_rate":["int"], "channels":["int"], "sample_width":["int"]}
      ]
    }
    
  • Success Response:

    • Code: 200
      Content:
      {
          "results": [
            {
                "streamURL": "string",
                "sessionTimeout": 0,
                "transcriptAudioURL": "string",
                "message": "string",
                "transcript": "string",
                "card": {
                    "subTitle": "string",
                    "mainTitle": "string",
                    "textField": "string",
                    "type": "string",
                    "imageURL": "string"
                }
            }
          ]
      }
      
  • Success Response (async mode):

    • Code: 200
      Content:
      {
          "conversation_id": "string",
      }
      
  • Error Response:

    • Code: 400 BAD REQUEST
      Content: "Invalid message"

    • Code: 400 BAD REQUEST
      Content: "Invalid user_id"

    • Code: 400 BAD REQUEST
      Content: {error: 'stt is invalid, it could be google or witai'}

    • Code: 400 BAD REQUEST
      Content: {error: 'Invalid format in message'}

    • Code: 400 BAD REQUEST
      Content: {error: 'Invalid encoded audio in message'}

    • Code: 400 BAD REQUEST
      Content: {error: 'Error processing audio'}

    • Code: 500 INTERNAL SERVER ERROR
      Content: {error: 'error message in case of an exception'}

  • Sample Call:

    const userId = <your user id>;
    const voiceId = "Joey";
    const languageCode = "en-US":
    
    $.post(`https://virtual-device.bespoken.io/batch_process?user_id=${userId}&voice_id=${voiceId}&language_code=${languageCode}`,  
    {  
        "messages": [
            {"text":"open guess the price", "phrases":["how many persons"]},
            {"text":"one"}
        ]
    },  
    function(data, status){  
        console.log("Got: " + data.results.length + " responses!");  
    });
    
  • Notes:

    • Not sending the language_code or voice_id will default both to en-US and Joey.

Conversation Endpoint

Obtains the processed results from a batch process sent in async mode

  • URL

    /conversation

  • Method:

GET

  • URL Params

    Required:

    uuid=[string]: the id of the conversation, returned by the Batch process when async mode param is set to true

  • Success Response:

    • Code: 200
      Content:
      {
          "results": [
            {
                "streamURL": "string",
                "sessionTimeout": 0,
                "transcriptAudioURL": "string",
                "message": "string",
                "transcript": "string",
                "card": {
                    "subTitle": "string",
                    "mainTitle": "string",
                    "textField": "string",
                    "type": "string",
                    "imageURL": "string"
                }
            }
          ]
      }
      
  • Error Response:

    • Code: 400 Bad Request
      Content: {error: 'Required parameter uuid is missing'}

    • Code: 404 Not Found
      Content: {error: 'The uuid provided doesn\'t match a process'}

    • Code: 500 INTERNAL SERVER ERROR
      Content: {error: 'error message in case of an exception'}

  • Sample Call:

    const userId = <your user id>;
    const voiceId = "Joey";
    const languageCode = "en-US":
    
    $.post(`https://virtual-device.bespoken.io/batch_process?user_id=${userId}&voice_id=${voiceId}&language_code=${languageCode}&async=true`,
    {
        "messages": [
            {"text":"open guess the price", "phrases":["how many persons"]},
            {"text":"one"}
        ]
    },
    function(data, status){
        console.log("Got conversation id: " + data.conversation_id);
    
        // Logic implementation to wait for a number of seconds
    
        $.post(`https://virtual-device.bespoken.io/conversation?uuid={uuid},
        function(convData, convStatus){
            console.log("Got: " + convData.results.length + " responses!");
        });
    });