Alexa HTML SDK

Overview

Welcome to the Alexa HTML SDK. This library allows your web application to send and receive messages to your skill endpoint and to react to certain local device events. In order for a skill to display UI on-device using the HTML runtime the device must support the Alexa.Presentation.HTML interface.

For additional information, please review the following links

  1. Tech Docs Overview
  2. Interface Reference
  3. Known Issues

Getting Started

Skill

You begin by creating a skill using the standard ASK tools. You are required to have a skill even if you only plan on supporting devices with HTML.

Alexa.Presentation.HTML.Start

If a device supports HTML, it lists Alexa.Presentation.HTML as a supported interface. Your skill also needs to indicate that you support this interface in the skill manifest (if using the ASK CLI), or the ASK Developer Console.

If the Alexa.Presentation.HTML interface is available, then you can add a new directive indicating you want to load a web page. This can be done as a response to most intents, but cannot be sent with other UI interfaces like APL or RenderTemplate.

{
  "type": "Alexa.Presentation.HTML.Start",
  "data": {}, // data you can send that is made available when the client is initialized
  "configuration": {
    "timeoutInSeconds": 90 // you can supply a timeout of up to 1800 seconds to indicate how long the screen stays on with no customer interaction
  },
  "request": {
      "method": "GET",
      "headers": { "authToken": "xxxxx" }, // Optional Authorization header to access game resources
      "uri": "https://{your domain}/{your file name}.html" // Required, URI of the html page
  }
}

NOTE: This is the JSON representation of the directive, each SDK has a different way to add these to your skill response.

Web Page

The next part of supporting HTML is creating and hosting your web content.

HTML Example

Below is a very simple example of an HTML file that references the Alexa HTML SDK.

<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8">
        <title>My First HTML Skill for Alexa</title>
        <style>
            body { margin: 0; background-color: black; color: white;  }
        </style>
    </head>
    <body>
    <script src="https://cdn.html.games.alexa.a2z.com/alexa-html/latest/alexa-html.js"></script>
    <div id="hello"></div>
    <script>
    Alexa.create({version: '1.0'})
        .then((args) => {
            const {
                alexa,
                message
            } = args;
            document.getElementById('hello').innerHTML = '<b>:)Alexa is ready</b>';
        })
        .catch(error => {
            document.getElementById('hello').innerHTML = '<b>:( Alexa not ready</b>';
        });
    </script>
    </body>
</html>

Notice that we set a background color of black. Having a loading page or some sort of experience while your assets load is key to letting your customers know that your app is getting ready for use. We recommend the background color stay consistent during loading and when your app is ready for use in order to avoid a "flashing" effect.

Hosting

You can host this file on AWS S3, a CDN like AWS CloudFront, or any web server you like that supports HTTPS. If you host on S3, remember to make your object public and use the HTTPS endpoint. The HTTPS object URL will usually be https://{bucket name}.s3.amazonaws.com/{object key}. If you are using the static website hosting feature of S3, you need to enable CloudFront to serve HTTPS.

Device Script File Interception

Referencing the HTML SDK via script tag works differently on-device than off-device. On-device, the HTTP GET request will be intercepted, and the JavaScript file that is meant to work on-device will be returned instead. Off-device, this file returns the JavaScript file contents, but the default protocol will not work.

For now, you have a couple options for local development:

  1. Don't include this script reference unless publishing. You can instead check for existence of the Alexa object and if it does not exist you can create your own mock.
  2. Create your own MessageProvider and trigger events as needed to simulate interactions.

Construction of Alexa API

As usual, the HTML application begins whe the page loads. The Alexa connection needs to be established at this time and you can do by creating the client.

let client;
Alexa.create({version: '1.0'})
    .then((args) => {
        const {
            alexa,
            message
        } = args;
        client = alexa;
    });

See the create method for additional information.

Versioning

Over time the Alexa API may need to introduce breaking changes to its function signatures. When it does so, the version number will change. You request a specific version number in the Alexa constructor to indicate that you've written your code against a specific set of signatures. Future versions of the Alexa API will accommodate requests for older interfaces wherever possible, ensuring that your code does not break. If you do not provide a version, or set version: "", then the latest version will be used.

Alexa Lifecycle

You can detect whether your customer's device supports HTML by inspecting the context of a skill request and checking for the existence of Alexa.Presentation.HTML in supportedInterfaces

{
  "System": {
    "application": {
      "applicationId": "amzn1.ask.skill.XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
    },
    "user": {
      "userId": "amzn1.ask.account.XXXX"
    },
    "device": {
      "deviceId": "amzn1.ask.device.XXXX",
      "supportedInterfaces": {
        "Alexa.Presentation.HTML": {
          "runtime": {
            "maxVersion": "1.0"
          }
        }
      }
    },
    "apiEndpoint": "https://api.amazonalexa.com",
    "apiAccessToken": "XXXXXXXXXXXXXXXXXXXXXXXX"
  }
}

If the device supports HTML, you can respond with an Alexa.Presentation.HTML.Start directive to load your web application. The device then loads the given URL, at which point, the web app then needs to initialize the Alexa SDK and wait for the connection to be established. Once the SDK is initialized, the create promise will be resolved with the alexa client and any start data you included in your Alexa.Presentation.HTML.Start directive. At this point the rest of the API is ready for use.

The Alexa Client Object

The alexa client returned by create is your connection to Alexa. See the full client reference for additional information.

Skill Communication

HTML applications can communicate with their skill endpoint to send local inputs, receive voice inputs, and use Alexa services like the skill store. To do this, the client uses a bidirectional socket-like messaging scheme. The application can send a message to the skill at any time (e.g. if the user touches the screen, or a timer elapses), and the skill can send one to the application at any time (e.g. should it receive an intent request). This mechanism is asynchronous and not guaranteed to be ordered.

alexa.skill.onMessage((message) => {
  // This is invoked for every HandleMessage directive from the skill.
  console.log('received a message: ' + message);
});

// You can send whatever data you need to be processed on your skill endpoint
alexa.skill.sendMessage({
    players: 1,
    state: {
        level: 5,
        health: 0.5
    },
    speech: 'are you ready for the next level?'
});

For more information on sendMessage and onMessage, see the skill interface.

NOTE: The skill.sendMessage function is rate limited to a maximum of 2 requests per second. To handle the throttling error (or any other error), please supply the optional callback argument to the sendMessage function. See sendMessage for additional information.

Device Capabilities

The SDK surfaces capabilities about the device via the alexa.capabilities object. For instance, this object contains information on whether a device supports wake-word and/or push-to-talk.

Speech

You can leverage speech by sending a message to your skill and have the skill return the usual outputSpeech. You can also listen to global speech events on-device when the device's speech player starts and stops playing.

alexa.speech.onStarted(() => {
    console.log('speech is playing');
});

alexa.speech.onStopped(() => {
    console.log('speech stopped playing');
});

If you're using transformers, you can also fetch and demux the transformed speech mp3 to extract the audio buffer and speech marks.

const transformerSpeechMp3 = 'https://tinytts.amazon.com/resource.mp3';
Alexa.utils.speech.fetchAndDemuxMP3(transformerSpeechMp3).then((audioResult) => {
    const {audioBuffer, speechMarks} = audioResult;
    playAudioAndAnimateCharacterSpeech(audioBuffer, speechMarks);
});

See Speech and SpeechUtils for additional information.

Voice

You can use the voice object to listen for voice input events to react to events occurring on-device. You can listen to global microphone events when device opens and closes the microphone.

alexa.voice.onMicrophoneOpened(() => {
    dimScreen();
    duckAudio();
});
alexa.voice.onMicrophoneClosed(() => {
    undimScreen();
    restoreAudio();
});

On devices that support wake-word activation, you can request the microphone to open programatically to avoid making a round-trip to the skill.

let requestMicrophoneOpenSupported = alexa.capabilities.microphone.supportsWakeWord;

function openMic() {
    if (requestMicrophoneOpenSupported){
        alexa.voice.requestMicrophoneOpen({
            onError: (reason) => {
                if (reason === "request-open-unsupported") {
                    requestMicrophoneOpenSupported = false;
                    openMicrophoneUsingSkillFallback();
                } else {
                    console.log("Microphone open error: " + reason);
                }
            }
        });
        return;
    }
    if (alexa.capabilities.microphone.supportsPushToTalk) {
        promptUserToPressMicButton();
        return;
    }
    openMicrophoneUsingSkillFallback();
}
function openMicrophoneUsingSkillFallback() {
  // Send a message to your skill and respond with { 'shouldEndSession': false }
  // to open the microphone
  // https://developer.amazon.com/en-US/docs/alexa/custom-skills/request-and-response-json-reference.html
  // e.g.
  alexa.skill.sendMessage({ action: "open-mic" });
}

function promptUserToPressMicButton() {
  // Update your web page to show a prompt to the user to show that the game
  // is waiting for them to respond with a voice command, which can be initiated
  // by pressing the mic button.
}

See Voice for additional information.

2019, Amazon.com, Inc. or its affiliates. All Rights Reserved.