In part one we looked at Alexa as a service, diving into what makes up a skill a developer can put on the platform. We also looked at one half of the Alexa system, the Skills Interface.
So far we’ve set up the Skills Interface to recognize some of the user’s intents, including slots for objects and locations (in our case). We’ve also supplied a listing of utterances and are able to test the system on some example phrases.
When we finished the last post we saw the JSON object Alexa will send to the /Skill Service/.
{
"session": {
"sessionId": "SessionId.<sessionId>",
"application": {
"applicationId": "amzn1.ask.skill.<reference>"
},
"attributes": {},
"user": {
"userId": "amzn1.ask.account.<userId>"
},
"new": true
},
"request": {
"type": "IntentRequest",
"requestId": "EdwRequestId.<requestId>",
"locale": "en-US",
"timestamp": "2017-01-30T18:09:08Z",
"intent": {
"name": "RecordLocationIntent",
"slots": {
"Object": {
"name": "Object",
"value": "car keys"
},
"Location": {
"name": "Location",
"value": "kitchen drawer"
}
}
}
},
"version": "1.0"
}
How and where we handle this is (mostly) up to us. Amazon recommends you use AWS Lambda for your Skill Service. But, if you don’t want to you can use any server with a valid SSL certificate communicating over HTTPS.
While AWS Lambda is a fine service, we chose to build our service using the Phoenix web framework. Phoenix runs on Elixir, a functional language ideal for this kind of task.
There are various packages you can add to an Elixir application to assist with an Alexa skill. We chose Phoenix Alexa. We liked its small code base providing just enough for our needs.
You add Phoenix Alexa to the mix application in the normal way. Add it to both the deps
and the application
functions in mix.exs
.
def application do
applications: [ ... , :phoenix_alexa]
end
def deps do
[{:phoenix, "~> 1.2.0"},
...
{:phoenix_alexa, "~> 0.2.0"}]
end
We’ll pipe all calls from Alexa through the :api
pipeline. It’s set to receive JSON out of the box. However, there’s an additional step to writing an Alexa skill that can be certified which we’ll cover here.
Alexa’s service needs to offer you some guarantee that the messages that reach your endpoint are, indeed, coming from them. Rather than offering you an API token or the like, Amazon signs its messages using an X509 certificate and a private key. In the requests, Amazon takes the body and signs it using its private key. You then have to go fetch their certificate with their public key and ensure that the signature (which they provide as a header) matches the body you received.
Sounds complicated? Well, thankfully for you we’ve written and open-sourced a Plug that manages the whole process for you. So let’s insert add it to our dependencies:
def deps do
...
{:less_verifies_alexa, "~> 0.1.0"}]
end
And add it to our :api
pipeline:
pipeline :api do
plug :accepts, ["json"]
if Mix.env == :prod do
plug LessVerifiesAlexa.Plug, application_id: Application.get_env(:phoenix_alexa, :application_id)
end
end
We grab the :application_id
from an environment variable. If there’s something wrong with the request, the Plug will return a status code of 400 and halt the connection. You can find the source for the plug on GitHub, and we’ll take a deep dive into how it works in the third part of this series.
So. If now we should have a call from our application with JSON as its payload. We’ll add a route to the router to handle this incoming call.
scope "/command", OurApplication do
pipe_through :api
post "/", AlexaController, :command
end
Pretty standard stuff, pipe it through the :api
pipeline we set above. Then dispatch it to the :command
action in the AlexaController
.
Now we need to set up the controller to handle the Intents we’ll receive.
In the controller we make use of Phoenix’s web module. Here, we use the Phoenix Alexa controller module.
defmodule OurApplication.AlexaController do
use OurApplication.Web, :controller
use PhoenixAlexa.Controller, :command
...
The last line there is what handles the incoming request. That controller defines some functions as overridable,
...
defoverridable [launch_request: 2, intent_request: 3, session_ended_request: 2]
...
So, we can override those functions in our controller to manage responses.
The launch_request/2
function handles an initial request (with no intent included). However, we can ignore the session_ended_request/2
since the default one works just fine for our needs.
Let’s ignore the first two functions for now, the real magic happens in the intent_request/3
call.
Let’s take a look at a sample of our overrides for intent_request/2
def intent_request(conn, "RecordLocationIntent", request) do
response = Responses.record_location(request)
do_response(conn, response)
end
def intent_request(conn, "FindObjectIntent", request) do
response = Responses.retrieve_object_location(request)
do_response(conn, response)
end
Note the use of pattern matching in the function definition’s second parameter. We are being passed whatever string is in the request.request.intent.name
of the JSON payload.
An intent of RecordLocationIntent
or FindObjectIntent
will match. We pass the request off to another module Responses
calling the relevant function therein. When the response returns we use do_response
to handle what’s sent back to Alexa.
Imagine a RecordLocationIntent
comes in. The controller is going to call Responses.record_location/1
.
def record_location(
%{ request: %{ intent: %{ slots: %{
"Object" => %{"value" => object_name},
"Location" => %{"value" => location} } }, },
session: %{ user: %{ userId: user_id } }
}
) do
case Objects.get_by(user_id, object_name) do
:not_found ->
Objects.record(user_id, object_name, location)
{:ok, result} ->
Objects.update(user_id, object_name, location)
end
"Ok, we have your #{object_name} in the #{location}"
end
Now, some may balk at the pattern matching going on in the head of the function. Bear in mind we receive the request
as a deeply nested struct of structs from the controller. To process the response we need the object_name
, the location
and the user_id
.
We could just dig out the data we need with get_in
, but I think there’s something to be said for having a bird’s eye view of the data structure you’re being passed in through pattern matching like this.
Once inside the function we know we have the variables required. There’s no need for conditional logic to check their presence.
The function immediately following this in the module reads,
def record_location(_bad_request), do: "We don't have the info we need to process your request."
Pattern matching FTW! Whichever one matches returns a string outlining the result of the call.
We do the same thing for the FindObjectIntent
. When called retrieve_object_location/1
will pattern match the request. This time it’s a little less verbose (as there’s no need for location
).
def retrieve_object_location(
%{ request: %{ intent: %{ slots: %{
"Object" => %{"value" => object_name} } }, },
session: %{ user: %{ userId: user_id } }
}
) do
case Objects.get_by(user_id, object_name) do
{:not_found} ->
"I'm sorry, we don't have a location for #{object_name}"
{:ok, result} ->
{_user_id, %{location: location}} = result
"You last put your #{object_name} in the #{location}"
end
end
Again we add a retrieve_object_location/1
for when the function above doesn’t match.
def retrieve_object_location(_bad_request), do: "We don't have the info we need to process the request."
Anything falling through to the second declaration means we don’t have what we need. We return a descriptive string.
You may have noticed I’ve used some pretty abstract code in the examples for persistence. How you choose to persist your data is up to you. If using Phoenix you may want to use Ecto. There are other options available.
In our case we wanted to exercise our OTP muscles. We went for a combination of Elixir’s GenServer backed by Erlang’s dets
module. We had limited experience with dets
, so it was a nice opportunity to further our understanding. There’s a little set up required with supervision and the like, but once it’s in place, it’s a breeze to work with.
dets
does have some well documented restrictions in terms of how much data it can work with. But there are ways around those restrictions, either by means of hashing or by swaping out DETS altogether. The advantage of having a single storage interface with Objects
is that we can make whatever changes we want in the backend without having to rework any of the controller code.
Before I close out I want to jump back to the controller to see the request leave the codebase. Earlier I showed the controller calling do_response/2
when we had a response to the intent.
defp do_response(conn, text) do
response =
%Response{}
|> set_output_speech(%TextOutputSpeech{text: text})
|> set_should_end_session(true)
conn |> set_response(response)
end
This is pretty straightforward. We’re using functions given to us by Phoenix Alexa. We populate a %Response{}
struct via the set_output_speech
and set_should_end_session
functions. We then use set_response
passing in the conn
and the response
.
Note: As this is a ‘request and response’ type intent, we end the session. There is nothing more we need to do this time around.
set_response/2
sets the response content type, and encodes the response to JSON for us. Nice.
In the case of a caller telling us where they have left something the response from our server is,
{
"version": "1.0",
"response": {
"outputSpeech": {
"type": "PlainText",
"text": "Ok, we have your car keys in the kitchen drawer"
},
"shouldEndSession": true
},
"sessionAttributes": {}
}
Alexa will say the text to the caller and we can now respond to them when they as for where they left the car keys.
Hopefully this gives you some insight into setting up a Skill with Alexa. We enjoyed putting the skill together and leveraging some of Elixir’s power in the process.
Please drop us a line or leave a comment if you want to ask us anything. We’d love to hear about your adventures with Alexa and how your skills are coming along!
If you wanted it to build a product you’d find a way to get time to work on it. If you really wanted to start that new hobby you’d sacrifice something to find the time and money to do it.
I'll define a "Wannabe Entrepreneur" as someone who has never made money from their businesses. Here are the different types of wannabes.
In the past few years I've built go-carts, built a 200+ sq ft workshop, written several eBooks. How do I create a life where I have time to work on side projects?
Receive 5 Software projects mistakes we have made over the years and how to avoid them.