Flowstorm Socket V1
This page describes communication protocol between clients and the server web socket channel implementation provided by Core Application.
If you're just starting with you own implementation client implementation consider to use rather Flowstorm Socket V2 instead of this.
Web Socket is provided by Core Application on URI /socket
The Platform Cloud Core Runner service provides web socket on URL address wss://core.flowstorm.ai/socket
V1 Events
Client using Socket V1 should support following set of events
See the detailed list of Channel Events
Name | Description |
| to initialise connection to server |
| to send user request |
| to initiate input audio processing (ASR) |
| to finish input audio processing |
| to cancel input audio prematurely |
Following event types are emitted by server
Name | Description |
| to acknowledge initialisation from server side |
| to send bot response |
| to send text result of ASR |
| to pass valid session ID to the client |
| to inform client that session has ended and session ID is no more valid and should be forget |
Communication Flow
The logic of client to server communication is following
Client | Server |
Connection initiation or restoration | |
State | waiting for new connections |
State | creates BotSocketAdapter object and waits for events |
Sending event | verifies sender, if ok then sends |
Accepting events
| |
Conversation | |
When client reaches state version 2 expects that client propose sessionId (in opposite to v1 where sessionId was defined strictly by server) - server can accept it or set different - in both cases it will send | sends event |
Sending text messages = sending event | processing request and sending |
Accepting Server can also actively send | |
Audio Input for Speech-To-Text | |
When client is in Listening state and audio input is enabled, it sends event | Server confirms to be ready to intercept and process audio input stream returning |
Client streams audio binary data to the server | Server reads audio binary data passing them to the ASR (STT) service. When ASR is finished it sends |
Client accepts event | Server release resources used to process input audio |
Example of client-server communication
Connection Initiation
Client establishes web socket connection and is in Open
state, so it can initiate it for future communication by Init
event containing config parameters describing how STT (Speech-To-Text) and STT (Text-To-Speech) will be treated.
Server confirms that it is ready for communication by Ready
event, client goes to Sleeping
state
Conversation Request
Client initiates conversation upon user activity (e.g. pressing talk button) by Request
event containing #intro
action
Client can generate or setsessionId
so it can attach to existing / previous session.
deviceId
should be unique per hardware + software client combination (e.g. browser type with unique ID stored in local storage for web application, application ID + mobile phone ID for mobile clients, client software name combined with MAC address for standalone client appliances etc.)
Request can contain attributes describing client current state, e.g. client type, location, temperature, etc. To understand better how these client attributes can be used in server part of Platform programming model please visit Client Attributes page.
Conversation Response
Server responds by Response
event and client goes to Responding
state, playing output audio
Speech Recognition
Client finished playing audio, it opens input audio by sending InputAudioStreamOpen
event
Server confirms that audio stream is open by sending InputAudioStreamOpen
event in return. Client goes to Listening
state and starts to send binary audio packets to the sockets.
Server recognises speech in the input audio and sends it back to the client in Recognized
event (so it can display input transcript to the user), client goes to Processing
state
Server generates response and sends it in Response
event, client goes to Responding
state, plays response audio (and optionally also displays response text)
Conversation End
As sessionEnded
is set true
, client goes do Sleeping
mode and sessionId
value is discarded (e.g. set to null
) so new value will have to be created for new conversation. If sleepTimeout
is non zero, then client also goes to Sleeping
mode but keeps sessionId
until sleep timeout expires. This allows to get back into the same session and have multiple conversations in the same session context.
Last updated