Use case: Voice call broadcast with prerecorded and text-to-speech (TTS) messages
Back to Articles
Tutorial VoIP SIP API Javascript Telecom Howto

Use case: Voice call broadcast with prerecorded and text-to-speech (TTS) messages

May 1, 2018 5 min
Aivis Olsteins

Aivis Olsteins

In this article we will discuss setting up a voice telephony broadcast which does not require knowledge of telephony protocols or installation of any telephony or Voice over IP equipment.

Let's start with this simple case: you want the system to call user whose SIP address is 123456@192.168.1.1 and play a prerecorded wav message which is stored at the following address: http://server.domain.com/voices/hello.wav.

With our API the request is as simple as this:

POST /voice/call/play

{
  "to": "sip:123456@192.168.1.1:5060",
  "from": "me",
  "playlist":
    [
      {
        "play": "http://server.domain.com/voices/hello.wav"
      }
    ]
}

And response will be like this:

{
"code": 0,
"status": 200,
"data": "accepted",
"request_id": "9976f46a-243f-4fc6-a441-ebfda38cfad4"
}

What happens above, is this:

  1. The request is accepted, and is given an ID, which is sent back to your application
  2. System will make a SIP call to 123456@192.168.1.1
  3. When other side picks up the phone, it will play the voice file from URL: http://server.domain.com/voices/hello.wav
  4. Aafter file is fully played, the system will hang up.

Now, how do you know if the call was successful or not? At the time API returned request ID, the call was yet to be established and the outcome was not known yet. For this, let's use the given request ID and craft another request to check what happened to the call:

GET /voice/call/status/9976f46a-243f-4fc6-a441-ebfda38cfad4


Note, this is GET request, and therefore has no data payload. We use request ID received in the previous step in the URL.

Depending on the outcome of the call, the response will be something like this:


{
"code": 0,
"status": 200,
"data": "accepted",
"request_id": "9976f46a-243f-4fc6-a441-ebfda38cfad4",
"record":
{
"type": "voice",
"result": "completed",
"request_time": "2018-05-01 13:35:32",
"to": "sip:123456@192.168.1.1:5060",
"from": "me",
"setup_time": "2018-05-01 13:35:32",
"connect_time": "2018-05-01 13:35:45",
"disconnect_time": "2018-05-01 13:36:02",
"disconnect_cause_code": "200",
"disconnect_cause_text": "OK",
"duration": 17,
"answered": true
}
}
 

So, the above response tells us that the call was successful, and lasted 17 seconds, i. e. duration of the recording. Note, that if you send status request early while the call is not yet established or is in progress, you would get different output, and some fields would not be present at all (like disconnect time, cause, and duration). Also in case if user did not pick up the phone, there would be different set of response attributes (i. e. connect time would not be present).

Now, let's see a case where you want to send a call with more files, and play them one after another. Here is the example request:


POST /voice/call/play

{
"to": "sip:123456@192.168.1.1:5060",
"from": "me",
"playlist":
[
{
"play": "http://server.domain.com/voices/hello.wav"
},
{
"play": "http://server.domain.com/voices/main-message.wav"
},
{
"play": "http://server.domain.com/voices/goodbye-thanks.wav"
}
]
}

The difference from previous post is that instead of one entry in the “playlist”, there are multiple. They will be played in the sequence, as they appear in the request, one by one, i.e. “hello.wav”, “main-message.wav” and “goodbye-thanks.wav”.

Now, let's say you need to add some dynamic text in the message you play for which you do not have a prerecorded file. In this case, you can include a Text-to-speech item in the “playlist”, like this:

POST /voice/call/play

{
"to": "sip:123456@192.168.1.1:5060",
"from": "me",
"playlist":
[
{
"play": "http://server.domain.com/voices/hello.wav",
"type": "remote"
},
{
"play": "http://server.domain.com/voices/main-message.wav",
"type": "remote"
},
{
"play": "Punctuality is the virtue of the bored",
"type": "tts",
"options":
{
"language": "en-US",
"gender": "female"
}
},
{
"play": "http://server.domain.com/voices/goodbye-thanks.wav",
"type": "remote"
}
]
}

As you see from above example, we have added a “type” attribute to the playlist items. It is designed to distinguish between files stored on remote servers (hence “remote”) and texts requiring conversion from text to speech (“tts”). The TTS type also has a possibility to pass options, like language and gender of the speaker.

API can also serve voice files from locally stored locations. That allows faster call establishment times, less bandwidth and increased availability. To achieve that, first upload the required file via API call:


POST /storage/media

Content-Type: audio/wav
Content-Length: 2845

raw audio file content


The server will respond with something like this:

{
"code": 0,
"status": 200,
"data": "accepted",
"request_id": "cc078956-f4a6-4494-b418-f0facc2f1203",
"file":
{
"id": "9ba994ea-a0b5-47cb-af69-fb0a235d7b19",
"size": 2845,
"type": "audio/wav"
}
}

Once the all necessary files are uploaded, we can send our call request like this:


POST /voice/call/play

{
"to": "sip:123456@192.168.1.1:5060",
"from": "me",
"playlist":
[
{
"play": "9ba994ea-a0b5-47cb-af69-fb0a235d7b19",
"type": "local"
},
{
"play": "http://server.domain.com/voices/goodbye-thanks.wav",
"type": "remote"
}
]
}

Note, that we use ID from “file” object, not the request itself.

In the next article we will see how to send calls to multiple users at the same time and how to use more advanced call routing features.

Share this article

Aivis Olsteins

Aivis Olsteins

An experienced telecommunications professional with expertise in network architecture, cloud communications, and emerging technologies. Passionate about helping businesses leverage modern telecom solutions to drive growth and innovation.

Related Articles

How Voice AI Reduces Agent Burnout and Boosts Satisfaction

How Voice AI Reduces Agent Burnout and Boosts Satisfaction

Reduce Burnout with Voice AI: Offload Repetitive Calls, Real‑Time Agent Assist, 40–80% Less ACW, Calmer Escalations, Healthier Occupancy, Proactive Deflection & PCI‑Safe Flows—Happier Agents, Faster Resolutions, Better Coaching, Faster Ramp

Read Article
Seamless Voice AI Integrations: Salesforce, HubSpot, and ERP Systems

Seamless Voice AI Integrations: Salesforce, HubSpot, and ERP Systems

Seamless Voice AI Integrations with Your Stack: Salesforce & HubSpot CRM + SAP/Oracle/NetSuite/Dynamics ERP; OAuth2 & mTLS Security; Real‑Time Read/Write (Cases, Orders, Payments, Scheduling); Warm Transfers, Context; Audit Logs, SLAs, iPaaS Support

Read Article
Measuring Voice AI Success: The KPIs That Matter—CSAT, Containment, Speed, Accuracy, Reliability, and ROI

Measuring Voice AI Success: The KPIs That Matter—CSAT, Containment, Speed, Accuracy, Reliability, and ROI

Voice AI KPI Scorecard: CSAT/NPS, FCR & Containment, Time‑to‑First‑Word & p95 Latency, Intent/Slot Accuracy & ASR WER, Groundedness, Tool Success (Payments/IDV/Scheduling), Warm Xfers, Uptime/Reliability, Consent/Redact, Cost per Resolution & ROI

Read Article
Sensitive Data in Voice AI: PCI‑Safe Payments, HIPAA‑Compliant PHI, Redaction & Tokenization

Sensitive Data in Voice AI: PCI‑Safe Payments, HIPAA‑Compliant PHI, Redaction & Tokenization

Managing Sensitive Data in Voice AI: PCI‑Safe Payments (DTMF Masking, Tokenization), HIPAA‑Compliant PHI Segregation, Redaction/De‑Identification, End‑to‑End Encryption, Zero‑Trust Access, Residency/Retention, DSAR Deletion, SIEM‑Audited Trails

Read Article

SUBSCRIBE TO OUR NEWSLETTER

Stay up to date with the latest news and updates from our telecom experts