Developers

Programmable Voice API

Use XML response verbs to program real-time call behavior for inbound and outbound voice workflows. This guide follows the official API model with clearer language and examples.

Official SDKs

Start with PHP, Python, or Node.js

The SDKs wrap call creation and modification, generate escaped Voice XML, provide structured errors, and include helpers for documented Gather and Dial callbacks.

PHP SDK Python SDK Node.js SDK

Compare installation options in the Developer Hub or follow the first-call tutorial.

How It Works

Your application initiates or modifies a call through REST API endpoints and provides either a URL to an XML response document or an inline XML payload. NextGenSwitch executes each XML verb in order.

Example XML response:

<?xml version="1.0" encoding="UTF-8"?>
<Response>
    <Say>Hello, world!</Say>
</Response>

PHP helper example:

<?php
require_once './vendor/autoload.php';
use NextGenSwitch\VoiceResponse;

$response = new VoiceResponse();
$response->say('Hello World!');

echo $response->xml();

Available XML Verbs

<SAY> - Convert text to speech.
<PLAY> - Play audio media.
<GATHER> - Collect DTMF and/or speech input.
<DIAL> - Connect to another party or channel.
<RECORD> - Capture caller audio.
<STREAM> - Start bidirectional WebSocket audio streaming.
<HANGUP> - End the active call.
<PAUSE> - Pause flow for a configured duration.
<REDIRECT> - Load another XML instruction document.
<BRIDGE> - Bridge to another in-progress call.
<LEAVE> - Exit queue/waiting context but keep call alive.

Create or Modify Calls via API

Use a REST request from your application to trigger or modify call behavior.

Create Call (POST)

Required request parameters:

Parameter	Description
`to`	Destination number for the call.
`from`	Caller number or client identifier.
`statusCallback` (optional)	Webhook URL for live states such as dialing, ringing, established, and disconnected.
`response`	URL to XML instruction document (omit if using `responseXml`).
`responseXml`	Inline XML instruction body (omit if using `response` URL).

curl --header "X-Authorization: your_authorization_code" \
  --header "X-Authorization-Secret: your_authorization_secret" \
  --request POST \
  --data 'to=23123&from=2323&statusCallback=https://example.com/call-status&response=https://example.com/call-flow.xml' \
  https://your-switch.example.com/api/v1/call

Modify Ongoing Call (PUT)

Send a PUT request with call_id and responseXml to change the active call flow.

Endpoint:

PUT https://your-switch.example.com/api/v1/call/{call_id}

Required parameters:

Parameter	Description
`call_id`	Unique ID of the active call to modify.
`responseXml`	Inline XML containing updated flow instructions.

curl --header "X-Authorization: your_authorization_code" \
  --header "X-Authorization-Secret: your_authorization_secret" \
  --request PUT \
  --data '"responseXml": "<Response>\n  <Pause length=\"2\"/>\n  <Say loop=\"1\">call has been modified</Say>\n  <Dial>1000</Dial>\n</Response>"' \
  https://your-switch.example.com/api/v1/call/{call_id}

`<Say>` Verb

Converts text to speech for playback during the call.

Attribute	Description
`loop`	Number of repetitions. A value of `0` can be treated as continuous looping by implementation.

<Response>
    <Say loop="2">This message will be repeated twice.</Say>
</Response>

`<Play>` Verb

Plays an audio file from URL or supported media source.

Attribute	Description
`loop`	Number of repetitions. Values less than or equal to `0` may be treated as extended looping.

<Response>
    <Play loop="3">https://example.com/audio/connecting.mp3</Play>
</Response>

`<Gather>` Verb

Collects user input by DTMF, speech recognition, or both.

Attribute	Description
`action`	Webhook URL for collected input.
`method`	HTTP method for action URL (`POST`/`GET`).
`timeout`	Wait time for input in seconds.
`speechTimeout`	Speech input wait time in seconds.
`numDigits`	Expected DTMF length; `0` waits for timeout or finish key.
`finishOnKey`	DTMF key that finalizes input.
`actionOnEmptyResult`	Whether to call action URL with empty input.
`transcript`	Enable speech transcription.
`beep`	Play beep before capture starts.
`speechProfile`	Custom speech-recognition profile.
`input`	Input mode: `dtmf`, `speech`, or `dtmf speech`.

<Gather action="https://example.com/process_input" method="POST" numDigits="4" timeout="10">
    <Say>Please enter your 4-digit PIN.</Say>
</Gather>

Action callback variables

When gathering finishes, NextGenSwitch sends the result to the action URL. A POST callback uses application/x-www-form-urlencoded; a GET callback sends the same fields as query parameters.

Variable	Description
`call_id`	ID of the call executing the gather.
`digits`	DTMF digits collected before `numDigits`, `finishOnKey`, or timeout ended the gather. The finish key is not included.
`speech_result`	Transcribed speech text when speech input and transcription are enabled. Omitted when no transcription is available.
`confidence`	Speech recognition confidence returned by the configured transcription service. Omitted when unavailable.
`voice`	Path of the captured gather audio file when voice input was recorded. Omitted for DTMF-only gathers.
`rec_path`	Internal captured-audio path. Usually empty for DTMF-only gathers.
`event_from`	Caller ID of the original call.
`event_to`	Destination of the original call.
`event_call_id`	ID of the original call. This normally matches `call_id`.

Example DTMF callback:

call_id=CALL-123
digits=1234
rec_path=
event_from=1001
event_to=2001
event_call_id=CALL-123

The action endpoint should return a valid XML <Response>. NextGenSwitch parses that response and continues the call with its verbs. Return an empty <Response /> when no additional action is required.

In the current switch implementation, the action callback is sent only when at least one digit or a voice recording is available. A timeout with no input does not invoke the action URL.

`<Dial>` Verb

Connects the current call to a phone number, queue, or external SIP/VoIP destination.

Attribute	Description
`to`	Target phone number, SIP endpoint, or queue name.
`action`	Webhook URL called after dial completes.
`method`	HTTP method for action webhook.
`callerId`	Caller ID for outbound leg.
`answerOnBridge`	Bridge audio immediately after answer.
`ringTone`	Enable/disable ringback tone to caller.
`timeLimit`	Maximum connected call duration in seconds.
`hangupOnStar`	Allow caller to hang up by pressing `*`.
`record`	Recording mode (`record-from-answer` / `record-from-ringing`).
`recordingStatusCallback`	Recording event webhook URL.
`statusCallback`	Call status event webhook URL.
`channel`	Advanced channel selector.
`channel_id`	Advanced numeric channel identifier.

<Response>
    <Dial to="+1234567890" answerOnBridge="true" record="record-from-answer">
        <Play>https://example.com/audio/connecting.mp3</Play>
    </Dial>
</Response>

Action callback variables

After the dialed leg finishes or the dial attempt fails, NextGenSwitch sends the result to the action URL. A POST callback is form-encoded; a GET callback uses query parameters.

Variable	Description
`call_id`	ID of the parent call that executed `<Dial>`.
`bridge_call_id`	ID of the outbound/dialed call leg. It can be empty when the leg could not be created.
`dial_status`	Numeric result: `1` means the dialed leg was established; `0` means the dial failed or was not established.
`duration`	Connected talk time in seconds. It is `0` when the call was not established.
`waiting_duration`	Seconds from starting the dial until the outbound leg was established. For an unanswered attempt, this is the total time spent waiting before completion.
`record_file`	Path of the generated recording when dial recording was enabled and started. Omitted when no recording was produced.
`event_from`	Caller ID of the parent call.
`event_to`	Destination of the parent call.
`event_call_id`	ID of the parent call. This normally matches `call_id`.

Example successful dial callback:

call_id=CALL-123
bridge_call_id=CALL-456
dial_status=1
duration=84
waiting_duration=6
record_file=records/CALL-123/dial/example.mp3
event_from=1001
event_to=2001
event_call_id=CALL-123

As with <Gather>, the dial action endpoint should return Voice XML. Use the returned fields to select the next flow, such as retrying another destination when dial_status=0 or ending the call after a successful connection.

`<Record>` Verb

Records caller audio with optional timeout, key-based stop, beep, and transcription support.

Attribute	Description
`action`	Webhook URL receiving recording metadata and file reference.
`method`	HTTP method for action URL.
`timeout`	Seconds of silence before auto-stop.
`finishOnKey`	Key to end recording (set `0` to disable key stop).
`transcribe`	Enable transcription output.
`trim`	Trim leading/trailing silence.
`beep`	Play beep before recording starts.

<Response>
    <Record
        action="https://example.com/handle_recording"
        method="POST"
        timeout="5"
        finishOnKey="#"
        beep="true" />
</Response>

`<STREAM>` Verb

Creates a live, bidirectional audio stream over WebSocket for AI agents or custom real-time services.

<?xml version="1.0"?>
<response>
  <connect>
    <stream name="stream" url="ws://127.0.0.1:8766/ws">
      <parameter name="param1" value="param1_value" />
      <parameter name="param2" value="param2_value" />
    </stream>
  </connect>
</response>

`<Hangup>` Verb

Immediately disconnects the active call.

<Response>
    <Say>We are now going to hang up.</Say>
    <Hangup />
</Response>

`<Pause>` Verb

Temporarily pauses call execution for the defined duration.

<Response>
    <Pause length="3" />
    <Say>Your call is very important to us.</Say>
</Response>

`<Redirect>` Verb

Stops current execution and loads a new XML instruction document from the specified URL.

<Response>
    <Say>You will now be redirected to a different set of instructions.</Say>
    <Redirect method="GET">https://example.com/new_instructions.xml</Redirect>
</Response>

`<Bridge>` Verb

Bridges the active call with another in-progress call using a bridge call ID.

Attribute	Description
`bridgeAfterEstablish`	Boolean flag controlling bridge timing: `false` attempts immediate bridge, `true` waits until secondary call is fully established.

<Response>
    <Say>We are bridging your call now.</Say>
    <Bridge bridgeAfterEstablish="true">ABC123</Bridge>
</Response>

`<Leave>` Verb

Removes the caller from queue/conference context while continuing the call flow.

<Response>
    <Say>You will now leave the queue, but remain on the call.</Say>
    <Leave />
    <Say>Welcome back! You have left the queue.</Say>
</Response>

Programmable Voice API

Start with PHP, Python, or Node.js

How It Works

Available XML Verbs

Create or Modify Calls via API

Create Call (POST)

Modify Ongoing Call (PUT)

`<Say>` Verb

`<Play>` Verb

`<Gather>` Verb

Action callback variables

`<Dial>` Verb

Action callback variables

`<Record>` Verb

`<STREAM>` Verb

`<Hangup>` Verb

`<Pause>` Verb

`<Redirect>` Verb

`<Bridge>` Verb

`<Leave>` Verb

Talk to our AI voice agent

Continue the conversation on WhatsApp

Start with PHP, Python, or Node.js

How It Works

Available XML Verbs

Create or Modify Calls via API

Create Call (POST)

Modify Ongoing Call (PUT)

<Say> Verb

<Play> Verb

<Gather> Verb

Action callback variables

<Dial> Verb

Action callback variables

<Record> Verb

<STREAM> Verb

<Hangup> Verb

<Pause> Verb

<Redirect> Verb

<Bridge> Verb

<Leave> Verb

`<Say>` Verb

`<Play>` Verb

`<Gather>` Verb

`<Dial>` Verb

`<Record>` Verb

`<STREAM>` Verb

`<Hangup>` Verb

`<Pause>` Verb

`<Redirect>` Verb

`<Bridge>` Verb

`<Leave>` Verb