Create Audio Transcription

curl --request POST \
  --url https://api.mor.org/api/v1/audio/transcriptions \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "s3_presigned_url": "<string>",
  "model": "<string>",
  "language": "<string>",
  "prompt": "<string>",
  "response_format": "<string>",
  "temperature": 123,
  "timestamp_granularities": "<string>",
  "enable_diarization": true,
  "output_content": "<string>",
  "session_id": "<string>"
}
'

{
  "text": "<string>",
  "segments": [
    {
      "id": 123,
      "start": 123,
      "end": 123,
      "text": "<string>"
    }
  ],
  "words": [
    {
      "word": "<string>",
      "start": 123,
      "end": 123
    }
  ]
}

POST

https://api.mor.org

api

audio

transcriptions

Create Audio Transcription

curl --request POST \
  --url https://api.mor.org/api/v1/audio/transcriptions \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "s3_presigned_url": "<string>",
  "model": "<string>",
  "language": "<string>",
  "prompt": "<string>",
  "response_format": "<string>",
  "temperature": 123,
  "timestamp_granularities": "<string>",
  "enable_diarization": true,
  "output_content": "<string>",
  "session_id": "<string>"
}
'

{
  "text": "<string>",
  "segments": [
    {
      "id": 123,
      "start": 123,
      "end": 123,
      "text": "<string>"
    }
  ],
  "words": [
    {
      "word": "<string>",
      "start": 123,
      "end": 123
    }
  ]
}

Transcribe audio file to text. This endpoint transcribes audio files using the Morpheus Network providers. It automatically manages sessions and routes requests to the appropriate transcription model. Supports both file upload and S3 pre-signed URLs. Returns JSON or plain text responses based on response_format parameter.

Playground limitation: The interactive API playground does not correctly handle file uploads. Use the cURL examples below or an SDK instead.

Headers

Authorization

string

required

API key in format: Bearer sk-xxxxxx

Body (multipart/form-data)

file

Audio file to transcribe. Supported formats include: mp3, mp4, mpeg, mpga, m4a, wav, webm.

Either file or s3_presigned_url must be provided, but not both.

s3_presigned_url

string

Pre-signed S3 URL as alternative to file upload. Useful for large files or when files are already stored in S3.

Use S3 pre-signed URLs for files larger than 25MB or when you want to avoid uploading files directly.

model

string

Model ID to use for transcription (blockchain hex address or name).

Use the List Models endpoint to see available transcription models.

language

string

Language code (e.g., en, es, fr) to help improve transcription accuracy. If not specified, the model will attempt to detect the language automatically.

prompt

string

Optional text to guide the model’s transcription. Useful for proper nouns, technical terms, or specific vocabulary that may appear in the audio.

response_format

string

default:"json"

Format for the transcription response. Options:

json - JSON object with text and metadata
text - Plain text only
srt - SubRip subtitle format
verbose_json - Detailed JSON with word-level timestamps
vtt - WebVTT subtitle format

temperature

number

Sampling temperature between 0.0 and 1.0. Higher values make the output more random, lower values make it more deterministic.

timestamp_granularities

string

Comma-separated list of timestamp granularities. Options: word, segment. Only applicable when response_format is verbose_json.

enable_diarization

boolean

default:"false"

Enable speaker diarization to identify different speakers in the audio. Requires models that support this feature.

output_content

string

Output content type specification for advanced use cases.

session_id

string

Optional session ID to use for this request. If not provided, the system will automatically create or use the session associated with the API key.

Response

The response format depends on the response_format parameter:

text

string

Transcribed text (when response_format is json or text)

segments

array

Array of transcription segments with timestamps (when response_format is verbose_json)

Show Segment Object

integer

Segment identifier

start

number

Start time in seconds

end

number

End time in seconds

text

string

Transcribed text for this segment

words

array

Array of word-level timestamps (when timestamp_granularities includes word)

Show Word Object

word

string

The transcribed word

start

number

Start time in seconds

end

number

End time in seconds

Example Request

import openai

client = openai.OpenAI(
    api_key="sk-xxxxxx",
    base_url="https://api.mor.org/api/v1"
)

# Upload audio file
with open("audio.mp3", "rb") as audio_file:
    transcript = client.audio.transcriptions.create(
        model="whisper-model",
        file=audio_file,
        response_format="verbose_json",
        timestamp_granularities=["word", "segment"]
    )

print(transcript.text)

Use Cases

Meeting Notes

Automatically transcribe meetings and generate searchable text records

Content Accessibility

Create captions and subtitles for video content

Voice Commands

Convert voice commands to text for voice-controlled applications

Podcast Transcription

Generate searchable transcripts for podcast episodes

Use verbose_json with timestamp_granularities to get word-level timestamps, which are useful for creating interactive transcripts or synchronizing with video.

Large audio files may take longer to process. For files over 25MB, consider using S3 pre-signed URLs instead of direct file uploads.

Speaker diarization can help identify different speakers in multi-person conversations, but requires models that support this feature.

Create Audio Speech Approve Spending

⌘I

Getting Started

Authentication

Account Management

Models

Chat

Embeddings

Audio

Session Management

Automation

Chat History

Utility

Create Audio Transcription

Headers

Body (multipart/form-data)

Response

Example Request

Use Cases

Meeting Notes

Content Accessibility

Voice Commands

Podcast Transcription

Getting Started

Authentication

Account Management

Models

Chat

Embeddings

Audio

Session Management

Automation

Chat History

Utility

​Headers

​Body (multipart/form-data)

​Response

​Example Request

​Use Cases

Meeting Notes

Content Accessibility

Voice Commands

Podcast Transcription

Headers

Body (multipart/form-data)

Response

Example Request

Use Cases