uChat Voice Messages Architecture — Developer Docs

The Voice Messages feature extends the core uChat — E2EE Messenger to support native audio recording and playback. To preserve the zero-knowledge security guarantees of the platform, voice messages are treated as standard file attachments encrypted client-side using the Crypto — Encryption Microservice patterns.

Data Model

In the uchat_messages table, voice messages are represented with a message_type of voice. Because the frontend playback widget needs to know the total duration of the track before the audio buffer is fully decrypted and loaded, a voice_duration_ms column is used.

type Message {
  id: ID!
  msgType: MessageType! # VOICE
  voiceDurationMs: Int
  fileUrl: String
  fileMime: String # audio/webm
  encryptedBody: String!
}

The audio is recorded using the MediaRecorder API (audio/webm;codecs=opus).

End-to-End Encryption Flow

The recording and encryption flow strictly adheres to the platform's E2EE standards:

Recording: The uChat Client SDK & Interface captures audio chunks via the browser's native MediaRecorder.
Encryption: Upon completion, the raw Blob is converted to an ArrayBuffer and encrypted symmetrically via AES-256-GCM.
Upload: The resulting ciphertext is uploaded directly to the platform's CDN.
Key Wrapping: The symmetric AES key used to encrypt the audio blob is itself encrypted using the recipient's Megolm session keys, producing the encryptedBody.
Mutation: The message is dispatched to the backend with msgType: VOICE and the exact duration in milliseconds.

When the frontend detects msgType === 'VOICE', it skips the generic file attachment card and mounts a dedicated audio playback widget.

The voiceDurationMs field allows the widget to render the timeline scale immediately.
Once the user clicks play, the audio blob is fetched from the CDN, decrypted in memory, and passed to a native HTML <audio> element via URL.createObjectURL(decryptedBlob).

uChat — E2EE Messenger — Core messaging architecture and Megolm ratchet implementation.
uChat Client SDK & Interface — Frontend component structure and state management.
Crypto — Encryption Microservice — Underlying primitives used to secure the voice buffers.

Data Model

End-to-End Encryption Flow

Playback Widget

Related