For AI agents: a documentation index is available at /llms.txt. Markdown versions of all documentation pages are available by appending .md to the URL path.

Recipe F — Voice-cloned audiobook chapter

Outcome: A chapter-XX.mp3 file narrated in a cloned voice from a short reference audio clip. Loop over chapters to bundle a full audiobook.

Approx cost: ~$0.03 per chapter (varies with length).

Prerequisites

  • A 30-60 second reference audio clip hosted at a public URL (REF_AUDIO_URL).
  • Chapter text files at chapter-01.txt, chapter-02.txt, etc.

Script

#!/bin/bash
set -euo pipefail

REF_AUDIO_URL="${REF_AUDIO_URL:?must set REF_AUDIO_URL to a publicly reachable mp3/wav}"

for chapter_file in chapter-*.txt; do
  out="${chapter_file%.txt}.mp3"
  echo "Narrating $chapter_file → $out"

  AUDIO_URL=$(cat "$chapter_file" \
    | visa-cli generate speech --json --yes - \
        --ref-audio-url "$REF_AUDIO_URL" \
    | jq -r '.urls[0] // .filePath')

  curl -sL "$AUDIO_URL" -o "$out"
done

echo "Audiobook chapters ready: $(ls -1 chapter-*.mp3)"

Cost breakdown

Step Tool Approx
Speech (per ~500-word chapter) fal-metavoice $0.03

Notes

  • generate speech clones the voice from --ref-audio-url; a clean 30-60s sample gives the best fidelity.
  • Each chapter is a separate paid call — loop over chapter-*.txt to bundle a full audiobook.