For AI agents: a documentation index is available at /llms.txt. Markdown versions of all documentation pages are available by appending .md to the URL path.
Outcome: After N rounds, a final image that's been iteratively refined based on per-round critiques from a vision-capable LLM. Useful when one-shot prompts don't capture intent.
Approx cost: ~$0.05 × N (per round: image + LLM critique).
#!/bin/bash
set -euo pipefail
PROMPT="${1:-a cyberpunk cat reading a newspaper}"
N="${2:-3}"
# Round 0: initial generation
CURRENT_IMG=$(visa-cli generate image "$PROMPT" --json --yes \
| jq -r '.urls[0] // .filePath')
echo "Round 0: $CURRENT_IMG"
for i in $(seq 1 "$N"); do
# Vision LLM critique using --from-stdin to bind the prior URL
CRITIQUE=$(visa-cli generate image "$PROMPT" --json --yes \
| visa-cli run-llm --model or-gemini-nano-banana-pro --json --yes \
--image-url --from-stdin '.urls[0]' \
"Critique this image for: '$PROMPT'. Be specific and concise. One short paragraph." \
| jq -r '.text')
# Re-prompt with the critique
CURRENT_IMG=$(echo "$PROMPT. Refinement note: $CRITIQUE" \
| visa-cli generate image - --json --yes \
| jq -r '.urls[0] // .filePath')
echo "Round $i: $CURRENT_IMG"
echo " critique: $CRITIQUE"
done
echo "Final image: $CURRENT_IMG"
| Step | Tool | Approx |
|---|---|---|
| Image generation | fal-flux-pro |
$0.04 |
| Vision critique | or-gemini-nano-banana-pro |
$0.01 |
| Per round | ~$0.05 |
3-round loop = ~$0.15. 10-round loop = ~$0.50.
--from-stdin .urls[0] flag binds the upstream JSON envelope's first
URL into --image-url — no jq round-trip needed.run-llm --json with a yes/no system prompt).