Agents

How to Use the Entire Segmind Model Library Through an AI Agent

Segmind's llms.txt makes every image, video, and audio model in the catalog discoverable and callable by any AI agent. Here is how to set it up, and why you need to set a budget first.

Rohit Rao

04 Apr 2026 • 3 min read

AI agents are getting genuinely useful. Not just for writing emails or summarizing docs, but for doing real creative work: generating images, producing videos, cloning voices, compositing scenes. The missing piece for most teams has been connecting agents to a media generation backend they can actually reason about.

That is exactly what we built at Segmind.

The Two URLs You Need

We follow the llms.txt convention, which makes our entire model catalog machine-readable.

https://segmind.com/llms.txt

This file lists every model available on Segmind: image generation, video generation, text-to-speech, audio, upscaling, background removal, and more. An agent can fetch this file to discover what is available before deciding which model to call.

https://segmind.com/models/{slug}/llms.txt

Each model has its own spec file at this URL. It tells the agent exactly how to call the model: required parameters, optional parameters, valid enum values (like aspect ratios, resolutions, styles), expected response format, and whether the endpoint is synchronous or async.

That is it. Two URL patterns and the agent knows the full catalog and how to use every model in it.

Getting Started

Create a Segmind account at segmind.com. New accounts get free credits to start.
Grab your API key from the dashboard under API Keys.
Add it to your agent. In Claude Cowork, paste it as an environment variable or drop it in your credentials file. In a custom agent or Claude.ai project, you can store it as a secret or reference it in your system prompt context.

The base endpoint for all model calls is https://api.segmind.com/v1/{slug}. Your API key goes in the x-api-key header.

What This Looks Like in Practice

Say you are using Claude Cowork and you want to generate a product visual. You tell Claude: "Generate a lifestyle photo of our new water bottle in a mountain setting." Claude can:

Fetch segmind.com/llms.txt to find the right image model.
Fetch the model spec at /models/{slug}/llms.txt to understand the call format.
Make the API call with a crafted prompt, return the image, and drop it into your deliverable.

No manual API wrangling. No switching tabs. The agent handles the full loop.

This works just as well with other agents: custom GPT agents, LangChain pipelines, n8n workflows, or any agent framework that can make HTTP requests and parse plain text.

Set a Budget Before You Run Anything

This is the part people skip and then regret.

Agents are fast. They will call an API dozens of times if you let them, especially in loops or when iterating on outputs. A video generation model at $0.10 per run adds up quickly if your agent decides to generate 50 variations.

Always set a Segmind budget before running an agent workflow:

Go to your dashboard and set a monthly or per-session spend limit.
If you are building a workflow for someone else, set a hard cap on the API key.
In your agent instructions, explicitly tell the agent how many generations it is allowed to make per task (e.g., "generate no more than 3 variations").

Segmind will stop accepting requests once the budget is hit. That is the safety net. But building the guardrail into your agent instructions means you never get surprised by the bill in the first place.

What You Can Actually Build

Once an agent has access to Segmind, the scope of what it can produce expands significantly:

Marketing teams can have agents generate ad creatives, product shots, and social visuals on demand without touching a design tool.
Video producers can script a scene and have an agent generate the visual assets, voiceover, and background music in a single workflow.
Developers can wire Segmind into their product so users get AI-generated media without the developer needing to hardcode a specific model.
Founders wearing every hat can delegate entire content pipelines to an agent and focus on what actually needs human judgment.

The media generation stack has always been powerful. The bottleneck was orchestration. Agents remove that bottleneck.

Start Here

Catalog: segmind.com/llms.txt
Model spec: segmind.com/models/{slug}/llms.txt
Docs: segmind.com/docs
Sign up: segmind.com

Set your budget, give your agent the API key, and point it at the catalog. Everything else follows.