Introducing Orion – the visual agent
that sees, reasons and acts.
Introducing Orion – the first visual agent
that sees, reasons and acts.
Introducing Orion – the visual agent
that sees, reasons and acts.
Introducing Orion – the first visual agent that sees, reasons and acts.
Orion is the first visual agent that can analyze images, videos, and documents – and act on them with precision, through a unified chat-completions API.
Loved by leading
AI companies
Loved by leading
AI companies
Loved by leading
AI companies
Interact with images, videos, documents in a single API.
Meet the agent that sees, reasons and acts
Frontier models like GPT-5, Claude 4.5, and Gemini 2.5 Pro can describe what they see – but they can’t act on it. Orion unites the reasoning power of large Vision-Language Models with the accuracy of specialized computer-vision tools – all through one unified API.
One interface for all your visual AI needs.
Images, documents, and videos – all through a single chat-completions interface.
Compose through conversation.
Chain visual operations like: detect → crop → enhance → analyze in a single conversation.
Integrate at warp speed
Drop-in replacement for the OpenAI SDK.
Same API pattern – new visual powers.
Auditable outputs
Every response comes with visual proof. Build and integrate with confidence.

One Agent – Infinite Possibilities.
One Agent – Infinite Possibilities.
At the core of Orion is a unified architecture that powers understanding, reasoning, and action across every visual modality.




Ridiculously versatile.
Ridiculously versatile.
Whatever your visual task – Orion knows how to act. Built with dozens of specialized computer-vision and multi-modal tools.
01
Image AI
Caption & Tag
Generate rich, contextual descriptions and semantic labels for any image.
Detection
Generate rich, contextual descriptions and semantic labels for any image.
Segmentation
Generate rich, contextual descriptions and semantic labels for any image.
Pointing
Generate rich, contextual descriptions and semantic labels for any image.
Generate & Edit
Generate rich, contextual descriptions and semantic labels for any image.
UI Parsing
Generate rich, contextual descriptions and semantic labels for any image.
Image Tools
Generate rich, contextual descriptions and semantic labels for any image.

01
Image AI
Caption & Tag
Generate rich, contextual descriptions and semantic labels for any image.
Detection
Generate rich, contextual descriptions and semantic labels for any image.
Segmentation
Generate rich, contextual descriptions and semantic labels for any image.
Pointing
Generate rich, contextual descriptions and semantic labels for any image.
Generate & Edit
Generate rich, contextual descriptions and semantic labels for any image.
UI Parsing
Generate rich, contextual descriptions and semantic labels for any image.
Image Tools
Generate rich, contextual descriptions and semantic labels for any image.

01
Image AI
Caption & Tag
Generate rich, contextual descriptions and semantic labels for any image.
Detection
Generate rich, contextual descriptions and semantic labels for any image.
Segmentation
Generate rich, contextual descriptions and semantic labels for any image.
Pointing
Generate rich, contextual descriptions and semantic labels for any image.
Generate & Edit
Generate rich, contextual descriptions and semantic labels for any image.
UI Parsing
Generate rich, contextual descriptions and semantic labels for any image.
Image Tools
Generate rich, contextual descriptions and semantic labels for any image.

01
Image AI
Caption & Tag
Generate rich, contextual descriptions and semantic labels for any image.
Detection
Generate rich, contextual descriptions and semantic labels for any image.
Segmentation
Generate rich, contextual descriptions and semantic labels for any image.
Pointing
Generate rich, contextual descriptions and semantic labels for any image.
Generate & Edit
Generate rich, contextual descriptions and semantic labels for any image.
UI Parsing
Generate rich, contextual descriptions and semantic labels for any image.
Image Tools
Generate rich, contextual descriptions and semantic labels for any image.

02
Document AI
Document Parsing
Generate rich, contextual descriptions and semantic labels for any image.
Structured OCR
Generate rich, contextual descriptions and semantic labels for any image.
Redaction
Generate rich, contextual descriptions and semantic labels for any image.
02
Document AI
Document Parsing
Generate rich, contextual descriptions and semantic labels for any image.
Structured OCR
Generate rich, contextual descriptions and semantic labels for any image.
Redaction
Generate rich, contextual descriptions and semantic labels for any image.
02
Document AI
Document Parsing
Generate rich, contextual descriptions and semantic labels for any image.
Structured OCR
Generate rich, contextual descriptions and semantic labels for any image.
Redaction
Generate rich, contextual descriptions and semantic labels for any image.
02
Document AI
Document Parsing
Generate rich, contextual descriptions and semantic labels for any image.
Structured OCR
Generate rich, contextual descriptions and semantic labels for any image.
Redaction
Generate rich, contextual descriptions and semantic labels for any image.
03
Video AI
Caption & Tag
Generate rich, contextual descriptions and semantic labels for any image.
Generate & Edit
Generate rich, contextual descriptions and semantic labels for any image.
Video Tools
Generate rich, contextual descriptions and semantic labels for any image.
03
Video AI
Caption & Tag
Generate rich, contextual descriptions and semantic labels for any image.
Generate & Edit
Generate rich, contextual descriptions and semantic labels for any image.
Video Tools
Generate rich, contextual descriptions and semantic labels for any image.
03
Video AI
Caption & Tag
Generate rich, contextual descriptions and semantic labels for any image.
Generate & Edit
Generate rich, contextual descriptions and semantic labels for any image.
Video Tools
Generate rich, contextual descriptions and semantic labels for any image.
03
Video AI
Caption & Tag
Generate rich, contextual descriptions and semantic labels for any image.
Generate & Edit
Generate rich, contextual descriptions and semantic labels for any image.
Video Tools
Generate rich, contextual descriptions and semantic labels for any image.

OpenAI
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from openai import OpenAI
client = OpenAI(
base_url="https://agent.vlm.run/v1/openai",
api_key="<VLMRUN_API_KEY>"
)
result = client.chat.completions.create(
model="vlmrun-orion-1",
messages=[
{"role": "user", "content": [
{"type": "text", "text": "Analyze the image & animate into a video."},
{"type": "image_url", "image_url": {"url": "https://..."}}
]}
)
print(result.choices[0].message.content)
OpenAI
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from openai import OpenAI
client = OpenAI(
base_url="https://agent.vlm.run/v1/openai",
api_key="<VLMRUN_API_KEY>"
)
result = client.chat.completions.create(
model="vlmrun-orion-1",
messages=[
{"role": "user", "content": [
{"type": "text", "text": "Analyze the image & animate into a video."},
{"type": "image_url", "image_url": {"url": "https://..."}}
]}
)
print(result.choices[0].message.content)
OpenAI
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from openai import OpenAI
client = OpenAI(
base_url="https://agent.vlm.run/v1/openai",
api_key="<VLMRUN_API_KEY>"
)
result = client.chat.completions.create(
model="vlmrun-orion-1",
messages=[
{"role": "user", "content": [
{"type": "text", "text": "Analyze the image & animate into a video."},
{"type": "image_url", "image_url": {"url": "https://..."}}
]}
)
print(result.choices[0].message.content)
OpenAI
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from openai import OpenAI
client = OpenAI(
base_url="https://agent.vlm.run/v1/openai",
api_key="<VLMRUN_API_KEY>"
)
result = client.chat.completions.create(
model="vlmrun-orion-1",
messages=[
{"role": "user", "content": [
{"type": "text", "text": "Analyze the image & animate into a video."},
{"type": "image_url", "image_url": {"url": "https://..."}}
]}
)
print(result.choices[0].message.content)
OpenAI
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from openai import OpenAI
client = OpenAI(
base_url="https://agent.vlm.run/v1/openai",
api_key="<VLMRUN_API_KEY>"
)
result = client.chat.completions.create(
model="vlmrun-orion-1",
messages=[
{"role": "user", "content": [
{"type": "text", "text": "Analyze the image & animate into a video."},
{"type": "image_url", "image_url": {"url": "https://..."}}
]}
)
print(result.choices[0].message.content)
For DEVELOPERS
Designed for
developers.
Designed for
developers.
Familiar API, unfamiliar power
Familiar API, unfamiliar power
All your favorite vision tools, in a single box.
Drop-in replacement for OpenAI SDK.
Handles images, documents, videos via URL or upload.
Streaming support for real-time responses.
Structured outputs with Pydantic / Zod support.
For ENTERPRISES
The new visual intelligence layer for your enterprise.
The new visual intelligence layer for your enterprise.
Deploy Orion securely inside your VPC or private cloud – bringing visual intelligence directly to your infrastructure. Power document, image, and video understanding across teams. SOC 2 Type II and HIPAA-ready.






Frequently asked
questions
FAQs
How is Orion different from GPT-5, Claude, or Gemini?
Frontier models can describe what they see — but not act on it. Orion goes beyond perception by planning, executing, and validating visual tasks. Instead of just describing an image, Orion can detect, crop, segment, generate, and analyze in sequence — reliably and deterministically.
What can I build with Orion?
Developers are already using Orion for a wide range of applications – from visual ETL pipelines that detect, extract, and structure information, to automated product and marketing asset generation, document parsing and redaction, video summarization and clipping, and even medical and geospatial visual analysis. If your workflow involves visual data, Orion can make it intelligent and interactive.
How is Orion priced?
Orion’s pricing is designed to be flexible and transparent — based on the tools you use and the volume of visual tasks you run. Each visual capability (detection, segmentation, OCR, generation, etc.) is priced per use, allowing you to scale from experimentation to production without committing to fixed model tiers. Enterprise plans include predictable monthly or on-prem options for teams that need volume pricing, VPC deployments, or compliance guarantees.
How accurate is Orion compared to traditional CV models?
Orion taps into both modern VLM archteictures and traditional computer-vision models, allowing it to reason and accurately perform visual tasks. In benchmarks across several multi-modal tasks (MMMU, MMBench, DocVQA, RefCOCO etc), Orion consistent outperformed leading VLMs on multi-step visual reasoning, structured OCR and traditional CV tasks like detection, segmentation, tracking. See our whitepaper for more details.
How do you keep data private?
Our Pro-tier offering runs entirely in our private cloud deployment. Requests made to our APIs will be logged and made available in our observability dashboard. For our enterprise-tier, we can enforce higher privacy requirements (SOC2, GDPR, HIPAA).
Can I run Orion on-prem on in-VPC?
Yes. For enterprise deployments, VLM Run offers both VPC Peering and In-VPC hosting options — ensuring data never leaves your environment. We’re SOC 2 Type II and HIPAA-ready for teams with compliance requirements.
How is Orion different from GPT-5, Claude, or Gemini?
Frontier models can describe what they see — but not act on it. Orion goes beyond perception by planning, executing, and validating visual tasks. Instead of just describing an image, Orion can detect, crop, segment, generate, and analyze in sequence — reliably and deterministically.
What can I build with Orion?
Developers are already using Orion for a wide range of applications – from visual ETL pipelines that detect, extract, and structure information, to automated product and marketing asset generation, document parsing and redaction, video summarization and clipping, and even medical and geospatial visual analysis. If your workflow involves visual data, Orion can make it intelligent and interactive.
How is Orion priced?
Orion’s pricing is designed to be flexible and transparent — based on the tools you use and the volume of visual tasks you run. Each visual capability (detection, segmentation, OCR, generation, etc.) is priced per use, allowing you to scale from experimentation to production without committing to fixed model tiers. Enterprise plans include predictable monthly or on-prem options for teams that need volume pricing, VPC deployments, or compliance guarantees.
How accurate is Orion compared to traditional CV models?
Orion taps into both modern VLM archteictures and traditional computer-vision models, allowing it to reason and accurately perform visual tasks. In benchmarks across several multi-modal tasks (MMMU, MMBench, DocVQA, RefCOCO etc), Orion consistent outperformed leading VLMs on multi-step visual reasoning, structured OCR and traditional CV tasks like detection, segmentation, tracking. See our whitepaper for more details.
How do you keep data private?
Our Pro-tier offering runs entirely in our private cloud deployment. Requests made to our APIs will be logged and made available in our observability dashboard. For our enterprise-tier, we can enforce higher privacy requirements (SOC2, GDPR, HIPAA).
Can I run Orion on-prem on in-VPC?
Yes. For enterprise deployments, VLM Run offers both VPC Peering and In-VPC hosting options — ensuring data never leaves your environment. We’re SOC 2 Type II and HIPAA-ready for teams with compliance requirements.
How is Orion different from GPT-5, Claude, or Gemini?
Frontier models can describe what they see — but not act on it. Orion goes beyond perception by planning, executing, and validating visual tasks. Instead of just describing an image, Orion can detect, crop, segment, generate, and analyze in sequence — reliably and deterministically.
What can I build with Orion?
Developers are already using Orion for a wide range of applications – from visual ETL pipelines that detect, extract, and structure information, to automated product and marketing asset generation, document parsing and redaction, video summarization and clipping, and even medical and geospatial visual analysis. If your workflow involves visual data, Orion can make it intelligent and interactive.
How is Orion priced?
Orion’s pricing is designed to be flexible and transparent — based on the tools you use and the volume of visual tasks you run. Each visual capability (detection, segmentation, OCR, generation, etc.) is priced per use, allowing you to scale from experimentation to production without committing to fixed model tiers. Enterprise plans include predictable monthly or on-prem options for teams that need volume pricing, VPC deployments, or compliance guarantees.
How accurate is Orion compared to traditional CV models?
Orion taps into both modern VLM archteictures and traditional computer-vision models, allowing it to reason and accurately perform visual tasks. In benchmarks across several multi-modal tasks (MMMU, MMBench, DocVQA, RefCOCO etc), Orion consistent outperformed leading VLMs on multi-step visual reasoning, structured OCR and traditional CV tasks like detection, segmentation, tracking. See our whitepaper for more details.
How do you keep data private?
Our Pro-tier offering runs entirely in our private cloud deployment. Requests made to our APIs will be logged and made available in our observability dashboard. For our enterprise-tier, we can enforce higher privacy requirements (SOC2, GDPR, HIPAA).
Can I run Orion on-prem on in-VPC?
Yes. For enterprise deployments, VLM Run offers both VPC Peering and In-VPC hosting options — ensuring data never leaves your environment. We’re SOC 2 Type II and HIPAA-ready for teams with compliance requirements.
How is Orion different from GPT-5, Claude, or Gemini?
Frontier models can describe what they see — but not act on it. Orion goes beyond perception by planning, executing, and validating visual tasks. Instead of just describing an image, Orion can detect, crop, segment, generate, and analyze in sequence — reliably and deterministically.
What can I build with Orion?
Developers are already using Orion for a wide range of applications – from visual ETL pipelines that detect, extract, and structure information, to automated product and marketing asset generation, document parsing and redaction, video summarization and clipping, and even medical and geospatial visual analysis. If your workflow involves visual data, Orion can make it intelligent and interactive.
How is Orion priced?
Orion’s pricing is designed to be flexible and transparent — based on the tools you use and the volume of visual tasks you run. Each visual capability (detection, segmentation, OCR, generation, etc.) is priced per use, allowing you to scale from experimentation to production without committing to fixed model tiers. Enterprise plans include predictable monthly or on-prem options for teams that need volume pricing, VPC deployments, or compliance guarantees.
How accurate is Orion compared to traditional CV models?
Orion taps into both modern VLM archteictures and traditional computer-vision models, allowing it to reason and accurately perform visual tasks. In benchmarks across several multi-modal tasks (MMMU, MMBench, DocVQA, RefCOCO etc), Orion consistent outperformed leading VLMs on multi-step visual reasoning, structured OCR and traditional CV tasks like detection, segmentation, tracking. See our whitepaper for more details.
How do you keep data private?
Our Pro-tier offering runs entirely in our private cloud deployment. Requests made to our APIs will be logged and made available in our observability dashboard. For our enterprise-tier, we can enforce higher privacy requirements (SOC2, GDPR, HIPAA).
Can I run Orion on-prem on in-VPC?
Yes. For enterprise deployments, VLM Run offers both VPC Peering and In-VPC hosting options — ensuring data never leaves your environment. We’re SOC 2 Type II and HIPAA-ready for teams with compliance requirements.
Try Orion today.
Try Orion today.








by Autonomi Al Inc. All rights reserved. © 2025
by Autonomi Al Inc. All rights reserved. © 2025
by Autonomi Al Inc. All rights reserved. © 2025
by Autonomi Al Inc. All rights reserved. © 2025


