Inference

Inference offers 2 models through Lava’s AI Gateway, supporting Chat Completions. Authentication uses Authorization: Bearer. See the Inference API docs for provider-specific parameters.

Supports both managed (Lava’s API keys) and unmanaged (bring your own credentials) mode.

Quick Start

const response = await fetch('https://api.lava.so/v1/forward?u=https%3A%2F%2Fapi.inference.net%2Fv1%2Fchat%2Fcompletions', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    Authorization: `Bearer ${forwardToken}`,
  },
  body: JSON.stringify({
    model: 'inference-net/schematron-v2-small',
    messages: [{ role: "user", content: "Hello!" }],
  }),
});

Chat Completions

Target URL: https://api.inference.net/v1/chat/completions


Content Type	`application/json`
Streaming	Yes (set `stream: true` in request body)

Model	Input / 1M tokens	Output / 1M tokens
inference-net/schematron-v2-small	$0.05	$0.25
inference-net/schematron-v2-turbo	$0.03	$0.15

Next Steps

All Providers

Browse all supported AI providers

Forward Proxy

Learn how to construct proxy URLs and authenticate requests

⌘I

Get Started

AI Gateway

AI Spend

Monetize

Integration

SDK

Cookbook

Comparisons

Quick Start

Chat Completions

Next Steps

All Providers

Forward Proxy

​Quick Start

​Chat Completions

​Next Steps

All Providers

Forward Proxy

Quick Start

Chat Completions

Next Steps