Working Across Multiple Providers
Explore how to programmatically discover and select AI models across multiple providers using OpenRouter. Learn model naming conventions, metadata interpretation, and how to write provider-agnostic code. Understand suffixes for specialized routing and how to keep your model selection flexible as the AI landscape evolves.
In the last lesson, we learned how we could switch between two models by changing a single line of code. With access to over 400 models from more than 60 providers, the next skill is knowing how to navigate that marketplace to find the right tool for a specific job and how to write code that makes swapping models a configuration change rather than an engineering task.
This lesson focuses on discovery. We will learn how to explore the model ecosystem, use the API to find models based on their capabilities, and design application code that can adapt as the market evolves.
Understanding the model landscape
The key to navigating OpenRouter is its standardized naming convention. Every model is identified by a unique string that tells you both who made it and which version it is.
Model naming conventions
Model IDs follow a simple provider/model-name format. This makes it easy to identify the source of any model at a glance.
openai/gpt-5.4anthropic/claude-sonnet-4.6google/gemini-3.1-flash-lite-previewmeta-llama/llama-4-scout
Model suffixes for specialized routing
Beyond the base name, OpenRouter offers special suffixes, or variants, that modify routing behavior to optimize for a specific need like cost, speed, or capability. You can append these to any compatible model ID.
:nitro: Routes to the highest-throughput (fastest) providers.:floor: Routes to the lowest-cost providers available.:free: Routes to models available on a no-cost tier.:extended: Uses model versions with a larger context window for longer inputs.:online: Attaches a web search plugin, giving the model real-time internet access.:exacto: Routes to a vetted list of providers benchmarked for the highest tool-calling accuracy, ideal for agentic workflows.:thinking: Enables chain-of-thought reasoning. For Anthropic models, this is handled via a dedicatedreasoningparameter.
Discovering models programmatically
The web page at openrouter.ai/models is a good starting point for manual exploration, but the same data is available via the /api/v1/models endpoint. Querying it programmatically lets your application select models based on real-time capability and pricing data rather than hardcoded assumptions.
The script below fetches all available models and filters for those that meet specific criteria: support for tool calling and a large context window.
The script queries the API directly and outputs a current list of qualified models. Embed this logic in your application, and the model list stays current without manual updates.
Reading model metadata
The response from the /api/v1/models endpoint contains rich metadata for each model. Knowing what each field contains tells you whether a model fits your requirements before you make a request.
Here is an example object for a single model:
{"id": "openai/gpt-5.4","canonical_slug": "openai/gpt-5.4-20260305","hugging_face_id": "","name": "OpenAI: GPT-5.4","created": 1772734352,"description": "GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ token context window (922K input, 128K output) with support for text and image inputs, enabling high-context reasoning, coding, and multimodal analysis within the same workflow.\n\nThe model delivers improved performance in coding, document understanding, tool use, and instruction following. It is designed as a strong default for both general-purpose tasks and software engineering, capable of generating production-quality code, synthesizing information across multiple sources, and executing complex multi-step workflows with fewer iterations and greater token efficiency.","context_length": 1050000,"architecture": {"modality": "text+image+file-\u003Etext","input_modalities": ["text","image","file"],"output_modalities": ["text"],"tokenizer": "GPT","instruct_type": null},"pricing": {"prompt": "0.0000025","completion": "0.000015","web_search": "0.01","input_cache_read": "0.00000025"},"top_provider": {"context_length": 1050000,"max_completion_tokens": 128000,"is_moderated": true},"per_request_limits": null,"supported_parameters": ["frequency_penalty","include_reasoning","logit_bias","logprobs","max_tokens","presence_penalty","reasoning","response_format","seed","stop","structured_outputs","tool_choice","tools","top_logprobs"],"default_parameters": {"temperature": null,"top_p": null,"top_k": null,"frequency_penalty": null,"presence_penalty": null,"repetition_penalty": null},"expiration_date": null}
Here are some details you might want to explore when using a model:
id: The unique identifier you use in API requests.context_length: The maximum number of tokens (input + output) the model can handle in a single request.pricing: An object detailing the cost per prompt token and per completion token in USD. Completion tokens are typically more expensive than prompt tokens.supported_parameters: An array listing advanced features the model supports, such astoolsorstructured_outputs. Always check this before using advanced features to avoid common failures.
Designing provider-agnostic application code
Now that we can find models programmatically, we should ensure our code is flexible enough to use them. Hardcoding model names directly in your application logic creates technical debt. Instead, treat the model name as a configuration value loaded at runtime. This makes swapping models a configuration update rather than a code change and a redeployment.
import os# Model name is loaded from an environment variable# Changing it requires no code change or redeploymentSUMMARY_MODEL = os.getenv("SUMMARY_MODEL", "google/gemini-3.1-flash-lite-preview")def generate_summary(text):response = call_openrouter(model_name=SUMMARY_MODEL,prompt=f"Summarize this: {text}")# ...
Hardcoding model names directly in your application logic creates the opposite problem, where every model change requires a code change and a redeployment. Externalizing model selection through environment variables or configuration files makes your system adaptable to new models, price changes, and performance tuning without touching application code.
Handling provider-specific nuances
OpenRouter’s API surface is uniform, but the underlying models are not. Three differences matter most in practice:
Tokenization varies across providers: GPT and Claude models use multi-character tokens, so a typical English word might consume one or two tokens. Some open-source models tokenize by character, meaning the same input can consume significantly more tokens and cost more than expected. Because token counts drive pricing, always check the
usagefield in the response rather than estimating client-side.Context window sizes differ: The
context_lengthfield in model metadata gives you the maximum tokens for the combined input and output of a single request. Models range from 8k to over 1M tokens. Sending a request that exceeds a model’s context window returns an error, so check this field before routing long documents or conversation histories to a model.Not every model supports every parameter: Features like tool calling, structured outputs, and reasoning are not universally supported. Checking
supported_parametersbefore making a request helps prevent failures where a model ignores an unsupported parameter entirely, returning a generic text response instead of the structured output or tool call you expected.
Conclusion
You can now navigate the OpenRouter model marketplace with precision, reading naming conventions, querying metadata programmatically, and writing code that treats model selection as configuration. The next layer is reliability, such as knowing what OpenRouter does when a provider goes down, and how to configure that behavior for your specific requirements. That is the subject of the next lesson.