DeepInfra on Hugging Face Inference Providers 🔥

We’re thrilled to share that DeepInfra is now a supported Inference Provider on the Hugging Face Hub! DeepInfra joins our growing ecosystem, enhancing the breadth and capabilities of serverless inference directly on the Hub’s model pages. 我们非常高兴地宣布，DeepInfra 现已成为 Hugging Face Hub 支持的推理提供商（Inference Provider）！DeepInfra 加入了我们不断壮大的生态系统，直接在 Hub 的模型页面上增强了无服务器推理的广度和能力。

Inference Providers are also seamlessly integrated into our client SDKs (for both JS and Python), making it super easy to use a wide variety of models with your preferred providers. 推理提供商也已无缝集成到我们的客户端 SDK（JS 和 Python 版本）中，让您可以极其轻松地使用您偏好的提供商来运行各种模型。

DeepInfra is a serverless AI inference platform offering one of the most cost-effective pricing per token in the industry. With a catalog of over 100 models, DeepInfra makes it easy for developers to integrate a wide range of AI capabilities into their applications with minimal setup. DeepInfra 是一个无服务器 AI 推理平台，提供业内最具性价比的 Token 定价方案之一。DeepInfra 拥有超过 100 种模型，使开发者能够以极简的配置将广泛的 AI 能力集成到自己的应用程序中。

DeepInfra supports a broad spectrum of model types - from LLMs to text-to-image, text-to-video, embeddings, and more. As part of this initial integration, DeepInfra is launching support for conversational and text-generation tasks on Hugging Face, enabling access to popular open-weight LLMs such as DeepSeek V4, Kimi-K2.6, GLM-5.1, and many more. Support for additional tasks (text-to-image, text-to-video, embeddings, and more) will roll out soon! DeepInfra 支持多种模型类型，涵盖大语言模型（LLM）、文生图、文生视频、嵌入（Embeddings）等。作为此次初步集成的一部分，DeepInfra 在 Hugging Face 上推出了对对话和文本生成任务的支持，用户可以访问 DeepSeek V4、Kimi-K2.6、GLM-5.1 等热门开源权重模型。对更多任务（如文生图、文生视频、嵌入等）的支持也将很快推出！

Read more about how to use DeepInfra as an Inference Provider in its dedicated documentation page. See the full list of models supported by DeepInfra here. Follow DeepInfra on Hugging Face: https://huggingface.co/DeepInfra. 请在专属文档页面阅读更多关于如何将 DeepInfra 作为推理提供商使用的信息。点击此处查看 DeepInfra 支持的完整模型列表。在 Hugging Face 上关注 DeepInfra：https://huggingface.co/DeepInfra。

How it works | 工作原理

In the website UI 在网站界面中

In your user account settings, you are able to: 在您的用户账户设置中，您可以：

Set your own API keys for the providers you’ve signed up with. If no custom key is set, your requests will be routed through HF. 为您已注册的提供商设置您自己的 API 密钥。如果未设置自定义密钥，您的请求将通过 HF 进行路由。
Order providers by preference. This applies to the widget and code snippets in the model pages. 按偏好对提供商进行排序。这适用于模型页面中的小部件和代码片段。

As mentioned, there are two modes when calling Inference Providers: 如前所述，调用推理提供商时有两种模式：

Custom key: calls go directly to the inference provider, using your own API key of the corresponding inference provider. 自定义密钥：调用直接发送至推理提供商，使用您在该提供商处的个人 API 密钥。
Routed by HF: in that case, you don’t need a token from the provider, and the charges are applied directly to your HF account rather than the provider’s account. 由 HF 路由：在这种情况下，您无需提供商的 Token，费用将直接计入您的 HF 账户，而非提供商账户。

Model pages showcase third-party inference providers (the ones that are compatible with the current model, sorted by user preference). 模型页面会展示第三方推理提供商（即与当前模型兼容的提供商，并按用户偏好排序）。

From the client SDKs 从客户端 SDK

DeepInfra is available through the Hugging Face SDKs - huggingface_hub (>= 1.11.2) for Python and @huggingface/inference for JavaScript. The following examples show how to use DeepSeek V4 Pro through DeepInfra. Use a Hugging Face token to authenticate - the request will be routed to DeepInfra automatically. DeepInfra 可通过 Hugging Face SDK 使用，包括 Python 版的 huggingface_hub (>= 1.11.2) 和 JavaScript 版的 @huggingface/inference。以下示例展示了如何通过 DeepInfra 使用 DeepSeek V4 Pro。使用 Hugging Face Token 进行身份验证，请求将自动路由至 DeepInfra。

Billing | 计费

For direct requests, i.e. when you use the key from an inference provider, you are billed by the corresponding provider. For instance, if you use a DeepInfra API key you’re billed on your DeepInfra account. 对于直接请求（即使用推理提供商的密钥时），您将由相应的提供商计费。例如，如果您使用 DeepInfra API 密钥，费用将计入您的 DeepInfra 账户。

For routed requests, i.e. when you authenticate via the Hugging Face Hub, you’ll only pay the standard provider API rates. There’s no additional markup from us; we just pass through the provider costs directly. (In the future, we may establish revenue-sharing agreements with our provider partners.) 对于路由请求（即通过 Hugging Face Hub 进行身份验证时），您只需支付标准的提供商 API 费率。我们不会收取任何额外费用；我们只是直接转嫁提供商的成本。（未来，我们可能会与提供商合作伙伴建立收入分成协议。）

Important Note ‼️ | 重要提示 ‼️

PRO users get $2 worth of Inference credits every month. You can use them across providers. 🔥 Subscribe to the Hugging Face PRO plan to get access to Inference credits, ZeroGPU, Spaces Dev Mode, 20x higher limits, and more. We also provide free inference with a small quota for our signed-in free users, but please upgrade to PRO if you can! PRO 用户每月可获得价值 2 美元的推理额度，可用于各个提供商。🔥 订阅 Hugging Face PRO 计划即可获得推理额度、ZeroGPU、Spaces 开发模式、20 倍更高的限制等权益。我们也为已登录的免费用户提供少量免费推理额度，但如果条件允许，请升级至 PRO！