Skip to Content
v1.1 is out now!
Interfacing SDK

Integrating SDKs with Local OpenAI-Compatible Services

This guide demonstrates how to connect your applications to a locally running OpenAI-compatible inference service using official SDKs. It includes setup, code samples in Python, JavaScript, .NET, cURL, streaming responses, and error handling.


Prerequisites

  • A running inference service at http://localhost:8553/v1/openai
  • Models downloaded/cached locally or accessible via alias/model ID
  • SDKs installed for your preferred language (Python, JS, .NET)
  • (Optional) Authentication configured if your service requires it

Install SDKs / Packages

# Python pip install openai

Using the OpenAI SDK with a Local Endpoint

from openai import OpenAI client = OpenAI(api_key="localkey", base_url="http://localhost:8553/v1/openai") response = client.chat.completions.create( model="phi3.5", messages=[ { "role": "system", "content": "You are a creative writing assistant who helps improve clarity and tone." }, { "role": "user", "content": "Write a short story about a time traveler who accidentally becomes their own ancestor." } ] ) print(response.choices[0].message.content)

Streaming Responses

for chunk in client.chat.completions.create( model="phi3.5", messages=[{"role": "user", "content": "List five futuristic startup ideas"}], stream=True ): if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="", flush=True)

Error Handling Examples

from openai import OpenAI, APIError client = OpenAI(api_key="localkey", base_url="http://localhost:8553/v1/openai") try: response = client.chat.completions.create( model="phi3.5", messages=[{"role": "user", "content": "Summarize the principles of thermodynamics"}] ) print(response.choices[0].message.content) except APIError as e: print(f"API request failed: {e}") except Exception as e: print(f"Unexpected error: {e}")