Integrating SDKs with Local OpenAI-Compatible Services

This guide demonstrates how to connect your applications to a locally running OpenAI-compatible inference service using official SDKs. It includes setup, code samples in Python, JavaScript, .NET, cURL, streaming responses, and error handling.

Prerequisites

A running inference service at http://localhost:8553/v1/openai
Models downloaded/cached locally or accessible via alias/model ID
SDKs installed for your preferred language (Python, JS, .NET)
(Optional) Authentication configured if your service requires it

Install SDKs / Packages

Python


# Python
pip install openai

JavaScript


# JavaScript 
npm install openai

C#


# .NET (C#)
dotnet add package OpenAI

cURL


# cURL requires no installation, just ensure it's available on your system
curl --version

Using the OpenAI SDK with a Local Endpoint

Python


from openai import OpenAI
 
client = OpenAI(api_key="localkey", base_url="http://localhost:8553/v1/openai")
 
response = client.chat.completions.create(
    model="phi3.5",
    messages=[
        {
            "role": "system",
            "content": "You are a creative writing assistant who helps improve clarity and tone."
        },
        {
            "role": "user",
            "content": "Write a short story about a time traveler who accidentally becomes their own ancestor."
        }
    ]
)
 
print(response.choices[0].message.content)

JavaScript


import OpenAI from "openai";
 
const openai = new OpenAI({
  baseURL: "http://localhost:8553/v1/openai",
  apiKey: "localkey",
});
 
async function main() {
  const completion = await openai.chat.completions.create({
    model: "phi3.5",
    messages: [
      { role: "system", content: "You are a creative writing assistant who helps improve clarity and tone." },
      { role: "user", content: "Write a short story about a time traveler who accidentally becomes their own ancestor." }
    ]
  });
 
  console.log(completion.choices[0].message.content);
}
 
main();

C#


using OpenAI;
 
var client = new OpenAIClient(new OpenAIClientOptions
{
  ApiKey = "localkey",
  BaseUrl = "http://localhost:8553/v1/openai"
});
 
var response = await client.Chat.Completions.CreateAsync(new ChatCompletionRequest
{
  Model = "phi3.5",
  Messages = new List<ChatMessage>
  {
      new ChatMessage("system", "You are a creative writing assistant who helps improve clarity and tone."),
      new ChatMessage("user", "Write a short story about a time traveler who accidentally becomes their own ancestor.")
  }
});
 
Console.WriteLine(response.Choices[0].Message.Content);

cURL


curl http://localhost:8553/v1/openai/chat/completions \
  -H "Authorization: Bearer localkey" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "phi3.5",
    "messages": [
      { "role": "system", "content": "You are a creative writing assistant who helps improve clarity and tone." },
      { "role": "user", "content": "Write a short story about a time traveler who accidentally becomes their own ancestor." }
    ]
  }'

Streaming Responses

Python


for chunk in client.chat.completions.create(
    model="phi3.5",
    messages=[{"role": "user", "content": "List five futuristic startup ideas"}],
    stream=True
):
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

JavaScript


const stream = await openai.chat.completions.create({
  model: "phi3.5",
  messages: [{ role: "user", content: "List five futuristic startup ideas" }],
  stream: true
});
 
for await (const part of stream) {
  const content = part.choices[0]?.delta?.content;
  if (content) process.stdout.write(content);
}

C#


await foreach (var update in client.Chat.Completions.CreateStreamingAsync(new ChatCompletionRequest
{
    Model = "phi3.5",
    Messages = new List<ChatMessage>
    {
        new ChatMessage("user", "List five futuristic startup ideas")
    }
}))
{
    if (update.Choices[0].Delta?.Content != null)
    {
        Console.Write(update.Choices[0].Delta.Content);
    }
}

cURL


curl http://localhost:8553/v1/openai/chat/completions \
  -H "Authorization: Bearer localkey" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "phi3.5",
    "messages": [{"role": "user", "content": "List five futuristic startup ideas"}],
    "stream": true
  }'

Error Handling Examples

Python


from openai import OpenAI, APIError
 
client = OpenAI(api_key="localkey", base_url="http://localhost:8553/v1/openai")
 
try:
    response = client.chat.completions.create(
        model="phi3.5",
        messages=[{"role": "user", "content": "Summarize the principles of thermodynamics"}]
    )
    print(response.choices[0].message.content)
except APIError as e:
    print(f"API request failed: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")

JavaScript


import OpenAI from "openai";
 
const openai = new OpenAI({
  baseURL: "http://localhost:8553/v1/openai",
  apiKey: "localkey",
});
 
async function run() {
  try {
    const completion = await openai.chat.completions.create({
      model: "phi3.5",
      messages: [{ role: "user", content: "Summarize the principles of thermodynamics" }],
    });
    console.log(completion.choices[0].message.content);
  } catch (err) {
    console.error("Request failed:", err.message || err);
  }
}
 
run();

C#


using OpenAI;
 
var client = new OpenAIClient(new OpenAIClientOptions
{
    ApiKey = "localkey",
    BaseUrl = "http://localhost:8553/v1/openai"
});
 
try
{
    var response = await client.Chat.Completions.CreateAsync(new ChatCompletionRequest
    {
        Model = "phi3.5",
        Messages = new List<ChatMessage>
        {
            new ChatMessage("user", "Summarize the principles of thermodynamics")
        }
    });
    Console.WriteLine(response.Choices[0].Message.Content);
}
catch (HttpRequestException ex)
{
    Console.WriteLine($"Network error: {ex.Message}");
}
catch (Exception ex)
{
    Console.WriteLine($"Unexpected error: {ex.Message}");
}

cURL


curl http://localhost:8553/v1/openai/chat/completions \
  -H "Authorization: Bearer localkey" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "phi3.5",
    "messages": [{"role": "user", "content": "Summarize the principles of thermodynamics"}]
  }' || echo "Request failed"