OpenAI SDK for TypeScript: Building Chat Completions and Tool Calling in Node.js
A hands-on guide to the official OpenAI TypeScript SDK. Learn how to set up the client, create chat completions, implement function calling with tool definitions, and stream responses in a Node.js application.
Getting Started with the OpenAI TypeScript SDK
The official openai npm package provides a fully typed client for the OpenAI API. Unlike community wrappers, it is maintained by OpenAI and covers every endpoint — chat completions, embeddings, assistants, images, and audio — with complete TypeScript definitions.
This tutorial walks through the core patterns you need for building AI agent backends: client setup, chat completions, tool calling, and streaming.
Installation and Client Setup
Install the SDK and configure your client:
flowchart TD
USER(["User message"])
LLM["LLM call<br/>with tools schema"]
DECIDE{"Model wants<br/>to call a tool?"}
EXEC["Execute tool<br/>sandboxed runtime"]
RESULT["Append tool_result<br/>to messages"]
GUARD{"Output passes<br/>guardrails?"}
DONE(["Final reply"])
BLOCK(["Refuse and log"])
USER --> LLM --> DECIDE
DECIDE -->|Yes| EXEC --> RESULT --> LLM
DECIDE -->|No| GUARD
GUARD -->|Yes| DONE
GUARD -->|No| BLOCK
style LLM fill:#4f46e5,stroke:#4338ca,color:#fff
style EXEC fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
style DONE fill:#059669,stroke:#047857,color:#fff
style BLOCK fill:#dc2626,stroke:#b91c1c,color:#fff
npm install openai
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
The constructor accepts optional parameters for baseURL, timeout, maxRetries, and custom fetch implementations. For production, configure retries and timeouts explicitly:
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
const client = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
maxRetries: 3,
timeout: 30_000, // 30 seconds
});
Basic Chat Completions
The chat completions endpoint is the foundation of every agent interaction. Here is a basic request with typed messages:
import OpenAI from "openai";
import type { ChatCompletionMessageParam } from "openai/resources/chat/completions";
const messages: ChatCompletionMessageParam[] = [
{
role: "system",
content: "You are a helpful coding assistant specializing in TypeScript.",
},
{
role: "user",
content: "Explain the difference between interface and type in TypeScript.",
},
];
const completion = await client.chat.completions.create({
model: "gpt-4o",
messages,
temperature: 0.7,
max_tokens: 1024,
});
const reply = completion.choices[0].message.content;
console.log(reply);
The response is fully typed — completion.choices[0].message gives you a ChatCompletionMessage with role, content, tool_calls, and refusal fields.
Implementing Tool Calling (Function Calling)
Tool calling lets the model invoke functions you define. This is the mechanism that turns a chat model into an agent. You define tools as JSON Schema objects, the model decides when to call them, and your code executes the actual logic.
import type { ChatCompletionTool } from "openai/resources/chat/completions";
const tools: ChatCompletionTool[] = [
{
type: "function",
function: {
name: "get_weather",
description: "Get the current weather for a given city",
parameters: {
type: "object",
properties: {
city: {
type: "string",
description: "The city name, e.g., San Francisco",
},
units: {
type: "string",
enum: ["celsius", "fahrenheit"],
description: "Temperature unit preference",
},
},
required: ["city"],
},
},
},
];
Send the tools alongside messages and handle the model's tool call response:
const response = await client.chat.completions.create({
model: "gpt-4o",
messages,
tools,
tool_choice: "auto",
});
const message = response.choices[0].message;
if (message.tool_calls) {
for (const toolCall of message.tool_calls) {
const args = JSON.parse(toolCall.function.arguments);
let result: string;
if (toolCall.function.name === "get_weather") {
result = await fetchWeather(args.city, args.units);
} else {
result = JSON.stringify({ error: "Unknown tool" });
}
// Append the assistant's message and the tool result
messages.push(message);
messages.push({
role: "tool",
tool_call_id: toolCall.id,
content: result,
});
}
// Get the final response with tool results included
const finalResponse = await client.chat.completions.create({
model: "gpt-4o",
messages,
tools,
});
console.log(finalResponse.choices[0].message.content);
}
Building an Agent Loop
A real agent iterates until the model stops requesting tools:
async function runAgent(
client: OpenAI,
systemPrompt: string,
userMessage: string,
tools: ChatCompletionTool[],
maxIterations = 10
): Promise<string> {
const messages: ChatCompletionMessageParam[] = [
{ role: "system", content: systemPrompt },
{ role: "user", content: userMessage },
];
for (let i = 0; i < maxIterations; i++) {
const response = await client.chat.completions.create({
model: "gpt-4o",
messages,
tools,
});
const choice = response.choices[0];
messages.push(choice.message);
if (choice.finish_reason === "stop") {
return choice.message.content ?? "";
}
if (choice.message.tool_calls) {
for (const toolCall of choice.message.tool_calls) {
const result = await executeTool(toolCall);
messages.push({
role: "tool",
tool_call_id: toolCall.id,
content: JSON.stringify(result),
});
}
}
}
return "Agent reached maximum iterations.";
}
Streaming Responses
For real-time UIs, stream tokens as they arrive:
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
const stream = await client.chat.completions.create({
model: "gpt-4o",
messages,
stream: true,
});
for await (const chunk of stream) {
const delta = chunk.choices[0]?.delta?.content;
if (delta) {
process.stdout.write(delta);
}
}
The stream is an async iterable. Each chunk contains a delta with partial content, tool call fragments, or finish reasons. The SDK handles reconnection and parsing automatically.
FAQ
How do I handle rate limits with the OpenAI SDK?
The SDK automatically retries on 429 (rate limit) and 500-level errors using exponential backoff. Configure maxRetries in the constructor. For high-throughput applications, implement a token bucket or use the x-ratelimit-remaining-tokens response header to throttle proactively.
Can the model call multiple tools in a single response?
Yes. The tool_calls array can contain multiple entries when the model determines it needs several pieces of information simultaneously. Your agent loop should execute all of them (ideally in parallel with Promise.all) before sending the results back.
What is the difference between tool_choice "auto" and "required"?
Setting tool_choice: "auto" lets the model decide whether to call a tool or respond directly. Setting tool_choice: "required" forces the model to call at least one tool. Use "required" when you know the user's request demands tool usage, such as data lookups or calculations.
#OpenAI #TypeScript #Nodejs #FunctionCalling #Streaming #ChatCompletions #AgenticAI #LearnAI #AIEngineering
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.