Skip to content

Add sample IChatClient and IEmbeddingGenerator implementations #46482

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions docs/ai/advanced/sample-implementations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
title: "Sample implementations of IChatClient and IEmbeddingGenerator"
description: Learn more about the IChatClient and IEmbeddingGenerator interfaces, see simple implementations, and find links to concrete implementations.
ms.topic: article
ms.date: 05/28/2025
---

# Sample implementations of IChatClient and IEmbeddingGenerator

.NET libraries that provide clients for language models and services can provide implementations of the <xref:Microsoft.Extensions.AI.IChatClient> and <xref:Microsoft.Extensions.AI.IEmbeddingGenerator`2> interfaces. Any consumers of the interfaces are then able to interoperate seamlessly with these models and services via the abstractions.

## The `IChatClient` interface

The <xref:Microsoft.Extensions.AI.IChatClient> interface defines a client abstraction responsible for interacting with AI services that provide chat capabilities. It includes methods for sending and receiving messages with multi-modal content (such as text, images, and audio), either as a complete set or streamed incrementally. Additionally, it allows for retrieving strongly typed services provided by the client or its underlying services.

The following sample implements `IChatClient` to show the general structure.

:::code language="csharp" source="./snippets/sample-implementations/SampleChatClient.cs":::

For more realistic, concrete implementations of `IChatClient`, see:

- [AzureAIInferenceChatClient.cs](https://github.com/dotnet/extensions/blob/main/src/Libraries/Microsoft.Extensions.AI.AzureAIInference/AzureAIInferenceChatClient.cs)
- [OpenAIChatClient.cs](https://github.com/dotnet/extensions/blob/main/src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIChatClient.cs)
- [Microsoft.Extensions.AI chat clients](https://github.com/dotnet/extensions/tree/main/src/Libraries/Microsoft.Extensions.AI/ChatCompletion)

## The `IEmbeddingGenerator<TInput,TEmbedding>` interface

The <xref:Microsoft.Extensions.AI.IEmbeddingGenerator`2> interface represents a generic generator of embeddings. Here, `TInput` is the type of input values being embedded, and `TEmbedding` is the type of generated embedding, which inherits from the <xref:Microsoft.Extensions.AI.Embedding> class.

The `Embedding` class serves as a base class for embeddings generated by an `IEmbeddingGenerator<TInput,TEmbedding>`. It's designed to store and manage the metadata and data associated with embeddings. Derived types, like `Embedding<T>`, provide the concrete embedding vector data. For example, an `Embedding<float>` exposes a `ReadOnlyMemory<float> Vector { get; }` property for access to its embedding data.

The `IEmbeddingGenerator<TInput,TEmbedding>` interface defines a method to asynchronously generate embeddings for a collection of input values, with optional configuration and cancellation support. It also provides metadata describing the generator and allows for the retrieval of strongly typed services that can be provided by the generator or its underlying services.

The following code shows how the `SampleEmbeddingGenerator` class implements the `IEmbeddingGenerator<TInput,TEmbedding>` interface. It has a primary constructor that accepts an endpoint and model ID, which are used to identify the generator. It also implements the <xref:Microsoft.Extensions.AI.IEmbeddingGenerator`2.GenerateAsync(System.Collections.Generic.IEnumerable{`0},Microsoft.Extensions.AI.EmbeddingGenerationOptions,System.Threading.CancellationToken)> method to generate embeddings for a collection of input values.

:::code language="csharp" source="./snippets/sample-implementations/SampleEmbeddingGenerator.cs":::

This sample implementation just generates random embedding vectors. For a more realistic, concrete implementation, see [OpenTelemetryEmbeddingGenerator.cs](https://github.com/dotnet/extensions/blob/main/src/Libraries/Microsoft.Extensions.AI/Embeddings/OpenTelemetryEmbeddingGenerator.cs).
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFramework>net9.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
</PropertyGroup>

<ItemGroup>
<PackageReference Include="Microsoft.Extensions.AI.Abstractions" Version="9.5.0" />
</ItemGroup>

</Project>
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
using System.Runtime.CompilerServices;
using Microsoft.Extensions.AI;

public sealed class SampleChatClient(Uri endpoint, string modelId) : IChatClient
public sealed class SampleChatClient(Uri endpoint, string modelId)
: IChatClient
{
public ChatClientMetadata Metadata { get; } = new(nameof(SampleChatClient), endpoint, modelId);
public ChatClientMetadata Metadata { get; } =
new(nameof(SampleChatClient), endpoint, modelId);

public async Task<ChatResponse> GetResponseAsync(
IEnumerable<ChatMessage> chatMessages,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,10 @@ public async Task<GeneratedEmbeddings<Embedding<float>>> GenerateAsync(
await Task.Delay(100, cancellationToken);

// Create random embeddings.
return new GeneratedEmbeddings<Embedding<float>>(
from value in values
return [.. from value in values
select new Embedding<float>(
Enumerable.Range(0, 384).Select(_ => Random.Shared.NextSingle()).ToArray()));
Enumerable.Range(0, 384)
.Select(_ => Random.Shared.NextSingle()).ToArray())];
}

public object? GetService(Type serviceType, object? serviceKey) =>
Expand Down
23 changes: 5 additions & 18 deletions docs/ai/microsoft-extensions-ai.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,15 +43,14 @@ The following subsections show specific [IChatClient](#the-ichatclient-interface

The following sections show specific [IEmbeddingGenerator](#the-iembeddinggenerator-interface) usage examples:

- [Sample implementation](#sample-implementation)
- [Create embeddings](#create-embeddings)
- [Pipelines of functionality](#pipelines-of-functionality)

### The `IChatClient` interface

The <xref:Microsoft.Extensions.AI.IChatClient> interface defines a client abstraction responsible for interacting with AI services that provide chat capabilities. It includes methods for sending and receiving messages with multi-modal content (such as text, images, and audio), either as a complete set or streamed incrementally. Additionally, it allows for retrieving strongly typed services provided by the client or its underlying services.

.NET libraries that provide clients for language models and services can provide an implementation of the `IChatClient` interface. Any consumers of the interface are then able to interoperate seamlessly with these models and services via the abstractions.
.NET libraries that provide clients for language models and services can provide an implementation of the `IChatClient` interface. Any consumers of the interface are then able to interoperate seamlessly with these models and services via the abstractions. You can see a simple implementation at [Sample implementations of IChatClient and IEmbeddingGenerator](advanced/sample-implementations.md).

#### Request a chat response

Expand Down Expand Up @@ -195,25 +194,13 @@ If you don't know ahead of time whether the service is stateless or stateful, yo

### The `IEmbeddingGenerator` interface

The <xref:Microsoft.Extensions.AI.IEmbeddingGenerator`2> interface represents a generic generator of embeddings. Here, `TInput` is the type of input values being embedded, and `TEmbedding` is the type of generated embedding, which inherits from the <xref:Microsoft.Extensions.AI.Embedding> class.
The <xref:Microsoft.Extensions.AI.IEmbeddingGenerator`2> interface represents a generic generator of embeddings. For the generic type parameters, `TInput` is the type of input values being embedded, and `TEmbedding` is the type of generated embedding, which inherits from the <xref:Microsoft.Extensions.AI.Embedding> class.

The `Embedding` class serves as a base class for embeddings generated by an `IEmbeddingGenerator`. It's designed to store and manage the metadata and data associated with embeddings. Derived types, like `Embedding<T>`, provide the concrete embedding vector data. For example, an `Embedding<float>` exposes a `ReadOnlyMemory<float> Vector { get; }` property for access to its embedding data.
The `Embedding` class serves as a base class for embeddings generated by an `IEmbeddingGenerator`. It's designed to store and manage the metadata and data associated with embeddings. Derived types, like <xref:Microsoft.Extensions.AI.Embedding`1>, provide the concrete embedding vector data. For example, an `Embedding<float>` exposes a `ReadOnlyMemory<float> Vector { get; }` property for access to its embedding data.

The `IEmbeddingGenerator` interface defines a method to asynchronously generate embeddings for a collection of input values, with optional configuration and cancellation support. It also provides metadata describing the generator and allows for the retrieval of strongly typed services that can be provided by the generator or its underlying services.

#### Sample implementation

The following sample implementation of `IEmbeddingGenerator` shows the general structure.

:::code language="csharp" source="snippets/microsoft-extensions-ai/AI.Shared/SampleEmbeddingGenerator.cs":::

The preceding code:

- Defines a class named `SampleEmbeddingGenerator` that implements the `IEmbeddingGenerator<string, Embedding<float>>` interface.
- Has a primary constructor that accepts an endpoint and model ID, which are used to identify the generator.
- Implements the `GenerateAsync` method to generate embeddings for a collection of input values.

The sample implementation just generates random embedding vectors. You can find a concrete implementation in the [📦 Microsoft.Extensions.AI.OpenAI](https://www.nuget.org/packages/Microsoft.Extensions.AI.OpenAI) package.
Most users don't need to implement the `IEmbeddingGenerator` interface. However, if you're a library author, you can see a simple implementation at [Sample implementations of IChatClient and IEmbeddingGenerator](advanced/sample-implementations.md).

#### Create embeddings

Expand Down Expand Up @@ -247,7 +234,7 @@ In this way, the `RateLimitingEmbeddingGenerator` can be composed with other `IE

You can start building with `Microsoft.Extensions.AI` in the following ways:

- **Library developers**: If you own libraries that provide clients for AI services, consider implementing the interfaces in your libraries. This allows users to easily integrate your NuGet package via the abstractions.
- **Library developers**: If you own libraries that provide clients for AI services, consider implementing the interfaces in your libraries. This allows users to easily integrate your NuGet package via the abstractions. For example implementations, see [Sample implementations of IChatClient and IEmbeddingGenerator](advanced/sample-implementations.md).
- **Service consumers**: If you're developing libraries that consume AI services, use the abstractions instead of hardcoding to a specific AI service. This approach gives your consumers the flexibility to choose their preferred provider.
- **Application developers**: Use the abstractions to simplify integration into your apps. This enables portability across models and services, facilitates testing and mocking, leverages middleware provided by the ecosystem, and maintains a consistent API throughout your app, even if you use different services in different parts of your application.
- **Ecosystem contributors**: If you're interested in contributing to the ecosystem, consider writing custom middleware components.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@
<Nullable>enable</Nullable>
</PropertyGroup>

<ItemGroup>
<PackageReference Include="OllamaSharp" Version="5.1.19" />
</ItemGroup>

<ItemGroup>
<ProjectReference Include="..\AI.Shared\AI.Shared.csproj" />
</ItemGroup>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
using Microsoft.Extensions.AI;
using OllamaSharp;

IChatClient client = new SampleChatClient(
new Uri("http://coolsite.ai"), "target-ai-model");
IChatClient client = new OllamaApiClient(
new Uri("http://localhost:11434/"), "phi3:mini");

// <Snippet1>
List<ChatMessage> history = [];
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@

<ItemGroup>
<PackageReference Include="Microsoft.Extensions.Hosting" Version="10.0.0-preview.3.25171.5" />
<PackageReference Include="OllamaSharp" Version="5.1.19" />
<ProjectReference Include="..\AI.Shared\AI.Shared.csproj" />
</ItemGroup>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,17 @@
using Microsoft.Extensions.AI;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using OllamaSharp;

// <SnippetUse>
HostApplicationBuilder builder = Host.CreateApplicationBuilder(args);

IChatClient client = new OllamaApiClient(
new Uri("http://localhost:11434/"),
"phi3:mini");

builder.Services.AddChatClient(services =>
new SampleChatClient(new Uri("http://localhost"), "test")
client
.AsBuilder()
.UseDistributedCache()
.UseRateLimiting()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@
<Nullable>enable</Nullable>
</PropertyGroup>

<ItemGroup>
<PackageReference Include="OllamaSharp" Version="5.1.19" />
</ItemGroup>

<ItemGroup>
<ProjectReference Include="..\AI.Shared\AI.Shared.csproj" />
</ItemGroup>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
using Microsoft.Extensions.AI;
using OllamaSharp;
using System.Threading.RateLimiting;

IEmbeddingGenerator<string, Embedding<float>> generator =
new RateLimitingEmbeddingGenerator(
new SampleEmbeddingGenerator(new Uri("http://coolsite.ai"), "target-ai-model"),
new OllamaApiClient(new Uri("http://localhost:11434/"), "phi3:mini"),
new ConcurrencyLimiter(new()
{
PermitLimit = 1,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@
<Nullable>enable</Nullable>
</PropertyGroup>

<ItemGroup>
<PackageReference Include="OllamaSharp" Version="5.1.19" />
</ItemGroup>

<ItemGroup>
<ProjectReference Include="..\AI.Shared\AI.Shared.csproj" />
</ItemGroup>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
// <Snippet1>
using Microsoft.Extensions.AI;
using OllamaSharp;

IEmbeddingGenerator<string, Embedding<float>> generator =
new SampleEmbeddingGenerator(
new Uri("http://coolsite.ai"), "target-ai-model");
new OllamaApiClient(new Uri("http://localhost:11434/"), "phi3:mini");

foreach (Embedding<float> embedding in
await generator.GenerateAsync(["What is AI?", "What is .NET?"]))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@

<ItemGroup>
<PackageReference Include="Microsoft.Extensions.Caching.Memory" Version="10.0.0-preview.4.25258.110" />
<PackageReference Include="OllamaSharp" Version="5.1.19" />
<PackageReference Include="OpenTelemetry.Exporter.Console" Version="1.12.0" />
<ProjectReference Include="..\AI.Shared\AI.Shared.csproj" />
</ItemGroup>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
using Microsoft.Extensions.Caching.Distributed;
using Microsoft.Extensions.Caching.Memory;
using Microsoft.Extensions.Options;
using OllamaSharp;
using OpenTelemetry.Trace;

// Configure OpenTelemetry exporter
Expand All @@ -14,7 +15,7 @@
// Explore changing the order of the intermediate "Use" calls to see
// what impact that has on what gets cached and traced.
IEmbeddingGenerator<string, Embedding<float>> generator = new EmbeddingGeneratorBuilder<string, Embedding<float>>(
new SampleEmbeddingGenerator(new Uri("http://coolsite.ai"), "target-ai-model"))
new OllamaApiClient(new Uri("http://localhost:11434/"), "phi3:mini"))
.UseDistributedCache(
new MemoryDistributedCache(
Options.Create(new MemoryDistributedCacheOptions())))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@
<Nullable>enable</Nullable>
</PropertyGroup>

<ItemGroup>
<PackageReference Include="OllamaSharp" Version="5.1.19" />
</ItemGroup>

<ItemGroup>
<ProjectReference Include="..\AI.Shared\AI.Shared.csproj" />
</ItemGroup>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
using Microsoft.Extensions.AI;
using OllamaSharp;

IChatClient client = new SampleChatClient(
new Uri("http://coolsite.ai"), "target-ai-model");
IChatClient client = new OllamaApiClient(
new Uri("http://localhost:11434/"), "phi3:mini");

// <Snippet1>
Console.WriteLine(await client.GetResponseAsync(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@
<Nullable>enable</Nullable>
</PropertyGroup>

<ItemGroup>
<PackageReference Include="OllamaSharp" Version="5.1.19" />
</ItemGroup>

<ItemGroup>
<ProjectReference Include="..\AI.Shared\AI.Shared.csproj" />
</ItemGroup>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
using Microsoft.Extensions.AI;
using OllamaSharp;

IChatClient client = new SampleChatClient(
new Uri("http://coolsite.ai"), "target-ai-model");
IChatClient client = new OllamaApiClient(
new Uri("http://localhost:11434/"), "phi3:mini");

// <Snippet1>
await foreach (ChatResponseUpdate update in client.GetStreamingResponseAsync("What is AI?"))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
</PropertyGroup>

<ItemGroup>
<PackageReference Include="OllamaSharp" Version="5.1.19" />
<PackageReference Include="OpenTelemetry.Exporter.Console" Version="1.12.0" />
</ItemGroup>

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
using Microsoft.Extensions.AI;
using OllamaSharp;
using OpenTelemetry.Trace;

// Configure OpenTelemetry exporter.
Expand All @@ -8,10 +9,10 @@
.AddConsoleExporter()
.Build();

var sampleChatClient = new SampleChatClient(
new Uri("http://coolsite.ai"), "target-ai-model");
IChatClient ollamaClient = new OllamaApiClient(
new Uri("http://localhost:11434/"), "phi3:mini");

IChatClient client = new ChatClientBuilder(sampleChatClient)
IChatClient client = new ChatClientBuilder(ollamaClient)
.UseOpenTelemetry(
sourceName: sourceName,
configure: c => c.EnableSensitiveData = true)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@

<ItemGroup>
<PackageReference Include="Microsoft.Extensions.AI" Version="9.5.0" />
<PackageReference Include="OllamaSharp" Version="5.1.19" />
</ItemGroup>

<ItemGroup>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
using Microsoft.Extensions.AI;
using OllamaSharp;

IChatClient client = new SampleChatClient(
new Uri("http://coolsite.ai"), "target-ai-model");
IChatClient client = new OllamaApiClient(
new Uri("http://localhost:11434/"), "phi3:mini");

Console.WriteLine(await client.GetResponseAsync("What is AI?"));
4 changes: 4 additions & 0 deletions docs/ai/toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,10 @@ items:
href: tutorials/evaluate-with-reporting.md
- name: "Evaluate response safety with caching and reporting"
href: tutorials/evaluate-safety.md
- name: Advanced
items:
- name: Sample interface implementations
href: advanced/sample-implementations.md
- name: Resources
items:
- name: API reference
Expand Down