Using a text embedding model locally with semantic kernel

Question

I've been reading Stephen Toub's blog post about building a simple console-based .NET chat application from the ground up with semantic-kernel. I'm following the examples but instead of OpenAI I want to use microsoft Phi 3 and the nomic embedding model. The first examples in the blog post I could recreate using the semantic kernel huggingface plugin. But I can't seem to run the text embedding example.

I've downloaded Phi and nomic embed text and are running them on a local server with lm studio.

Here's the code I came up with that uses the huggingface plugin:

using System.Net;
using System.Text;
using System.Text.RegularExpressions;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Embeddings;
using Microsoft.SemanticKernel.Memory;
using System.Numerics.Tensors;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using Microsoft.SemanticKernel.ChatCompletion;

#pragma warning disable SKEXP0070, SKEXP0003, SKEXP0001, SKEXP0011, SKEXP0052, SKEXP0055, SKEXP0050  // Type is for evaluation purposes only and is subject to change or removal in future updates. 

internal class Program
{
    private static async Task Main(string[] args)
    {
        //Suppress this diagnostic to proceed.
        // Initialize the Semantic kernel
        IKernelBuilder kernelBuilder = Kernel.CreateBuilder();
        kernelBuilder.Services.ConfigureHttpClientDefaults(c => c.AddStandardResilienceHandler());
        var kernel = kernelBuilder
            .AddHuggingFaceTextEmbeddingGeneration("nomic-ai/nomic-embed-text-v1.5-GGUF/nomic-embed-text-v1.5.Q8_0.gguf",
            new Uri("http://localhost:1234/v1"),
            apiKey: "lm-studio",
            serviceId: null)
            .Build();

        var embeddingGenerator = kernel.GetRequiredService<ITextEmbeddingGenerationService>();
        var memoryBuilder = new MemoryBuilder();
        memoryBuilder.WithTextEmbeddingGeneration(embeddingGenerator);
        memoryBuilder.WithMemoryStore(new VolatileMemoryStore());
        var memory = memoryBuilder.Build();
        // Download a document and create embeddings for it
        string input = "What is an amphibian?";
        string[] examples = [ "What is an amphibian?",
                              "Cos'è un anfibio?",
                              "A frog is an amphibian.",
                              "Frogs, toads, and salamanders are all examples.",
                              "Amphibians are four-limbed and ectothermic vertebrates of the class Amphibia.",
                              "They are four-limbed and ectothermic vertebrates.",
                              "A frog is green.",
                              "A tree is green.",
                              "It's not easy bein' green.",
                              "A dog is a mammal.",
                              "A dog is a man's best friend.",
                              "You ain't never had a friend like me.",
                              "Rachel, Monica, Phoebe, Joey, Chandler, Ross"];
        for (int i = 0; i < examples.Length; i++)
            await memory.SaveInformationAsync("net7perf", examples[i], $"paragraph{i}");
        var embed = await embeddingGenerator.GenerateEmbeddingsAsync([input]);
        ReadOnlyMemory<float> inputEmbedding = (embed)[0];
        // Generate embeddings for each chunk.
        IList<ReadOnlyMemory<float>> embeddings = await embeddingGenerator.GenerateEmbeddingsAsync(examples);
        // Print the cosine similarity between the input and each example
        float[] similarity = embeddings.Select(e => TensorPrimitives.CosineSimilarity(e.Span, inputEmbedding.Span)).ToArray();
        similarity.AsSpan().Sort(examples.AsSpan(), (f1, f2) => f2.CompareTo(f1));
        Console.WriteLine("Similarity Example");
        for (int i = 0; i < similarity.Length; i++)
            Console.WriteLine($"{similarity[i]:F6}   {examples[i]}");
    }
}

At the line:

for (int i = 0; i < examples.Length; i++)
    await memory.SaveInformationAsync("net7perf", examples[i], $"paragraph{i}");

I get the following exception:

JsonException: The JSON value could not be converted to Microsoft.SemanticKernel.Connectors.HuggingFace.Core.TextEmbeddingResponse

Does anybody know what I'm doing wrong?

I've downloaded the following nuget packages into the project:

Id	Versions	ProjectName
Microsoft.SemanticKernel.Core	{1.15.0}	LocalLlmApp
Microsoft.SemanticKernel.Plugins.Memory	{1.15.0-alpha}	LocalLlmApp
Microsoft.Extensions.Http.Resilience	{8.6.0}	LocalLlmApp
Microsoft.Extensions.Logging	{8.0.0}	LocalLlmApp
Microsoft.SemanticKernel.Connectors.HuggingFace	{1.15.0-preview}	LocalLlmApp
Newtonsoft.Json	{13.0.3}	LocalLlmApp
Microsoft.Extensions.Logging.Console	{8.0.0}	LocalLlmApp

Robert · Accepted Answer · 2024-07-23 13:13:39Z

4

I found a solution to this problem thanks to Bruno Capuano's blog post about building a local RAG scenario using Phi-3 and SemanticKernel.

The code up to the string input = "What is an amphibian?"; line now looks like this:

    // Initialize the Semantic kernel
    IKernelBuilder kernelBuilder = Kernel.CreateBuilder();

    Kernel kernel = kernelBuilder
        .AddOpenAIChatCompletion(
                    modelId: "phi3",
                endpoint: new Uri("http://localhost:1234"),
                apiKey: "lm-studio")
        .AddLocalTextEmbeddingGeneration()
        .Build();

    // get the embeddings generator service
    var embeddingGenerator = kernel.Services.GetRequiredService<ITextEmbeddingGenerationService>();
    var memory = new SemanticTextMemory(new VolatileMemoryStore(), embeddingGenerator);

So although we're not using OpenAI we can still use the AddOpenAIChatCompletion method.

The AddLocalTextEmbeddingGeneration() method is from the SmartComponents.LocalEmbeddings.SemanticKernel Nuget package

I wrote a small console program with most of the examples from the blog posts. You can find it on github

answered Jul 23, 2024 at 13:13

Robert

1,0681 gold badge11 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Lucas Over a year ago

Thanks for your example. Do you have any idea how we can use the SemanticTextMemory with DI (builder.Services.AddKernelMemory) in Semantic Kernel?

Lucas Over a year ago

I just found out that Kernel Memory and Semantic Memory (like SemanticTextMemory) are two different things. github.com/microsoft/…

Loïc Carrère · Accepted Answer · 2024-11-05 15:47:48Z

We've made embedding computation fast and straightforward in the LM-Kit.NET SDK which come with a community license including embedding capabilities.

Disclaimer: I'm one of the developers of this toolkit.

Simply add the following NuGet packages:

LM-Kit.NET
LM-Kit.NET.Backend.Cuda12.Windows

I've ported your code snippet using it:

 LLM model = new LLM("https://huggingface.co/lm-kit/nomic-embed-text-1.5/resolve/main/nomic-embed-text-1.5-F16.gguf?download=true");
 Embedder embedder = new Embedder(model);

 Stopwatch sw = Stopwatch.StartNew();

 string input = "What is an amphibian?";
 string[] examples = [ "What is an amphibian?",
                           "Cos'è un anfibio?",
                           "A frog is an amphibian.",
                           "Frogs, toads, and salamanders are all examples.",
                           "Amphibians are four-limbed and ectothermic vertebrates of the class Amphibia.",
                           "They are four-limbed and ectothermic vertebrates.",
                           "A frog is green.",
                           "A tree is green.",
                           "It's not easy bein' green.",
                           "A dog is a mammal.",
                           "A dog is a man's best friend.",
                           "You ain't never had a friend like me.",
                           "Rachel, Monica, Phoebe, Joey, Chandler, Ross"];


 float[] inputEmbedding = await embedder.GetEmbeddingsAsync(input);
 float[][] exampleEmbeddings = await embedder.GetEmbeddingsAsync(examples);

 sw.Stop();

 Console.WriteLine($"Elapsed (ms): {Math.Round(sw.Elapsed.TotalMilliseconds)} - Similarities: ");

 for (int index = 0; index < examples.Length; index++)
 {
     float similarity = Embedder.GetCosineSimilarity(inputEmbedding, exampleEmbeddings[index]);
     Console.WriteLine($"{similarity} {examples[index]}");
 }

Which is producing the following output:

Elapsed (ms): 60 - Similarities:
0,99999046 What is an amphibian?
0,41398236 Cos'è un anfibio?
0,7832368 A frog is an amphibian.
0,7543625 Frogs, toads, and salamanders are all examples.
0,81931585 Amphibians are four-limbed and ectothermic vertebrates of the class Amphibia.
0,6827168 They are four-limbed and ectothermic vertebrates.
0,69980454 A frog is green.
0,51011163 A tree is green.
0,51940656 It's not easy bein' green.
0,7068557 A dog is a mammal.
0,63366437 A dog is a man's best friend.
0,4223255 You ain't never had a friend like me.
0,41579404 Rachel, Monica, Phoebe, Joey, Chandler, Ross

jeb · Accepted Answer · 2024-07-06 07:23:40Z

1

I think you cannot use AddHuggingFaceTextEmbeddingGeneration with an embedding model from LM Studio out of the box. The reason is that the HuggingFaceClient internally changes the url and adds:

pipeline/feature-extraction/

 private Uri GetEmbeddingGenerationEndpoint(string modelId)
     => new($"{this.Endpoint}{this.Separator}pipeline/feature-extraction/{modelId}");

That's the same Error Message I get in the LM Studio Console:

[2024-07-03 22:18:19.898] [ERROR] Unexpected endpoint or method. (POST /v1/embedding/pipeline/feature-extraction/nomic-ai/nomic-embed-text-v1.5-GGUF/nomic-embed-text-v1.5.Q5_K_M.gguf). Returning 200 anyway

In order to get this working the url would have to be changed.

edited Jul 6, 2024 at 7:23

answered Jul 3, 2024 at 20:44

jeb

1,3961 gold badge10 silver badges24 bronze badges

Comments

Roger Barreto · Accepted Answer · 2024-09-11 14:32:52Z

1

Currently you can also use a Local Model with ONNX without the hurdle of a Local Http Server for it.

Check it here: https://github.com/microsoft/semantic-kernel/tree/main/dotnet/samples/Demos/OnnxSimpleRAG

answered Sep 11, 2024 at 14:32

Roger Barreto

2,2841 gold badge19 silver badges24 bronze badges

1 Comment

Robert Jan 5 at 15:53

Thanks for the suggestion I added examples using a local model with ONNX to my github page

Collectives™ on Stack Overflow

Using a text embedding model locally with semantic kernel

4 Answers 4

2 Comments

Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related