5

I've been reading Stephen Toub's blog post about building a simple console-based .NET chat application from the ground up with semantic-kernel. I'm following the examples but instead of OpenAI I want to use microsoft Phi 3 and the nomic embedding model. The first examples in the blog post I could recreate using the semantic kernel huggingface plugin. But I can't seem to run the text embedding example.

I've downloaded Phi and nomic embed text and are running them on a local server with lm studio.

Here's the code I came up with that uses the huggingface plugin:

using System.Net;
using System.Text;
using System.Text.RegularExpressions;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Embeddings;
using Microsoft.SemanticKernel.Memory;
using System.Numerics.Tensors;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using Microsoft.SemanticKernel.ChatCompletion;

#pragma warning disable SKEXP0070, SKEXP0003, SKEXP0001, SKEXP0011, SKEXP0052, SKEXP0055, SKEXP0050  // Type is for evaluation purposes only and is subject to change or removal in future updates. 

internal class Program
{
    private static async Task Main(string[] args)
    {
        //Suppress this diagnostic to proceed.
        // Initialize the Semantic kernel
        IKernelBuilder kernelBuilder = Kernel.CreateBuilder();
        kernelBuilder.Services.ConfigureHttpClientDefaults(c => c.AddStandardResilienceHandler());
        var kernel = kernelBuilder
            .AddHuggingFaceTextEmbeddingGeneration("nomic-ai/nomic-embed-text-v1.5-GGUF/nomic-embed-text-v1.5.Q8_0.gguf",
            new Uri("http://localhost:1234/v1"),
            apiKey: "lm-studio",
            serviceId: null)
            .Build();

        var embeddingGenerator = kernel.GetRequiredService<ITextEmbeddingGenerationService>();
        var memoryBuilder = new MemoryBuilder();
        memoryBuilder.WithTextEmbeddingGeneration(embeddingGenerator);
        memoryBuilder.WithMemoryStore(new VolatileMemoryStore());
        var memory = memoryBuilder.Build();
        // Download a document and create embeddings for it
        string input = "What is an amphibian?";
        string[] examples = [ "What is an amphibian?",
                              "Cos'è un anfibio?",
                              "A frog is an amphibian.",
                              "Frogs, toads, and salamanders are all examples.",
                              "Amphibians are four-limbed and ectothermic vertebrates of the class Amphibia.",
                              "They are four-limbed and ectothermic vertebrates.",
                              "A frog is green.",
                              "A tree is green.",
                              "It's not easy bein' green.",
                              "A dog is a mammal.",
                              "A dog is a man's best friend.",
                              "You ain't never had a friend like me.",
                              "Rachel, Monica, Phoebe, Joey, Chandler, Ross"];
        for (int i = 0; i < examples.Length; i++)
            await memory.SaveInformationAsync("net7perf", examples[i], $"paragraph{i}");
        var embed = await embeddingGenerator.GenerateEmbeddingsAsync([input]);
        ReadOnlyMemory<float> inputEmbedding = (embed)[0];
        // Generate embeddings for each chunk.
        IList<ReadOnlyMemory<float>> embeddings = await embeddingGenerator.GenerateEmbeddingsAsync(examples);
        // Print the cosine similarity between the input and each example
        float[] similarity = embeddings.Select(e => TensorPrimitives.CosineSimilarity(e.Span, inputEmbedding.Span)).ToArray();
        similarity.AsSpan().Sort(examples.AsSpan(), (f1, f2) => f2.CompareTo(f1));
        Console.WriteLine("Similarity Example");
        for (int i = 0; i < similarity.Length; i++)
            Console.WriteLine($"{similarity[i]:F6}   {examples[i]}");
    }
}

At the line:

for (int i = 0; i < examples.Length; i++)
    await memory.SaveInformationAsync("net7perf", examples[i], $"paragraph{i}");

I get the following exception:

JsonException: The JSON value could not be converted to Microsoft.SemanticKernel.Connectors.HuggingFace.Core.TextEmbeddingResponse

Does anybody know what I'm doing wrong?

I've downloaded the following nuget packages into the project:

Id Versions ProjectName
Microsoft.SemanticKernel.Core {1.15.0} LocalLlmApp
Microsoft.SemanticKernel.Plugins.Memory {1.15.0-alpha} LocalLlmApp
Microsoft.Extensions.Http.Resilience {8.6.0} LocalLlmApp
Microsoft.Extensions.Logging {8.0.0} LocalLlmApp
Microsoft.SemanticKernel.Connectors.HuggingFace {1.15.0-preview} LocalLlmApp
Newtonsoft.Json {13.0.3} LocalLlmApp
Microsoft.Extensions.Logging.Console {8.0.0} LocalLlmApp

4 Answers 4

4

I found a solution to this problem thanks to Bruno Capuano's blog post about building a local RAG scenario using Phi-3 and SemanticKernel.

The code up to the string input = "What is an amphibian?"; line now looks like this:

    // Initialize the Semantic kernel
    IKernelBuilder kernelBuilder = Kernel.CreateBuilder();

    Kernel kernel = kernelBuilder
        .AddOpenAIChatCompletion(
                    modelId: "phi3",
                endpoint: new Uri("http://localhost:1234"),
                apiKey: "lm-studio")
        .AddLocalTextEmbeddingGeneration()
        .Build();

    // get the embeddings generator service
    var embeddingGenerator = kernel.Services.GetRequiredService<ITextEmbeddingGenerationService>();
    var memory = new SemanticTextMemory(new VolatileMemoryStore(), embeddingGenerator);

So although we're not using OpenAI we can still use the AddOpenAIChatCompletion method.

The AddLocalTextEmbeddingGeneration() method is from the SmartComponents.LocalEmbeddings.SemanticKernel Nuget package

I wrote a small console program with most of the examples from the blog posts. You can find it on github

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for your example. Do you have any idea how we can use the SemanticTextMemory with DI (builder.Services.AddKernelMemory) in Semantic Kernel?
I just found out that Kernel Memory and Semantic Memory (like SemanticTextMemory) are two different things. github.com/microsoft/…
2

We've made embedding computation fast and straightforward in the LM-Kit.NET SDK which come with a community license including embedding capabilities.

Disclaimer: I'm one of the developers of this toolkit.

Simply add the following NuGet packages:

  • LM-Kit.NET
  • LM-Kit.NET.Backend.Cuda12.Windows

I've ported your code snippet using it:

 LLM model = new LLM("https://huggingface.co/lm-kit/nomic-embed-text-1.5/resolve/main/nomic-embed-text-1.5-F16.gguf?download=true");
 Embedder embedder = new Embedder(model);

 Stopwatch sw = Stopwatch.StartNew();

 string input = "What is an amphibian?";
 string[] examples = [ "What is an amphibian?",
                           "Cos'è un anfibio?",
                           "A frog is an amphibian.",
                           "Frogs, toads, and salamanders are all examples.",
                           "Amphibians are four-limbed and ectothermic vertebrates of the class Amphibia.",
                           "They are four-limbed and ectothermic vertebrates.",
                           "A frog is green.",
                           "A tree is green.",
                           "It's not easy bein' green.",
                           "A dog is a mammal.",
                           "A dog is a man's best friend.",
                           "You ain't never had a friend like me.",
                           "Rachel, Monica, Phoebe, Joey, Chandler, Ross"];


 float[] inputEmbedding = await embedder.GetEmbeddingsAsync(input);
 float[][] exampleEmbeddings = await embedder.GetEmbeddingsAsync(examples);

 sw.Stop();

 Console.WriteLine($"Elapsed (ms): {Math.Round(sw.Elapsed.TotalMilliseconds)} - Similarities: ");

 for (int index = 0; index < examples.Length; index++)
 {
     float similarity = Embedder.GetCosineSimilarity(inputEmbedding, exampleEmbeddings[index]);
     Console.WriteLine($"{similarity} {examples[index]}");
 }

Which is producing the following output:

Elapsed (ms): 60 - Similarities:
0,99999046 What is an amphibian?
0,41398236 Cos'è un anfibio?
0,7832368 A frog is an amphibian.
0,7543625 Frogs, toads, and salamanders are all examples.
0,81931585 Amphibians are four-limbed and ectothermic vertebrates of the class Amphibia.
0,6827168 They are four-limbed and ectothermic vertebrates.
0,69980454 A frog is green.
0,51011163 A tree is green.
0,51940656 It's not easy bein' green.
0,7068557 A dog is a mammal.
0,63366437 A dog is a man's best friend.
0,4223255 You ain't never had a friend like me.
0,41579404 Rachel, Monica, Phoebe, Joey, Chandler, Ross

Comments

1

I think you cannot use AddHuggingFaceTextEmbeddingGeneration with an embedding model from LM Studio out of the box. The reason is that the HuggingFaceClient internally changes the url and adds:

pipeline/feature-extraction/

 private Uri GetEmbeddingGenerationEndpoint(string modelId)
     => new($"{this.Endpoint}{this.Separator}pipeline/feature-extraction/{modelId}");

That's the same Error Message I get in the LM Studio Console:

[2024-07-03 22:18:19.898] [ERROR] Unexpected endpoint or method. (POST /v1/embedding/pipeline/feature-extraction/nomic-ai/nomic-embed-text-v1.5-GGUF/nomic-embed-text-v1.5.Q5_K_M.gguf). Returning 200 anyway

enter image description here

In order to get this working the url would have to be changed.

Comments

1

Currently you can also use a Local Model with ONNX without the hurdle of a Local Http Server for it.

Check it here: https://github.com/microsoft/semantic-kernel/tree/main/dotnet/samples/Demos/OnnxSimpleRAG

1 Comment

Thanks for the suggestion I added examples using a local model with ONNX to my github page

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.