Newest 'huggingface' Questions

0 votes

0 answers

42 views

Optimization Challenge in Hugging Face: Effcienntly Serving Muliple, Differently Sized LLMs on a Single Gpu with PyTorch [closed]

I am currently working on a Python based Gen AI project that requires the efficient deployment and serving of multiple LLMs specifically models with different parameter counts ( Llama-2 7B and Mistral ...

Amira Yassin

1

asked yesterday

0 votes

0 answers

30 views

When Running a GGUF Model from Hugging Face Using Ollama, How Will the Modelfile Be Selected?

Background Knowledge According to the Hugging Face documentation, now it's supported to run a GGUF model directly using Ollama with ollama run hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF something like ...

Gorun

118

asked 2 days ago

0 votes

1 answer

143 views

Error in pyannote.audio Pipeline. Python. HuggingFace

Getting issue with use_auth_token keyword while implementing a pipeline from pyannote.audio. I already used:- pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization-3.1", ...

phantomguild

1

asked Nov 12 at 12:15

-1 votes

0 answers

24 views

How to use the models from huggingface from local machine server

I am trying to use the following model Emotion Llama and try to understand how to download the models and place them in the right dir from huggingface. It actually suggests to donwload three models in ...

Jose Ramon

5,364

asked Nov 11 at 20:05

1 vote

0 answers

159 views

Transformers 'could not import module pipeline' to jupyter notebook

I need to to run a series of pre-trained fine-tuned models from Hugging Face to Jupyter notebook. I have updated to the latest version of both PyTorch and Transformers, but when I run the code from ...

Alex Colville

11

asked Nov 4 at 9:16

1 vote

1 answer

78 views

Xcode Can't Find swift-transformers Package

I'm trying to implement Speech-to-Text transcription in my Swift app using Hugging Face's swift-transformers package to run Whisper models locally. I've added the package to my Xcode project, but when ...

Zaid

451

asked Nov 2 at 15:07

1 vote

0 answers

68 views

How to pass P_map: dict[str, torch.Tensor] to PEFT (LoRA)?

My proxy goal is to change LoRA from h = (W +BA)x to h = (W + BAP)x. Preliminary code attached for your reference My actual goal is to train a model with the following loss: 〖Θ ̃=(arg min)┬Δ ̂ 〗⁡〖‖𝑓_(...

Jason Rich Darmawan

2,193

asked Oct 15 at 5:25

-1 votes

2 answers

95 views

LangChain HuggingFace ChatHuggingFace raises StopIteration with any model

I’m trying to use LangChain’s Hugging Face integration to chat with the model TinyLlama/TinyLlama-1.1B-Chat-v1.0 for the very first time, but I’m getting a StopIteration error when calling .invoke(). ...

forstudy

73

asked Oct 10 at 15:36

0 votes

0 answers

61 views

ONNX Runtime Helsinki-NLP in Java

has anyone managed to translate something using Helsinki-NLP and ONNX Runtime in Java? Using a Python script, I generated these files: ├── encoder_model.onnx ├── decoder_model.onnx ├── ...

minizibi

393

asked Oct 9 at 8:16

2 votes

1 answer

80 views

Why does my system message content contain "image": None when mapping conversation dataset?

I'm creating a conversation dataset for an image classification task where the system message should contain only text, and the user message contains both text and an image. However, after mapping my ...

GauravGiri

21

asked Oct 1 at 18:05

0 votes

0 answers

96 views

tokenizer error: RuntimeError: The size of tensor a (4) must match the size of tensor b (8) at non-singleton dimension 0

When fine tuning a model, using the HuggingFace inference hub, the error below was encountered: The tokenizer has new PAD/BOS/EOS tokens that differ from the model config and generation config. The ...

Kingsley Uchunor

1

asked Oct 1 at 15:52

0 votes

0 answers

89 views

How to solve device mismatch issue when using offloading with QwenImageEditPlus pipeline and GGUF weights

After failing to make the QwenImageEditPlus run (https://huggingface.co/spaces/discord-community/README/discussions/9#68d260e32053323e6bfab30c), I tried a different approach (thanks to all the example ...

Siladittya

1,215

asked Sep 24 at 7:36

1 vote

0 answers

61 views

Why does hugging face trainer still recognize different device between my encoder & classifier head even after I manually map it on the same device

I encounterd this error while trying to run hugging face trainer on a multi-gpu. RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! I use a ...

Dwi Rezky Fahlan

11

asked Sep 15 at 3:45

0 votes

1 answer

379 views

Hugging face api is not returning any response and receiving errors

I have been stepping into GenAI and currently I am working with Hugging face's open source models. However, I am not able to receive any response from the API. I have created access token on hugging ...

Apoorva Walia

21

asked Sep 11 at 3:38

0 votes

0 answers

61 views

How do I compute validation loss for a fine-tuned Qwen model in Hugging Face Transformers during evaluation?

I trained a Qwen model on my own dataset. Now I need to evaluate my trained model using the loss function, but I don’t know how to do it. I saw examples for other metrics such as accuracy and ...

Kathi Meyer

1

asked Sep 3 at 8:05

0 votes

0 answers

25 views

Commit unable to auto-activate while using Gradio on Huggingface, but adding a blank line and commit it from the website works

I was trying to use Gradio in Huggingface Spaces. I added an app.py file in my VScode, and VScode told me that the push was all right. However, Huggingface Spaces declared "No application file&...

Alex YAN

1

asked Sep 2 at 4:41

1 vote

0 answers

37 views

Using HuggingFace API using Azure ML Studio

I have an Azure ML studio notebook. I want to use the HuggingFace "cross-encoder-nli-deberta-v3-base" model to do zero-shot classification. This code instantiates the endpoint without error: ...

msand

11

asked Aug 28 at 0:40

0 votes

0 answers

218 views

"Model is not supported for task text generation, supported task: conversational" on LangChain HuggingFaceInference JS

Just trying to use an Text-Gen LLM from HuggingFace Inference Provider using LangChain in Node.js, I chose Model Qwen/Qwen2.5-1.5B-Instruct, trying out other models did not seem to work, I couldn't ...

Basel_Dev

21

asked Aug 26 at 23:15

0 votes

0 answers

57 views

Smolagents CodeAgent gets error from correct code

the Smolagents CodeAgent is given a task to convert a string into markdown table format. It successfully captures the related part of the string and writes the code for markdown table formatting. ...

aearslan

176

asked Aug 24 at 18:32

0 votes

1 answer

117 views

How to load dataset from huggingface to google colab?

I am trying to load a training dataset in my Google Colab notebook but keep getting an error. Here is the code snippet which returns the error: from datasets import load_dataset ds = load_dataset(&...

AlecArk

1

asked Aug 23 at 14:07

1 vote

2 answers

171 views

How to interpret cosine similarity using EmbeddingSimilarityEvaluator

I am reading about Text embeddings in LLM from the book Hands-On Large Language Models. It is mentioned that as follows: from sentence_transformers.evaluation import EmbeddingSimilarityEvaluator from ...

venkysmarty

11.5k

asked Aug 15 at 11:49

1 vote

0 answers

799 views

KeyError when loading GPT-OSS-20B locally with transformers on CPU

I’m trying to load gpt-oss-20b locally using Hugging Face transformers with CPU only. Minimal code: from transformers import pipeline model_path = "/mnt/d/Projects/models/gpt-oss-20b" pipe = ...

mindlesscoding

1

asked Aug 14 at 20:00

0 votes

1 answer

161 views

Is HuggingFace Accelerate's init_empty_weights Context Manager (Properly) Implemented for a Diffuser?

Discussion HuggingFace accelerate's init_empty_weights() properly loads all text encoders I tested to the PyTorch meta device and consumes no apparent memory or disk space while loaded. However, it ...

Matthew Ross

185

asked Aug 11 at 21:54

0 votes

0 answers

232 views

TypeError: PPOTrainer.init() got an unexpected keyword argument 'config'

I am trying to initialize a PPO_trainer but have issues. from trl import PPOTrainer, PPOConfig ppo_config = PPOConfig( batch_size=4, learning_rate=1e-5, mini_batch_size=2, use_cpu=...

m0ss

472

asked Aug 6 at 15:43

1 vote

0 answers

53 views

BLIP Fine-Tuning: Special Token Always Biased to One Class in Generated Caption

I'm trying to fine-tune Hugging Face BLIP (Bootstrapped Language-Image Pretraining) to classify pizza boxes as either recyclable (clean) or non-recyclable (contaminated) by generating captions that ...

Wow Wow

11

asked Aug 4 at 20:47

0 votes

0 answers

56 views

Why is LeRobot’s policy ignoring additional camera streams despite custom `input_features`?

I'm using LeRobot to train a SO101 arm policy with 3 video streams (front, above, gripper) and a state vector. The dataset can be found at this link. I created a custom JSON config (the train_config....

Aaron Serpilin

31

asked Jul 29 at 13:44

0 votes

0 answers

47 views

TypeError: 'NoneType' object is not iterable when using ChatHuggingFace with TinyLlama/TinyLlama-1.1B-Chat-v1.0 in LangChain

I'm trying to use the TinyLlama/TinyLlama-1.1B-Chat-v1.0 model from Hugging Face with LangChain using the langchain_huggingface integration. My goal is to get a simple response from the model using ...

Simran Dalvi

1

asked Jul 28 at 20:42

1 vote

1 answer

322 views

Getting StopIteration when using HuggingFaceEndpoint with LangChain and flan-t5-large

I'm trying to use the langchain_huggingface.HuggingFaceEndpoint integration to call the "google/flan-t5-large" model from Hugging Face in a LangChain pipeline. Here's my code: from langchain....

coderbhai

41

asked Jul 25 at 3:18

0 votes

1 answer

239 views

RuntimeError: CUDA error: named symbol not found when using TorchAoConfig with Qwen2.5-VL-7B-Instruct model

I'm trying to load the Qwen2.5-VL-7B-Instruct model from hugging face with 4-bit weight-only quantization using TorchAoConfig (similar to how its mentioned in the documentation here), but I'm getting ...

Sankalp Dhupar

73

asked Jul 21 at 23:41

0 votes

0 answers

63 views

Hugging Face applying Transformation on nested to datasets without loading into memory

I am trying to apply below transformation for preparing my datasets for fine tuning using unsloth huggingface. It requires the dataset to be in following format. def convert_to_conversation(sample): ...

SoraHeart

428

asked Jul 4 at 11:27

0 votes

0 answers

92 views

Python Flask App with LlamaIndex + Ollama application significantly slower in offline Docker container vs online version with identical setup

Problem I have two nearly identical Python applications using LlamaIndex + Ollama for document Q&A: Online version: ~5 seconds response time Offline version: ~18 seconds response time FYI i am ...

sai

1

asked Jul 2 at 15:33

0 votes

1 answer

240 views

Language Model Evaluation with Custom Task - Hugging Face Lighteval

I am creating a benchmark to evaluate a language model. First, I generated the dataset that I'm gonna prompt the Language model with. Subsequently, I tried to evaluate any tiny language model just to ...

Mahmoud Hanouneh

1

asked Jun 26 at 19:04

1 vote

1 answer

677 views

SFTTrainer: The specified `eos_token` ('<EOS_TOKEN>') is not found in the vocabulary of the given `processing_class` (Qwen2TokenizerFast)

I upgraded my Python trl package to version 0.18.1. I use the SFTTrainer of the package to finetune a Qwen2.5 LLM neural net. Previously, I used the TrainingArgument class to set additional params. I ...

soosmann

119

asked Jun 12 at 10:35

0 votes

1 answer

142 views

How to get the code of the hugging face models?

There is a simple way to download a model from hugging face, # Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("sentence-...

Uwe.Schneider

1,485

asked Jun 7 at 23:03

3 votes

0 answers

208 views

Cannot interence with images on llama-cpp-python

I am new to this. I have been trying but could not make the the model answer on images. from llama_cpp import Llama import torch from PIL import Image import base64 llm = Llama( model_path='Holo1-...

Abhash Rai

61

asked Jun 7 at 5:50

0 votes

0 answers

44 views

Translation returns <unk> token

I'm having relatively good results with HelsinkiNlp models for translation, except for one thing: some special characters are omitted from the translation. If I decode without skipping the special ...

gooopil

17

asked Jun 5 at 1:00

1 vote

0 answers

94 views

Sentence similarity pipeline with @huggingface/transformers

Wanted to use the pipeline api from @huggingface/transformers js for sentence-similarity - but I do not see a specific pipeline for it. The closest thing is text classification and feature extractions ...

Edv Beq

1,020

asked Jun 4 at 16:16

0 votes

0 answers

56 views

Using llama-index with the deployed LLM

I wanted to make a web app that uses llama-index to answer queries using RAG from specific documents. I have locally set up Llama3.2-1B-instruct llm and using that locally to create indexes of the ...

Utkarsh

1

asked May 29 at 11:17

2 votes

1 answer

210 views

JSONDecodeError while using HuggingFace Inference API with LangChain for Embeddings

I’m trying to generate embeddings using the Hugging Face Inference API with LangChain in Python, but I’m running into issues. My goal is to use the API (not local models) to generate embeddings for ...

Jeevan

11

asked May 26 at 13:46

0 votes

0 answers

42 views

I am getting a .NET HuggingFace 403 or 404 error

In my .NET project, I am configuring the Huggingface library as follows: builder.Services .AddKernel() .AddHuggingFaceChatCompletion( model: "deepseek-ai/DeepSeek-R1", ...

Murat Öztürk

11

asked May 26 at 13:04

0 votes

0 answers

30 views

Webpack error: "Module parse failed: Unexpected character '�' (1:0)" when using @xenova/transformers

I'm trying to run a sentiment analysis function using the @xenova/transformers package in a NextJS project with Webpack, but I'm encountering the following error: Module parse failed: Unexpected ...

Santhosh

53

asked May 26 at 13:01

1 vote

1 answer

87 views

HfHubHTTPError when calling DoclingLoader with a pdf file

I have installed docling successfully, but when doing the following: from langchain_docling import DoclingLoader source_path = "shared\abc.pdf" loader = DoclingLoader(file_path=source_path) ...

chaos24

11

asked May 24 at 18:22

0 votes

1 answer

227 views

how to download huggingface-model files by filtering unwanted files

a huggingface model, like Qwen32B-GGUF, contains some quantization-related files which are large. Perhaps, only use one quantization-related file and the rest is not used. By huggingface-cli, it ...

Hobin C.

793

asked May 20 at 3:35

0 votes

0 answers

288 views

Getting StopIteration error in HuggingFace

I am using Colab and HuggingFace Token is added in Colab secrets. from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint from dotenv import load_dotenv from google.colab import ...

Sam2021

21

asked May 17 at 11:23

1 vote

2 answers

675 views

Facing issue using a model hosted on HuggingFace Server and talking to it using API_KEY

I am trying to create a simple langchain app on text-generation using API to communicate with models on HuggingFace servers. I created a “.env” file and stored by KEY in the variable: “...

Sri2110

345

asked May 11 at 9:34

0 votes

0 answers

48 views

What is the proper way to fill a batch in training an LM all the way to the end eg how to correct my tokenize_and_group_texts_via_blocks?

I’m preparing a text dataset for next-token language-model pre-training. Using HF datasets with batched=True, I wrote a helper that: prepends a BOS token (if the tokenizer has one), appends an EOS ...

Charlie Parker

6,186

asked May 10 at 3:49

0 votes

0 answers

109 views

How can I properly load a LoRA weight into a pretrained Stable Diffusion model on TorchServe and enable parallel inference?

I'm attempting to serve a pretrained Stable Diffusion model with LoRA weights applied using TorchServe. However, the LoRA weights don't seem to load properly, and I'm not sure why. Could anyone help ...

박연수

1

asked May 8 at 13:28

0 votes

1 answer

583 views

ollama.generate raises model not found error: "hf.co/mradermacher/Llama-3.2-3B-Instruct-uncensored-GGUF"

I'm trying to run a Python script that uses the ollama library to generate responses from a custom LLM model. My code attempts to call ollama.generate() using the following model name: chosen_model = '...

JaS

45

asked May 4 at 11:32

0 votes

0 answers

137 views

Hugging Face Sentence Transformer API returning 400 error for embeddings with incorrect format

import { DataAPIClient } from "@datastax/astra-db-ts"; import { PuppeteerWebBaseLoader } from "langchain/document_loaders/web/puppeteer"; import axios from "axios"; ...

Rakib islam

1

asked May 1 at 6:37

0 votes

1 answer

107 views

Unable to connect to hugging face model

from sentence_transformers import SentenceTransformer model = SentenceTransformer("BAAI/bge-small-en-v1.5") sentences = [ "The weather is lovely today.", "It's so ...

SM9595

39

asked Apr 30 at 13:06

Collectives™ on Stack Overflow