0

I have been stepping into GenAI and currently I am working with Hugging face's open source models. However, I am not able to receive any response from the API. I have created access token on hugging face's platform, and used it in .env file too. Here is my code:

from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from dotenv import load_dotenv

load_dotenv()

llm = HuggingFaceEndpoint(
    repo_id='TinyLlama/TinyLlama-1.1B-Chat-v1.0',
    task='text-generation'
)

model = ChatHuggingFace(llm=llm)

result = model.invoke("What is the capital of India?")

print(result.content)

And this is the response I am getting:

(venv) PS C:\Users\user\Desktop\langchain models> python ./ChatModels/huggingface.py Traceback (most recent call last): File "C:\Users\user\Desktop\langchain models\venv\Lib\site-packages\huggingface_hub\utils_http.py", line 409, in hf_raise_for_status response.raise_for_status() File "C:\Users\user\Desktop\langchain models\venv\Lib\site-packages\requests\models.py", line 1026, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 504 Server Error: Gateway Time-out for url: https://router.huggingface.co/featherless-ai/v1/chat/completions

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\Users\user\Desktop\langchain models\ChatModels\huggingface.py", line 13, in result = model.invoke("What is the capital of India?") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\user\Desktop\langchain models\venv\Lib\site-packages\langchain_core\language_models\chat_models.py", line 393, in invoke self.generate_prompt( File "C:\Users\user\Desktop\langchain models\venv\Lib\site-packages\langchain_core\language_models\chat_models.py", line 1019, in generate_prompt return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\user\Desktop\langchain models\venv\Lib\site-packages\langchain_core\language_models\chat_models.py", line 837, in generate self._generate_with_cache( File "C:\Users\user\Desktop\langchain models\venv\Lib\site-packages\langchain_core\language_models\chat_models.py", line 1085, in _generate_with_cache result = self._generate( ^^^^^^^^^^^^^^^ File "C:\Users\user\Desktop\langchain models\venv\Lib\site-packages\langchain_huggingface\chat_models\huggingface.py", line 577, in _generate answer = self.llm.client.chat_completion(messages=message_dicts, **params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\user\Desktop\langchain models\venv\Lib\site-packages\huggingface_hub\inference_client.py", line 923, in chat_completion data = self._inner_post(request_parameters, stream=stream) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\user\Desktop\langchain models\venv\Lib\site-packages\huggingface_hub\inference_client.py", line 279, in _inner_post hf_raise_for_status(response) File "C:\Users\user\Desktop\langchain models\venv\Lib\site-packages\huggingface_hub\utils_http.py", line 482, in hf_raise_for_status raise _format(HfHubHTTPError, str(e), response) from e huggingface_hub.errors.HfHubHTTPError: 504 Server Error: Gateway Time-out for url: https://router.huggingface.co/featherless-ai/v1/chat/completions

That's quite a long one I tried to research a bit and got to know that maybe the model which I am calling api for doesn't work, but it happens for every model. Please help me out.

2
  • error 504 means problem on server. Maybe it is only temporary problem and you have to wait until they fix it. Error als oshows some url and you could check if you can open it in web browser. Maybe even web browser will have problem to open it. Commented Sep 11 at 11:38
  • [documentation](TinyLlama/TinyLlama-1.1B-Chat-v1.0 · Hugging Face) shows example with module transformers. It may suggest that it may not work with hugginface. Commented Sep 11 at 12:00

1 Answer 1

1

I don't see anything wrong with your code that's causing the error you're getting. It's rather specific to the model TinyLlama/TinyLlama-1.1B-Chat-v1.0

You can try with other similar models, such as meta-llama/Llama-3.1-8B-Instruct and it works.

# main.py
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from dotenv import load_dotenv

load_dotenv()

llm = HuggingFaceEndpoint(
    repo_id='meta-llama/Llama-3.1-8B-Instruct',
    task='text-generation'
)

model = ChatHuggingFace(llm=llm)

result = model.invoke("What is the capital of India?")

print(result.content)

output:

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.