I am trying to create a ReAct agent in LlamaIndex using a local gpt-oss-20b model.
I have successfully loaded my local model using HuggingFaceLLM from llama_index.llms.huggingface and it seems to be working correctly. Here is the code I'm using for that part:
import torch
from llama_index.llms.huggingface import HuggingFaceLLM
# This part works fine
llm = HuggingFaceLLM(
model_name="../gpt-oss-20b-local",
tokenizer_name="../gpt-oss-20b-local",
device_map="auto",
model_kwargs={"torch_dtype": torch.float16},
)
Now, I want to use this llm to create an agent. I am following the official LlamaIndex documentation for the ReAct Agent: https://docs.llamaindex.ai/en/stable/examples/agent/react_agent/
The documentation provides an example for streaming events that looks like this:
# (Assuming agent and handler are already defined as per the docs)
# ... agent setup code ...
async for ev in handler.stream_events():
print(ev)
print("---")
When I try to add this loop to my script, I get a syntax error because it's not inside an async function.
My Full (Simplified) Code:
from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool
from llama_index.core.callbacks import CallbackManager, LlamaDebugHandler
# Assume 'llm' is loaded as shown above
def multiply(a: int, b: int) -> int:
"""Multiply two integers and returns the result integer"""
return a * b
multiply_tool = FunctionTool.from_defaults(fn=multiply)
# Setup agent
llama_debug = LlamaDebugHandler(print_trace_on_end=True)
callback_manager = CallbackManager([llama_debug])
agent = ReActAgent.from_tools(
[multiply_tool],
llm=llm,
verbose=True,
callback_manager=callback_manager
)
# This is the problematic part from the documentation
response = agent.stream_chat("What is 21 * 21?")
handler = llama_debug.get_event_handler("stream_chat")
# The following line causes the error
async for ev in handler.stream_events():
print(ev)
print("---")
The Error:
File "test.py", line 46
async for ev in handler.stream_events():
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SyntaxError: 'async for' outside async function
I understand that async for must be within an async def function, but the LlamaIndex documentation presents the code this way. How am I supposed to run this example code?
Do I need to wrap this logic in a main async function and then run it with asyncio.run()? What is the correct way to execute these asynchronous streaming examples from the documentation in a standard Python script?
async for(orawait) outside function. it may need to put code inasyncfunctiona and useasyncio.run(your_function()). Did you try it?