#

gguf-models

Here are 22 public repositories matching this topic...

pollockjj / ComfyUI-MultiGPU

This custom_node for ComfyUI adds one-click "Virtual VRAM" for any UNet and CLIP loader as well MultiGPU integration in WanVideoWrapper, managing the offload/Block Swap of layers to DRAM *or* VRAM to maximize the latent space of your card. Also includes nodes for directly loading entire components (UNet, CLIP, VAE) onto the device you choose

pytorch unet-pytorch stable-diffusion comfyui ggml comfyui-workflow comfyui-nodes gguf-models wanvideowrapper

Updated Oct 16, 2025
Python

ToolNeuron

Siddhesh2377 / ToolNeuron

Privacy-first AI ecosystem for Android. Run GGUF models offline or access 100+ cloud models via OpenRouter. Features 11 premium offline voices, extensible plugins, and dynamic DataHub for context injection. No subscriptions, no data harvesting—just AI on your terms.

android kotlin open-source privacy-first jetpack-compose mobile-ai ai-assistant llm llama-cpp local-ai openrouter gguf-models sherpa-onnx offline-tts

Updated Nov 29, 2025
C++

ashioyajotham / fingpt_trader

A quant trading system platform based on FinGPT, demonstrating new applications of large pre-trained Language Models in quantitative finance.

sentiment-analysis transformers asyncio quant quantitative-finance lora financial-analysis quantitative-trading cryptocurrency-exchanges peft binance-exchange llama-cpp fingpt falcon-7b ai-in-finance gguf-models market-inefficiencies

Updated May 15, 2025
Python

controlecidadao / samantha_ia

Experimental interface environment for open source LLM, designed to democratize the use of AI. Powered by llama-cpp, llama-cpp-python and Gradio.

Updated Oct 11, 2025
Python

local-ai-zone / local-ai-zone.github.io

Discover the Best AI Models for Your PC

website llama website-automation qwen gguf deepseek gguf-models gpt-oss

Updated Nov 29, 2025
HTML

HelpingAI / inferno

Run Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1, and other state-of-the-art language models locally with scorching-fast performance. Inferno provides an intuitive CLI and an OpenAI/Ollama-compatible API, putting the inferno of AI innovation directly in your hands.

text-generation llama llm ollama llamacpp-python gguf-models

Updated May 10, 2025
Python

mili-tan / Onllama.GGUFLinkOut

Create out symbolic links for the GGUF Models in Ollama Blobs. for use in other applications such as Llama.cpp/Jan/LMStudio etc. / 将 Ollama GGUF 模型文件软链接出，以便其他应用使用。

jan llamacpp llama-cpp ollama gguf lmstudio gguf-models

Updated Feb 15, 2025
C#

LuisMiSanVe / GGUF-to-PyTorchTensor

Simple Python Script that converts the Weight of a GGUF Model to a PyTorch Tensor

python numpy cuda pytorch tensor huggingface llamacpp gguf-models

Updated May 26, 2025
Python

Mainframework / Quanta

Convert and quantize llm models

Updated Nov 29, 2025

BLOCKCHAIN-AGI-AIR

basedgod55hjl / BLOCKCHAIN-AGI-AIR

🤖 Serverless AI on Hedera! Smart contracts run LLMs ⚙️ Distilled GGUF on IPFS. Pay crypto 🪙 → get replies 💬. 99.9% cheaper 💸 than big-tech APIs. Decentralized ⛓️ no censorship. “Gov can’t stop it. Nobody can.” 🧠 – Sir Charles Spikes , Cincinnati,Ohio| TELEGRAM: @SirGODSATANAGI✉️ SirCharlesspikes5@gmail.com #AGI #NoPC

Updated Sep 1, 2025
Java

arcxteam / gguf-convert-model

Auto GGUF Converter for HuggingFace Hub Models with Multiple Quantizations (GGUF Format)

machine-learning cmake ai tensorflow transformers huggingface ai-models huggingface-models bf16 safetensors llama-cpp gguf gguf-models gguf-quantization gguf-editor convert-gguf

Updated Nov 12, 2025
Python

nemmusu / run-llama-server

This is a Bash script to automatically launch llama-server, detects available .gguf models, and selects GPU layers based on your free VRAM.

bash cli utility ai launcher nvidia llama nvidia-smi nvidia-gpu llm llamacpp gguf llama-server gguf-models

Updated May 25, 2025
Shell

shandingwangyue / llm-engine

A high-performance local service engine for large language models, supporting various open-source models and providing OpenAI-compatible API interfaces with streaming output support.

python docker ai huggingface llm llm-inference llm-engineering gguf-models

Updated Nov 1, 2025
Python

echenim / hf-batch-downloader

Automate bulk downloads of Hugging Face LLMs with retry logic, manifest export, checksum validation, and usage reporting. Ideal for managing GGUF models at scale.

huggingface large-language-models llm gguf-models

Updated May 2, 2025
Python

patw / LlamaHerder

A web UI for managing multiple models with llama-server.exe on windows

flask-application llamacpp gguf-models

Updated Aug 30, 2025
HTML

dzikrisyairozi / wearable-ai

A modular CLI chatbot with Discord webhook integration optimized for edge devices using GGUF quantized models.

cli chatbot edge-devices quantized discord-webhook gguf-models

Updated Apr 13, 2025
Python

AdityaMogare / LurisQA

A fine-tuned legal QA system using Zephyr or LLaMA with GGUF support. Real-time inference via Hugging Face API or local model hosting.

legal assistant finetuning-llms ollama gguf-models

Updated Aug 5, 2025
Python

mitjafelicijan / gguf-list

GGUF model list & tools

llamacpp gguf gguf-models

Updated Feb 27, 2025
Python

testli-ai / outlines-llama-cpp-python-streaming-output

This repository demonstrates how to use outlines and llama-cpp-python for structured JSON generation with streaming output, integrating llama.cpp for local model inference and outlines for schema-based text generation.

outlines llamacpp llama-cpp llama-cpp-python gguf llamacpp-python gguf-models

Updated Mar 4, 2025
Python

qubasehq / Hugdl

A fast and reliable model downloader written in Go for downloading HuggingFace models.

machine-learning model huggingface gguf-models download-llms

Updated Aug 12, 2025
Go

Improve this page

Add a description, image, and links to the gguf-models topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gguf-models topic, visit your repo's landing page and select "manage topics."