23,953 questions
0
votes
0
answers
7
views
How do I visualize the latent representation produced by the Stable Diffusion VAE?
I am trying to visualize the latent representation produced by the VAE inside a Stable Diffusion pipeline
from diffusers import StableDiffusionPipeline
import torch
# A CUDA ordinal is simply the ...
0
votes
0
answers
25
views
AWS SageMaker PyTorch Model Deployment - is entry_point needed?
I'm trying to deploy a pre-trained PyTorch model to SageMaker using the Python SDK. I have a model.tar.gz file that is uploaded to S3, with the following structure:
code/
code/requirements.txt
code/...
Tooling
0
votes
0
replies
33
views
Good packages for bounded Linear Quantile Regression?
I'm looking for a good package to train a linear quantile regression model, i.e. $\hat y = \sum_{i=1}^n w_i \cdot X_i$. With $x_i$ are the input features, and $w_i$ are the bounded trainable weights. ...
0
votes
0
answers
21
views
Attribution Error when using Huggingface transformers Trainer with FSDP
I am now trying to use FSDP in Huggingface transformers Trainer. The training script is something like
train_dataset = Mydataset(...)
args = TrainingArguments(...)
model = LlamaForCausalLM....
0
votes
0
answers
44
views
Optimization Challenge in Hugging Face: Effcienntly Serving Muliple, Differently Sized LLMs on a Single Gpu with PyTorch [closed]
I am currently working on a Python based Gen AI project that requires the efficient deployment and serving of multiple LLMs specifically models with different parameter counts ( Llama-2 7B and Mistral ...
2
votes
1
answer
74
views
Having trouble with R's torch and tensor dimensions
I am trying to follow along with this webpage: https://jtr13.github.io/cc21fall2/tutorial-on-r-torch-package.html
I am trying to understand R's implementation of PyTorch.
I am having some trouble with ...
0
votes
0
answers
39
views
How to force NCCL build to embed PTX for all kernels (prevent linker from stripping ncclDevKernel PTX)?
I am compiling NCCL 2.27.5-1 (I tried also 2.28.9-1) from source for a V100 GPU (sm_70). My goal is to have libnccl.so contain compute_70 PTX for every kernel.
Despite passing explicit -gencode=arch=...
-1
votes
1
answer
42
views
YOLOv8 custom training loop using v8DetectionLoss fails to converge on custom dataset (7 classes) [closed]
I am trying to implement a custom training loop for object detection using YOLOv8 (Ultralytics) and PyTorch. My goal is to fine-tune a pre-trained yolov8n.pt model on the Aquarium dataset, which ...
1
vote
0
answers
56
views
PyTorch installed via uv project shows CPU-only version on Windows with CUDA specification in pyproject.toml
I'm trying to set up a Python project using uv and pyproject.toml on Windows. I want to install the CUDA-enabled PyTorch, but after installing, when I check the version, it shows CPU-only.
Here’s my ...
Advice
0
votes
0
replies
29
views
When using TensorDictPrioritizedReplayBuffer, should I apply the priority weight manually or not?
With Prioritized Experience Replay (PER), we use Beta parameter, so we can find weight that will be used to offset the bias introduced by PER. Now, with PyTorch's TensorDictPrioritizedReplayBuffer, I ...
1
vote
2
answers
124
views
pytorch Module B=A, A.to('cpu'), but the tensor in B is still in GPU, why?
After converting module A to CPU, the origin parameter tensor still stays on the GPU? When it is released? Is it wrong if I reuse the parameter?
My code:
import torch.nn as nn
class A(nn.Module):
...
2
votes
1
answer
25
views
PyTorch .view() operation to manipulate tensor dimensions vis a vis using torch.unbind followed by torch.cat
In Torch, .view() reshapes the tensor. However, there are multiple ways to reshape a multi-dimensional tensor to a target shape. How does it decide between those different ways?
For example, in Torch, ...
2
votes
1
answer
498
views
PyTorch fails on Windows Server 2019: “Error loading c10.dll” (works fine on Windows 10)
I'm trying to deploy a Python project on Windows Server 2019, but PyTorch fails to import with a DLL loading error.
On my local machine (Windows 10, same Python version), everything works perfectly.
...
1
vote
1
answer
59
views
.so file built on same CPU but different EC2 instances lead to missing symbols
I am building a wheel of PyTorch from source, based on their https://github.com/pytorch/pytorch/blob/v2.6.0/.ci/manywheel/build_common.sh CI build script. I tested on a "local" instance of a ...
Advice
0
votes
2
replies
46
views
Fixing a UNET in pytorch that doesn't work in eval mode due to BatchNorm2d layers
I have a UNET model trained in pytorch (by someone else) that produces quite different results in eval mode to train mode (train mode results look good, eval mode they are rubbish). A bit of googling ...
0
votes
0
answers
52
views
Given groups=1, weight of size [64, 1024, 1, 1], expected input[1, 256, 1, 1] to have 1024 channels, but got 256 channels instead
I have encountered this issue and I searched on the forums but I couldnt solve it. How can I solve this problem ?
I tried to add CBAM module in yolov12 for my custom dataset to improve accuracy. I ...
0
votes
0
answers
94
views
My SimSiam is collapsing- SimSiam on CUB-200-2011 with ViT
I'm trying to implement SimSiam using a ViT backbone on the CUB-200-2011 dataset. However, during training, the embeddings collapse to a single direction despite using stop-gradient. Here’s what I ...
-1
votes
0
answers
24
views
How to use the models from huggingface from local machine server
I am trying to use the following model Emotion Llama and try to understand how to download the models and place them in the right dir from huggingface. It actually suggests to donwload three models in ...
1
vote
1
answer
72
views
Is passing ray resources as options when calling the function equivalent to setting them in the function's decorator?
Is
@ray.remote
def run_experiment(...):
(...)
if __name__ == '__main__':
ray.init()
exp_config = sys.argv[1]
params_tuples, num_cpus, num_gpus = load_exp_config(exp_config)
ray.get(...
0
votes
0
answers
47
views
Unclear formulation in Temporal Fusion Transformer paper
I am currently trying to implement the Temporal Fusion Transformer using PyTorch.
This paper (https://arxiv.org/pdf/1912.09363) is my reference.
Currently I am stuck with the variable selection ...
0
votes
0
answers
31
views
Where is EXECUTORCH_LIBRARY defined in ExecuTorch v1.0?
I’m trying to register a custom operator for ExecuTorch (v1.0, built from the PyTorch 2.5 source tree).
My goal is to create a shared library that defines a few quantum operators and runs them from a ....
0
votes
0
answers
49
views
Torch 2.4.1 doesn't utilize my system memory after CUDA memory runs out
I wrote a lot of scripts to test the compatibility of my system with PyTorch 2.4.1, and they all indicate I can run it. I don't have enough memory on my GPU, so I tried enabling expandable_segments so ...
1
vote
1
answer
122
views
How to configure uv via pyproject.toml to lock PyTorch (+cu118) to a custom index and prevent uv run from using the CPU-only version?
I am managing a project with uv (v0.9.4) that requires a specific PyTorch CUDA build. The generic installation works, but using uv run causes a package conflict, despite the environment being correct.
...
0
votes
0
answers
78
views
IndexError: index -1 is out of bounds for dimension 0 with size 0
I am currently experimenting with modifying the KV cache of the LLaVA model in order to perform controlled interventions during generation (similar to cache-steering methods in recent research). The ...
0
votes
1
answer
32
views
How can I get torch.set_grad_enabled(True) to work in ComfyUI?
I just spent hours figuring out that the following code fails when included in a ComfyUI custom node, but works perfectly fine outside (using the same Python venv). I finally found out that someone ...
0
votes
1
answer
79
views
Unable to step into torch.nn.functional.linear using VS Code debugging
I want to step into the linear function using VS Code's step-in , but it skips automatically when I click "step into". Could anyone help me with this?
I used DEBUG=1 when compiling PyTorch.
...
1
vote
0
answers
67
views
Should I use torch.inference_mode() in a prediction method even when using model.eval()? [duplicate]
I'm following the book "Deep Learning with PyTorch Step By Step" and I have a question about the predict method in the StepByStep class (from this repository: GitHub).
The current ...
1
vote
0
answers
159
views
Transformers 'could not import module pipeline' to jupyter notebook
I need to to run a series of pre-trained fine-tuned models from Hugging Face to Jupyter notebook. I have updated to the latest version of both PyTorch and Transformers, but when I run the code
from ...
Advice
2
votes
0
replies
80
views
How should I balance DSA, ML fundamentals, PyTorch implementation, and Kaggle practice for ML Engineer interviews?
I’m a Computer Science graduate preparing for ML/AI Engineer roles.
I’m facing a dilemma about what to focus on, how much to allocate time to each area, and what exact roadmap to follow to prepare ...
3
votes
0
answers
104
views
I get the error " ImportError: libcudnn.so.9: cannot open shared object file: No such file or directory " when i try to use torch in virtual env
I have installed Cuda 13 on fedora 42 .
When i use pytorch localy, torch works fine, but when i creat a virtualenv my pytorch cant find the ibcudnn files.
I get the error
ImportError: libcudnn.so.9: ...
2
votes
2
answers
92
views
Decoder only model AI making repetitive responses
I am making a Decoder only transformer using Pytorch and my dataset of choice is the fullEnglish dataset from kaggle Plaintext Wikipedia (full English).
The problem is that my model output is ...
0
votes
1
answer
72
views
Generating response with KV Cached System Prompt throws error when Input Tokens are less than Prompt Tokens
I am trying to run Mistral-7B-Instruct-v0.2.
Each run is PROMPT + details[i].
PROMPT has instructions on how to generate JSON based on details.
As the prefix part of each input is same; kind of like a ...
2
votes
1
answer
35
views
AttributeError: 'NoneType' object has no attribute 'blocks' when running Cache-DiT example with Wan2.2 model
I’m trying to use
Cache-DiT
to accelerate inference for the Wan2.2 model.
However, when I run the example script,
python run_wan_2.2_i2v.py --steps 28 --cache
I get the following error.
Namespace(...
0
votes
0
answers
37
views
How do I interpret Gaussian process parameters?
I'm performing Gaussian process regression using GPyTorch. I'm modeling two correlated tasks as follows:
class MyModel(gpytorch.models.ExactGP):
def __init__(self, X, Y, likelihood):
super(...
2
votes
0
answers
59
views
Having problems computing PDE Residuals
I'm computing PDE residuals for The_Well datasets (e.g. turbulent_radiative_layer_2D and shear_flow) using finite differences, but the residuals are much larger than I expect. The data are generated ...
0
votes
1
answer
29
views
Can I avoid setting-up and tearing down processes when using PyTorch DataLoader?
In my scenario I use multiple DataLoaders with multiple Datasets to evaluate models against each other (I want to test models with multiple resolutions, which means each dataset has a distinct ...
1
vote
1
answer
108
views
Can uv integrate with e.g. pytorch prebuilt docker env?
So, pytorch requires a rather large bundle of packages. The prebuilt docker pytorch gpu images (https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/running.html) are quite helpful in ...
1
vote
0
answers
160
views
Why does “Command Buffer Full” appear in PyTorch CUDA kernel launches?
I’m using the PyTorch profiler to analyze sglang, and I noticed that in the CUDA timeline, some kernels show “Command Buffer Full”. This causes the cudaLaunchKernel time to become very long, as shown ...
0
votes
0
answers
97
views
ModuleNotFoundError: No module named 'losses.loss'; 'losses' is not a package error when training MAT model (PyTorch, NVIDIA repo)
I'm trying to fine-tune the MAT (Masked Attention Transformer) model from the official repository:
https://github.com/fenglinglwb/MAT
However, I keep getting the following error during training:
...
0
votes
0
answers
32
views
open3d.ml build for tourch==2.10 (for sm_120 GPU architecture)
I have NVIDIA GeForce RTX 5060 with the "Blackwell" architecture with compute capability 12.0 that's why i have to use nightly build of pytorch=2.10.0.dev20251017+cu128 which support for ...
0
votes
0
answers
93
views
Torch example transformer with TransformerDecoder
In the torch example provided here https://github.com/pytorch/examples/tree/main/word_language_model, tansformer only uses torch.TransformerEncoder and torch.TransformerDecoder is overwritten with a ...
0
votes
0
answers
36
views
T5-small generates only padding tokens during validation/test in PyTorch Lightning
I'm fine-tuning T5-small using PyTorch Lightning and encountering a strange issue during validation and test steps.
The Problem:
During validation_step and test_step, model.generate() consistently ...
0
votes
0
answers
71
views
Torchvision save segmentation masks to png
There is a tutorial i try to follow https://docs.pytorch.org/tutorials/intermediate/torchvision_tutorial.html
working with .png files as segmentation masks.
The png files can be found here:
https://...
2
votes
1
answer
121
views
Fast vectorized maximal independent set greedy algorithm [closed]
I need a really fast vectorized maximal independent set algorithm implemented in pytorch, so I can use it for tasks with thousands of nodes in reasonable time.
I cannot use networkx, it is way too ...
1
vote
0
answers
68
views
How to pass P_map: dict[str, torch.Tensor] to PEFT (LoRA)?
My proxy goal is to change LoRA from h = (W +BA)x to h = (W + BAP)x. Preliminary code attached for your reference
My actual goal is to train a model with the following loss: 〖Θ ̃=(arg min)┬Δ ̂ 〗〖‖𝑓_(...
1
vote
0
answers
107
views
How to lazy load jsonl file
I am trying to build a pytorch Dataset based on some .jsonl files. The size of each .jsonl file is about 2GB, and I have 50 such files. Therefore, it would be not very practical to load all these ...
0
votes
0
answers
52
views
How do I create a multitask GPyTorch model with a user-specified noise covariance matrix?
I've implemented standard homoskedastic multitask Gaussian process regression using GPyTorch as follows:
class MyModel(gpytorch.models.ExactGP):
def __init__(self, X, Y, likelihood):
super(...
1
vote
1
answer
130
views
Torch Conv2d results in both dimensions convolved
I have input shape to a convolution (50, 1, 7617, 10). Here, 7617 is word vectors as rows, and 10 is the number of words in columns. I want to convolve column-wise and obtain (2631, 1, 7617, 1), 1 ...
3
votes
1
answer
75
views
Matching PyTorch and ONNX outputs layer-wise for debugging inference drift
I want to debug layer-by-layer to see where the ONNX model starts deviating from the PyTorch model outputs.
I can extract intermediate outputs in PyTorch using forward hooks, like:
def get_activation(...
0
votes
0
answers
293
views
Installation error while installing GroundingDino
I am trying to install the GroundingDino as instructed in the README file of their official GitHub repo, but I am facing the error below:
Obtaining file:///home/kgupta/workspace/Synthetic_Data_gen/...