I think the standard way is to create a Dataset class object from the arrays and pass the Dataset object to the DataLoader.
One solution is to inherit from the Dataset class and define a custom class that implements __len__() and __get__(), where you pass X and y to the __init__(self,X,y).
For your simple case with two arrays and without the necessity for a special __get__() function beyond taking the values in row i, you can also use transform the arrays into Tensor objects and pass them to TensorDataset.
Run the following code for a self-contained example.
# Create a dataset like the one you describe
from sklearn.datasets import make_classification
X,y = make_classification()
# Load necessary Pytorch packages
from torch.utils.data import DataLoader, TensorDataset
from torch import Tensor
# Create dataset from several tensors with matching first dimension
# Samples will be drawn from the first dimension (rows)
dataset = TensorDataset( Tensor(X), Tensor(y) )
# Create a data loader from the dataset
# Type of sampling and batch size are specified at this step
loader = DataLoader(dataset, batch_size= 3)
# Quick test
next(iter(loader))
x_dataandlabelsare both Pytorch tensors, you can combine them into aTensorDatasetthen create a dataloader from that TensorDataset. $\endgroup$