Skip to main content
Filter by
Sorted by
Tagged with
1 vote
1 answer
52 views

For example, I want to keep every 3rd row, but I must keep numbers divisible by 3(or some special rule like that). When I see a number divisible by 3, that restarts the count, meaning I will start ...
Baron Yugovich's user avatar
0 votes
2 answers
65 views

First, here is some toy data: df <- data.frame( "stim" = c("face", "object", "pareidolia", "face", "face", "object", "...
thefriendly_plague.doctor's user avatar
0 votes
1 answer
130 views

There is a previous question which asks a similar question (Is there a way to create a loop where I provide a function and dataframe and subsample it, and repeat the function with a subsample N times?)...
Birdman's user avatar
  • 45
0 votes
2 answers
146 views

Let's say I have a large dataset of numeric values: big_dataset = rnorm(n = 500, mean = 20, sd = 10) I want to pull out a subset of observations from big_dataset that have similar values (within 5 ...
CephBirk's user avatar
  • 6,790
0 votes
2 answers
74 views

I am not sure what the correct word for this would be, so apologies for getting the terminology horribly wrong. Basically I have about 1000 datapoints, and I want to randomly subsample 100 data points ...
PowellHall's user avatar
0 votes
1 answer
179 views

I want to create a sub-sample of data frame df, depending on the frequency of a given category in one of its columns, e.g. a. Let's assume we have a data frame like this: df <- data.frame(a = rep(1:...
tpetzoldt's user avatar
  • 5,838
0 votes
1 answer
22 views

I have been tasked with subsampling a data set of cameras to determine whether we can get away with fewer cameras in our camera grid. The dataset already has detection rates for each species at each ...
Sierra McMurry's user avatar
0 votes
1 answer
104 views

I have a dataset containing a weight column, which I would like to subset while adjusting these weights to keep it representative of the original dataset. Let us say I have the dataframe : data....
Kamaloka's user avatar
  • 139
0 votes
1 answer
92 views

I need a code or idea for the case that we have a dataset of 1000 rows. I want to subsample from rows with the size of 800 for multiple times (I dont know how many times should I repeat). How should I ...
Nmgh's user avatar
  • 155
1 vote
0 answers
104 views

I have a dataframe with 67 items that looks like this: df <- data.frame("item"= c("item1", "item2", "item3", "item4", "item5"), "...
Thea's user avatar
  • 11
0 votes
1 answer
320 views

I'm studying how to implement a Skip-Gram model using Pytorch, I follow this tutorial, in the subsampling part the author used this formula: import random import math def subsample_prob(word, t=1e-3):...
Dogo-San's user avatar
  • 375
2 votes
1 answer
74 views

I would like to obtain quantile in a tailored subset. For example in the following dataset: data = data.table(x=c(rep(1,9),rep(2,9)),y=c(rep(1:6,each=3)),z=1:18) For each row i, I want to know, in the ...
Junyang Guo's user avatar
0 votes
1 answer
389 views

How would generate in the most concise way a monthly period index that is observed only every 12 months? I came up with the following solution pd.period_range(start=pd.Period('1975-07'), ...
user64150's user avatar
0 votes
2 answers
262 views

I have a time series as below: **Date_time** 2018-06-26 17:19:30 2018-06-26 17:20:40 2018-06-26 17:20:41 2018-06-26 17:20:42 [...] 2018-06-26 17:21:36 2018-06-26 17:21:37 2018-06-26 17:21:38 2018-06-...
Jujulie's user avatar
3 votes
1 answer
652 views

Following this exact question In Matlab, how can I use chroma subsampling to downscale a 4:4:4 image to 4:2:0 when the image is in YCbCr? where he is performing chroma downscaling from 4:4:4 to 4:2:0, ...
Sanam's user avatar
  • 243
0 votes
1 answer
459 views

The sampling_table parameter is only used in the tf.keras.preprocessing.sequence.skipgrams method once to test if the probability of the target word in the sampling_table is smaller than some random ...
user12346170's user avatar
0 votes
0 answers
244 views

I have five data frames among which I want to run regressions: df1: stock returns df2: housing returns df3: actual inflation rate df4: expected inflation rate df5: unexpected inflation rate ...
Albi351's user avatar
0 votes
0 answers
227 views

I've been skimming through the Armadillo documentation and examples, but it seems there is no real efficient way to subsample (or resample) a large vector or matrix, such that if you had N elements ...
StarShine's user avatar
  • 2,080
0 votes
1 answer
106 views

I have a dataframe like this names = ["Patient 1", "Patient 2", "Patient 3", "Patient 4", "Patient 5", "Patient 6", "Patient 7"] ...
lordy's user avatar
  • 630
0 votes
1 answer
107 views

Apologies in advance if this has already been asked and for my wording of this question as I am new to R. Is there any way of making my code for subsampling sound files more efficient? I have 148 ...
HarHar's user avatar
  • 35
1 vote
1 answer
2k views

I'm trying to reduce the input data size by first performing a K-means clustering in R then sample 50-100 samples per representative cluster for downstream classification and feature selection. The ...
ML33M's user avatar
  • 415
0 votes
1 answer
198 views

Let's say I have a DataSet that look like this: Name | Grade --------------- Josh | 94 Josh | 87 Amanda | 96 Karen | 78 Amanda | 90 Josh | 88 I would like to create a new DataSet ...
shakedzy's user avatar
  • 2,893
1 vote
1 answer
174 views

I've been trying to loop over left joins (using R). I need to create a table with columns representing samples from a larger table. Each column of the new table should represent each of these samples. ...
D C's user avatar
  • 13
1 vote
0 answers
351 views

I tried to replicate the solution posted here with tf.data.Dataset.interleave, but not quite sure how to apply the interleave method to already created dataset objects. here is the code: import ...
Hoda's user avatar
  • 41
1 vote
1 answer
344 views

I have a df of measurements over 50 years. I am trying to subsample the data to see what patterns I would have found had I only sampled in 2 years, or in 3, 4, 5, etc, instead of in all 50. I made a ...
Jake L's user avatar
  • 1,097
0 votes
1 answer
413 views

The goal is to sample the n number of data points from the original population. But the original population has serial correlation (consider it as time series data) and I want to choose neighboring ...
hbadger19042's user avatar
1 vote
1 answer
167 views

I have two 1D arrays of integers whose some differ, for example: a = [1,2,2,0,3,5] b = [0,0,3,2,0,0] I would like the sum of each array to be equal to that of the smallest of the two. However I want ...
APiazza's user avatar
  • 13
1 vote
1 answer
1k views

Does sample= 0 in Gensim word2vec mean that no downsampling is being used during my training? The documentation says just that "useful range is (0, 1e-5)" However putting the threshold to 0 would ...
Leonardo Sanna's user avatar
2 votes
2 answers
163 views

I need to write a function involving subsetting a df by a variable n bins. Like, if n is 2, then subsample the df some number of times in two bins (from the first half, then from the second half). If ...
Jake L's user avatar
  • 1,097
1 vote
0 answers
41 views

I need to get all possible combinations nCr of all possible sizes of a numpy array. [1,2,3,4,5] should give us a set of arrays: [1],[2],[3],[4],[5] [1,2],[1,3],[1,4],[1,5],[2,3],[2,4],[2,5],[3,4],[3,...
DDR's user avatar
  • 507
1 vote
1 answer
2k views

I've been tasked with performing a 4:2:0 chroma subsampling (color compression) on a series of JPEGs. The first step is to ensure that I can generate my Y, Cb, and Cr values and then convert back ...
Jared Boyd's user avatar
0 votes
1 answer
3k views

I have already converted the jpg images from RGB to YCbCr but must now use Chroma Subsampling to make them 4:2:0. I have searched but have not found any information on how to do this (note: I am very ...
dcalvert's user avatar
0 votes
1 answer
531 views

Given be a rectangular image img and patch s. Now I would like to cover the whole image with square patches of side length s, so that every pixel in img is in at least one patch using the minimal ...
Imago's user avatar
  • 489
1 vote
1 answer
3k views

I want to create an .mp4 output. But it doesn't work... I'm using ffmpeg. My input video is a raw video and I want to have an raw video .mp4 at the end. My code that i use: ffmpeg.exe -i input.y4m -...
Coder95's user avatar
  • 131
0 votes
1 answer
651 views

I am trying to reimplement wor2vec in pytorch. I implemented subsamping according to the code of the original paper. However, I am trying to understand how subsampling is implemented in Gensim. I ...
Pietro's user avatar
  • 465
0 votes
1 answer
1k views

Stratified sampling is old, and very significant. Donald Knuth (high priest of computer science) uses it for evaluating the work of his PhD students, and for teaching his deeply and sincerely held ...
EngrStudent's user avatar
  • 2,022
0 votes
0 answers
1k views

In R,a data set with 30 categories (N cluster=30),in each cluster there are unequal number of units (in ith cluster, there can be 24, 25,26,27, or 28 units). I want to take two stage sampling, first ...
Grace's user avatar
  • 173
2 votes
1 answer
3k views

I am implementing the Skipgram model, both in Pytorch and Tensorflow2. I am having doubts about the implementation of subsampling of frequent words. Verbatim from the paper, the probability of ...
Pietro's user avatar
  • 465
0 votes
1 answer
713 views

yesterday I already asked a similar question: R - Randomly split a dataframe in n equal pieces The answer I got is nearly what I need, but there are still problems with it. Also I thought about ...
Mr.Spock's user avatar
  • 519
1 vote
0 answers
80 views

How can I efficiently compare matched cohorts in spark? In python for each observation of the minority class in a highly imbalanced dataset sampling k observations from the majority class can be ...
Georg Heiler's user avatar
  • 17.9k
0 votes
1 answer
505 views

I noticed that each time I save a jpg file in PHP, it is saved with sub-sampling. How to remove that? I'm using GD library.
worisi24's user avatar
  • 159
0 votes
1 answer
1k views

I was asked in the test on what will be the size of the video of 10 seconds displayed at 25fps assuming each chroma sample takes 4 bits, luminance component takes 8 bits and 4:2:0 chroma sampling is ...
HQuser's user avatar
  • 640
1 vote
1 answer
320 views

The title is probably confusing. I have a reasonably large 3D numpy array. I'd like to cut it's size by 2^3 by binning blocks of size (2,2,2). Each element in the new 3D array should then contain the ...
Matheus Leão's user avatar
1 vote
1 answer
2k views

How can a 1:1 stratified sampling be performed in python? Assume the Pandas Dataframe df to be heavily imbalanced. It contains a binary group and multiple columns of categorical sub groups. df = pd....
Georg Heiler's user avatar
  • 17.9k
1 vote
1 answer
856 views

First, I'm trying to subsample a large dataset with many individuals, but each individual requires a different subsample size. I'm comparing across two time periods, so I want to subsample each ...
user9351962's user avatar
4 votes
2 answers
3k views

I have this problem that I want to plot a data distribution where some values occur frequently while others are quite rare. The number of points in total is around 30.000. Rendering such a plot as png ...
oarfish's user avatar
  • 4,774
1 vote
1 answer
887 views

xgb.cv and sklearn.model_selection.cross_validate do not produce the same mean train/test error even though I set the same seed/random_state and I make sure both methods use the same folds. The code ...
Maauss's user avatar
  • 11
5 votes
4 answers
3k views

I have a dataframe which contains multiple samples (1-n) per group. I would like to sample this dataset, without replacement, so that I have a maximum of 5 samples per group (1-5). This problem has ...
Aaarrrgh's My Game's user avatar
1 vote
0 answers
222 views

I don't understand when the sampling is make: Does the first mini batch will be the same for each epoch? Or there no difference at all?
Fractale's user avatar
  • 1,704
1 vote
1 answer
2k views

I'm new in android and have some questions. The idea is to simulate a book page with some images and text on it and animations that zoom on a column and after clicking a button zooms on a different ...
john sadeghi's user avatar