6,157 questions
0
votes
0
answers
17
views
Significance analysis of microarrays (SAM) on OMIQ gives q-value of exactly 0 [closed]
I am analyzing a set of flow cytometry data in the online environment of OMIQ. OMIQ has a feature to analyze the data with the Significance analysis of microarrays (SAM) algorithm (https://www.pnas....
0
votes
0
answers
60
views
Clustering without all pairwise distances
I have a set of binarized images containing forms, each image follows one of N layouts. There are a few outliers which do not follow a layout and contain random text and images.
The distance between ...
1
vote
0
answers
65
views
FlowSOM randomly stops because of missing consensus.pdf
I am using FlowSOM() Clustering from the FlowSOM and am getting an error while a vectorized function is running:
Error in map2(): ℹ In index: 8. ℹ With name: FileID8. Caused by error in map() at ...
0
votes
0
answers
34
views
How can I reconcile multiple related documents (invoices, returns, and credit notes) with inconsistent data?
I need some help with a fairly complex task I’ve been assigned: document reconciliation between different types of records.
In short, I have to match documents with different “causal codes”:
2: Goods ...
0
votes
0
answers
51
views
Selective Inference on Ordinal Clustering
I've been using an ordered stereotype (OSM) approach to ordinal clustering with the R library 'clustord'
clustord is very well-documented with step-by step vignette. Therefore, to execute row ...
0
votes
0
answers
45
views
How to avoid overmerging with mclust, and failure to reproduce clustering?
I have been working with mclust, and have encountered issues that I can't find an obvious reason for. My main concern is that the threshold for multiple components to be found seems really high, and I ...
0
votes
0
answers
30
views
How to programmatically handle container partition redistribution in GridDB cluster after node failure?
Question
GridDB Container Partition Recovery After Node Failure
I'm working with a 3-node GridDB cluster and need to implement automatic recovery logic when one node fails. My application creates ...
0
votes
0
answers
52
views
Changing post and line colour in deg patterns cluster figures
I have had cluster plots produced for some RNA Seq time course data using the LRT analysis. I believe the plots are produced using the command:
clusters <- degPatterns(cluster_rlog, metadata = meta,...
5
votes
3
answers
244
views
Efficiently group rows within tolerance for multiple numeric columns
I'm trying to group rows that have values within specific error/tolerance.
Input looks like this:
input <- data.frame(Row_number = 1:22,
Name = c(rep("A",6), rep("...
0
votes
0
answers
38
views
Conditional logistic regression with robust standard errors for data matched with replacement
I am working with matched case-control data that used risk-set sampling with replacement (a control can be matched to more than one case). I am trying to figure out the correct syntax for conditional ...
2
votes
1
answer
154
views
How to dynamically partition a 2D array into boxes based on inverse area density?
Context:
I have a 2D array (size N x M), let's call it U, where each cell contains a non-negative value K ≥ 0 representing a "density" at that point. I want to algorithmically divide the ...
0
votes
1
answer
44
views
Spatial clustering with two separate datasets
I'm hoping to get some advice on approaching a clustering problem. I have two separate spatial datasets, being real data and modelled data. The real data contains a binary output (0,1), which is ...
1
vote
1
answer
122
views
How to set minimum-maximum load constraint in Google Route Optimization API
I'm using Google RO API to create clusters. There is a capacity constraint on the clusters and the clusters should not overlap with each other. To do this, I've set the load demand of each shipment to ...
0
votes
1
answer
160
views
DragonFly benchmark: slow on Cluster
I need help regarding dragonfly db, particularly benchmarking.
So here is the story, I tried benchmarking dragonfly as a cache to replace redis. I got the expected result when testing single node; it ...
3
votes
5
answers
153
views
Combine connected list elements to form distinct list elements
I need to combine interconnected list elements to form distinct elements in base R with no additional packages required (while removing NA and zero-length elements).
Edit: I look for flexibility of ...
1
vote
1
answer
154
views
Capacitated Clustering using Google Route Optimization API
Fixed sized clusters
I need help with a capacitated clustering task. I have 400 locations (the number can vary each time), and I need to create fixed-size clusters (e.g., 40 locations per cluster). ...
0
votes
0
answers
28
views
Cluster lat/lon values based on values
I'm trying to cluster values from a map in Python (these values could be income, kindness towards dogs or amount of penguins in supermarkets, for me the values are floats) from different data sources. ...
0
votes
0
answers
57
views
Finding subclusters of a specific cluster
I performed HDBSCAN Clustering
hdbscan_clusterer = hdbscan.HDBSCAN(min_cluster_size=200)
df['Cluster'] = hdbscan_clusterer.fit_predict(data_matrix_for_clustering)
Now, I’m interested in getting the ...
0
votes
0
answers
48
views
Evaluating Fuzzy clustering quality
Initially, I performed kmeans clustering and obtained some meaningful clusters. To refine these clusters, I ran Fuzzy C Means on the Kmeans center using "e1071" package. Are there any ...
2
votes
1
answer
186
views
Clustering lines in bands
Little intro
I have data (link at the bottom), with on the y-axis the score, x-axis the position, for different labels. Now I want to know if there is one label that is "significantly" ...
1
vote
2
answers
94
views
Using Python to group similar values from pair combinations
I have a list of paired values. Values in each pair are declared as similar, meaning two values are considered similar if they appear together in a pair from the list. My goal is to create a list of ...
0
votes
0
answers
39
views
Mapbox Maps iOS - Show unclustered image as an icon in the clustered layer
I'm using GeoJSONSource to show images on the map (like images on the map in Apple Photos).
Those images are loaded from the FeatureCollection object and first thing I do is to add them to map style.
...
0
votes
0
answers
82
views
clustering in Lavaan for structural equation model
I am trying to fit SEM in lavaan that includes both a measurement and structural model. The measurement model consists of six latent variables, which serve as outcomes in the structural model. The ...
2
votes
0
answers
75
views
Fuzzy C-means : All clusters centers converge to the same point after the first centroids update
I am implementing Fuzzy C-means to work with image segmentation following the given algorithm :
However when updating the centroids (this is the first thing that I do) all clusters centers converge ...
1
vote
0
answers
27
views
Cluster detection error (Invalid date in population file ) using rsatscan library purly spatial and discrete Poisson
I'm trying to run a purely spatial analysis using SaTScan in R, but I'm getting date-related errors even though I'm not using any temporal data. Here's the error:
Error: Invalid date '775' in ...
0
votes
0
answers
25
views
Cluster stability measurement for fuzzy clustering
I'm a biologist working in the data science field. I've successfully done clustering for a heterogenic disease with K-means. But I shifted to Fanny to get membership value and to be able to handle the ...
1
vote
1
answer
73
views
How to use WeightedCluster to aggregate sequences and apply on Multichannel sequence analysis
I have 54399 cases, and 2 channels (HOM and HOS), and I want to use multichannel sequence analysis, the data example is as follows:
ID
HOM1
HOM2
HOM3
HOM4
HOS1
HOS2
HOS3
HOS4
1
A
A
B
C
NO
YES
NO
NO
2
...
1
vote
0
answers
45
views
use a priority queue to do hierarchical clustering without import heapq
I am using priority queue to do the hierarchical clustering(can not import heapq), and want to use the complete-link method, but I don't know what is the problem of my code, the reason is far from ...
-1
votes
1
answer
181
views
get this error when i run the k-means code -> AttributeError: 'NoneType' object has no attribute 'split'
from sklearn.cluster import KMeans
cs = []
for i in range(1, 11):
kmeans = KMeans(n_clusters = i, init = 'k-means++', max_iter = 300, n_init = 10, random_state = 0)
kmeans.fit(X)
cs.append(...
0
votes
1
answer
102
views
Community Detection with both Node and Edge Weights
I have a directed graph where there are importance or weight attributes for both the nodes and edges. I am looking for a community or module detection implementation in python that will consider both ...
1
vote
1
answer
191
views
K-Means taking a long time
I'm using k-means for my project for the first time. my dataset has more than 400,000 rows and 11 columns, I run the k-means for k= 3, 5, 7, 9, and 10. it took more than 65 minutes and still no output....
4
votes
3
answers
156
views
Filter rows based on combined set of values in a string
In R, I have the following dataframe with the column "overlap" listing rows that have overlapping values on some other column.
df <- data.frame(overlap = c("1,2,3", "1,2,3&...
0
votes
0
answers
177
views
Efficient parallelization of silhouette score calculation
I have a large dataset (2 million rows, 100 columns), and I need to perform clusterization. I used the elbow method to determine the optimal number of clusters. However, in order to get a more refined ...
0
votes
1
answer
80
views
How to solve "Duplicated samples have been found in X" error for DBCV metric
I'm trying to compute the DBCV metric (provided by "git+https://github.com/FelSiq/DBCV") on density-based clusters from a dataset similar to the one shown here:
The calculation is performed ...
1
vote
0
answers
55
views
Clustering for grouping sentences and then caption the cluster with a short name
I have a series of text utterances in summary form (form of sentences). I am trying to perform clustering and group them with similarity in context (not in literal meaning) and report the clusters ...
0
votes
1
answer
55
views
Problems with creating a mathematical clustering model with an additive criterion in CPLEX OPL Studio
I'm trying to create a model in CPLEX OPL Studio for clustering with an additive criterion, but I have a number of errors that I don't know how to fix correctly, because I'm very bad at OPL Studio
...
-2
votes
1
answer
40
views
Clustering a Grid into segments of equal length
I have a grid with many interconnections. The grid consists of edges of different length. I would like to cluster this grid into segments of similar length. The edges which are summarized in a cluster ...
1
vote
0
answers
44
views
how to visualize collective behaviour of self propelled rod in two dimension?
multiple no. of self propelled rods (modelled using odd number of connected hard spheres ) with a fixed self propelled velocity is moving in a medium (2D) with three different diffusion constants for ...
0
votes
1
answer
135
views
How to create a dendrogram colored by clusters with hclust and cutreeDynamic
I'm working on a clustering problem and I would like to use the hclust functions to create the dendrogram and cutreeDynamic to create clusters from the mentioned dendrogram. In fact, I have already ...
0
votes
1
answer
173
views
How to use the dissimilarity matrix output from vegdist() function for hclust()?
I have computed the dissimilarity matrix using vegdist() function, and method specified as "morisita". However, even though hclust() function is built to read both distance or dissimilarity ...
3
votes
1
answer
58
views
How can I find contour or edges in my picture with opencv in Python3?
I want to detecting the three rectangles(white, gray, black) in this picture, like below image.
I tried to use find_contour function in opencv for Python, but the light gray stripes disturbed find ...
1
vote
2
answers
150
views
Clustering longitudinal data with labels?
I have longitudinal data as follows:
import pandas as pd
# Define the updated data with samples only in 'sample_A' or 'sample_B'
data = {
'gene_id': ['gene_1', 'gene_1', 'gene_1', 'gene_1', '...
1
vote
1
answer
106
views
Clustering geometries recursively exceeds cluster size limit
I want each cluster to have a maximum of 20 items. Here is my code in PostgreSQL with PostGIS extension:
WITH RECURSIVE clustered_data AS (-- Step 1: Perform initial clustering
SELECT pma.*
...
4
votes
1
answer
496
views
Topic modelling many documents with low memory overhead
I've been working on a topic modelling project using BERTopic 0.16.3, and the preliminary results were promising. However, as the project progressed and the requirements became apparent, I ran into a ...
0
votes
1
answer
519
views
What is the interpretation of this wavy T-SNE plot?
I am trying the T-SNE method to explore high-dimensional datasets and reduce its dimensionality.
And I have ended up with the following plot.
I have used the TSNE parameters n_components=2 and init='...
0
votes
1
answer
56
views
Spring Boot 3 Session Clustering Error Deployed on External Tomcat10
I use Spring Boot 3.x and an external Tomcat 10.
Set up session clustering on an external Tomcat
If I check on the jsp page, the session is shared, but
If I check the same logic with spring boot ...
-1
votes
1
answer
126
views
What features to extract to cluster text?
I want to make a classifier for text, which is further use to suggest the most similar text for a one given.
The flow of the app is the following:
extract the main 10 topics from the text, using a ...
0
votes
1
answer
144
views
pheatmap clustering order
I have this dataset:
> dput(mdata2)
structure(list(EE = c(3.3221428469822, 3.62699732299098, 1.75430154205983,
0.809228977410138, 1.24117055233438, 2.93403148663873, 4.01630566539058,
1....
0
votes
1
answer
253
views
Clustering for SBERT embedding
I have a set of sentences which I have transformed into vectors using SBERT embedding. I would like to cluster these vectors.
When looking for informations online, I keep seeing post telling to do ...
2
votes
1
answer
95
views
How to delete edges based on cluster_edge_betweenness output
I want to do the same as asked here, using the first approach from the question.
Sadly, the mods variable from the following line is not defined and I'm asking my self how to adjust:
g2 <- delete....