21,952 questions
0
votes
1
answer
133
views
Is combining pytorch and ProcessPoolExecutor doable?
I have created a structure, which provided a dataset as opts to a ProcessPoolExecutor and the inputs are the indices for the dataset.
I could provide a MWE, but I tried several approaches and all ...
1
vote
0
answers
58
views
igraph shortest_paths causes copying when launched in parallel with mclapply/parallel
I need to calculate many shortest paths between a set of starting points (origin_nodes) and a set of end points (target_nodes) in a very large graph. The graph can be over 50gb in ram. I could, in ...
0
votes
1
answer
77
views
SAS - handling macro variables in nested RSUBMITs
I am trying to code a simulation in SAS with nested RSUBMITs; the nested RSUBMITs are within a macro. I cannot get the macro variables to pass correctly into or out of the 2nd RSUBMIT.
Here is a very ...
0
votes
0
answers
109
views
Running a Rcpp function in parallel is slower than running it in serial
I am developping a R function than used a function code in Rcpp (i am already new to this).
I wanted to make simulations and run the same r functions in a cluster of x cores. Unfortunately when I do ...
1
vote
1
answer
85
views
Troubleshooting 'row names discarded' warning in parallel simulation output in R using the 'future' package
I'm running a simulation using the future and future.apply packages in R where I need to execute multiple iterations of a function in parallel and bind the results together. When I use more than one ...
0
votes
0
answers
32
views
Write Data contiguously across threads
I have a compute shader that culls object triangles against frustums.
For culling, I use a huge vertex and index buffer and a pair of (offset, count) to identify the range of vertices for a single ...
3
votes
1
answer
100
views
What is the best way to perform parallel reduction to consolidate contributions to a matrix?
I am attempting to parallelise a calculation and consolidate the results into a matrix. A large number of calculations are performed and each one contributes to a summed matrix of all the results.
...
-1
votes
2
answers
200
views
What are the performance implications of using AsParallel() with Parallel.ForEach() in LINQ queries?
To optimize the performance of a C# application and came across AsParallel() in LINQ.
I want to understand the key differences between them, especially regarding performance when working with larger ...
0
votes
1
answer
57
views
openxlsx and writeData in parallel
I'm trying to parallelise with openxlsx and its function writeData the export of many Excel individual files that should be somehow summarised in a central Excel file.
As can be shown in the reprex ...
0
votes
1
answer
51
views
Is there a way to improve the parallel process described below to solve an ODE equation?
I am trying to solve coupled ODEs, but I present a small part of the problem here. I have tried to solve for R along the entire grid. The grid size is too big to run conventional methods, hence I am ...
0
votes
0
answers
53
views
shiny and showNotification in parallel
Below is my sequential reprex Shiny App that I want to run in a parallel environment:
lapply(c("shiny", "DT", "parallel"), library, character.only = TRUE)
ui <- ...
0
votes
1
answer
49
views
Parallelization by threads vs Parallelization by processes on backend
I used to work with usual sync programming and the archtecture implied
that if you need anything to run a parallel,
you queue it in message system and you would spawn extra process on the same or ...
0
votes
0
answers
31
views
How to process kafka batch events in parallel with order guarantees for duplicate IDs in Apache Camel?
I have a problem with parallel processing in Apache Camel. I am consuming a batch of 10 messages from Kafka, which gives me an exchange with list of exchanges. I want to process these exchanges in ...
0
votes
0
answers
70
views
Running PySpark Jobs in Parallel within Dagster on Dockerized Setup
I am implementing a Docker setup for running Dagster and PySpark together. My docker-compose.yml file looks like this:
dagster:
container_name: dagster
hostname: dagster
build:
...
1
vote
2
answers
178
views
How to properly run Python multiprocessing pool inside larger loop and shut it down before next loop starts
I have a large script where I am processing terabytes of weather/climate data that comes in gridded format. I have a script that uses an outer loop (over years - 1979 to 2024), and for each year, ...
0
votes
0
answers
34
views
How to fix time.process_time() not working when wrapped around a function run with Numba jit, with parallelisation=True
I have a function that has a numba @njit wrapper around it to make it faster, I've set parallel=True to make it run faster. And now want to measure the time it takes, using time.process_time(), ...
2
votes
0
answers
74
views
Parallelizing highly dynamic and unbalanced loads
I have a computation with a following structure (pseudocode):
intermediate_results = []
for source in sources: # (1)
source_data = prepare( load( source ) ) # (2)
for sample in ...
0
votes
0
answers
32
views
recordbatchreader failed when reading parquet file
I tried to use arrow::recoredbatchreader to read multiple rowgroups from a parquet file in parallelism. I use GetRecordBatchReader to acquire recordbatchreader. However, I noticed that when the number ...
0
votes
1
answer
117
views
How to partition a list and send requests in parallel
I would like to partition a list into a sublists, and send a request for each sublist in parallel.
I have a list of product ids, want to partition it to be sublists with the size of 3.
List<List<...
-3
votes
1
answer
91
views
How to run a parallel for loop in Python when filling an array? [closed]
I am using the metpy package to calculate many different weather parameters for many different locations across North America for many different hours. I want to fill arrays containing these weather ...
1
vote
1
answer
102
views
how to reduce memory footprint when reading parquet file
I want to read a parquet file batch by batch in parallelism. I achieve this by merge multiple continuous rowgroup together and read them by arrow::RecordBatchReader. When I monitor the memory usage ...
0
votes
1
answer
76
views
Big Matrix not available to workers
I am converting a large data frame into a big.matrix object to enable parallel processing (otherwise, the data frame is too large and I run out of RAM). My code is currently like this:
df <- data....
1
vote
2
answers
167
views
Oracle APEX - page monitoring background dbms_scheduler job
I have a procedure which is taking time to execute. The procedure is being called from Oracle APEX upon clicking on a submit button, but it times out after 30 mins. Since the users doesn't want to ...
1
vote
0
answers
41
views
Python multiprocessing: Exponential slowdown when processing batches of HDF5 files
I'm trying to load and process large amounts of HDF5 files using Python and convert them into dataframes. The HDF5 files are scattered in equal-sized batches. I've tried two approaches using ...
2
votes
2
answers
312
views
How to write a GPU worker pool to run multiple tasks at the same time in bash?
Suppose there are 4 CUDA devices (0,1,2,3) on my computer and there are 10 tasks to run, each tasks is a script named run01.sh, run02.sh, ..., run10.sh.
The problem is, each task use only 1 GPU, I ...
1
vote
2
answers
179
views
Is it possible to read a file in parallel by extending this function?
I developed this function in C to read a file consisting of one word per line, like a standard wordlist.
The function has already been optimized to a reasonable extent, but I would like to know if ...
1
vote
1
answer
45
views
Optimizing PowerShell Script for Remote Execution and Parallel Processing
I have a PowerShell script that performs remote execution of a script on multiple servers. The script checks whether a particular script exists on each server, and if it does, it invokes that script ...
0
votes
2
answers
79
views
In OpenMPI, is there a way to put jobs on specific cores?
The question pretty much says it all. This is for benchmarking purposes. I really do need to target specific cores on specific nodes. Targeting particular nodes is not enough in and of itself for ...
0
votes
1
answer
111
views
Warp Reduce primitives to threads sharing same value [closed]
I'm facing the problem of reducing values to threads in warps that shares the same variable's content.
More specifically, in order to avoid atomic add operation on the an array i'm evaluating ...
0
votes
0
answers
31
views
Using multiprocessing to simulate multiple computers
I want to use multiprocessing (Python) to simulate the following scenario: there are multiple computing centers, each with multiple computers, and these computers can execute tasks in parallel. ...
2
votes
0
answers
179
views
How do batches in Hugging Face transformers internally work?
While using hugging face transformers, when calling the .generate, how are the input prombt internally executed? and how does batch size make a difference? For instance, input dimension (1,128) vs (10,...
0
votes
0
answers
35
views
Pre-staging large data files for parallel job execution
Apologies in advance if this is a mundane or unclear question.
I want to scale up a workflow on on a cluster to run a program concurrently on several nodes. The program in question references a large, ...
0
votes
0
answers
54
views
How to handle transactions in parallelFlux?
We have a very processing heavy flux pipeline. To speed it up we are using parallel flux. But now the problem is that everything is waiting behind Database connection and its still slow. The entire ...
0
votes
0
answers
35
views
How to run two cells in parallel in Jupyter Notebook?
I am wondering if it is possible to run two parallel processes in Jupyter Notebook, in two different cells, so that each process can calculate and print its own results under the cell.
So far I have ...
2
votes
2
answers
145
views
How to process a massive file in parallel in Python while maintaining order and optimizing memory usage?
I'm working on a Python project where I need to process a very large file (e.g., a multi-gigabyte CSV or log file) in parallel to speed up processing. However, I have three specific requirements that ...
0
votes
2
answers
78
views
How to parallelize NUnit tests that need to use FileParameter?
.NET 8.0
NUnit 4.1.0
I have a bunch of tests that are wrapped within a TestFixture attribute.
Each test case is currently opening a file as a FileParameter and passing that to a function:
string path ...
1
vote
1
answer
635
views
Parallelising dlthub Rest API pipeline
I am trying to speed up the following dlt (dlthub) pipeline via parallelisation as shown in the documentation here: https://dlthub.com/docs/reference/performance#parallelism
Here is the original (NOT ...
0
votes
1
answer
101
views
Python Multiprocessing program using only 2 of the 20 cores
I am new to multiprocessing, so this might be a stupid question.
I am using Ubuntu 20.04.6 LTS (64-bit) with a 12th Gen Intel(R) Core(TM) i7-12700K processor and 16GB of RAM under Python 3.9.19. When ...
1
vote
0
answers
55
views
java parallel streams to exclude main thread as worker thread
Does Java parallel stream process all substreams in worker threads, I see the main thread is also used as a worker thread.
Sample program:
package org.example;
import java.util.Arrays;
import java....
2
votes
1
answer
474
views
Can we assign different number of workers to different playwright test environments?
The application that I'm testing has single-user per session enforcement, so I've reduced the number of workers to 1. This has created the problem that my entire Playwright suite takes over an hour to ...
0
votes
1
answer
106
views
Is it possible to ensure an Azure Function executes only once in parallel?
I have the requirement to execute an Azure Function (let's call it OperationalFunction) only once in parallel. What I mean is the following:
I have two "entry point" functions:
I have a ...
0
votes
1
answer
38
views
How to maintain synchronization between distributed python processes?
I have a number of workstations that run long processes containing sequences like this:
x = wait_while_current_is_set
y = read_voltage
z = z + y
The workstations must maintain synchronization with a ...
2
votes
1
answer
267
views
Repeated wandb.init() in parallelized wandb sweeps
I wrote some code trying to parallelize my wandb sweeps since the model I am working with takes a long time to converge and I have a lot of subprocesses to sweep through. Basically I don’t have the ...
0
votes
0
answers
74
views
How can I prevent Task.Run and Parallel.ForEach from using the "outer" thread?
My "outer" code running under a thread (obviously). That outer code calls Parallel.ForEach and/or Task.Run, and the outer thread also gets used inside those methods.
Consider this code:
...
0
votes
2
answers
340
views
Differences between Superscalar Processors and Very Long Instruction Word Design?
Disclaimer: Somewhat new to deep diving into how the hardware actually executes instructions.
Reading "Game Engine Architecture" by Jason Gregory, and I'm on the Implicit & Explicit ...
2
votes
1
answer
154
views
Coroutines accelerate execution by more times than number of logical processors. Why?
I'm playing around with kotlin coroutines and now I am testing it on Nilakantha series (formula for calculating Pi).
Here is my code:
import kotlinx.coroutines.*
import kotlin.system.measureTimeMillis
...
1
vote
2
answers
152
views
I used a 2 workers pool to split a sum in a function with Python multiprocessing but the timing doesn't speed up, is there something I'm missing?
The Python code file is provided below. I'm using Python 3.10.12 on a Linux mint 21.3 (in case any of these info are needed). The one with a pool of 2 workers takes more time than the one without any ...
1
vote
0
answers
59
views
Speedup ratio and parallel efficiency of implemented simple parallel matrix multiplication
The following is my source code for parallel calculation matrix multiplication, which tests the parallel efficiency obtained by the number of 1, 2, 4, 8, 16, and 32 threads, respectively.
My Operating ...
-2
votes
1
answer
116
views
Is there any way that I can convert my cpu parallel code to the cuda?
I have written a code which works on CPU cores very well. But it's not fast enough for me. I want to run it on CUDA cores and I already tried to write a kernel for the montecarlo part of it and etc. ...
0
votes
1
answer
68
views
How to merge private arrays into a shared one in OMP - C?
I am trying to parallelize a program for finding local maxima using reduction. However, I am encountering a problem during the merging process: the merged array ends up containing exactly two fewer ...