380 questions
1
vote
2
answers
82
views
How to loop over a polars column in Rust?
let df: polars::prelude::DataFrame = read_dataframe(save_path).unwrap();
let texts = df.column("text").unwrap();
let mut counter: i32 = 0;
for lemma in texts.try_into(){
counter += 1;...
1
vote
0
answers
89
views
Why do I observe so high memory usage for copying a Parquet file with streaming using Polars with Rust?
Goal
I want to write a function in Rust using the Polars crate that does the following:
Copy a Parquet file from one location to another
Handle files larger than RAM
Not load the entire file into ...
1
vote
1
answer
26
views
Update foward_fill when updating to polars 0.48.1
I am updating my rust polars dependencies for 0.46 to 0.48
I am having trouble with the following line
df = df.lazy().select(&[col("ts"), col("sensor_id").forward_fill(None), ...
2
votes
0
answers
180
views
Speeding up Polars rust plugin branching and aggregating
I'm following polars plugins tutorial - branch mispredictions and it says that theres a faster way to implement the following code:
#[polars_expr(output_type=Int64)]
fn sum_i64(inputs: &[Series]) -...
1
vote
1
answer
97
views
Converting a Rust `futures::TryStream` to a `polars::LazyFrame`
I have an application where I have a futures::TryStream. Still in a streaming fashion, I want to convert this into a polars::LazyFrame. It is important to note that the TryStream comes from the ...
6
votes
1
answer
202
views
'builtin_function_or_method' object cannot be interpreted as an integer when trying to call rust function from python
While trying to develop python packages using rust code, I had some problems that seem to preceed the function invocation in rust. I've tried to implement this simple function:
use polars::prelude::...
4
votes
1
answer
616
views
How to use the "is_in" function correctly?
In Polars 0.46.0 it works normally:
let df = df!(
"id" => [0, 1, 2, 3, 4],
"col_1" => [1, 2, 3, 4, 5],
"col_2" => [3, 4, 5, 6, 7],
)
.unwrap();
dbg!(&...
1
vote
2
answers
67
views
Polars Rust Filtering on Group in Aggregate Window
I have a dataframe where I'm calculating the sum of certain messages within x window (say, 30 seconds).
I want to only keep rows where that sum is greater than or equal to 3. Here's my code:
async fn ...
5
votes
1
answer
73
views
Do Polars Expression plugins guarantee aligned chunks?
Does the Polars Expression plugin framework guarantee that chucks are aligned (to avoid footguns)? If I have the following function it appears that it happens:
#[polars_expr(output_type=Float64)]
fn ...
0
votes
1
answer
112
views
Most efficient way for polars plugin expression taking multiple columns
I'm creating a polars plugin for doing different physical calculations. I have quite a few situations where I have 3 to 5 columns as input. Below is a quite trivial case since it could be implement ...
1
vote
1
answer
108
views
Rust Polars upgrade from 46 to 47 diff() function changed and results a runtime error
I am upgrading Polars Rust function from v0.46.0 to v47.1, diff() function broken and the hints Nth(1) not working:
error[E0308]: mismatched types
--> src/lib.rs:1368:35
|
1368 | ...
1
vote
1
answer
210
views
How do I ensure that a Polars expression plugin properly uses multiple CPUs?
I'm writing a polars plugin that works, but never seems to use more than one CPU. The plugin's function is element-wise, and is marked as such in register_plugin_function. What might I need to do to ...
0
votes
1
answer
65
views
encountered problems when converting str to datetime in Rust polars
let t_hour_df = hour_df.clone().lazy()
.with_column( col("hour").cast(DataType::Datetime(TimeUnit::Nanoseconds,Some(PlSmallStr::from("%Y-%m-%d %H%M")))).alias(&...
0
votes
0
answers
171
views
How can I use Polars to read a Parquet file in small batches? BatchedParquetReader seems broken
I want to read a Parquet file in batches/chunks so I don’t have to have the whole file in RAM. It’s a large file like tens of Gigabytes. I tried BatchedParquetReader, but it still reads the entire ...
1
vote
0
answers
69
views
How to set the number of threads for `LazyCsvReader`
let gaf_df = LazyCsvReader::new(file_path.clone())
.with_has_header(false)
.with_separator(b'\t')
.finish()?;
let gaf_df_select = gaf_df
.select([
col(&...
1
vote
1
answer
363
views
How to use Polars to read specific columns from a CSV file
I have a very large file generated by other tools, but I don't need all the information, only a few columns of information are enough. When I use Python pandas to read, I can specify the required ...
0
votes
0
answers
96
views
How to read large Arrow IPC files in batches for transformation with low memory usage?
I'm working with a Rust-based data processing pipeline using the polars and arrow2 crates. I have a flow where I batch-read CSVs and write them to an Arrow IPC file using IpcWriter with compression ...
1
vote
2
answers
145
views
Rust-polars: unable to filter dataframe after renaming the column filtered
The following code runs:
fn main() {
let mut df = df! [
"names" => ["a", "b", "c", "d"],
"values" => [1, 2, 3, 4],
...
1
vote
1
answer
161
views
Using `is_in` in rust-polars
I am trying to subset a rust-polars dataframe by the names contained in a different frame:
use polars::prelude::*;
fn main() {
let mut df = df! [
"names" => ["a", &...
0
votes
0
answers
273
views
Issue Writing Polars DataFrame in Chunks to Arrow/Parquet Without Corruption
Issue Writing Polars DataFrame in Chunks to Arrow/Parquet Without Corruption
What I Am Trying to Do
I'm trying to write a Polars DataFrame in chunks to either an Arrow IPC file or a Parquet file ...
1
vote
2
answers
105
views
Polars column manipulation
I've been trying to figure out how to perform this with the Rust Polars library. But I am still trying to learn Rust and its Polars. And the casting is holding me back. With this code, I get an error ...
1
vote
1
answer
83
views
How to get all group in polars by rust?
In python, just like this
df = pl.DataFrame({"foo": ["a", "a", "b"], "bar": [1, 2, 3]})
for name, data in df.group_by("foo"):
print(...
2
votes
1
answer
86
views
How to append a row with column-wise means in a Polars DataFrame in Rust?
I'm trying to calculate the mean of each numeric column in a Polars v0.46.0 DataFrame using Rust, and append the result as a new row at the bottom of the DataFrame.
Here's a simplified example of the ...
0
votes
1
answer
189
views
How to serialize json from a polars df in rust
I want get json from polars dataframe, follow this answer Serialize Polars `dataframe` to `serde_json::Value`
use polars::prelude::*;
fn main() {
let df = df! {
"a" => [1,2,3,...
2
votes
1
answer
120
views
How to format expr in polars by rust?
Polars python, format a column like this
df = pl.DataFrame({
"a": [0.15, 0.25]
})
result = df.with_columns(
pl.format("{}%", (pl.col("a") * 100).round(1))
)
print(...
1
vote
0
answers
38
views
Extracting value of type list[i16] from parquet cell [duplicate]
I have a parquet file with two columns: one of type list[i16] and another of type list[f32]:
┌───────────────┬─────────────────────────────────┐
│ Column A ┆ Column B │
│ --...
1
vote
0
answers
144
views
Using multithreading in Polars Expression Plugins
UPDATE:
See this SO post where the streaming engine is used:
How do I ensure that a Polars expression plugin properly uses multiple CPUs?
Orginial post:
I want to write a custom Polars Expression ...
1
vote
1
answer
145
views
Polars expressions that switches on data type
How do I write a Polars expression that switches based on the datatype received? As an example, many Polars operations that work on null values fail on null columns. I want to avoid the computation ...
2
votes
2
answers
204
views
Renaming a single column with unknown name in Polars-Rust
The following renames a column in a Polars-Rust dataframe:
#![allow(unused_variables)]
use polars::prelude::*;
fn main() {
println!("Hello, world!");
let mut df = df! [
&...
0
votes
1
answer
72
views
Secondary y-axis for Plotlars TimeSeriesPlot
I would like to create a Plotlars TimeSeriesPlot for 2 time series that have completely different value range. Hence a secondary y-axis would be great for readability (as of now I just see 2 parallel ...
0
votes
1
answer
141
views
using format!() with data from polars cells
I am trying to figure out how to get format!() to work when pulling data from a polars dataframe but am not getting the formatting/alignment. It works when passing a String into format!() instead of ...
4
votes
1
answer
342
views
Why is that pola.rs rust-code considerably slower than the python version?
I am currently comparing different DataFrame based libs in python and rust. Of course I also check pola.rs, as that lib can be used from within both programming languages.
I tried to write the same ...
1
vote
2
answers
192
views
How to sum a column?
I have difficulty summing a column in Polars-Rust dataframe. E.g., the folowing snippet:
use polars::prelude::*;
fn main() {
// let numbers = [1, 2, 3, 4, 5];
let n = 5;
let numbers: Vec&...
1
vote
1
answer
120
views
`scan` with struct as input and output in Polars plugin
I've worked through this Polars plugins tutorial, which covers both cumulative iteration through a column using Rust's scan and working with structs as input and output. I got stuck trying to achieve ...
-2
votes
1
answer
131
views
Reading a float array from file and converting it to integer [closed]
I am wondering about the equivalent to the following Python code in Rust:
import numpy as np
import pandas as pd
X = pd.read_csv('./myfile.tsv', sep='\t')
X1 = (X > 0).astype(np.float64).T
X2 = X1....
2
votes
1
answer
128
views
How to create a custom function expression
I have a function that creates a ChunkArray<ListType> from two chunk arrays, however I'm having a hard time converting the column function into a "Function Expression".
The goal is ...
0
votes
1
answer
105
views
How can i construct and concat a dataframe or series with Array elements?
Suppose I wanted to add into my dataframe a column for arrays. This can be done easily in the python implementation by specifying the schema at construction. Here is some code to show off my issue
fn ...
1
vote
1
answer
178
views
Polars Rust equivalent to pl.lit() (repeated value in df)
In python I can construct a dataframe with a repeated value like this:
import polars as pl
df = pl.DataFrame({"foo": [1,2]}).with_columns(bar=pl.lit("baz"))
Can this be done in ...
0
votes
0
answers
69
views
Custom Expression returns list[f64] instead of f64 when using group_by_dynamic()
When using group_by_dynamic() to perform a rolling calculation, my custom geometric mean expression will return a list[f64] dtype for each value instead of a f64.
However, when performing the ...
1
vote
0
answers
346
views
Printing Polars dataframe in RUST
I am trying to print a dataframe in rust. This is my code:
use polars_core::prelude::*;
use polars_io::prelude::*;
fn main() {
let test_df = example();
println!("{:?}", test_df);
}
...
1
vote
1
answer
125
views
Mutate polars column and keep original column name on custom expression
I trying to implement a custom expression in Rust polars to calculate the geomean of different columns, essentailly replicating the same behavior of .mean() expression where it will apply the ...
0
votes
1
answer
87
views
Polars Rust Napi borrower checker
When creating a Polars Rust Napi function, I am running into a borrower checker error:
returns a value referencing data owned by the current function
Any help would be appreciated. Thx
#[napi(...
0
votes
1
answer
61
views
Rust Polars dataframe aggregate top values from list in dataframe and join back to original dataframe
I'm trying to find a way to populate a new field in a dataframe that is the result of a group_by and aggregation.
For example, in a measurements dataframe, a column reader, has a list of animal sights ...
1
vote
1
answer
456
views
Polars Rust API creating a dataframe from a string variable / reading csv with options from a string
Using the Polars Rust API, is it possible to create a DataFrame directly from a CSV string / reader while specifying options such as the separator?
Currently, I'm working around this by saving the ...
0
votes
0
answers
117
views
How to pass rust polars DataFrame to python object?
I know there's this question. However the code I want is a little different.
[dependencies]
ordered-float: "4.4"
polars: {version="0.43", features = ["lazy", "...
1
vote
1
answer
178
views
Map over struct in Polars
In Rust Polars I have a dataframe with columns a and b and a struct MyStruct with fields a and b (with type u64). I want to convert each row of the dataframe into a MyStruct, returning a vector of ...
0
votes
1
answer
110
views
Query SQL Server with integrated security - Query Failed SourceNotSupport("MsSQL"), Any Comments?
I am trying to read SQL Server data to polars DataFrame in Rust with connectorx. However, I never succeeded with windows authentication. When run, it returns
Query failed: SourceNotSupport("...
1
vote
1
answer
104
views
How do I perform this aggregation over columns based on data from other columns in Polars?
I have data that looks like this loaded into Polars:
uid
groupid
thresholds
class
data1
data2
data3
data4
X1
X
0.0
0
1
1
1
X2
X
0.0
0
1
1
1
X3
X
0.0
0
1
1
1
Y1
Y
0.0
1
1
1
1
Y2
Y
0.0
1
1
1
1
Y3
Y
...
0
votes
1
answer
165
views
Error while filtering the DataFrame by column value in Rust with Polars: Expected &column, found &str
I’m working with a CSV file in Rust using the Polars library and successfully read the CSV into a DataFrame. Now, I need to filter the DataFrame based on a specific value in the "City" ...
0
votes
2
answers
193
views
Failed to resolve polars_core, arrow::legacy, Dataframe is polars-lazy = "0.44.2"
Despite the:
reading of the polar_lazy 0.44.2
sucessful installation of cargo add polars-lazy
the following code results in errors:
error[E0433]: failed to resolve: could not find legacy in arrow
...