2

I can read a csv file which does not have column headers in the file. With the following code using polars in rust:

use polars::prelude::*;

fn read_wine_data() -> Result<DataFrame> {
    let file = "datastore/wine.data";
    CsvReader::from_path(file)?
        .has_header(false)
        .finish()
}


fn main() {
    let df = read_wine_data();
    match df {
        Ok(content) => println!("{:?}", content.head(Some(10))),
        Err(error) => panic!("Problem reading file: {:?}", error)
    }
}

But now I want to add column names into the dataframe while reading or after reading, how can I add the columns names. Here is a column name vector:

let COLUMN_NAMES = vec![
    "Class label", "Alcohol",
    "Malic acid", "Ash",
    "Alcalinity of ash", "Magnesium",
    "Total phenols", "Flavanoids",
    "Nonflavanoid phenols",
    "Proanthocyanins",
    "Color intensity", "Hue",
    "OD280/OD315 of diluted wines",
    "Proline"
];

How can I add these names to the dataframe. The data can be downloaded with the following code:

wget https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data
3
  • Not super familiar with polars, but with python DataFrame libraries you can pass in headers when reading CSV files. Just taking a quick look at the polars docs, it seems that you can use the with_schema or with_dtypes method on the CsvReader, where a Schema would be your column definitions. pola-rs.github.io/polars/polars/prelude/… Commented Jun 11, 2022 at 18:11
  • Yes, I have seen that API doc also, but I have not found any instruction that how the inputs with_schema or with_dtypes should look like of course it takes a Schema type but could find a concrete example of the constructor of the Schema type. The Schema::new() method does not take any input. Commented Jun 11, 2022 at 21:19
  • It looks like Schema has another method, with_column, that takes in a column name and a datatype. So, my assumption would be Schema::new().with_column("columnname", DataType::Int32): pola-rs.github.io/polars/polars/chunked_array/object/… Commented Jun 12, 2022 at 15:41

2 Answers 2

5

This seemed to work, by creating a schema object and passing it in with the with_schema method on the CsvReader:

use polars::prelude::*;
use polars::datatypes::DataType;

fn read_wine_data() -> Result<DataFrame> {
  let file = "datastore/wine.data";

  let mut schema: Schema = Schema::new();
  schema.with_column("wine".to_string(), DataType::Float32);

  CsvReader::from_path(file)?
      .has_header(false)
      .with_schema(&schema)
      .finish()
 }


fn main() {
    let df = read_wine_data();
    match df {
        Ok(content) => println!("{:?}", content.head(Some(10))),
        Err(error) => panic!("Problem reading file: {:?}", error)
    }
}

Granted I don't know what the column names should be, but this is the output I got when adding the one column:

shape: (10, 1)
┌──────┐
│ wine │
│ ---  │
│ f32  │
╞══════╡
│ 1.0  │
├╌╌╌╌╌╌┤
│ 1.0  │
├╌╌╌╌╌╌┤
│ 1.0  │
├╌╌╌╌╌╌┤
│ 1.0  │
├╌╌╌╌╌╌┤
│ ...  │
├╌╌╌╌╌╌┤
│ 1.0  │
├╌╌╌╌╌╌┤
│ 1.0  │
├╌╌╌╌╌╌┤
│ 1.0  │
├╌╌╌╌╌╌┤
│ 1.0  │
└──────┘
Sign up to request clarification or add additional context in comments.

Comments

1

Here is the full solution working for me:

fn read_csv_into_df(path: PathBuf) -> Result<DataFrame> {
    let schema = Schema::from(vec![
        Field::new("class_label", Int64),
        Field::new("alcohol", Float64),
        Field::new("malic_acid", Float64),
        Field::new("ash", Float64),
        Field::new("alcalinity_of_ash", Float64),
        Field::new("magnesium", Float64),
        Field::new("total_phenols", Float64),
        Field::new("flavanoids", Float64),
        Field::new("nonflavanoid_phenols", Float64),
        Field::new("color_intensity", Float64),
        Field::new("hue", Float64),
        Field::new("od280/od315_of_diluted_wines", Float64),
        Field::new("proline", Float64),
    ]);
    CsvReader::from_path(path)?.has_header(false).with_schema(&schema).finish()
}

I had Use Field and types for each field to create a schema then use the schema in CsvReader to read the data.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.