Using Ruby CSV to extract one column

Question

I've been trying to work with getting a single column out of a csv file.

I've gone through the documentation, http://www.ruby-doc.org/stdlib/libdoc/csv/rdoc/index.html but still don't really understand how to use it.

If I use CSV.table, the response is incredibly slow compared to CSV.read. I admit the dataset I'm loading is quite large, which is exactly the reason I only want to get a single column from it.

My request is simply currently looks like this

@dataTable = CSV.table('path_to_csv.csv')

and when I debug I get a response of

#<CSV::Table mode:col_or_row row_count:2104 >

The documentation says I should be able to use by_col(), but when I try to output

<%= debug @dataTable.by_col('col_name or index') %>

It gives me "undefined method 'col' error"

Can somebody explain to me how I'm supposed to use CSV? and if there is a way to get columns faster using 'read' instead of 'table'?

I'm using Ruby 1.92, which says that it is using fasterCSV, so I don't need to use the FasterCSV gem.

You might be getting it wrong because by_col doesn't take any arguments. — Gal
– Gal, Commented May 11, 2011 at 19:32

jkebinger · Accepted Answer · 2011-05-11 20:16:30Z

16

To pluck a column out of a csv I'd probably do something like the following:

col_data = []
CSV.foreach(FILENAME) {|row| col_data << row[COL_INDEX]}

That should be substantially faster than any operations on CSV.Table

answered May 11, 2011 at 20:16

jkebinger

4,1644 gold badges22 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

coderhs · Accepted Answer · 2015-01-30 06:28:49Z

16

You can get the values from single column of the csv files using the following snippet.

@dataTable = CSV.table('path_to_csv.csv')
@dataTable[:columnname]

answered Jan 30, 2015 at 6:28

coderhs

4,9021 gold badge19 silver badges26 bronze badges

1 Comment

sixty4bit Over a year ago

this is the best answer

BobRodes · Accepted Answer · 2023-08-11 04:02:47Z

I found that this works for me (I'm using the OP's variable name here):

@dataTable = CSV.read('path_to_csv.csv')
@dataTable.by_col!
p @dataTable.values_at('Field1')

This prints all the values in the column Field1, as an array of arrays with one element: [value1],[value2],[value3]... and so on. So

p @dataTable.values_at('Field1').flatten

will print all the values in the column Field1 in a single array.

If you want to loop through all the fields in a table one by one, then here's one way to do that. First, you have to convert so that indexes reference columns rather than rows, with by_col!. Then indexes will reference columns instead of rows, and you can do something like this:

@dataTable = CSV.read('path_to_csv.csv')
@dataTable.by_col!

0.upto(@dataTable.headers.size - 1) do |i|
  p @dataTable.values_at(i).flatten.compact.size # Or whatever you want here
end

This is a way to work up summary values from a CSV file, which can then be used to create a pivot table. If there's a requirement to input data from a CSV file and output summary data in the form of a pivot table, this might be a straightforward way to go.

Collectives™ on Stack Overflow

Using Ruby CSV to extract one column

3 Answers 3

Comments

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related