0

I have a CSV file as following

ID      Required  -- these are headers
SD0005   Yes      -- row information

I have to validate each row against header. Say ID contains letters and numbers and length should not be more than 6. Required header should be either yes or no in every row.

How can I achieve this functionality in Ruby if I have to process a huge file which has more than 1000 rows with good performance?

I'm reading particular row against each header as follows

CSV.foreach('file path', :headers => true) do |csv_obj| 
csv_obj['ID'] 
csv_obj['Required']

Is there a way to know which condition failed while validating column against header for a row. I need to know for which condition it failed and print it out

New to ruby. Appreciate help

1
  • You need the processing in ruby. Why would you have validation code in javascript? Commented Sep 9, 2015 at 14:50

1 Answer 1

1

To get the data into Ruby from the CSV file, try the following:

# This will read the data in with headers, convert the column names to symbols, 
# and then turn each resulting CSV::Row instance into a hash

data = CSV.read('temp.csv', headers: true, header_converters: :symbol).map(&:to_h)

this should return the following:

=> [{:id=>"SD0005", :required=>" yes"}, ...]

Once you have all of your information in a format you can work with in Ruby, you could make a method that checks the validity of each ID.

def valid_id?(id_string)
  # uses Regular Expressions to ensure ID is 6 
  # characters that consist of only numbers/letters
  # The double-bang(!!) turn a truthy value to `true`
  # or false value to `false`

  !!id_string.match(/^[\D|\d]{6}$/)
end

If you wanted to test another column's validity, do so with a separate method.

def valid_required?(req_column)
  req_column.downcase == 'yes' ||   req_column.downcase == 'no'
end

Create a main validator method

def all_valid?(row)
  valid_id?(row[:id]) && valid_required?(row[:required])
end

Then keep only the records whose ID is valid

# #select keeps values whose block evaluates to `true`
valid_records = data.select { |record| all_valid?(record) }
Sign up to request clarification or add additional context in comments.

4 Comments

I have functions in javascripts folder under app folder(directory structure created when we create rails project) For the above hash created how can i call function for every column for a given row? Say i have to call valid_id for id and valid_required for required and so on.Each column in header has a function. And then store complete row information only if validation passes for all columns of row to database. could you be more specific how we can code this in ruby?
You can create another method that checks the validity of a record's required required property like valid_required? then create a method that checks all of your validations. Ex: Inside of def all_columns_valid write valid_id? && valid_required?.
I'd encourage you to do some exploration on the CSV Ruby Documentation. It should help give you the understanding that's helpful in debugging these sorts of things. If all else fails, see here stackoverflow.com/questions/4822422/output-array-to-csv-in-ruby
Thanks Alex. Will check it out

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.