Ruby newbie here. I've got a product csv where first col is a unique SKU and second col is a product ID that can be duplicated across multiple products (+ many other cols but these are the pertinent ones). Like:
SKU | Prod ID
99 | 10384
100 | 10385
101 | 10385
102 | 10386
103 | 10386
104 | 10387
In the script I'm writing, the first time a product ID is used will become a 'parent', and any subsequent instances of the product ID get treated differently (ie, different sizes).
Currently am reading in the whole CSV rather than doing foreach line as I assumed I'd need all the data available to find the duplicates.
Issue is I'm not sure on the how to be able to identify the first time a product ID is used and then identifying any further instances of it's use.
My first thought was to somehow identify the duplicates (uniq?) and then create a new column and put a 1 if it's the first time it's occurred and 0 if it's occurred previously. After looking at uniq I'm not sure how I then go back to the main list and mark my 1's and 0's.
Can someone please point me in the direction of the classes/methods I need to be looking at?
Thanks, Liam
Edit for John D: This gives me the hashes but in 1:1 format not 1: all instances of prod ID
CSV.foreach(INPUT, :headers => true , :header_converters => :symbol, :col_sep => "|", :quote_char => "\x00") do |csv_obj|
items[csv_obj.fields[0]] = [csv_obj.fields[1]]
end
so gives; "230709"=>["88507"], "109064"=>["9019"]