1

Hello! I would like to combine horizontally many CSV files (the total number will oscillate around 120-150) into one CSV file by adding one column from each file (in this case column called “grid”). All those files have the same columns and number of the rows (they are constructed the same) and are stored in the same catalogue. I’ve tried with CSV module and pandas. I don't want to define all 120 files. I need a script to do it automatically. I’m stuck and I have no ideas...

Some input CSV files (data) and CSV file (merged) which I would like to get: https://www.dropbox.com/transfer/AAAAAHClI5b6TPzcmW2dmuUBaX9zoSKYD1ZrFV87cFQIn3PARD9oiXQ

That's how my code looks like when I use the CSV module:

import os
import glob
import csv

os.chdir('\csv_files_direction')

extension = 'csv'
files = [i for i in glob.glob('*.{}'.format(extension))]
out_merg = ('\merged_csv_file_direction')

with open(out_merg,'wt') as out:
    writer = csv.writer(out)
    for file in files:
        with open(file) as csvfile:
            data = csv.reader(csvfile, delimiter=';')
            result = []
            for row in data:
                a = row[3] #column which I need
                result.append(a)

Using this code I receive values only from the last CSV. The rest is missing. As a result I would like to have one precise column from each CSV file from the catalogue.

And Pandas:

import os
import glob
import pandas as pd
import csv

os.chdir('\csv_files_direction')

extension = 'csv'
files = [i for i in glob.glob('*.{}'.format(extension))]
out_merg = ('\merged_csv_file_direction')
in_names = [pd.read_csv(f, delimiter=';', usecols = ['grid']) for f in files]

Using pandas I receive data from all CSV's as the list which can be navigated using e.g in_names[1]. I confess that this is my first try with pandas and I don't have ideas what should be my next step.

I will really appreciate any help! Thanks in advance, Mateusz

1 Answer 1

1

For the part of CSV i think you need another list define OUTSIDE the loop. Something like

import os
import sys
dirname = os.path.dirname(os.path.realpath('__file__'))
import glob
import csv


extension = 'csv'
files = [i for i in glob.glob('*.{}'.format(extension))]
out_merg = ('merged_csv_file_direction')

result= []
with open(out_merg,'wt') as out:
    writer = csv.writer(out)
    for file in files:
        with open(file) as csvfile:
            data = csv.reader(csvfile, delimiter=';')
            col = []
            for row in data:
                a = row[3] #column which I need
                col.append(a)
            result.append((col))

NOTE: I have also changed the way to go into the folder. Now you can run the file direcly in the folder that contains the 2 folders (one for take the data and the other to save the data)

Regarding the part of pandas you can create a loop again. This time you need to CONCAT the dataframes that you have created using in_names = [pd.read_csv(f, delimiter=';', usecols = ['grid']) for f in files] I think you can use

import os
import glob
import pandas as pd
import csv

os.chdir('\csv_files_direction')

extension = 'csv'
files = [i for i in glob.glob('*.{}'.format(extension))]
out_merg = ('\merged_csv_file_direction')
in_names = [pd.read_csv(f, delimiter=';', usecols = ['grid']) for f in files]
result = pd.concat(in_names)

Tell me if it works

Sign up to request clarification or add additional context in comments.

6 Comments

Hi @ragioniere! Results of the first option (with CSV) are not quite what I would like to receive. Columns are saved horizontally (one below the other) not vertically (one next to other). The second option (with pandas) gives me no results. It only saves the name of the column.
Do you have any ideas how to transpose list or store data in another way to keep the CSV column shape?
For the CSV part: this because we are using a list to store the file; the list has the following structure when printed [element1, element2, element3] If you need to put them horizontally this means that you need to save the file "outside" the program. My question is "In which kind of file"? In this way, we can find what is the next step to "print" the list in a file "horizontally". I think I have misunderstood the question for the pandas part. Do you need to extract the third column as done before? Then "transpose" the remaining column to make it horizontally?
Check this link. I think it will clarify what I mean. dropbox.com/transfer/…
The part of CSV now is complete. Now the list is composed by multiple lists. So in the end, you will have 150 elements inside result that you will be able to save in a CSV file. Tell me if you need the part of pandas
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.