0

I have 2 csv files containing bathymetry data - single and multibeam

CSV 1 which is the multibeam data looks like:

X Y Z
626066.4 234058.2 6.69
626066.4 234059.2 6.89
626066.4 234060.2 7.06

And CSV 2 which contains the singlebeam data looks like:

x Y A B C
627839 232463.4 14.22 14.46 14.71
627839 232463.1 14.22 14.46 14.71

I would like to use pandas to merge the 2 CSVs based as follows:

  1. The X and Y coordinates of CSV2 are combined with the pre-existing X,Y coordinates of CSV 1
  2. The specified Z value of CSV 2 (A, B or C) is combined with CSV 1 Z value

This way I will have a combined xyz dataset of a survey area which will evidentally be more accurate due to a combination of multi and single beam data.

For clarification, what I want to end up with results wise is:

X Y Z
626066.4 234058.2 6.69
626066.4 234059.2 6.89
626066.4 234060.2 7.06
627839 232463.4 14.22
627839 232463.1 14.22

I have tried the below code snippet, but need a way of combining the X's, Y's and Z's based on specified columns of CSV 2.

import pandas as pd

# Read the files into two dataframes.
df1 = pd.read_csv('CSV1.csv')
df2 = pd.read_csv('CSV2.csv')

# Merge the two dataframes, using _ID column as key
df3 = pd.merge(df1, df2, on = 'X')
df3.set_index('X', inplace = True)

# Write it to a new CSV file
df3.to_csv('CSV3.csv')
3
  • can you include sample data, instead of screenshots? Commented Nov 15, 2021 at 20:15
  • 1
    What do you mean by combined? Added together? Averaged? Commented Nov 15, 2021 at 20:17
  • @DNy All records of CSV 2 added beneath CSV 1 Commented Nov 15, 2021 at 20:23

1 Answer 1

1

IIUC you want to be able to tell you function which column is to be handled as Z in df2 and then concat both lists:

def concat_df_on_z(df1, df2, z_col):
    df2 = df2[['X', 'Y', z_col]].rename(columns={z_col: 'Z'})
    return pd.concat([df1, df2])

df3 = concat_df_on_z(df1, df2, 'B')
print(df3)

Output:

          X         Y      Z
0  626066.4  234058.2   6.69
1  626066.4  234059.2   6.89
2  626066.4  234060.2   7.06
0  627839.0  232463.4  14.46
1  627839.0  232463.1  14.46
Sign up to request clarification or add additional context in comments.

3 Comments

If I use the script import pandas as pd # Read the files into two dataframes. df1 = pd.read_csv('C:/Users/Public/multibeam.csv') df2 = pd.read_csv('C:/Users/Public/singlebeam.csv') def concat_df_on_z(df1, df2, z_col): df2 = df2[['X', 'Y', z_col]].rename(columns={z_col: 'Z'}) return pd.concat([df1, df2]) df3 = concat_df_on_z(df1, df2, 'B') df3.to_csv(r'C:\Users\Public\CSV3.csv') I get the below results
X[m] Y[m] Z[m] X Y Z 0 626066.43 234058.2 6.69 NaN NaN NaN 1 626066.43 234059.2 6.89 NaN NaN NaN 2 626066.43 234060.2 7.06 NaN NaN NaN 3 626067.43 234057.2 6.69 NaN NaN NaN 4 626067.43 234058.2 6.89 NaN NaN NaN ... ... ... ... ... ... ... 34158 NaN NaN NaN 627148.40 233739.17 12.94 34159 NaN NaN NaN 627148.27 233739.35 12.92
It's very hard to read as a comment. What is the problem? You can get rid of the index column while writing to csv: df3.to_csv('CSV3.csv', index=False). What are X[m] Y[m] Z[m] ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.