Find gaps in spatial database based on attributes and fill those caps with new features with existing attributes

Question

I have a big line shapefile data (over 2 millions features) showing animal movements.

Simplified example:
ID    Year    st_time     end_time         stX           stY          enX           enY
1      2018   070505    070603            5857000 461000 5848000    380000
2      2018   070603    070705            5848000 380000   5907100    379000
3      2018   070742    070852            5929000 359100   5936000    364000
4      2018   070852    070915            5936000 364000   5890700    363500
etc.

Each feature has attributes including start and end time and most of the features form continuous set of lines (the end time of the previous feature is the start time of the next feature, eg. between IDs 1 and 2 in the example) but occasionally there are some caps that I would need to find and fix (eg. between IDs 2 and 3 in the example) by drawing the missing line, eg.

ID    Year    st_time     end_time         stX           stY          enX           enY
1      2018   070505    070603            5857000 461000 5848000    380000
2      2018   070603    070705            5848000 380000   5907100    379000
5      2018   070705    070742            5907100 379000   5929000    359100
3      2018   070742    070852            5929000 359100   5936000    364000
4      2018   070852    070915            5936000 364000   5890700    363500
etc.

I have divided the data based on years which means I now have over 30 files that I would need to do the following to:

Check if the ID n+1 start time matches with ID n end time (or alternatively use coordinates).
If not, create a new line feature where
start time = ID n end time, and
end time = ID n+1 start time
These attributes are definitely necessary with the new features

I'm working with ArcGIS Pro 2.3.3 and would like to use ModelBuilder or Python coding to do this. I assume Pandas would be the package for database management but I don't know quite how to approach finding the caps and especially creating the new features with appropriate attributes.

I know about XY to line -tool in the Arcpy package but to use it I would first need to create the table with the new feature rows (previous step).

Bera · Accepted Answer · 2019-06-29 08:19:54Z

Your question is almost all pandas so you could try posting it to stackoverflow and get a shorter answer. But this seems to work:

import pandas as pd
import arcpy

#I tried with csv input. Convert shapefile table to csv...
df = pd.read_csv(r'C:\somefolder\input.csv', dtype={'st_time':str,'end_time':str})
#...or use da.SearchCursor
columns = ['ID', 'Year', 'st_time', 'end_time', 'stX', 'stY', 'enX', 'enY']
df = pd.DataFrame.from_records(data=arcpy.da.SearchCursor(r'C:\data\shapefile123.shp',columns), columns=columns)

df[['st_time','end_time']] = df[['st_time','end_time']].apply(lambda x: pd.to_datetime(x,format='%H%M%S').dt.time) #st_time and end_time from strings to time
df['ok'] = df['end_time'] == df['st_time'].shift(-1) #Check if end time row 1 matches start time row 2

def create_missing_rows(x):
    newrows = []
    newid = x.ID.max()+1
    for index, row in x.iterrows():
        if x.shape[0]-1 != index: #Skip last row
            if row.ok == False:
                ID = newid
                newid+=1
                st_time = row.end_time
                end_time = x.iloc[index+1].st_time
                stX,stY = row[['enX','enY']]
                endX,endY = x.iloc[index+1][['stX','stY']]
                year = row.Year
                newrows.append([ID,year,st_time,end_time,stX,stY,endX,endY,True])
    return pd.DataFrame(newrows, columns=x.columns)

df = pd.concat([df,create_missing_rows(df)])
df.sort_values(by='st_time', inplace=True)
df.drop(columns=['ok'], inplace=True)
df.reset_index(drop=True, inplace=True)
df.to_csv(r'C:\somefolder\out.csv')

df is now:

   ID  Year   st_time  end_time      stX     stY      enX     enY
0   1  2018  07:05:05  07:06:03  5857000  461000  5848000  380000
1   2  2018  07:06:03  07:07:05  5848000  380000  5907100  379000
2   5  2018  07:07:05  07:07:42  5907100  379000  5929000  359100
3   3  2018  07:07:42  07:08:52  5929000  359100  5936000  364000
4   4  2018  07:08:52  07:09:15  5936000  364000  5890700  363500

Then use XY to line.

Thank you so much! Just letting you know I'll check the code within the next couple of days. Appreciate your help!! — Sn0W
– Sn0W, Commented Jul 2, 2019 at 22:29

Stack Exchange Network

Find gaps in spatial database based on attributes and fill those caps with new features with existing attributes

1 Answer 1

Your Answer

Hot Network Questions

Find gaps in spatial database based on attributes and fill those caps with new features with existing attributes

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Related

Hot Network Questions