0

I have a list of DOIs and Journal titles in a csv, now in a dataframe. I'm trying to reconstruct the journal url for v1 of the articles using a function. What is the correct way to create a new column using a function that uses values from existing columns in the dataframe?

The dataframe looks like this:

PMID    Journal     DOI
1234    medRxiv     10.1101/2020.09.30.320762
2345    bioRxiv     10.1101/2020.05.26.117549

The function I created:

def createURL (doi, journal) :
    url = 'https://www.'+journal+'.org/content/'+ str(doi)+'v1'
    
    return url

My attempt to call the function:

#this returns a Key Error ('PMID')
for row in dfRxiv :
    dfRxiv['URL'][row] = createURL(dfRxiv['DOI'][row], dfRxiv['Journal'][row])

I'm new to Python and I'm sure there's a better way to do this - I appreciate any help!

1 Answer 1

1

You're pretty much there. You don't need to iterate through the rows or define a function - it's easier than that!:

dfRxiv['URL'] = 'https://www.' + dfRxiv['Journal'].astype('str') + '.org/content/' + dfRxiv['DOI'].astype('str') + 'v1'

edit If you do want to use a function:

def createURL(df):
    url = 'https://www.' + df['Journal'].astype('str') + '.org/content/' + df['DOI'].astype('str') + 'v1'
    return url

dfRxiv['URL'] = createURL(dfRxiv)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.