1

I am trying to create a xml from a dataframe . Issue that I am facing i want pass column into paramter and its not working.

    ORDER_NO        1175        1304          1421  
   7283630          2019-12-04  2019-12-10   2019-12-12 
   7283650          2019-12-25  NaN          2019-12-20

My code

header = """<ORD>{}</ORD>"""
body ="""
<osi:ORDSTSINF types:STSCDE="{}">
<DTM>{}</DTM>
</osi:ORDSTSINF>
<osi:ORDSTSINF types:STSCDE="{}">
<DTM>{}</DTM>
</osi:ORDSTSINF>
<osi:ORDSTSINF types:STSCDE="{}">
<DTM>{}</DTM>
</osi:ORDSTSINF>"""

for row in df.itertuples():
    with open(f'{row[1]}.xml', 'w') as f:
        f.write(header.format(row[1]))
        f.write(body.format(col[2], row[2], col[3],row[3],col[4],row[4]))

I want to pass then columnname wherever STS={} is coming.

Expected output

<ORD> 7283630</ORD>
<osi:ORDSTSINF types:STSCDE="1175">
<DTM>2019-12-04</DTM>
<osi:ORDSTSINF types:STSCDE="1304">
<DTM>2019-12-10</DTM>
<osi:ORDSTSINF types:STSCDE="1421">
<DTM>22019-12-12</DTM>

How can this be done in python?

1 Answer 1

1

Because python count from 0 in indexing subtract 1 from selecting columns names like cols[2] to cols[1] for all values:

cols = df.columns
for row in df.itertuples():
    with open(f'{row[1]}.xml', 'w') as f:
        f.write(header.format(row[1]))
        f.write(body.format(cols[1], row[2], cols[2],row[3],cols[3],row[4]))

<ORD>7283630</ORD>
<osi:ORDSTSINF types:STSCDE="1175">
<DTM>2019-12-04</DTM>
</osi:ORDSTSINF>
<osi:ORDSTSINF types:STSCDE="1304">
<DTM>2019-12-10</DTM>
</osi:ORDSTSINF>
<osi:ORDSTSINF types:STSCDE="1421">
<DTM>2019-12-12</DTM>
</osi:ORDSTSINF>

EDIT:

More dynamic solution with loop:

header = """<ORD>{}</ORD>"""
body ="""
<osi:ORDSTSINF types:STSCDE="{}">
<DTM>{}</DTM>"""

cols = df.columns
for row in df.itertuples():
    with open(f'{row[1]}.xml', 'w') as f:
        f.write(header.format(row[1]))
        for c, r in zip(row[2:], cols[1:]):
            f.write(body.format(r, c))

EDIT1: For mit missing values add notna:

header = """<ORD>{}</ORD>"""
body ="""
<osi:ORDSTSINF types:STSCDE="{}">
<DTM>{}</DTM>"""

cols = df.columns
for row in df.itertuples():
    with open(f'{row[1]}.xml', 'w') as f:
        f.write(header.format(row[1]))
        for c, r in zip(row[2:], cols[1:]):
            if pd.notna(c):
                f.write(body.format(r, c))
Sign up to request clarification or add additional context in comments.

5 Comments

@jezrael, Thanks it works . Is there another easy way of writing body statement. Here I based on number of columns ,I am duplicating statement . Instead canwe automatically create number body lines needed based on columns and pass the value
I have more question,for record 2 number in data frame value of column 1304 is NaN. Then I want remove body being generated for 1304 column and keep the rest. In general wherever NaN is occurring that part has excluded from the body.Can that be done?
@aeapen so in output is 5 times <osi:ORDSTSINF types:STSCDE=? Now I am offline, on phone only, so tomorrow should be added solution.
I dont want <osi:ORDSTSINF types:STSCDE="1304"> <DTM>NaN</DTM> to be created but for 1175 &1421 the XML itshould be generated. In Short , I want the code skip the XML generation when row values for STS_CDE is NaN.
for record 1 in the data frame ,xml should generated for all status codes but for record 2 only 1175 and 1421 should be created avoiding NaN

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.