1

I'm trying to parse the following XML data:

http://pastebin.com/UcbQQSM2

This is just an example of the 2 types of data I will run into. Companies with the needed address information and companies without the needed information.

From the data I need to collect 3 pieces of information:

1) The Company name

2) The Company street

3) The Company zipcode

I'm able to do this with the following code:

#Creates list of Company names
CompanyList = []
for company in xmldata.findall('company'):
    name = company.find('name').text
    CompanyList.append(name)

#Creates list of Company zipcodes
ZipcodeList = []
for company in xmldata.findall('company'):
    contact_data = company.find('contact-data')
    address1 = contact_data.find('addresses')
    for address2 in address1.findall('address'):
        ZipcodeList.append(address2.find('zip').text)

#Creates list of Company streets
StreetList = []
for company in xmldata.findall('company'):
    contact_data = company.find('contact-data')
    address1 = contact_data.find('addresses')
    for address2 in address1.findall('address'):
        StreetList.append(address2.find('street').text)

But it doesn't really do what I want it to, and I can't figure out how to do what I want. I believe it will be some type of 'if' statement but I don't know.

The problem is that where I have:

for address2 in address1.findall('address'):
    ZipcodeList.append(address2.find('zip').text)

and

for address2 in address1.findall('address'):
    StreetList.append(address2.find('street').text)

It only adds to the list the places that actually have a street name or zipcode listed in the XML, but I need a placemark for the companies that also DON'T have that information listed so that my lists match up.

I hope this makes sense. Let me know if I need to add more information.

But, basically, I'm trying to find a way to say if there isn't a zipcode/street name for the Company put "None" and if there is then put the zipcode/street name.

Any help/guidance is appreciated.

1 Answer 1

1

Well I am going to do a bad thing and suggest you use a conditional (ternary) operator.

StreetList.append(address2.find('street').text if address2.find('street').text else 'None')

So this statement says return address2.find('street').text if **address2.find('street') is not empty else return 'None'.

Additionally you could created a new method to do the same test and call it in both places, note my python is rusty but should get you close:

def returnNoneIfEmpty(testText):
    if testText:
        return testText
    else:
        return 'None'

Then just call it:

StreetList.append(returnNoneIfEmpty(address2.find('street').text))
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks, but that still seems to only give me the Companies that already have zipcodes. It's doesn't put 'None' for the ones that don't. I think it's because the "for address2 in address1.findall('address'): ZipcodeList.append(address2.find('zip').text)" Specifically, the .findall() goes through and only pulls out the Companies that have a 'zip' present. I need it to do that, but also put 'None' in for the Companies that don't have the 'zip' attribute. Unless, I misunderstood how to use the first part of your of your answer.
It looks like it pulls all elements that match, not just ones with values. I would print out each one before you test it to see what "empty" means. The if statement assumes that the text would be an empty string, but maybe that is not the case.
if address2.find('street').text.len > 0 else

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.