Inserting values in a Python list

Question

I am working on a script that parses a text file in an attempt to normalize it enough to be able to insert it in to a DB. The data represents articles written by 1 or more authors. The problem I am having is that because there is not a fixed number of authors, I get a variable number of columns in my output text file. eg.

author1, author2, author3, this is the title of the article
author1, author2, this is the title of the article
author1, author2, author3, author4, this is the title of the article

These results give me a max column number of 5. So, for the first 2 articles I will need to add blank columns so that the output has an even number of columns. What would be the best way to do this? My input text is tab delimited and I can iterate through them fairly easily by splitting on the tab.

Is it safe to assume that the article title is always the last item of the list? Also, what approach have you tried? — Joel Cornett
– Joel Cornett, Commented May 19, 2012 at 2:54
I have it working with the variable column count but this won't do. I need to have a set number of columns. I've built lists and tried adding to them but I get stuck with adding the blank items in the list. — aeupinhere
– aeupinhere, Commented May 19, 2012 at 2:54

Josiah · Accepted Answer · 2012-05-19 04:24:45Z

Assuming you already have the max number of columns and already have them separated into lists (which I'm going to assume you put into a list of their own), you should be able to just use list.insert(-1,item) to add empty columns:

def columnize(mylists, maxcolumns):
    for i in mylists:
        while len(i) < maxcolumns:
            i.insert(-1,None)

mylists = [["author1","author2","author3","this is the title of the article"],
           ["author1","author2","this is the title of the article"],
           ["author1","author2","author3","author4","this is the title of the article"]]

columnize(mylists,5)
print mylists

[['author1', 'author2', 'author3', None, 'this is the title of the article'], ['author1', 'author2', None, None, 'this is the title of the article'], ['author1', 'author2', 'author3', 'author4', 'this is the title of the article']]

Alternative version that doesn't destroy your original list, using list comprehensions:

def columnize(mylists, maxcolumns):
    return [j[:-1]+([None]*(maxcolumns-len(j)))+j[-1:] for j in mylists]

print columnize(mylists,5)

[['author1', 'author2', 'author3', None, 'this is the title of the article'], ['author1', 'author2', None, None, 'this is the title of the article'], ['author1', 'author2', 'author3', 'author4', 'this is the title of the article']]

Josh Smeaton · Accepted Answer · 2012-05-19 03:24:29Z

Forgive me if I've misunderstood, but it sounds like you're approaching the problem in a difficult way. It's quite easy to convert your text file into a dictionary that maps title to a set of authors:

>>> lines = ["auth1, auth2, auth3, article1", "auth1, auth2, article2","auth1, article3"]
>>> d = dict((x[-1], x[:-1]) for x in [line.split(', ') for line in lines])
>>> d
{'article2': ['auth1', 'auth2'], 'article3': ['auth1'], 'article1': ['auth1', 'auth2', 'auth3']}
>>> total_articles = len(d)
>>> total_articles
3
>>> max_authors = max(len(val) for val in d.values())
>>> max_authors
3
>>> for k,v in d.iteritems():
...     print k
...     print v + [None]*(max_authors-len(v))
... 
article2
['auth1', 'auth2', None]
article3
['auth1', None, None]
article1
['auth1', 'auth2', 'auth3']

Then, if you really want to, you can output this data using the csv module that's built in to python. Or, you could directly output the SQL that you're going to need.

You are opening the same file many times, and reading it many times, just to get counts that you can derive from the data in memory. Please don't read the file multiple times for these purposes.

Collectives™ on Stack Overflow

Inserting values in a Python list

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related