0

I've got a text file containing multiple jobtitle. I want to remove the title that reoccurs. I created 2 empty array, one for all jobtitle and another which stores non-duplicate values. The code i've used is:

with open('jobtitle.txt') as fp:
jobtitle =[]
jobtitle_original = []
for line in fp:
 jobtitle.append(line)
for i in range(0,len(jobtitle)):
 for j in range(0,len(jobtitle_original)):
  if jobtitle_original[j] == jobtitle[i]:
   continue
  else:
   jobtitle_original.append(jobtitle[i])
print jobtitle_original

But it returns me an empty array. I'm using Python 2.7.

1
  • It's not surprising because jobtitle_original is 0 length in the beginning so the inner loop body is never executed. Commented Apr 1, 2014 at 11:27

2 Answers 2

1

You can simply use set:

jobs = ['engineer','artist','mechanic','teacher','teacher','engineer','engineer']

print list(set(jobs))
['engineer','artist','mechanic','teacher']

A simpler demonstration:

>>> lst = [1,4,2,4,3,5,3,5,3,5,4,5,4]
>>> print list(set(lst))
[1,4,2,3,5]

set takes a list and creates a set of non-duplicate items. Then, you can simply cast it as a list using list(set(something)).

Sign up to request clarification or add additional context in comments.

1 Comment

+1 for you. I've posted my own answer though, just to clarify how to deal with data coming from a file properly.
1

Combining your file input and set solution.

with open('jobtitle.txt') as fp:
    result = set(fp.readlines())

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.