2

The tuples inside the file:

 ('Wanna', 'O')
 ('be', 'O')
 ('like', 'O')
 ('Alexander', 'B')
 ('Coughan', 'I')
 ('?', 'O')

My question is, how to join two strings from the different tuples but in the same index with a condition?

For example in my case, i want to join string in [0] if [1] equal to 'B' and followed by 'I'

So the output will be like:

  Alexander Coughan

This is my code but the output is not like i want which is it just printed "NONE":

   readF = read_file ("a.txt")
   def jointuples(sentt, i):
        word= sentt[i][0]
        wordj = sentt[i-1][0]
        nameq = sentt[i][1]

        if nameq =='I':
           temp= ' '.join (word + wordj)
           return temp

   def join2features(sentt):
        return [jointuples(sentt, i) for i in range(len(sentt))]

   c_joint = [join2features(s) for s in readF]

   c_joint
6
  • What does "the output is bad" mean? Commented Apr 27, 2015 at 10:09
  • 3
    Please include some more samples: What pairs would be joined because of what? Commented Apr 27, 2015 at 10:09
  • I mean the output is not like I want which is it just printed NONE Commented Apr 27, 2015 at 10:12
  • Despite not being exactly clear on the criteria - you have a if nameq =='I-NP' there... what do you expect that to do? Commented Apr 27, 2015 at 10:15
  • the pairs would be join are the string in the index[0] if the strings in index [1] is equal to "B" and follow by "I". For my case i want the output print Alexander Coughan @Tichodroma Commented Apr 27, 2015 at 10:16

4 Answers 4

3

Here's how I'd write this:

from ast import literal_eval
from itertools import tee

def pairwise(iterable): # from itertools recipes
    a, b = tee(iterable)
    next(b, None)
    return zip(a, b)

with open("a.txt") as f:
    for p0, p1 in pairwise(map(literal_eval, f)):
        if p0[1] == 'B' and p1[1] == 'I':
            print(' '.join(p0[0], p1[0]))
            break

Here's why:

Your file consists of what appear to be reprs of Python tuples of two strings. That's a really bad format, and if you can change the way you've stored your data, you should. But if it's too late and you have to parse it, literal_eval is the best answer.

So, we turn each line in the file into a tuple by mapping literal_eval over the file.

Then we use pairwise from the itertools recipes to convert the iterable of tuples into an iterable of adjacent pairs of tuples.

So, now, inside the loop, p0 and p1 will be the tuples from adjacent lines, and you can just write exactly what you described: if p0[1] is 'B' and it's followed by (that is, p1[1] is) 'I', join the two [0]s.

I'm not sure what you wanted to do with the joined string, so I just printed it out. I'm also not sure if you want to handle multiple values or just the first, so I put in a break.

Sign up to request clarification or add additional context in comments.

Comments

2

I'll extend the input data to include more 'B' + 'I' examples.

phrases = [('Wanna', 'O'),
    ('be', 'O'),
    ('like', 'O'),
    ('Alexander', 'B'),
    ('Coughan', 'I'),
    ('One', 'B'),
    ('Two', 'I'),
    ('Three', 'B')]

length = len(phrases)
res = ['%s %s' % (phrases[i][0], phrases[i + 1][0])
    for i in range(length)
    if i < length - 1 and phrases[i][1] == 'B' and phrases[i + 1][1] == 'I']
print(res)

The result is:

['Alexander Coughan', 'One Two']

1 Comment

Thanks for the extra help
1

here's a one line solution

>>> t = [ ('wanna', 'o'),
... ('be', 'o'),
... ('like', 'o'),
... ('Alexander', 'B'),
... ('Coughan', 'I'),
... ('?', 'o')]
>>> x = [B[0] for B in t if B[1]=='B'][0] + ' ' + [I[0] for I in t if I[1]=='I'][0]
>>> print x
Alexander Coughan
>>> 

Comments

1

I hadn't seen @MykhayloKopytonenko's solution when I went to write mine, so mine is similar:

tuples = [('Wanna', 'O'),
          ('be', 'O'),
          ('like', 'O'),
          ('Alexander', 'B'),
          ('Coughan', 'I'),
          ('?', 'O'),
          ('foo', 'B'),
          ('bar', 'I'),
          ('baz', 'B'),]
results = [(t0[0], t1[0]) for t0, t1 in zip(tuples[:-1], tuples[1:])
                          if t0[1] == 'B' and t1[1] == 'I']
for r in results:
    print("%s %s" % r)

This outputs:

Alexander Coughan
foo bar
>>> 

If you absolutely must have the result returned as a string, change the list comprehension to:

 results = ["%s %s" % (t0, t1) for t0, t1 in zip(tuples[:-1], tuples[1:])
                               if t0[1] == 'B' and t1[1] == 'I']

This takes advantage of the fact that, based on your criteria, the last element of your list of tuples will never be returned as the first element of the result set. As a result, the zip effectively steps you through (tuples[n], tuples[n + 1]) so that you can easily examine the values.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.