1

I'm fairly new to Python, and am trying to put together a Markov chain generator. The bit that's giving me problems is focused on adding each word in a list to a dictionary, associated with the word immediately following.

def trainMarkovChain():
    """Trains the Markov chain on the list of words, returning a dictionary."""
    words = wordList()
    Markov_dict = dict()
    for i in words:
        if i in Markov_dict:
            Markov_dict[i].append(words.index(i+1))
        else:
            Markov_dict[i] = [words.index(i+1)]
    print Markov_dict

wordList() is a previous function that turns a text file into a list of words. Just what it sounds like. I'm getting an error saying that I can't concatenate strings and integers, referring to words.index(i+1), but if that's not how to refer to the next item then how is it done?

3
  • 1
    Use enumerate() to get both index as well as item. list.index won't work as expected if your list contains duplicate items. Commented May 1, 2014 at 10:47
  • 1
    possible duplicate of Iterate a list as pair (current, next) in Python Commented May 1, 2014 at 10:50
  • words.index(i) + 1 is what you want, but this fails if there are duplicate words. Commented May 1, 2014 at 11:04

4 Answers 4

2

You can also do it as:

for a,b in zip(words, words[1:]):

This will assign a as an element in the list and b as the next element.

Sign up to request clarification or add additional context in comments.

3 Comments

Good approach, but zip(words, words[1:]) doesn't zip in the last word, as words[1:] is one element shorter.
@famousgarkin, But isn't that what the OP wants since they are checking the next element and so they should stop at the second last so that it doesn't raise an error.
Yep, may not matter, just pointing out in case someone wonders. And +1 for simplicity.
2

The following code, simplified a bit, should produce what you require. I'll elaborate more if something needs explaining.

words = 'Trains the Markov chain on the list of words, returning a dictionary'.split()
chain = {}
for i, word in enumerate(words):
    # ensure there's a record
    next_words = chain.setdefault(word, [])
    # break on the last word
    if i + 1 == len(words):
        break
    # append the next word
    next_words.append(words[i + 1])

print(words)
print(chain)

assert len(chain) == 11
assert chain['the'] == ['Markov', 'list']
assert chain['dictionary'] == []

Comments

0
def markov_chain(list):
    markov = {}
    for index, i in enumerate(list):
        if index<len(list)-1:
            markov[i]=list[index+1]

    return (markov)    

The code above takes a list as an input and returns the corresponding markov chain as a dictionary.

Comments

0

You can use loops to get that, but it's actually a waste to have to put the rest of your code in a loop when you only need the next element.

There are two nice options to avoid this:

Option 1 - if you know the next index, just call it:

my_list[my_index]

Although most of the times you won't know the index, but still you might want to avoid the for loop.


Option 2 - use iterators

& check this tutorial

my_iterator = iter(my_list)
next(my_iterator)    # no loop required

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.