2

I am trying to read in an xyz file into python but keep getting these error messages. Kinda new to python so would love some help interpreting it!

def main():
    atoms = []
    coordinates = []
    name = input("Enter filename: ")
    xyz = open(name, 'r')
    n_atoms = xyz.readline()
    title = xyz.readline()
    for line in xyz:
        atom, x, y, z = line.split()
        atoms.append(atom)
        coordinates.append([float(x), float(y), float(z)])
    xyz.close()

    return atoms, coordinates


if __name__ == '__main__':
    main()

Error:
Traceback (most recent call last):
  File "Project1.py", line 25, in <module>
    main()
  File "Project1.py", line 16, in main
    atom, x, y, z = line.split()
ValueError: not enough values to unpack (expected 4, got 3)

I believe the value error is because after a couple of lines there are only 3 values. But not sure why I am getting return errors.

2 Answers 2

4

One very important rule of thumb especially in python is: Don't reinvent the wheel and use existing libraries.

The xyz files are one of the few universally normed file formats in chemistry. So IMHO you don't need any logic to determine the length of your line. The first line is an integer n_atoms and gives you the number of atoms, the second line is an ignored comment line and the next n_atoms lines are [string, float, float, float] as you have already written in your code. A file that deviates from this, is probably corrupted.

Using the pandas library you can simply write:

import pandas as pd
molecule = pd.read_table(inputfile, skiprows=2, delim_whitespace=True,
                         names=['atom', 'x', 'y', 'z'])

Or you use the chemcoord package which has its own Cartesian class representing molecules in cartesian coordinates:

import chemcoord as cc
molecule = cc.Cartesian.read_xyz(inputfile)

Disclaimer: I am the author of chemcoord.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you very much! I'll definitely investigate libraries in the future!
2

You are getting errors because you unpack a list in the line

atom, x, y, z = line.split()

This only makes sense if there are 4 items in the line.

You have to define logic of what happens when there are only 3 items in a line, like this (within the for loop):

for line in xyz:
    line_data = line.split()
    if len(line_data) == 3:
         # Behavior when only 3 items in a line goes here!
         # Add your code here!
         continue

    atom, x, y, z = line_data
    atoms.append(atom)
    coordinates.append([float(x), float(y), float(z)])

What your program does when it encounters a line with only 3 items depends on what you want it to.

5 Comments

It'd probably be good to also check for the len(line_data) == 4 case
@cricket_007 The == 4 case is already implemented. If the length is neither 3 nor 4, the program will (correctly) crash, thereby alerting the programmer that code has to be added for the new format (say, 5 items) too.
I would unpack the list to 4 items only if len(line_data) == 4. Otherwise report unexpected format "len(line_data) != 4". I think weird to only unexpected format of 3, and let program crash on 5,6,7,8, etc...
Can this be generalized with the splat operator?
@mikey That depends on the format. Yes if the format is something like atom x y z comment1 comment2 comment3.... No if a variable number of elements per line does not make sense.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.