0

I am trying to sort an array and separate it in python.

Example:

I have an data file like this that I will import:

x   y   z
1   3   83
2   4   38
8   1   98
3   87  93
4   1   73
1   3   67
9   9   18
1   4   83
9   3   93
8   2   47

I want it to first look like this:

x   y   z
1   3   83
1   3   67
1   4   83
2   4   38
3   87  93
4   1   73
8   1   98
8   2   47
9   9   18
9   3   93

So the x column is in ascending order, followed by the y column.

And then finally I want to build an array out of these arrays? Can I do that?

So I have:

array[0] = [[1, 3, 83],[1, 3, 67],[1, 4, 83]]
array[1] = [[2, 4, 38]]
array[2] = [[3, 87, 93]]
array[3] = [[4, 1, 73]]
array[4] = [[8, 1, 98],[8,2,47]]

and so on...

Starting out:

import numpy as np
import matplotlib.pyplot as plt

data_file_name = 'whatever.dat'

data=np.loadtxt(data_file_name)
3
  • Can you please provide a minimal reproducible example so we can assist with the issues you are having in your implementation attempt? Commented Apr 1, 2016 at 19:15
  • Are you willing to use the Pandas package, or do you want a pure Python solution? Commented Apr 1, 2016 at 19:19
  • Pure python would be the best -- thank you kindly Commented Apr 1, 2016 at 19:19

2 Answers 2

1

Here is a numpy solution (given that you used it for loading the data):

import numpy as np

data_file_name = 'whatever.dat'
data = np.loadtxt(data_file_name, 
                  skiprows=1, 
                  dtype=[('x', float), ('y', float), ('z', float)])

data.sort(axis=0, order=['x', 'y', 'z'])

unique_x_col_vals = set(row[0] for row in data)
array = {n: [list(row) for row in data if row[0] == val] 
            for n, val in enumerate(unique_x_col_vals)}

>>> array
{0: [[1.0, 3.0, 67.0], [1.0, 3.0, 83.0], [1.0, 4.0, 83.0]],
 1: [[2.0, 4.0, 38.0]],
 2: [[3.0, 87.0, 93.0]],
 3: [[4.0, 1.0, 73.0]],
 4: [[8.0, 1.0, 98.0], [8.0, 2.0, 47.0]],
 5: [[9.0, 3.0, 93.0], [9.0, 9.0, 18.0]]}

It uses a dictionary comprehension to generate the array, internally using a list comprehension to extract each row for the unique values based on column x.

I've used floats when importing the data, but you can also specify int if that matches your data.

Sign up to request clarification or add additional context in comments.

Comments

0

You can use pandas for this, with just couple lines of code:

df = pd.read_csv(txt, sep=r"\s*")
print df.sort(['x','y'], ascending=[True,True])

1 Comment

a pure python solution would be better for me on this particular case

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.