3

I have approached the a problem to list a set of items, which have components, which in turn have properties in awk.

I have tried to approach the problem in two ways.

1) Define an array list[item-number,component-number][properties].
2) Define an array list[item-number][component-number][properties].

This was in many ways interesting, as I noticed (2) maintain the order of insertion, while (1) does not. I know arrays are associative in awk and it could very well be a coincidence this happened. However, as the order of insertion is important in my case (and also, I want to learn more about awk), I would like to know if this is what happening and why.

Any ideas? BR Patrik

0

1 Answer 1

5

Neither approach retains any information on the order of insertion, if it seems like either does then that is just coincidence. If the order of insertion is important to you then you need to write some code to track that order, e.g.

key = foo FS bar
if ( !(key in list) ) {
    keys[++numKeys] = key
}
list[key] = whatever

would give you an array keys[] of the indices in the order they are inserted and an array list[] that maps each key to it's value so you can later do:

for (keyNr=1; keyNr<=numKeys; keyNr++) {
    key = keys[keyNr]
    print list[key]
}

or similar to print the contents of list[] in the order they were inserted.

Sign up to request clarification or add additional context in comments.

5 Comments

I think I will try to use this approach for the order. Unfortunately this seems to ge the Achilles heel to awk. Thanks for the tip!
Dear Ed, why do you prefer FS over SUBSEP?
@patrik you're welcome but it's not an Achilles heel at all. Why track insertion order and use up time and memory for everyone when very few applications need it and it's trivial to write code to support if/when you do want it. Awk is a tiny language that executes extremely fast (typically faster than equivalent C programs) due to it's philosophy of only providing language constructs to do things that are difficult to do with existing constructs and this isn't close to being difficult to do. So it''s a "pro", not a "con".
@kvantour because, though unlikely, SUBSEP can appear in your input fields while the default FS can't. It's also a couple of chars shorter to type and when you want to print the array indices a blank or other string is easier to see than SUBSEP. If you're using a regexp for FS then the equation changes and then I'd consider using OFS vs SUBSEP.
Also important to mention, if the key is generated by floating point numbers, you might want to adjust CONVFMT

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.