1
Student Fruits
Foo Apple,Banana
Bar Orange
Baz Lemon,Orange

How do I quantify/analyze the column Fruits with multiple values in a single cell separated by comma?

I created multiple helper columns, then split the values like so:

Student Fruits fruit1 fruit2
Foo Apple,Banana Apple Banana
Bar Orange Orange
Baz Lemon,Orange Lemon Orange

This leads to both horizontally and vertically entered data. So if I need to, say, find out the fruits for 'Foo' in a conventional way, I'd have to restructure this so that each fruit is in its own row to make it spreadsheet readable. It also doesn't translate well if I need to create pivot tables.

Is there a better approach for this kind of task?

1

2 Answers 2

3

The data should preferably be normalized to the first-normal-form:

Student Fruits
Foo Apple
Foo Banana
Bar Orange
Baz Lemon
Baz Orange

You can try formulas like the one below to make the transformation.

=QUERY(
  WRAPROWS(
   FLATTEN(
    MAP(
     A1:A4,
     B1:B4,
     LAMBDA(a,b,
      TOROW(
       REDUCE(,SPLIT(b,","),
        LAMBDA(ac,cur,
          VSTACK(ac,{a,cur}))))))),2),
 "where Col1<>''")
Sign up to request clarification or add additional context in comments.

Comments

2

In these cases I usually restructure the data in a more spreadsheet-friendly format, using a formula along the lines of:

=ARRAYFORMULA(
   QUERY(
     SPLIT(TOCOL(A2:A4&"|"&SPLIT(B2:B4,",")),"|"),
     "where Col2 is not null"))

enter image description here

Depending on what exactly you're trying to do this may not always be needed as there are cases in which it's not difficult to work with the unstructured data.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.