0

I am having trouble getting the logic right for a method I'm trying to write and I thought it might be a fun problem for someone else to have a look at. The target language is Java, though I'm just going to present it generically so as not to discourage anyone from sharing their thoughts.

The input to the method is an arbitrary number of hierarchical sets of data and the output is a list of combinations of members of that data where each combination has one member from each set of hierarchical data. However, every possible combination isn't included. The key to building the combinations is whether or not a given element is a leaf member in the hierarchy or not and each element of each set knows its name and whether or not it is a leaf. First, I'll give an example and then I'll try to articulate the rules. For concreteness, we'll consider that the input is the following three sets:

A1       B1            C1
  A10      B10           C10
  A11         B100       C11
              B101       C12
              B102       C13
              B103
              B104
              B105
           B11
              B110
              B111
              B112
              B113
              B114

The corresponding output would look like this:

(1)   A1  B1   C1        (37)  A10 B111 C11    (73)   A11 B103 C10  
(2)   A10 B1   C1        (38)  A10 B111 C12    (74)   A11 B103 C11 
(3)   A10 B10  C1        (39)  A10 B111 C13    (75)   A11 B103 C12 
(4)   A10 B100 C1        (40)  A10 B112 C1     (76)   A11 B103 C13 
(5)   A10 B100 C10       (41)  A10 B112 C10    (77)   A11 B104 C1    
(6)   A10 B100 C11       (42)  A10 B112 C11    (78)   A11 B104 C10 
(7)   A10 B100 C12       (43)  A10 B112 C12    (79)   A11 B104 C11 
(8)   A10 B100 C13       (44)  A10 B112 C13    (80)   A11 B104 C12 
(9)   A10 B101 C1        (45)  A10 B113 C1     (81)   A11 B104 C13 
(10)  A10 B101 C10       (46)  A10 B113 C10    (82)   A11 B11 C1  
(11)  A10 B101 C11       (47)  A10 B113 C11    (83)   A11 B110 C1    
(12)  A10 B101 C12       (48)  A10 B113 C12    (84)   A11 B110 C10 
(13)  A10 B101 C13       (49)  A10 B113 C13    (85)   A11 B110 C11 
(14)  A10 B102 C1        (50)  A10 B114 C1     (86)   A11 B110 C12 
(15)  A10 B102 C10       (51)  A10 B114 C10    (87)   A11 B110 C13 
(16)  A10 B102 C11       (52)  A10 B114 C11    (88)   A11 B111 C1    
(17)  A10 B102 C12       (53)  A10 B114 C12    (89)   A11 B111 C10 
(18)  A10 B102 C13       (54)  A10 B114 C13    (90)   A11 B111 C11 
(19)  A10 B103 C1        (55)  A11 B1 C1       (91)   A11 B111 C12 
(20)  A10 B103 C10       (56)  A11 B10 C1     (92)   A11 B111 C13 
(21)  A10 B103 C11       (57)  A11 B100 C1     (93)   A11 B112 C1    
(22)  A10 B103 C12       (58)  A11 B100 C10    (94)   A11 B112 C10 
(23)  A10 B103 C13       (59)  A11 B100 C11    (95)   A11 B112 C11 
(24)  A10 B104 C1        (60)  A11 B100 C12    (96)   A11 B112 C12 
(25)  A10 B104 C10       (61)  A11 B100 C13    (97)   A11 B112 C13 
(26)  A10 B104 C11       (62)  A11 B101 C1     (98)   A11 B113 C1    
(27)  A10 B104 C12       (63)  A11 B101 C10    (99)   A11 B113 C10 
(28)  A10 B104 C13       (64)  A11 B101 C11    (100)  A11 B113 C11
(29)  A10 B11  C1        (65)  A11 B101 C12    (101)  A11 B113 C12
(30)  A10 B110 C1        (66)  A11 B101 C13    (102)  A11 B113 C13
(31)  A10 B110 C10       (67)  A11 B102 C1     (103)  A11 B114 C1   
(32)  A10 B110 C11       (68)  A11 B102 C10    (104)  A11 B114 C10
(33)  A10 B110 C12       (69)  A11 B102 C11    (105)  A11 B114 C11
(34)  A10 B110 C13       (70)  A11 B102 C12    (106)  A11 B114 C12
(35)  A10 B111 C1        (71)  A11 B102 C13    (107)  A11 B114 C13
(36)  A10 B111 C10       (72)  A11 B103 C1    

Though I understand the pattern unambiguously, it's hard for me to articulate the rules for the same reason that it's hard for me to write the method. Basically, if an elemnet is a non leaf, it's only used to build combinations with the root element from the sets to it's right. If instead an element is a leaf, it's crossed with all of the posisble combinations to it's right. Obviously the problem is recursive and I feel like the psuedo code should look something like this:

buildCombinations(element, setNumber, numberOfSets)
{
  if (element.IsLeaf( ))
  {
    List list = buildCombinations(elementFromNextList, setNumber++, numberOfSets)
    CrossJon element with list
  }
  else
  {
    IfOnlyIKnew!
  }
}

But I've been staring at it for two days and I just can't quite wrap my head around it. I supsect that I've not done an adaquate job of explaining the situation so please feel free to ask questions for clarification. In any event, this strikes me as the kind of problem that might be obvious to someone a bit smarter than me and I'd love it if anyone can offer any suggestions.

Thanks in advance!

4
  • 1
    What data structure(s) hold the nested lists/trees/whatever? I feel like the pseudocode abstraction has sort of obfuscated some of the seemingly important details like this. Final output is an array/list? Commented May 25, 2020 at 1:26
  • 1
    A11 East C1 contains East? Commented May 25, 2020 at 1:54
  • East should be B10. I've fixed the post. Good eye! Commented May 25, 2020 at 2:02
  • I write a detailed response to ggorlen's question but it was too long and got kicked back. (still finding my way around stackoverfow) Short version: Each set is an ArrayList of MyElement instances where each MyElement has only two attributes, name and isLeaf. And the output could be an ArrayList of String arrays where each String array has 3 elements. Commented May 25, 2020 at 2:05

1 Answer 1

1

You forgot to include B105 in your output. That will add 10 additional lines to the output -

A10 B105 C1
A10 B105 C10
A10 B105 C11
A10 B105 C12
A10 B105 C13

A11 B105 C1
A11 B105 C10
A11 B105 C11
A11 B105 C12
A11 B105 C13

That mistake aside, this problem is a small difference from cartesian product for which there are many well-known algorithms.

The key difference is that a leaf node will never appear after a tree node. One naive algorithm would generate the cartesian product of all tree nodes and then remove the ones where leafs appear after trees.

But we can do better. Instead of wastefully visiting paths that will be pruned from the output, we can intelligently skip ahead when a tree node is encountered -


JavaScript

Here's a quick implementation using JavaScript -

const traverse = ([ t, ...more ], r = []) =>
  // base case: empty t
  t === undefined
    ? [ r ]

  // inductive: t.leaf
  : t.leaf
      ? traverse(more, [...r, t.value])

  // inductive: t.tree
  : [ [ ...r, t.value, ...more.map(t => t.value) ] // <--
    , ...t.children.flatMap(c => traverse([ c, ...more ], r))
    ]

The input to our traverse function is an array of leaf or tree nodes -

const leaf = (value) =>
  ({ leaf, value })

const tree = (value, ...children) =>
  ({ tree, value, children })

Given the trees in your original post -

A1       B1            C1
  A10      B10           C10
  A11         B100       C11
              B101       C12
              B102       C13
              B103
              B104
              B105
           B11
              B110
              B111
              B112
              B113
              B114

We can represent them using tree and leaf like so -

const treeA =
  tree("A1", leaf("A10"), leaf("A11"))

const treeB =
  tree
    ( "B1"
    , tree("B10", leaf("B100"), leaf("B101"), leaf("B102"), leaf("B103"), leaf("B104"), leaf("B105"))
    , tree("B11", leaf("B110"), leaf("B111"), leaf("B112"), leaf("B113"), leaf("B114"))
    )

const treeC =
  tree("C1", leaf("C10"), leaf("C11"), leaf("C12"), leaf("C13"))
const result =
  traverse([ treeA, treeB, treeC ]) // [ [ "A1 B1 C1" ], [ ... ], ... ]
    .map(path => path.join(" "))    // [ "A1 B1 C1", "A10 B1 C1", ... ]

console.log(result.join("\n"))  // "A1 B1 C1\nA10 B1 C1\n..."
console.log(result.length)      // 117

Expand the snippet below to verify the result in your browser -

const leaf = (value) =>
  ({ leaf, value })

const tree = (value, ...children) =>
  ({ tree, value, children })

const traverse = ([ t, ...more ], r = []) =>
  // base case: empty t
  t === undefined
    ? [ r ]

  // inductive: t.leaf
  : t.leaf
      ? traverse(more, [...r, t.value])

  // inductive: t.tree
  : [ [ ...r, t.value, ...more.map(t => t.value) ]
    , ...t.children.flatMap(c => traverse([ c, ...more ], r))
    ]

const treeA =
  tree("A1", leaf("A10"), leaf("A11"))

const treeB =
  tree
    ( "B1"
    , tree("B10", leaf("B100"), leaf("B101"), leaf("B102"), leaf("B103"), leaf("B104"), leaf("B105"))
    , tree("B11", leaf("B110"), leaf("B111"), leaf("B112"), leaf("B113"), leaf("B114"))
    )

const treeC =
  tree("C1", leaf("C10"), leaf("C11"), leaf("C12"), leaf("C13"))

const result =
  traverse([ treeA, treeB, treeC ])
    .map(path => path.join(" "))

console.log(result.join("\n"))
console.log(result.length)

Output -

A1 B1 C1
A10 B1 C1
A10 B10 C1
A10 B100 C1
A10 B100 C10
A10 B100 C11
A10 B100 C12
A10 B100 C13
A10 B101 C1
A10 B101 C10
A10 B101 C11
A10 B101 C12
A10 B101 C13
A10 B102 C1
A10 B102 C10
A10 B102 C11
A10 B102 C12
A10 B102 C13
A10 B103 C1
A10 B103 C10
A10 B103 C11
A10 B103 C12
A10 B103 C13
A10 B104 C1
A10 B104 C10
A10 B104 C11
A10 B104 C12
A10 B104 C13
A10 B105 C1
A10 B105 C10
A10 B105 C11
A10 B105 C12
A10 B105 C13
A10 B11 C1
A10 B110 C1
A10 B110 C10
A10 B110 C11
A10 B110 C12
A10 B110 C13
A10 B111 C1
A10 B111 C10
A10 B111 C11
A10 B111 C12
A10 B111 C13
A10 B112 C1
A10 B112 C10
A10 B112 C11
A10 B112 C12
A10 B112 C13
A10 B113 C1
A10 B113 C10
A10 B113 C11
A10 B113 C12
A10 B113 C13
A10 B114 C1
A10 B114 C10
A10 B114 C11
A10 B114 C12
A10 B114 C13
A11 B1 C1
A11 B10 C1
A11 B100 C1
A11 B100 C10
A11 B100 C11
A11 B100 C12
A11 B100 C13
A11 B101 C1
A11 B101 C10
A11 B101 C11
A11 B101 C12
A11 B101 C13
A11 B102 C1
A11 B102 C10
A11 B102 C11
A11 B102 C12
A11 B102 C13
A11 B103 C1
A11 B103 C10
A11 B103 C11
A11 B103 C12
A11 B103 C13
A11 B104 C1
A11 B104 C10
A11 B104 C11
A11 B104 C12
A11 B104 C13
A11 B105 C1
A11 B105 C10
A11 B105 C11
A11 B105 C12
A11 B105 C13
A11 B11 C1
A11 B110 C1
A11 B110 C10
A11 B110 C11
A11 B110 C12
A11 B110 C13
A11 B111 C1
A11 B111 C10
A11 B111 C11
A11 B111 C12
A11 B111 C13
A11 B112 C1
A11 B112 C10
A11 B112 C11
A11 B112 C12
A11 B112 C13
A11 B113 C1
A11 B113 C10
A11 B113 C11
A11 B113 C12
A11 B113 C13
A11 B114 C1
A11 B114 C10
A11 B114 C11
A11 B114 C12
A11 B114 C13
117       // results.length

from JavaScript to Java

I don't know Java but I can help rewrite the program above in a way that might be easier to translate -

JavaScript                           Java
================================================================
Array.prototype.concat               List.AddAll, Stream.concat
Array.prototype.map                  Stream.map
Array.prototype.flatMap              Stream.flatMap
Array.prototype.join                 String.join
(c => ___)                           Lambda (c -> ___)
[ foo, ...bar ]                      (no equivalent, see below)
const traverse = (trees = []) =>
  // call traverseHelper with trees and initial path for all nodes, []
  traverseHelper(trees, [])
function traverseHelper ([ t, ...more ], r = [])
{ // when no t
  if (t === undefined)
    return [ r ]

  // when t.leaf
  else if (t.leaf)
    return traverseHelper(more, [ ...r, t.value ])

  // when t.tree
  else
  { // when t.tree is encountered, do not add t.value to the children paths
    // simply copy the .value of each tree to the path for this node
    const path = r.concat([ t.value ]).concat(more.map(t => t.value))

    // return the path for this node, combined wit the results for
    // the children nodes.
    return [ path ]
      .concat(t.children.flatMap(c => traverseHelper([ c, ...more ], r)))
  }
}

To my knowledge, Java does not support spread syntax. These are just syntactic sugar for common array manipulations -

function foo(x, y, ...z) {
  console.log("foo x:", String(x))
  console.log("foo y:", String(y))
  console.log("foo z:", String(z))
}

function bar([ x, ...more ]) {
  console.log("bar x:", String(x))
  console.log("bar more:", String(more))
}

const a = [1,2,3]
const b = 4
const c = [5,6]

foo(...a, b, ...c)   // <-- foo(1,2,3,4,5,6)
// x: 1
// y: 2
// z: 3,4,5,6

const d = [...a, b, ...c] // <-- [1,2,3,4,5,6]
bar(d)
// x: 1
// more: 2,3,4,5,6


racket/scheme

Racket is a great language for discussing and sharing algorithms because there is almost no syntax and programs are made up of simple parts -

#lang racket

(define (traverse ts (r null))
  (cond 
    ;; base case: empty
    ((null? ts)  
      (list r))

    ;; inductive: leaf
    ((null? (cdar ts)) 
      (traverse (cdr ts)
                (cons (caar ts) r)))

    ;; inductive: tree
    (else                
      (cons (append (reverse (map car ts)) r)
            (append-map (lambda (c)
                          (traverse (cons c (cdr ts)) r))
                        (cdar ts))))))

This is almost a direct translation of the JavaScript code with the exception that path results are constructed in reverse order.

(define a
  '(a1 (a10) (a11)))

(define b
  '(b1 (b10 (b100) (b101) (b102) (b103) (b104) (b105))
       (b11 (b110) (b111) (b112) (b113) (b114))))

(define c
  '(c1 (c10) (c11) (c12) (c13)))

(define result
  (map reverse (traverse (list a b c))))

result

(map reverse ...) is used to display the results using the correct ordering -

'((a1 b1 c1)
  (a10 b1 c1)
  (a10 b10 c1)
  (a10 b100 c1)
  (a10 b100 c10)
  (a10 b100 c11)
  (a10 b100 c12)
  (a10 b100 c13)
  (a10 b101 c1)
  (a10 b101 c10)
  (a10 b101 c11)
  (a10 b101 c12)
  (a10 b101 c13)
  (a10 b102 c1)
  (a10 b102 c10)
  (a10 b102 c11)
  (a10 b102 c12)
  (a10 b102 c13)
  (a10 b103 c1)
  (a10 b103 c10)
  (a10 b103 c11)
  (a10 b103 c12)
  (a10 b103 c13)
  (a10 b104 c1)
  (a10 b104 c10)
  (a10 b104 c11)
  (a10 b104 c12)
  (a10 b104 c13)
  (a10 b105 c1)
  (a10 b105 c10)
  (a10 b105 c11)
  (a10 b105 c12)
  (a10 b105 c13)
  (a10 b11 c1)
  (a10 b110 c1)
  (a10 b110 c10)
  (a10 b110 c11)
  (a10 b110 c12)
  (a10 b110 c13)
  (a10 b111 c1)
  (a10 b111 c10)
  (a10 b111 c11)
  (a10 b111 c12)
  (a10 b111 c13)
  (a10 b112 c1)
  (a10 b112 c10)
  (a10 b112 c11)
  (a10 b112 c12)
  (a10 b112 c13)
  (a10 b113 c1)
  (a10 b113 c10)
  (a10 b113 c11)
  (a10 b113 c12)
  (a10 b113 c13)
  (a10 b114 c1)
  (a10 b114 c10)
  (a10 b114 c11)
  (a10 b114 c12)
  (a10 b114 c13)
  (a11 b1 c1)
  (a11 b10 c1)
  (a11 b100 c1)
  (a11 b100 c10)
  (a11 b100 c11)
  (a11 b100 c12)
  (a11 b100 c13)
  (a11 b101 c1)
  (a11 b101 c10)
  (a11 b101 c11)
  (a11 b101 c12)
  (a11 b101 c13)
  (a11 b102 c1)
  (a11 b102 c10)
  (a11 b102 c11)
  (a11 b102 c12)
  (a11 b102 c13)
  (a11 b103 c1)
  (a11 b103 c10)
  (a11 b103 c11)
  (a11 b103 c12)
  (a11 b103 c13)
  (a11 b104 c1)
  (a11 b104 c10)
  (a11 b104 c11)
  (a11 b104 c12)
  (a11 b104 c13)
  (a11 b105 c1)
  (a11 b105 c10)
  (a11 b105 c11)
  (a11 b105 c12)
  (a11 b105 c13)
  (a11 b11 c1)
  (a11 b110 c1)
  (a11 b110 c10)
  (a11 b110 c11)
  (a11 b110 c12)
  (a11 b110 c13)
  (a11 b111 c1)
  (a11 b111 c10)
  (a11 b111 c11)
  (a11 b111 c12)
  (a11 b111 c13)
  (a11 b112 c1)
  (a11 b112 c10)
  (a11 b112 c11)
  (a11 b112 c12)
  (a11 b112 c13)
  (a11 b113 c1)
  (a11 b113 c10)
  (a11 b113 c11)
  (a11 b113 c12)
  (a11 b113 c13)
  (a11 b114 c1)
  (a11 b114 c10)
  (a11 b114 c11)
  (a11 b114 c12)
  (a11 b114 c13))
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks! I can't tell you how much I appreciate it. I actually had considered building the whole cross product and pruning the results, but the actual code involves a bunch of extra processing including db access every time a combination is constructed so paying that overhead for combinations I would just throw away anyway didn't make sense. The only problem now is that I don't know any javascript! But it shouldn't be too much work to suss out what's happening. Thanks again, I really appreciate it!
Delighted to help. If you run into any trouble, let me know and I'll try to answer follow up questions :D
So, I've spent several hours working through js tutorials and I understand a lot of the code, but not all of it. Can you explain what's happening in the t.tree case? Specifically, what is returned from the traverse function?
@user2801442 like Java, JavaScript is a messy but versatile language. Hopefully you learned something fun! I added some notes about converting the function to Java
Thanks again. I did finally get this working. As I was working through my crash course in javascript, one of the things that confused me was whether the ...s were functioning as the spread operator or the rest operator. Anyway, I think that I understand about 90% of the js at this point so I'll be satisfied with that!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.