0

I have the following tree algorithm which prints the conditions for each leaf:

def _grow_tree(self, X, y, depth=0):
    # Identify best split
    idx, thr = self._best_split(X, y)

    # Indentation for tree description
    indent = "    " * depth

    indices_left = X.iloc[:, idx] < thr
    X_left = X[indices_left]
    y_left = y_train[X_left.reset_index().loc[:,'id'].values]

    X_right = X[~indices_left]
    y_right = y_train[X_right.reset_index().loc[:,'id'].values]

    self.tree_describe.append(indent +"if x['"+ X.columns[idx] + "'] <= " +\
                             str(thr) + ':')
    # Grow on left side of the tree  
    node.left = self._grow_tree(X_left, y_left, depth + 1)

    self.tree_describe.append(indent +"else: #if x['"+ X.columns[idx] + "'] > " +\
                         str(thr) + ':')
    # Grow on right side of the tree
    node.right = self._grow_tree(X_right, y_right, depth + 1)

    return node

This produces the following print for a particular case:

["if x['VAR1'] <= 0.5:",
 "    if x['VAR2'] <= 0.5:",
 "    else: #if x['VAR2'] > 0.5:",
 "else: #if x['VAR1'] > 0.5:",
 "    if x['VAR3'] <= 0.5:",
 "    else: #if x['VAR3'] > 0.5:"]

How could I obtain the following output?:

["if x['VAR1'] <= 0.5:",
 "    if x['VAR1'] <= 0.5&x['VAR2'] <= 0.5",
 "    else: #if x['VAR1'] <= 0.5&x['VAR2'] > 0.5:",
 "else: #if x['VAR1'] > 0.5:",
 "    if x['VAR1'] > 0.5&x['VAR3'] <= 0.5:",
 "    else: #if x['VAR1'] > 0.5&x['VAR3'] > 0.5:"]
3
  • Didn't you make the indentation because you didn't want this output? Because now indent shows which item is a child of another, and now you want to repeat the x['VAR1'] <= 0.5/x['VAR1'] > 0.5 parts to show that. Commented Feb 25, 2020 at 14:59
  • Initially yes, but now I intend to use the pandas query function to create leaf columns based on conditions and the first way is not practical. I need for each leaf to have all the conditions. Commented Feb 25, 2020 at 15:07
  • So just forward it like you did with depth. Depth grows +1, description elements will grow with conditions. Commented Feb 25, 2020 at 15:25

1 Answer 1

1

You could introduce a new argument to your function, which will have the string with higher-level condition(s) that need to be added to each deeper conditions:

I would also suggest using .format() for your string building:

def _grow_tree(self, X, y, depth=0, descr=""):

    idx, thr = self._best_split(X, y)

    indent = "    " * depth

    cond = "x['{}'] <= {}{}".format(X.columns[idx], thr, descr)
    self.tree_describe.append("{}if {}:".format(indent, cond))

    node.left = self._grow_tree(X_left, y_left, depth + 1, " & " + cond)

    cond = "x['{}'] > {}{}".format(X.columns[idx], thr, descr)
    self.tree_describe.append("{}else: #if {}:".format(indent, cond))

    node.right = self._grow_tree(X_right, y_right, depth + 1, " & " + cond)

    return node
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.