How can I speed up my IDA* algorithm for N-puzzle?

Question

I'm trying to solve the N-Puzzle problem using IDA* algorithm with a Manhattan heuristic. I already implemented the algorithm from the pseudocode in this Wikipedia page (link).

Here's my code so far :

class Node():
    def __init__(self, state, manhattan):
        self.state = state
        self.heuristic = manhattan

    def __str__(self):
        return f"state=\n{self.state}\nheuristic={int(self.heuristic)}"

    def __eq__(self, other):
        return np.array_equal(self.state, other.state)  


def customSort(node):
    return node.heuristic


def nextnodes(node):
    zero = np.where(node.state == 0)
    
    y,x = zero
    y = int(y)
    x = int(x)

    directions = []
    if y > 0:
        directions.append((y - 1, x))
    if x > 0:
        directions.append((y, x - 1))
    if y < constant.SIZE2 - 1:
        directions.append((y + 1, x))
    if x < constant.SIZE2 - 1:
        directions.append((y, x + 1))

    arr = []
    for direction in directions:
        tmp = np.copy(node.state)
        goal = np.where(constant.GOAL_STATE == tmp[direction])
        
        tmp[direction], tmp[zero] = tmp[zero], tmp[direction]

        tile1_score = manhattan_distance(direction, goal)
        tile2_score = manhattan_distance(zero, goal)

        arr.append(Node(tmp, node.heuristic - tile1_score + tile2_score))
    
    arr.sort(key=customSort)

    return arr


def manhattan_distance(a, b):
    return abs(b[0] - a[0]) + abs(b[1] - a[1])


def count_distance(number, state1, state2):
    position1 = np.where(state1 == number)
    position2 = np.where(state2 == number)

    return manhattan_distance(position1, position2)


def manhattan_heuristic(state):
    distances = [count_distance(num, state, constant.GOAL_STATE) for num in constant.SIZE]
    return sum(distances)


def search(path, g, threshold):
    node = path[-1]
    f = g + node.heuristic

    if f > threshold:
        return f
    if np.array_equal(node.state, constant.GOAL_STATE):
        return True

    minimum = float('inf')
    for n in nextnodes(node):
        if n not in path:
            path.append(n)
            tmp = search(path, g + 1, threshold)
            if tmp == True:
                return True
            if tmp < minimum:
                minimum = tmp
            path.pop()

    return minimum


def solve(initial_state):path = solve(initial_state)
    initial_node = Node(initial_state, manhattan_heuristic(initial_state))
    threshold = initial_node.heuristic
    
    path = [initial_node]
    
    while 1:
        tmp = search(path, 0, threshold)
        if tmp == True:
            print(f"GOOD!")
            return path
        elif tmp == float('inf'):
            print(f"WRONG!")
            return False
        threshold = tmp

You can call the solve() function (and create some needed global variables) like this:

def define_goal_state(n):
    global GOAL_STATE
    global SIZE
    global SIZE2
    global ZERO

    m = [[0] * n for i in range(n)]
    dx, dy = [0, 1, 0, -1], [1, 0, -1, 0]
    x, y, c = 0, -1, 1
    for i in range(n + n - 2):
        for j in range((n + n - i) // 2):
            x += dx[i % 4]
            y += dy[i % 4]
            m[x][y] = c
            c += 1

    GOAL_STATE = np.array(m)
    SIZE = range(1, len(GOAL_STATE) ** 2)
    SIZE2 = len(GOAL_STATE)
    ZERO = np.where(GOAL_STATE == 0)
  

initial_state = np.array([8 7 3],[4 1 2],[0 5 6])
define_goal_state(len(initial_state))
path = solve(initial_state)

My implementation of the Wikipedia pseudocode is working and I get pretty fast results in a 8-Puzzle problem. But it gets really slow with a 15-Puzzle problem and above. Regarding the solve() and the search() functions I think I respect almost identically the pseudocode. I also optimized my nextnodes() function to count only the tile that has been moved, not all tiles every time. Should be good here.

So where's the catch ? Why is my implementation taking so long ? Does anyone have a clue ?

Marc · Accepted Answer · 2021-03-17 09:45:52Z

The code does not compile as it is:

The solve function's signature contains unnecessary code

def solve(initial_state):path = solve(initial_state)

A NumPy array cannot be initialized like np.array([8 7 3],[4 1 2],[0 5 6])
Globals are not initialized
constant.GOAL_STATE is not defined

Review

In the method search, checking if n not in path is an expensive operation that can be optimized using a set or a dictionary. This drops the runtime on my machine by 70%. A dictionary seems to be a good option since it keeps insertion order (if Python >= 3.7) and can be used as a queue.
manhattan_heuristic uses many times np.where, it can be optimized. An option is to iterate through the puzzle and when an element is different from the goal, calculate the difference. Converting GOAL_STATE to a dictionary will also improve the performance, instead of calling position1 = np.where(state1 == number) every time.
nextnodes also uses np.where two times which can be avoided. For example, instead of calling zero = np.where(node.state == 0) every time, Node can contain the coordinates of zero as a class property.
Naming: variable SIZE and SIZE2 are confusing because one is a generator and the other is the size of the puzzle. Consider giving more meaningful names.
Globals: try to reduce the use of globals as much as possible as it makes the code hard to test.

Note: this is not a comprehensive review, I think there is still room for improvements, I hope you will get more reviews.

Runtime after the improvements:

Puzzle	Original	Improved
3x3	0.696 s	0.097 s
4x4	7.021 s	0.549 s

Reworked code:

from time import perf_counter as pc

import numpy as np

GOAL_STATE = [[]]
GLOBAL_STATE_DICT = {}
N = 0


class Node:
    def __init__(self, state, manhattan, zero_pos):
        self.state = state
        self.heuristic = manhattan
        self.zero_pos = zero_pos

    def __str__(self):
        return f"state=\n{self.state}\nheuristic={int(self.heuristic)}"

    def __eq__(self, other):
        return np.array_equal(self.state, other.state)

    def __repr__(self):
        return f"state=\n{self.state}\nheuristic={int(self.heuristic)}"

    def __hash__(self):
        return hash(self.state.tobytes())


def customSort(node):
    return node.heuristic


def nextnodes(node):
    zero = node.zero_pos

    r, c = map(int, zero)
    directions = ((r - 1, c), (r + 1, c), (r, c - 1), (r, c + 1))
    nodes = []
    for direction in directions:
        if 0 <= direction[0] < N and 0 <= direction[1] < N:
            tmp = np.copy(node.state)
            goal = GLOBAL_STATE_DICT[tmp[direction]]

            tmp[direction], tmp[zero] = tmp[zero], tmp[direction]

            dir_goal_distance = manhattan_distance(direction, goal)
            goal_zero_distance = manhattan_distance(goal, (r, c))

            nodes.append(Node(tmp, node.heuristic - dir_goal_distance + goal_zero_distance, direction))
    return sorted(nodes, key=customSort)


def manhattan_distance(a, b):
    return abs(b[0] - a[0]) + abs(b[1] - a[1])


def manhattan_heuristic(state):
    distance = 0
    for i in range(N):
        for j in range(N):
            num = state[i][j]
            if num != GOAL_STATE[i][j] and num != 0:
                goal = GLOBAL_STATE_DICT[num]
                distance += manhattan_distance((i, j), goal)
    return distance


def search(path, g, threshold):
    node = list(path.keys())[-1]

    f = g + node.heuristic

    if f > threshold:
        return f
    if np.array_equal(node.state, GOAL_STATE):
        return True

    minimum = float('inf')
    for n in nextnodes(node):
        if n not in path:
            path[n] = None
            tmp = search(path, g + 1, threshold)
            if tmp == True:
                return True
            if tmp < minimum:
                minimum = tmp
            path.popitem()

    return minimum

def solve(initial_state):
    zero = np.where(initial_state == 0)
    initial_node = Node(initial_state, manhattan_heuristic(initial_state), zero)
    threshold = initial_node.heuristic
    # The dictionary keeps insertion order since Python 3.7 so it can be used as a queue
    path = {initial_node: None}
    while 1:
        tmp = search(path, 0, threshold)
        if tmp == True:
            print("GOOD!")
            return path.keys()
        elif tmp == float('inf'):
            print("WRONG!")
            return False
        threshold = tmp



def define_goal_state(n):
    global GOAL_STATE
    global N
    global GLOBAL_STATE_DICT

    m = [[0] * n for i in range(n)]
    dx, dy = [0, 1, 0, -1], [1, 0, -1, 0]
    x, y, c = 0, -1, 1
    for i in range(n + n - 2):
        for j in range((n + n - i) // 2):
            x += dx[i % 4]
            y += dy[i % 4]
            m[x][y] = c
            c += 1

    GOAL_STATE = np.array(m)
    N = len(GOAL_STATE)
    GLOBAL_STATE_DICT = {m[r][c]: (r, c) for r in range(N) for c in range(N)}


tests = {'3x3': np.array([[8, 7, 3], [4, 1, 2], [0, 5, 6]]),
         '4x4': np.array([[13, 2, 10, 3],
                                [1, 12, 8, 4],
                                [5, 0, 9, 6],
                                [15, 14, 11, 7]])}

for name, puzzle in tests.items():
    define_goal_state(len(puzzle))
    print('Puzzle:\n', puzzle)
    t0 = pc()
    path = solve(puzzle)
    t1 = pc()
    print(f'{name} depth:{len(path)} runtime:{round(t1 - t0, 3)} s')

Sorry for the delay. Your answer is really helpful, I managed to speed up my IDA algo a little bit. It is still slow on difficult configurations. I can speed it up by changing the heuristic : manhattan seems to be pretty slow. — cuzureau
– cuzureau, Commented Apr 15, 2021 at 13:29

Stack Exchange Network

How can I speed up my IDA* algorithm for N-puzzle?

1 Answer 1

Review

Reworked code:

You must log in to answer this question.

Hot Network Questions

How can I speed up my IDA* algorithm for N-puzzle?

1 Answer 1

Review

Reworked code:

You must log in to answer this question.

Related

Hot Network Questions