Skip to main content
Post Undeleted by Emily L.
added 1698 characters in body
Source Link
Emily L.
  • 16.7k
  • 1
  • 39
  • 89

Memory ConsumptionHash value collisions

IYour hash function might be causing your problems. As you are only using the hash value (getId()) in the closed and open set (as opposed to a Node that implements hashCode and equals to resolve hash value collisions), a hash value collision will appear as the node has already been visited even though it might not have been. Meaning that you skip parts of the search space, parts that could contain the solution. This could result in that you simply miss the easy solutions and keep on searching for a vague memorylong time.

For a 5x5 problem there exists 25 unique tiles (pun not intendedthe empty tile counted as a tile) about JVMs having, so you have 25! ~= 1.55E+25 unique board combinations. Using a default memory limit that is32bit integer as a bit low buthash code, at most 2^32 ~= 4.29E+9 unique board combinations can be increased by passing "-Xms" argument likerepresented without a collision.

So for example "-Xms2G"each hash value there exists ~= 3.61E+15 (=1.55E25/4.29E9) boards that have collisions. Or put the other way, if you pick two random boards the likely hood of a collision is:

Big Bug Busted

collidingBoards/totalBoards = 2.33E-10 (=3.61E+15/1.55E+25 = 1/4.29E9).

ButIf you evaluate 300 000 nodes, we can estimate a upper bound of the most glaring problemchance to not get a collision anywhere in the table by:

1 - (1 - 2.33E-10)^300000 = 0.00007 or 0.007% chance.

So you are very unlikely to have is that your Node class doesn't override hashCodecorrect closed/open sets after 300 000 iterations. Remember with A* you're searching for the optimal solution and you will expand a lot of nodes to find it; As opposed to if you're just looking for equals. Thusa solution then you'll take the first solution you find HashSet(greedy best-first search will usedo this quickly).

The closed set is not necessary (and neither is the open)

The purpose of the default implementations which rely onclosed set is to guarantee that you do not expand the object identitysame state twice as the consistency of the heuristic guarantees that the first time you expand a node is the best way to expand it. Two identical nodesIt is just an optimization.

If you get rid of the closed set you will not compare equal because they are different objects generated at different times instill find the optimal solution but you may expand each state multiple times. However as A* searchremembers the cost to get to a specific state you're going to expand better (less costly) paths first.

SoIf you don't have the closed set, you can simply remove the open set as it is essentially the same as your fringe of nodes to expand.

This should help you on larger problems, but running out of memory after 300k nodes sounds odd, which brings the next point:

Adjust your JVM memory limits

If I'm not mistaken the default memory limit on some JVM's is quite low, you can increase this by passing closeSet-Xms2G doesn't fill its function and only occupies memory and cpu time for no real useexample as command line argument to java.

Memory Consumption

I have a vague memory (pun not intended) about JVMs having a default memory limit that is a bit low but can be increased by passing "-Xms" argument like for example "-Xms2G".

Big Bug Busted

But the most glaring problem you have is that your Node class doesn't override hashCode and equals. Thus the HashSet will use the default implementations which rely on the object identity. Two identical nodes will not compare equal because they are different objects generated at different times in the A* search.

So your closeSet doesn't fill its function and only occupies memory and cpu time for no real use.

Hash value collisions

Your hash function might be causing your problems. As you are only using the hash value (getId()) in the closed and open set (as opposed to a Node that implements hashCode and equals to resolve hash value collisions), a hash value collision will appear as the node has already been visited even though it might not have been. Meaning that you skip parts of the search space, parts that could contain the solution. This could result in that you simply miss the easy solutions and keep on searching for a long time.

For a 5x5 problem there exists 25 unique tiles (the empty tile counted as a tile), so you have 25! ~= 1.55E+25 unique board combinations. Using a 32bit integer as a hash code, at most 2^32 ~= 4.29E+9 unique board combinations can be represented without a collision.

So for each hash value there exists ~= 3.61E+15 (=1.55E25/4.29E9) boards that have collisions. Or put the other way, if you pick two random boards the likely hood of a collision is:

collidingBoards/totalBoards = 2.33E-10 (=3.61E+15/1.55E+25 = 1/4.29E9).

If you evaluate 300 000 nodes, we can estimate a upper bound of the chance to not get a collision anywhere in the table by:

1 - (1 - 2.33E-10)^300000 = 0.00007 or 0.007% chance.

So you are very unlikely to have correct closed/open sets after 300 000 iterations. Remember with A* you're searching for the optimal solution and you will expand a lot of nodes to find it; As opposed to if you're just looking for a solution then you'll take the first solution you find (greedy best-first search will do this quickly).

The closed set is not necessary (and neither is the open)

The purpose of the closed set is to guarantee that you do not expand the same state twice as the consistency of the heuristic guarantees that the first time you expand a node is the best way to expand it. It is just an optimization.

If you get rid of the closed set you will still find the optimal solution but you may expand each state multiple times. However as A* remembers the cost to get to a specific state you're going to expand better (less costly) paths first.

If you don't have the closed set, you can simply remove the open set as it is essentially the same as your fringe of nodes to expand.

This should help you on larger problems, but running out of memory after 300k nodes sounds odd, which brings the next point:

Adjust your JVM memory limits

If I'm not mistaken the default memory limit on some JVM's is quite low, you can increase this by passing -Xms2G for example as command line argument to java.

Post Deleted by Emily L.
Checked the heuristic myself, it is consistent.
Source Link
Emily L.
  • 16.7k
  • 1
  • 39
  • 89

Correctness

I'm not convinced that your heuristic is consistent (wikipedia). If it isn't you can't use the closed set optimization which you're using. Because the first way you reach a given state may not be the best way to get to that state.

You need to check this.

Memory Consumption

For the 5x5 board you have 25 tiles and there thus exists 25! approx 1.55E+25 unique board combinations. Obviously A* will find the solution a without visiting all of the nodes, so your closed set won't go that big. This is just to illustrate the size of the search space.

I have a vague memory (pun not intended) about JVMs having a default memory limit that is a bit low but can be increased by passing "-Xms" argument like for example "-Xms2G".

Big Bug Busted

But the most glaring problem you have is that your Node class doesn't override hashCode and equals. Thus the HashSet will use the default implementations which rely on the object identity. Two identical nodes will not compare equal because they are different objects generated at different times in the A* search.

So your closeSet doesn't fill its function and only occupies memory and cpu time for no real use.

Correctness

I'm not convinced that your heuristic is consistent (wikipedia). If it isn't you can't use the closed set optimization which you're using. Because the first way you reach a given state may not be the best way to get to that state.

You need to check this.

Memory Consumption

For the 5x5 board you have 25 tiles and there thus exists 25! approx 1.55E+25 unique board combinations. Obviously A* will find the solution a without visiting all of the nodes, so your closed set won't go that big. This is just to illustrate the size of the search space.

I have a vague memory (pun not intended) about JVMs having a default memory limit that is a bit low but can be increased by passing "-Xms" argument like for example "-Xms2G".

Big Bug Busted

But the most glaring problem you have is that your Node class doesn't override hashCode and equals. Thus the HashSet will use the default implementations which rely on the object identity. Two identical nodes will not compare equal because they are different objects generated at different times in the A* search.

So your closeSet doesn't fill its function and only occupies memory and cpu time for no real use.

Memory Consumption

I have a vague memory (pun not intended) about JVMs having a default memory limit that is a bit low but can be increased by passing "-Xms" argument like for example "-Xms2G".

Big Bug Busted

But the most glaring problem you have is that your Node class doesn't override hashCode and equals. Thus the HashSet will use the default implementations which rely on the object identity. Two identical nodes will not compare equal because they are different objects generated at different times in the A* search.

So your closeSet doesn't fill its function and only occupies memory and cpu time for no real use.

deleted 6 characters in body
Source Link
Emily L.
  • 16.7k
  • 1
  • 39
  • 89

Correctness

I doI'm not believeconvinced that your heuristic is consistent (wikipedia). As suchIf it isn't you can't use the closed set optimization which you're using. Because the first way you reach a given state may not be the best way to get to that state.

So you can get the wrong solution!You need to check this.

Memory Consumption

For the 5x5 board you have 25 tiles and there thus exists 25! approx 1.55E+25 unique board combinations. Obviously A* will find the solution a without visiting all of the nodes, so your closed set won't go that big. This is just to illustrate the size of the search space.

I have a vague memory (pun not intended) about JVMs having a default memory limit that is a bit low but can be increased by passing "-Xms" argument like for example "-Xms2G".

Big Bug Busted

But the most glaring problem you have is that your Node class doesn't override hashCode and equals. Thus the HashSet will use the default implementations which rely on the object identity. Two identical nodes will not compare equal because they are different objects generated at different times in the A* search.

So your closeSet doesn't fill its function and only occupies memory and cpu time for no real use.

Correctness

I do not believe that your heuristic is consistent (wikipedia). As such you can't use the closed set optimization which you're using. Because the first way you reach a given state may not be the best way to get to that state.

So you can get the wrong solution!

Memory Consumption

For the 5x5 board you have 25 tiles and there thus exists 25! approx 1.55E+25 unique board combinations. Obviously A* will find the solution a without visiting all of the nodes, so your closed set won't go that big. This is just to illustrate the size of the search space.

I have a vague memory (pun not intended) about JVMs having a default memory limit that is a bit low but can be increased by passing "-Xms" argument like for example "-Xms2G".

Big Bug Busted

But the most glaring problem you have is that your Node class doesn't override hashCode and equals. Thus the HashSet will use the default implementations which rely on the object identity. Two identical nodes will not compare equal because they are different objects generated at different times in the A* search.

So your closeSet doesn't fill its function and only occupies memory and cpu time for no real use.

Correctness

I'm not convinced that your heuristic is consistent (wikipedia). If it isn't you can't use the closed set optimization which you're using. Because the first way you reach a given state may not be the best way to get to that state.

You need to check this.

Memory Consumption

For the 5x5 board you have 25 tiles and there thus exists 25! approx 1.55E+25 unique board combinations. Obviously A* will find the solution a without visiting all of the nodes, so your closed set won't go that big. This is just to illustrate the size of the search space.

I have a vague memory (pun not intended) about JVMs having a default memory limit that is a bit low but can be increased by passing "-Xms" argument like for example "-Xms2G".

Big Bug Busted

But the most glaring problem you have is that your Node class doesn't override hashCode and equals. Thus the HashSet will use the default implementations which rely on the object identity. Two identical nodes will not compare equal because they are different objects generated at different times in the A* search.

So your closeSet doesn't fill its function and only occupies memory and cpu time for no real use.

Source Link
Emily L.
  • 16.7k
  • 1
  • 39
  • 89
Loading