Assignment 04 - Individual - Pathfinding


Due: Nov 1, 11:55pm

In this assignment, you will implement a practical algorithm used to find an efficiently traversable path between two points.

NOTE: Not only should your implementation work correctly, but it will need to use space and time efficiently to receive full credit.

This assignment is based on web exercises from the Algorithms 4/e book website.

The Pathfinding Problem


In general, a pathfinding algorithm searches for the "shortest" or "best" path from one node in a graph to another. Nodes represent locations and are connected by edges. Edges represent an ability to move between the end nodes, and each edge has an associated weight representing the cost of moving across the edge. The "shortest" or "best" path between any two nodes is a sequence of edges connecting the two nodes that has the smallest sum of weights. In other words, there may be many possible sequences of edges that connect the two nodes, but we are interested in choosing only the sequence that is or is close to the optimal.

Pathfinding is a very common algorithm that is applied in many ways. Below is a short list of pathfinding examples.

Given enough time a breadth-first search algorithm will find a solution assuming one exists, but such a searching solution does not scale well, especially as the dimensionality of movement increases. In this assignment, you will implement Dijkstra's Algorithm and a variant of this called the A* Algorithm, both of which find a solution efficiently. The figures below show the solution (pink path) to the maze from the figure above, where the searched area is tinted magenta. Although the two figures show exactly the same (optimal) path, they use different heuristic coefficient values, and therefore the area of searching is different. The starting point is the upper-left corner, and the ending point is the upper-right corner.

We can use these same pathfinding algorithms to solve all types of mazes.

Nearly all real-time strategy video games use pathfinding to control character movements. The player selects a character and then specifies a destination, and the character starts moving toward the location. The difficulty in this scenario is that there are obstacles (buildings, other characters, etc.) or terrain types that have varying easiness in traversal (walking on pavement is generally easier than walking on sand). If the pathfinding algorithm takes this variability into account, the character will could choose routes that move it more quickly to the destination.

We will modify our pathfinding algorithm to account for elevation change. A character can move easily from one location to an nearby location if the elevation remains the same. However, the more the ground changes (either up or down), the more difficult moving there becomes.

The figures below show a top-down view of a randomly generated terrain. The black color represents sea-level, and the bright green represents higher ground above water. The starting point is drawn as a large blue circle near the bottom, and the ending point is a large red circle closer to the top-left corner. The top-right, bottom-left, and bottom-right figures visualize three different paths. Note that the path taken in the bottom-right figure would be shorter to go "as the crow flies" between the two points, the change in elevation along that path would make for difficult travel.


Modeling the Pathfinding Problem and the A* Search


We will use several data structures to solve this problem, notably the priority queue, the linked list (tree), and the array. For the priority queue, use the MinPQ data type provided by the algs4.jar library. You will implement a simple linked-list node that will be the key to MinPQ, therefore it will need to implement the Comparable interface.

As an aside, we could solve this problem using less memory on average (the worst case is same) using other data structures, but we have not yet learned about them yet. Furthermore, we are trading off some computation time (remove key from priority queue) for memory usage (keys can be invalidated).

To determine a path, we will begin at the path's starting coordinate and expand outward in all directions until we reach the ending coordinate. For every coordinate we explore, we will create an object that records a path back to the starting location.

For example, suppose we have the following simple terrain, where S indicates the path's starting location and E indicates the end.

0 1 2 3 4
0
1 S
2
3 E
4

Now, if we expanded out in all directions from S and remember the way back, we would see the following.

0 1 2 3 4
0
1 S
2
3 E
4

Expanding once more...

0 1 2 3 4
0
1 S
2
3 E
4

Note that in this example, the top-left spot could point to the right or down, but both paths take the same time. Continuing this fully, we get the following.

0 1 2 3 4
0
1 S
2
3
4

If we follow the path back from E to S, we get: ↑, ←, ←, ←, ↑.

0 1 2 3 4
0
1 S
2
3
4

The algorithm above runs in \(N^2\) time but always finds the shortest route. However, this is not an interesting problem.

An example of an interesting problem is this: what if some of those squares requires more time to cross? We could model this by assigning values to each point on the grid, and then as we explore out we accumulate the values (stored at each point along with the way back). This time, though, if there are two or more possible ways to go, we choose the cheapest path. Below is an example.

0 1 2 3 4
0 +1 +1 +1 +1 +5
1 +9 S +4 +1 +1
2 +9 +9 +4 +8 +2
3 +1 +7 +4 +2 E
4 +1 +1 +2 +4 +1

Starting at S and accumulating while expanding out, we would see the following.

0 1 2 3 4
0 → 2 ↓ 1 ← 2 ← 3 ← 8
1 → 9 S ← 4 ↑ 4 ← 5
2 → 18 ↑ 9 ↑ 8 ↑ 12 ↑ 7
3 → 17 ↑ 16 ↑ 12 ← 14 E
4 ↑ 18 ↑ 17 ↑ 14 ← 18 ← 19

Looking around E, we see that the direction that has the smallest number is up (7), and following the arrows back to S is in fact the cheapest path to take. The modified algorithm above runs in \(aN^2\) time, where \(a > 1\), which isn't much worse than before but now we're solving interesting problems.

A careful observer would notice that we actually do not need to fill in the entire table. In fact, since the shortest path costs 7, then any number greater than 7 does not need to be checked (ex: bottom-left corner).

A simple way to take this into account is to use a minimum priority queue. We start by placing in this queue the starting location (S). Then, we repeat the following until the queue is empty or we have reached our destination

It would be good to prove to yourself that this implementation will determine the optimal path.

Note: the algorithm above is called Dijkstra's algorithm.

Although the algorithm above will improve our average runtimes by only searching along the cheapest route so far, it is still not great. Another observation you might have made is that the optimal route will likely take us toward our destination. We will add to our accumulated cost a heuristic value that captures this notion "every step should get us closer". Implementing this observation is the A* search algorithm.

A heuristic is a practical method that is not guaranteed to be optimal or perfect, but it is an approach that reasonably gets us to our goal quicker.

The heuristic that we will use is simply the distance between the surrounding location (\(N\)) and the path's end (\(E\)) times a scalar (\(h\)). In the following equation, \(\text{cost}(C)\) is the accumulated cost to reach the current point \(C\) (the location that we are searching around), and \(\text{travelcost}(C,N)\) is the cost of traveling from \(C\) to \(N\) (a location neighboring \(C\)).

\[\text{cost}(N) = \text{cost}(C) + \text{travelcost}(C, N) + h * \text{dist}(N, E)\]

When \(h = 0\), the A* search algorithm becomes Dijkstra's algorithm. For more information, see Amit Patel's A* Pages for more information.

Important note: take care of reaching a location by two different paths by invalidating the longer paths in the minimum priority queue. In other words, if a particular location was reached from the left but another path from the right comes along that is cheaper, the left path should be invalidated and the right added.

In the example below, the ← 16 location is invalidated and overwritten with → 13 when the +2 location becomes → 12, because \(12 + 1 = 13 < 16\) (the algorithm reached the +1 location from the left before it could reach it from the cheaper path on the right).

costs ... +5 +1 +2 +3 ...
  \(\vdots\) \(\vdots\) \(\vdots\) \(\vdots\) \(\vdots\) \(\vdots\) \(\vdots\)
step \(i\) ... ← 15 ← 16 +2 → 10 ...
step \(i+1\) ... ← 15 ← 16 → 12 → 10 ...
step \(i+2\) ... ← 15 → 13 → 12 → 10 ...

The Pathfinder Class


To model a pathfinding problem, modify the data type Pathfinding and the internal private data type PFNode (pathfinding node) to implement following API:

A Pathfinder will take a Terrain object as an argument for the constructor. The setPathStart() and setPathEnd() methods will set the starting and ending locations, and the setHeuristic() method will set the search heuristic.

Once the main properties of the Pathfinder object have been set, the computePath() method will perform an A* search to find a path from the starting locations to the ending location that has approximately the least cost.

foundPath() returns true if a path was found. getPathCost() returns the total (non-heuristic) cost of traveling along the path. getSearchSize() returns the number of locations that were looked at while finding the path. wasSearched() returns true if the specified Coord was looked at while finding the path.

The getPathSolution() method returns an Iterable<Coord> that iterates along the path from the starting position to the ending position. This will be used to draw the path in PathfinderVisualizer and for the Walker (see next section). Note: each Coord along the solution must be adjacent to the previous and next Coord on the path.

The resetPath() method will clear out any path information.

Similar Problem/Fewer Variables: The simplest method of solving this problem is to ignore the travel costs during computePath() and simply find a path. In fact, this can be computed in time linear to \(N\) (not \(N^2\)).

Edge Cases: Throw an IllegalArgumentException if the Coord passed to setPathStart() or setPathEnd() is null. Throw an IndexOutOfBoundsException if the Coord is outside the acceptable range (hint: use isInBounds() method of Coord). Throw an IllegalArgumentException if computePath() is called before the start or end locations have not been set.

Performance Requirements: Your implementation of computePath() must use space that is linear to the final search space, and it should run no worse than \(\texttilde a\ N^2 \lg N\) time, where \(a\) is about 15% (depending on map and heuristic). Also, all methods but computePath() and resetPath() should run in constant time.

The Walker Class


To model an individual following the path along a terrain, modify the Walker data type to implement the following API:

The constructor takes as parameters the Terrain the walker is to traverse and the path (as Iterable<Coord>). The getLocation() returns the walker's current location, which starts at the first Coord in path. doneWalking returns true if the walker has reached the end of the path.

advance() will "walk" the walker along the path for a given amount of time (byTime). This should take into account the cost of following the given path along the Terrain. In other words, if the path is along level ground, the walker should traverse it rather quickly, but if it is uphill/downhill, then the walker will go much slower. Use the computerTravelCost() method of the Terrain to determine how much time it takes to travel from the walker's current position to the next position on the path.

Hint: start by implementing this to advance to the next Coord on the path each time to make sure you get it correct. Once you are convinced that it works, then try to consider the difficulty in traveling.

The Terrain Class


We have provided a class to load and store maps or heightmaps. The key functions to this class, though, are computeDistance() and computeTravelCost() functions, as they will help you perform the A* search. The computeDistance() returns the "as the crow flies" distance between the two input coordinates. The computeTravelCost() returns the cost of traveling (a combination of distance and height difference) from the first input coordinate to the second.

Other Classes


The Coord class is a convenient way to pass 2D coordinate (longitude+latitude, i+j, etc.). Note that the class is immutable, so you can assign to it only at construction The class does contain a useful function called isInBounds() that will return a boolean to indicate whether the Coord is within the specified minimum and maximum values (inclusive).

The PathfinderVisualizer is a straightforward class that renders the data stored in Terrain and Pathfinder objects.

The InteractivePathfinderVisualizer provides some simple interaction to the PathfinderVisualizer. See the comments at the top of .java file for details.

Finally, the TerrainEditor class contains a handful of useful functions to edit a Terrain object. The InteractivePathfinderVisualizer uses these functions to smooth, increase, or decrease the heightmap or to generate a new fractal terrain.

Deliverables


Submit a zip file containing the entire project IntelliJ project, including all of the files listed below. The files you are required to modify are marked with <<<<<.

04_Pathfinding/
    .idea/...
    .log/...
    out/...
    src/
        algs4.jar
        stdlib.jar
        Coord.java
        InteractivePathfinderVisualizer.java
        Pathfinder.java                         <<<<<
        PathfinderVisualizer.java
        Terrain.java
        TerrainEditor.java
        Walker.java                             <<<<<
    04_Pathfinding.iml
    readme.txt                                  <<<<<

We will supply an IntelliJ project with stdlib.jar and algs4.jar libraries. These libraries contain many useful classes. On this assignment, the only library functions you may call are those in java.util, stdlib.jar, and algs4.jar. Do not import any other library!

Grading Rubric


3pts - Submission contains all files
3pts - Correct implementation of computePath() and getPathSolution()
3pts - Correct implementation of getPathCost() and getSearchSize()
3pts - Correct implementation of Walker class
3pts - Efficient code
3pts - Completed readme.txt