-Tue Nov 13 11:50:24 PST 1990 Phil Lapsley phil@Berkeley.EDU
-
-This library implements a reasonably general version of the A* algorithm.
-Basically, A* is like an ordered search, but with a heuristic that allows
-it to make a better choices as to which path to take. The subdirectory
-"test" has example code for using A* to search a weighted cartesian matrix.
-The file "XXX//bestpath.c" has code to interface Empire to the A*
-algorithms.
-
-This library is copyrighted; see the file COPYRIGHT for details.
-
-COMPILATION
-
-Do a "make" in this directory to make the library. Cd into "test" and
-do a "make" there to make a test program, called (ta da) "test".
-
-LIBRARY USAGE
-
-Pretty much all the data that the user needs to communicate to the library
-is stored in an "as_data" structure, which is created by a call to "as_init":
-
- struct as_data *adp;
-
- adp = as_init( ... );
-
-The arguments to as_init specify a number of key things the algorithm will
-use, and these are discussed below.
-
-Once you have an "as_data" structure, you can fill in its "from" and "to"
-members, which specify the coordinates of the points between which you want
-to go. The basic coordinate data structure is the "as_coord", which is
-just an "x" integer and a "y" integer. So:
-
- adp->from.x = ...;
- adp->from.y = ...;
- adp->to.x = ...;
- adp->to.y = ...;
-
-and then call as_search:
-
- as_search(adp);
-
-If the return value of as_search is 0, the algorithm found a path from
-"from" to "to". The path is stored in the "path" member of the "as_data"
-structure, and is just a linked list of coordinates ("as_coord" structs)
-from "from" to "to".
-
-If the return value of as_search is -1, the algorithm couldn't find a
-path from "from" to "to". If the return is -2, a system error occurred
-(most probably malloc failed).
-
-After a call to as_search, lots of malloced data 'n' stuff will be
-floating around, all pointed to by items in the "as_data" data structure.
-If you call "as_search" again (presumably with new "from" and "to"
-coordinates), the system will free these data structures and reallocate
-new ones. If you no longer want the data structures at all (i.e., you
-never intend to call "as_search" again), you can call:
-
- as_delete(adp);
-
-and this will free up all data associated with the as_data structure pointed
-to by adp.
-
-ARGUMENTS TO AS_INIT
-
-"as_init" takes eight arguments, in the following order:
-
- maxneighbors The maximum number of neighboring sectors a
- coordinate can have. On a cartesian grid,
- this would be 8. On a hex map, it would be 6.
-
- hashsize The size of the hash table used to determine
- if we've visited a particular coordinate.
-
- hashfunc A pointer to a function that takes an
- "as_coord" as an argument and returns an
- integer that will be used as a hash value.
- If this is a NULL pointer, the as library
- will assign its own hash function that just
- adds the x and y coordinates together.
-
- neighborfunc A pointer to a function that takes a coordinate
- (call it "c"), a pointer to an array of
- coordinates of length "maxneighbors", and
- a user data pointer (see below). This
- function should figure out the coordinates
- of all the neighbors of "c" and put them in
- the array cp[0] ... cp[maxneighbors-1].
- It should then return the number of neighbors
- found.
-
- lbcostfunc A pointer to a function that takes two
- coordinates, call them "from" and "to",
- and a user data pointer (see below). It
- returns a double precision *LOWER BOUND*
- on the cost to get from "from" to "to".
- "from" and "to" may be separated by an
- arbitrary distance. More on this below.
-
- realcostfunc A pointer to a function that takes two
- coordinates, call them "from" and "to",
- and a user data pointer (see below). It
- returns a double precision value of
- the *actual cost* to move from "from" to
- "to". Note that "from" will never be more
- than one sector away from "to".
-
- seccostfunc A pointer to a function that takes two
- coordinates, call them "from" and "to",
- and a user data pointer (see below). It
- returns a double precision value that will
- be used to break ties when two nodes have
- the same lower bound to the target. An
- example will be discussed below.
-
- userdata A (char *) that can be a pointer to
- any kind of data you want. This will
- be passed as the third argument to the
- neighborfunc, lbcostfunc, realcostfunc,
- and seccostfunc.
-
-Look in test/cart.c to see examples of these functions for cartesian
-coordinates.
-
-NOTES ON THE LOWER BOUND FUNCTION
-
-"lbcostfunc" is *CRUCIAL* to the algorithm's performance. The entire
-idea behind the A* algorithm is that, when considering whether to move
-to a new coordinate, you know two things:
-
- (1) how much it's cost you to get to that coordinate so far,
-
- (2) a LOWER BOUND on how much it will cost to get to the
- destination from that coordinate.
-
-If these two conditions are met, the algorithm will return the optimal
-path to the destination. The closer the lower bound is to the actual
-cost to get from one point to another, the quicker the algorithm will
-find this path. HOWEVER, if the lower bound is ever violated, i.e.,
-if the so-called lower bound function returns a value that is greater
-than the actual cost to get to the destination, then the algorithm will
-not necessarily find the optimal path.
-
-Example:
-
-Assume that we're on a cartesian matrix, and the cost to move from one point
-to another is just the distance between the two points, and that no
-other costs are involved. In this case, the lower bound function could
-be the same as the actual cost function, i.e.,
-
- real cost = lower bound cost = sqrt(dx^2 + dy^2);
-
-In this case, the algorithm will find the destination as quickly as possible.
-
-Another example:
-
-Again assume we're on a cartesian matrix, and the cost to move from
-one point to another is two things: (1) the distance between them, (2) some
-arbitrary cost function we get off of a map. E.g.,
-
- X
-
- 0 1 2 3
- 0 0 0 0 0
- 1 0 0 0 0
- Y 2 0 0 2 0
- 3 0 0 0 0
-
-The real cost to move from (x,y) 0,0 to 1,0 is 1. That is, it's the distance
-(1) plus the value of the map at (1,0), which is 0. The real cost to move
-from (1,2) to (2,2) is 3: the distance (1) plus the map value at (2,2), which
-is 2, totaling 3.
-
-In this case, the lower bound function could still be the distance between
-the two points, since this WILL NEVER BE MORE than the actual cost, and
-hence is a lower bound. I.e.,
-
- real cost = sqrt(dx^2 + dy^2) + map costs
-
- lower bound cost = sqrt(dx^2 + dy^2)
-
- lower bound cost <= real cost for all coordinates
-
-This is what the the example in the "test" directory uses.
-
-A third example:
-
-You could make the lower bound function always return 0. This would be
-a valid lower bound of the cost to move between any two points. In this
-case, the A* algorithm will behave like a breadth-first search, and isn't
-very much fun.
-
-SECONDARY COST FUNCTION
-
-The algorithm tries new coordinates in order of lowest lower-bound.
-There can be cases where the lower-bound function for two (or more)
-coordinates is the same, i.e., there is a tie. The algorithm uses
-the secondary cost function (which does NOT have to be a lower bound)
-to choose which coordinate to pick. A typical heuristic might be
-to use Euclidian distance for this secondary cost function, on the
-assumption that it's always better to move closer the destination.
-The Empire code does just this.
-
-If you don't need a secondary cost function, just specify a NULL pointer
-to the seccost argument to as_init, and the routines will use 0.0 for you.
-
-EMPIRE INTERFACE
-
-The interface code in "XXX/bestpath.c" is a rather complicated example
-of A* interface code. What it does is based on some features of Empire
-that are explained here.
-
-First, movement cost in Empire is based on a quantity called "mobility
-cost", which is a property of every sector. As we trace a path from
-source to destination, we add up mobility cost as we go. Once we're
-there, we have a total mobility cost. This is what we'd like to
-minimize.
-
-Second, Empire has highways, which are zero cost movement paths. This
-hurts the A* algorithm, because it means that the lower bound cost
-function is very weak. For example, think what happens if we move from
-one side of the country to another: if the two sectors are attached via
-a highway, the cost will be very small -- in fact, it will be the cost
-to move out of the source sector, and the cost to move into the
-destination sector. If, on the other hand, the two sectors aren't
-connected by a highway, the cost could be quite large. Thus, the lower
-bound is just the cost to move out of a sector plus the cost to move
-into a sector. This is a pretty weak lower bound, and means that we
-have to explore a lot of sectors.
-
-Third, the mobility costs tend to tie a lot, as explained in the
-section above on secondary cost functions. Thus, we use the Empire
-"mapdist" function as a secondary sort function to break ties. For
-example, consider the case where a sector borders two highway sectors,
-one on each side. The lower bound function will say that the two have
-equal lower bound costs, since they're both highways and cost nothing
-to move on. The secondary sort function is then used to break the tie --
-it says, "Take the one that moves you closer to the destination".
-
-Fourth, all of the information we need about a sector (its mobility
-cost, who owns it, etc.) is stored in the sector file on disk. This
-means that the getsect() function to get it off disk will do a read(),
-which is VERY expensive. Because of the weak lower bound, A* ends up
-checking lots of sectors, including sectors that it's seen before.
-To avoid doing thousands of system calls per second, the bestpath.c file
-has sector caching routines that store in memory copies of every
-sector we read. (The function bp_getsect handles this). This means
-we never read a sector from disk more than once, although at the expense
-of using lots of memory.
-
-THEORY OF OPERATION
-
-The basic idea is as follows:
-
- 1. Add the start node to the head of a master queue.
- 2. Is the head of the queue where we want to be? If so, stop.
- 3. Make a list of all the neighbors coordinates around the
- coordinate of the head.
-
- 4. For each neighbor coordinate,
-
- Compute the lower bound cost to the destination from here.
-
- If this coordinate is already on the either the
- master queue or the "tried" queue:
-
- 4a. If it was on either the "master" queue or the
- "tried" queue and the one on the queue has a
- smaller lower-bound than the neighbor, ignore
- this neighbor and go on to the next neighbor.
-
- 4b. If it was on the "master" queue and the new
- neighbor has a smaller lower bound,
- Move the one on the queue to a different
- queue called "subsumed".
-
- 4c. If it was on the "tried" queue and the new
- neighbor has a smaller lower bound,
- Move the one on the "tried" queue to the
- master queue, and update its lower bound
- value to be as the new neighbor, and
- update its backpointer appropriately.
-
- We now have a list of all neighbor coordinates that are either
- not on the queue already, or are cheaper than the ones on the
- queue.
-
- 5. Move the node at the head of the queue (the one whose neighbors
- we now have in a list) onto a different queue called "tried".
-
- 6. Sort this list of neighbors, and merge it into the master queue,
- keeping the master queue ordered by lower bound cost to destination.
-
- 7. Goto 2.
-
-My algorithm does all of this, plus a little more (the above doesn't really
-mention backpointers except in passing), EXCEPT: I don't do step 4c.
-I'm not convinced that this can ever occur if the lower bound rule isn't
-broken, and I have yet to see it occur. However, if as_winnow returns -1,
-this error has occurred.