Use the new path finder for sea & air, drop bestownedpath()
bestownedpath() is a rather simple-minded breadth-first search. It's
slower than the new path finder, and maintaining it in addition to the
new path finder makes no sense.
This way, we compute all distribution paths from the same center in
one go, and thus fully exploit the fast multiple paths from same
source capability of Dijkstra's algorithm.
Sorting by dist center increases the average length of runs from 4.5
to 73 for my continental test case, and from 3 to 10 for my island
test case.
Compared to the commit before the previous one, distribution path
assembly runs more than 40 times faster for my continental test case,
and more than 5 times faster for my island test case.
The new path finder now runs my continental test case more than 30
times faster than the old A*, and the island test case more than 6
times, in a fraction of the memory. This makes the continental
updates run 3.5 times faster, and the island updates 6% faster.
Distribution path assembly no longer dominates the continental
update's run time: it takes less than 10% instead of more than 70%.
Exploit fast "multiple paths from same source" in distribution
Dijkstra's algorithm can find multiple paths from the same source.
This is much faster than starting from scratch for every path.
Make distribution path assembly work that way. This speeds up runs of
distributions to the same center. The next commit will reorder path
searches to maximize the length of these runs. It also has benchmark
results.
Allocates four bytes per sector, actually uses only the first 4*n
bytes, where n is the number of distributing sectors.
Use the new path finder for land paths, drop old A*
This gets rid of the memory leak mentioned in the previous commit.
To get rid of the buffer overruns for long paths mentioned in the
previous commit, make BestLandPath() fail when path length exceeds
1023 characters.
assemble_dist_paths() and move_ground() pass buffers with a different
size. Eliminate assemble_dist_paths()'s buffer. Update now works
regardless of distribution distance (the distribute command still
limits to 1023, to be fixed in a later commit). Enlarge
move_ground()'s buffers. Doubles the length of paths accepted by
explore, move, and transport.
I use two test cases to benchmark the path finders: "continental" (Hvy
Metal 2 updates) and "island" (Hvy Plastic 2 updates).
The new path finder runs my tests around 3-4 times faster than the old
A* without its caches. That's enough to meet its cached performance
for "island", but it's only half as fast for "continental". Not for
long; big speedups are coming.
We've been using Phil Lapsley's A* library to find land paths since
Chainsaw 3. It's reasonably general, and uses relatively complex data
structures to conserve memory. Unfortunately, it occasionally leaks a
bit of memory (see commit 86a187c0), and is unsafe for long paths (see
commit e30dc417).
To speed it up, v4.2.2 added two caches: the neighbor cache and the
path cache.
The neighbor cache attempts to speed up lookup of adjacent sectors.
It allocates 6 pointers per sector for that. In my tests, this is
more, sometimes much more memory than the A* library uses. See commit 7edcd3ea on branch old-astar for its (modest) performance impact.
The path cache attempts to speed up the update's computation of
distribution path costs. There, A* runs many times. Each run finds
many shortest paths, of which only the one asked for is returned. The
path cache saves all of them, so that when one of them is needed
later, we can get it from the path cache instead of running A* again.
The cache is quite effective, but a bit of a memory hog (see commit a02d3e9f on branch old-astar).
I'm pretty sure I could speed up the path cache even more by reducing
its excessive memory consumption --- why store paths when we're only
interested in cost? But that's a bad idea, because the path cache
itself is a bad idea.
Finding many shortest paths from the same source has a well-known
efficient and simple solution: Dijkstra's algorithm[*].
A* is an extension of Dijkstra's algorithm. It computes a *single*
path faster than Dijkstra's. But it can't compute *many* shortest
paths from the same source as efficiently as Dijkstra's.
I could try to modify Phil's code to make it compute many shortest
paths from the same source efficiently: turn A* into its special case
Dijkstra's algorithm (at least for distribution path assembly), then
generalize it to the many paths problem. Of course, I'd also have to
track down its memory allocation bugs, and make it safe for long
paths.
Instead, I'm replacing it. This commit is the first step: a rather
unsophisticated implementation of Dijkstra's algorithm specialized to
hex maps. It works with simple data structures: an array for the hex
map (16 bytes per sector), and a binary heap for the priority queue
(16 bytes per sector, most of it never touched). This is more memory
than Phil's A* uses, but much less than Phil's A* with v4.2.2's
caches.
[*] To fully exploit Dijkstra's "many paths" capability, we need to
compute distribution paths in distribution center order.
Mostly to measure its effectiveness. Compile with AS_NO_PATH_CACHE
defined to disable it.
Turns out the path cache is quite effective. For my continental test
case (Hvy Metal 2 updates), it reduces the number of searches by a
factor of 18.5, speeding up distribution path assembly by a factor of
7. The price is memory: it uses 135 times more memory than the A*
library. For my island test case (Hvy Plastic 2 updates), I get 4
times search reduction, 3.5 times faster distribution path assembly,
36 times more memory.
Fix when best_path() prints A* performance statistics
Print them when A* actually runs, not when best_path() finds a path.
Statistics for unsuccessful runs were lost, and old statistics were
printed for path cache hits.
Why upgrade? I'm not a lawyer, but here's my take on the differences
to version 2:
* Software patents: better protection against abuse of patents to
prevent users from exercising the rights under the GPL. I doubt
we'll get hit with a patent suit, but it's a good move just on
general principles.
* License compatibility: compatible with more free licenses, i.e. can
"steal" more free software for use in Empire. I don't expect to steal
much, but it's nice to have the option.
* Definition of "source code": modernization of some details for today's
networked world, to make it easier to distribute the software. Not
really relevant to us now, as we normally distribute full source code.
* Tivoization: this is about putting GPL-licensed software in hardware,
then make the hardware refuse to run modified software. "Neat" trick
to effectively deny its users their rights under the GPL. Abuse was
"pioneered" by TiVo (popular digital video recorders). GPLv3 forbids
it. Unlikely to become a problem for us.
* Internationalization: more careful wording, to harden the license
outside the US. The lawyers tell us it better be done that way.
* License violations: friendlier way to deal with license violations.
This has come out of past experience enforcing the GPL.
* Additional permissions: Probably not relevant to us.
Also include myself in the list of principal authors.
BestLandPath(), BestDistPath() and best_path() are unsafe by design:
they take a path[] argument without a size, and blindly assume there's
enough space. When that's wrong, bp_path() overruns the caller's
buffer.
move_ground() and assemble_dist_paths() provide space for 512
characters. best(), dist(), path(), att_reacting_units(), s_commod()
and do_unit_move() provide space for 1024 characters.
A malicious player can arrange paths longer than that, but it takes a
lot of land.
BestAirPath() and BestShipPath() also take a path[] argument without a
size, but they're actually safe: bestownedpath() writes at most 100
(MAXROUTE) characters, perform_mission_bomb() provides space for 512,
sorde(), getpath(), do_unit_move() and nav_ship() for 1024.
Optimize assemble_dist_paths() for foreign distribution center
You can't distribute to a foreign sector. This case is relatively
rare. However, unsuccessful path search is relatively expensive, and
the extra check doesn't really slow down the common case.
Fix assemble_dist_paths()'s recovery from invalid dist center
The recovery avoided crashing here, but left the path costs undefined.
If they happend to be non-negative, dodistribute() still crashed. Set
the costs to -1 to avoid that.
Speed up export cost calculation in assemble_dist_paths()
Import and export paths enter the same sectors, except for the last
one. Compute export cost from import cost instead of reverting the
import path. Do it in dodistribute(), so that we need to store only
import costs.
SAVE_FINISH_PATHS hasn't been used since 4.2.2, remove it
Since 4.2.2, assemble_dist_paths() stores a dummy path instead of the
real path to the dist center. That's possible because distribution
doesn't actually use the path, only whether it exists.
The code to store and free the real path is still around, under #ifdef
SAVE_FINISH_PATHS. Remove it.
Fix bitmap overruns when WORLD_X * WORLD_Y not a multiple of 16
World-sized bitmaps were allocated with size WORLD_SZ() / 8, which
expands to (WORLD_X * WORLD_Y / 2) / 8. The divisions truncate unless
WORLD_X * WORLD_Y is a multiple of 16. The bitmaps were one byte too
small then. Bitmap overruns happen when:
* A lookout looks at one of the last sectors of the sector file.
Besides commands look and llook, this affects navigate and march
sub-command 'l'.
* Command spy spies into one of the last sectors of the sector file.
* A map or nmap (but not a bmap) shows one of the last sectors of the
sector file, or a sector that can see one of the last sectors
(visual range is two sectors at 100% efficiency). Besides commands
lmap, map, nmap, pmap, smap, this affects move and transport
sub-command 'm'.
Diagnosed with valgrind.
Already broken in BSD Empire 1.1 (bitmaps were on the stack then).
We know player != other. Because we can have only one player in state
PS_PLAYING per country, and we know other->state == PS_PLAYING, it
follows that player->cnum != other->cnum. Thus, no functional change.
Adds another call to getnatp() hidden in relations_with(), though.
Keeping that optimized isn't worth it.
Use relations_with() for getrel(NP, THEM) where NP isn't THEM
Replacing getrel(NP, THEM), where NP is known to be getnatp(US), by
relations_with(US, THEM) makes a difference only when US equals THEM.
Replace in places where it's obvious that they're not equal.
Adds a few calls to getnatp() hidden in relations_with(). Keeping
that optimized isn't worth it.
Use relations_with() for US==THEM || getrel(NP, THEM)
Replace patterns like "US == THEM || getrel(NP, THEM)...", where NP is
known to be getnatp(US), by "relations_with(US, THEM)...". No
functional change.
Adds a few calls to getnatp() hidden in relations_with(), though.
Keeping that optimized isn't worth it.
This removes a special case for POGO (#0). Before, unoccupied sectors
were treated as "own or allied" for POGO, but not for other deities.
Impact on callers:
* BestAirPath() is not affected, because the change is only reachable
with a non-null bigmap argument.
* sorde() and nav_ship() pass a non-zero ship owner. sorde() ensures
that itself, and prod_ship() does it for nav_ship().
* unit_path() passes the player number when called with a ship
argument, i.e. in the navigate command. Player number is zero for
POGO. Since deities can't navigate foreign ships, this can happen
only when POGO navigates dead ships. Yes, that's possible, needs
fixing.
* getpath() passes the player number (zero for POGO) when called with
argument P_SAILING, i.e. by the sail command.
Thus, the change makes navigate's and sail's path finding work for
POGO exactly like it does for other deities. That's fine.
No functional change, even though this changes rel[intruder] from
NEUTRAL to ALLIED. Uses of rel[]:
* getilists() and ac_encounter() compare rel[cn] to HOSTILE. No
change, because NEUTRAL and ALLIED are both greater than HOSTILE.
* ac_encounter() compares rel[cn] to ALLIED, but only when cn !=
plane_owner. Because it passes plane_owner as argument for
getilists() parameter intruder, rel[cn] can't refer to the changed
element of rel[] here.
The new value of rel[plane_owner] permits simplifying cn ==
plane_owner || rel[cn] == ALLIED to just rel[cn] == ALLIED.
No functional change, because the change affects only
notified[victim], which isn't used in the loop around
notify_coastguard(), and gets overwritten before the interdiction fire
loop.
Use relations_with() for getrel(getnatp(US), THEM) where US!=THEM
Replacing getrel(getnatp(US), THEM) by relations_with(US, THEM) makes
a difference only when US equals THEM. Replace in places where it's
obvious that they're not equal.
Note: getsect() sets player->owner to "player is god or owns this
sector". Thus, after getsect(..., §), sect.sct_own ==
player->cnum implies player->owner. Conversely, !player->owner
implies sect.sct_own != player->cnum. Similarly for getship(),
getplane() and nxtitem().
Use relations_with() where its different value doesn't matter
Switching from getrel() to relations_with() can change the value from
NEUTRAL to ALLIED. The change doesn't matter when the value's only
compared to HOSTILE, as both old and new value are greater than
HOSTILE. Likewise for >= NEUTRAL.
Relations checking with getrel() often needs a special case for "is
same country". If you forget, you get behavior appropriate for a
neutral foreign country, which is usually very wrong (see commit 16c68eb4 for an example).
Unlike getrel(), relations_with() considers countries allied to
themselves. Less dangerous. In fact, allied behavior is typically
just right, so the special case isn't even needed.
Make share_bmap() do nothing for sharing with oneself
Before, it overwrote '?', '.', ' ' in the bmap with the capitalized
country letter, but only for sectors the player owns. Pretty
harmless, just weird. It can't happen currently, because sharebmap
with self fails with "does not have friendly relations towards you".
The second patch hunk fixes a latent bug. Before, rejected deity
flashes led to a bogus "not logged on" message, now they lead to a
"not accepting" message. But deity flashes can't be rejected, so this
doesn't matter.
sendmessage() checked NF_FLASH on two places: when deciding whether to
send the message, and later when telling the player why it didn't send
a flash. This can race with the toggle command as follows: if a flash
could not be sent because the recipient's NF_FLASH was off, and the
recipient toggled it on before the flag was checked again, the flash
command claimed the sender wasn't logged on.
Use feels_like_helping() in dosupport(), lnd_support()
feels_like_helping() case cn == foe is missing in the code it
replaces. No difference in behavior, because:
* cn == foe && cn == friend can't happen. Because you can't get into
ground combat against yourself (assault, attack and paradrop don't
let you), friend != foe for support.
* cn == foe && cn != friend behaves the same: no support.
feels_like_helping() returns 0 because of the explicit case. The
replaced code doesn't support because cn can't be at war with
itself.
Mission execution first builds lists of eligible units, one list per
country. These lists are passed to perform_mission() one by one,
where they get freed.
Bugs:
* unit_interdict() didn't pass the list for the submarine's owner, but
build_mission_list_type() built one. Any submarine movement within
own submarine interdiction mission op areas leaked memory.
* dosupport() passed only lists for countries that actually support
(ally at war with victim), but build_mission_list_type() built lists
for all countries hostile to the victim. Ground combat within
support mission op areas countries that are hostile to one of the
party without actually supporting the other leaked memory.
* perform_mission() failed to free missiles targeting units.
Fixing the latter is straightforward.
Fix the first two by deciding whether a country acts on a mission
trigger before building any lists, in ground_interdict(),
unit_interdict(), dosupport(). Remove the code dealing with that from
build_mission_list_type() and the loops around perform_mission().
lnd_mar() and lnd_mar_one_sector() take an actor argument.
Nevertheless, they sometimes used player->cnum. Fortunately, they are
the same: all callers pass current player for actor. Normalize to
actor for consistency.
Fix land unit attack mobility cost out of allied sectors
Land units pay a mobility penalty when marching into a non-old-owned
sector without sector mobility, to slow them down in newly taken
sectors. Attacking land units pay this penalty regardless of sector
mobility.
When attacking out of an allied sector, the penalty was computed as if
the land unit was owned by that ally. Attacking sectors old-owned by
that ally was too cheap, and taking back one's own was too expensive.
Broken since attacking land units pay the "newly taken" mobility
penalty: commit 2e693275, v4.3.6.
Fix attack when attacking sector gets taken by ally
When an attacking sector got lost while the player was at a prompt,
and the new owner was allied to the player, the server got confused:
1. If the sector attacked with mil, the server let the ghost mil
attack, but not occupy.
2. If the sector was allied, the server reported the sector loss and
land units dropping out of the attack, but claimed the lost sector was
yours.
Fix 1. by dropping sectors from attack when they change owner away
from the player, regardless of relations. Side effect: also drops any
surviving land units there. Before, they dropped out only if the new
owner wasn't allied to the player. That change's okay.
* The check whether the attacker old-owns the attacked sector is
broken, because att_abort() uses sect.sct_oldown uninitialized.
Spotted by the Clang Static Analyzer.
* Its implementation in setrel() is somewhat scary. It's actually
okay, because that part of setrel() only runs within decl(). Other
callers don't reach it: update_main() because player->god != 0
there, and the rest because they never pass a rel < HOSTILE.
* Documentation is a bit vague.
SLOW_WAR hasn't been used in a public game in years. Fixing it is not
worth it, so remove it instead.
The previous commit's message claims the race can lead to duplicated
output, use after free, then double-free. That's correct only up to
the use after free. There is no double-free.
Heap corruption (double-free?) has been observed in Changeling,
though. Player logged in (still in sanctuary), map #, crashed within
removecc()'s free(io->data). Partial backtrace:
raise () from /lib64/libc.so.6
abort () from /lib64/libc.so.6
__libc_message () from /lib64/libc.so.6
malloc_printerr () from /lib64/libc.so.6
removecc (ioq=0x251fd10, cc=468) at ../src/lib/gen/ioqueue.c:350
ioq_dequeue (ioq=0x251fd10, cc=468) at ../src/lib/gen/ioqueue.c:135
io_output (iop=0x251fc90, wait=1) at ../src/lib/empthread/io.c:231
recvclient (cmd=0x258d8e0 "", size=1024) at ../src/lib/player/recvclient.c:82
getcommand (combufp=0x2557068 "map #1") at ../src/lib/player/empdis.c:84
I haven't been able to reproduce.
To hopefully catch ioqueue going south earlier, make ioq_dequeue()
oops when it can't dequeue as many bytes as requested.
Fix race in io_output() that can lead to double-free
Move call of ioq_makeiov() to its use, because calling it before
empth_select() is racy, as follows.
Player thread flushes output by calling io_output(player->iop, 1).
io_output() sets up iov[] to point to queued output. empth_select()
blocks on output.
Another thread sends a C_FLASH or C_INFORM message to this player.
This calls io_output(p->iop, 0). The output file descriptor has
become writable since the player thread blocked on it, so some output
gets written and dequeued.
The player thread resumes, writes out iov[] and dequeues. Any output
already written by the other thread gets duplicated. If the other
thread's dequeue operation freed struct io buffers, there's use after
free followed by double-free.
Don't write garbage to unused trade destination in trade file
struct trdstr members trd_x, trd_y are used only for teleporting
trades. For others, trad() wrote garbage coordinates to the trade
file. They weren't used except by xdump. Fortunately, even there
they're visible only to deities.
Write invalid coordinates instead. Do that in set() as well, so that
coordinates are valid only when we have a teleport destination.