Markus Armbruster [Thu, 27 Dec 2012 13:43:43 +0000 (14:43 +0100)]
Make news item merging deterministic and safe for year 2038
News reporting merges news items into recent items with same contents.
For that purpuse, we keep a small cache of recent items. When a new
item can't be merged into an item in the cache, the oldest item gets
evicted to make space for the new one.
ncache() evicts the first item with the smallest timestamp (struct
nwsstr member nws_when). Timestamps are in seconds, therefore clashes
are common, and eviction depends on exact timing. Such indeterminism
can make the smoke test fail.
Moreover, ncache() assumes timestamps cannot exceed 0x7fffffff. If
they do, it always evicts the slot 0. They will in 2038.
Fix by evicting round robin. This always evicts the oldest item.
Markus Armbruster [Thu, 27 Dec 2012 13:00:00 +0000 (14:00 +0100)]
Make smoke test's plane build more robust
The airfield is a sector taken from player 8. How many updates it
takes to convert is highly variable. If it converts late, the
airfield may not be constructed in time. This is currently the case
for me.
Move the airfield to a more dependable sector.
For me, the smoke test now fails frequently, because of differences in
news. To be fixed next.
Markus Armbruster [Tue, 18 Dec 2012 20:57:45 +0000 (21:57 +0100)]
Remove fairland from smoke test
Import xdump instead. To decouple the smoke test from future fairland
changes that result in different worlds.
Markus Armbruster [Mon, 17 Dec 2012 20:45:49 +0000 (21:45 +0100)]
Add fairland test to make check
Markus Armbruster [Mon, 17 Dec 2012 20:44:45 +0000 (21:44 +0100)]
Add files test to make check
Markus Armbruster [Mon, 17 Dec 2012 20:36:53 +0000 (21:36 +0100)]
Factor tests/test-common.sh out of tests/smoke-test
For reuse by future tests.
Markus Armbruster [Sun, 12 Aug 2012 18:54:34 +0000 (20:54 +0200)]
Make smoke test check the final empdump -x
Markus Armbruster [Sat, 14 Jul 2012 14:38:15 +0000 (16:38 +0200)]
Get rid of shell boilerplate in smoke test Empire batch files
Markus Armbruster [Sat, 14 Jul 2012 06:46:38 +0000 (08:46 +0200)]
New make target check
Just a smoke test so far, extracted from src/scripts/nightly/. This
makes the existing smoke test more easily accessible. Noteworthy
differences:
* Instead of patching the code to make output more stable, postprocess
the output to normalize it.
* Compare actual results to expected results instead of the previous
test run's results.
* Much faster. The old test harness used sleep liberally to "ensure"
things always happen in the same order.
Known shortcomings:
* The smoke test hangs when the server fails to complete startup, or
fails to terminate.
* Normalization of xdump hardcodes columns instead of getting them
from xdump meta.
* Normalization of time values in xdump is an ugly hack.
* xdump meta column type isn't normalized. Actual values can vary
between systems, because the width of enumeration types is
implementation-defined. The smoke test works only when they're
represented as int, which is the case on common systems.
* Currently expected to work only with thread package LWP and a
random() that behaves exactly like the one on my development system,
because:
- Thread scheduling is reliably deterministic only with LWP
- The PRN sequence produced by random() isn't portable
- Shell builtin kill appears not to do the job in MinGW
- The Windows server tries to run as service when -d isn't
specified
Further work is needed to address these shortcomings.
Getting C programs behave exactly the same on all systems is hard.
We'll likely run into system-dependent differences that upset the
smoke test. Floating-point computation seems particularly vulnerable.
Instead of updating src/scripts/nightly/ to use "make check", retire
it. It hasn't been used in quite a while. Investing more into our
homegrown auto-builder doesn't make sense, as canned auto-builders
such as Travis CI and Jenkins are readily available.
The shell scripts src/scripts/nightly/tests/?? become Empire batch
files tests/smoke/. The shell scripts are actually shell boilerplate
around Empire batch files. To make sure git recognizes the move, this
commit moves them unchanged. tests/smoke-test strips the boilerplate
before it feeds the batch files to the client. The next commit will
get rid fo that.
Markus Armbruster [Sun, 5 Aug 2012 15:03:08 +0000 (17:03 +0200)]
Don't put file descriptor values in thread names
The names are logged. Loging file descriptor values gets in the way
of regression testing, such as the smoke test that'll be committed
shortly.
Markus Armbruster [Sun, 5 Aug 2012 15:00:07 +0000 (17:00 +0200)]
Don't log threads initialization
Markus Armbruster [Tue, 8 Jan 2013 19:00:09 +0000 (20:00 +0100)]
Simplify lnd_take_casualty()'s land unit retreat code
Bonus: avoids "may be used uninitialized" compiler warnings (the code
was safe despite the warning).
Markus Armbruster [Tue, 8 Jan 2013 17:54:20 +0000 (18:54 +0100)]
Really fix accepting connections from "long" IPv6 address
Commit
ee01ac19 (v4.3.23) enlarged player member hostaddr from 32 to
46 characters, but missed natstr member nat_hostaddr. player_main()
copies hostaddr to nat_hostaddr. Can overrun the destination, but
fortunately just into nat_hostname.
Impact:
* Can makes praddr() print only a suffix of the address. Used by play
command, for player messages during login and logout, and for
logging.
* Can make player_main()'s test for "same address as last time" fail,
causing extra "Last connection" messages.
* Matching against econfig key privip is not affected.
* Journal event login is not affected.
Markus Armbruster [Tue, 8 Jan 2013 17:02:07 +0000 (18:02 +0100)]
Simplify head_meanwhile()
No functional change.
Markus Armbruster [Tue, 8 Jan 2013 16:49:30 +0000 (17:49 +0100)]
Take ship cost into account when picking missile interdiction target
Due to a typo, shp_missile_interdiction() picks the admissible target
with highest efficiency instead of the one with highest efficiency *
build cost.
Broken in commit
cd8d7423, v4.3.8.
Markus Armbruster [Tue, 8 Jan 2013 16:38:42 +0000 (17:38 +0100)]
Really fix setsector and setres not to wipe out concurrent updates
setsector() and setres() continue after check_sect_ok() fails.
Clobbers the updates that made check_sect_ok() fail, triggering a
seqno mismatch oops.
Commit
04a332a8 (v4.3.27) claimed to fix this, but actually only
suppressed the generation oops.
Markus Armbruster [Tue, 8 Jan 2013 16:37:37 +0000 (17:37 +0100)]
Really fix give not to wipe out concurrent updates
give() continues after check_sect_ok() fails. Clobbers the updates
that made check_sect_ok() fail, triggering a seqno mismatch oops.
Commit
b58c37e2 (v4.3.27) claimed to fix this, but actually only
suppressed the generation oops.
Markus Armbruster [Tue, 8 Jan 2013 14:59:59 +0000 (15:59 +0100)]
Drop resnoise()'s second parameter
All callers pass the same argument.
Markus Armbruster [Tue, 8 Jan 2013 14:52:27 +0000 (15:52 +0100)]
Fix setsector not to disclose number of landmines to occupier
When the deity sets the number of mines with setsector, the sector
owner (if any) is told the resulting number of mines. Even for
occupied sectors, where mines belong to the old owner, and thus
shouldn't be disclosed. Oops.
Fix setsector not to tell the sector owner anything then.
Markus Armbruster [Tue, 8 Jan 2013 13:43:01 +0000 (14:43 +0100)]
anti, give, grind take <SECTS> argument, fix their documentation
Markus Armbruster [Tue, 8 Jan 2013 13:41:30 +0000 (14:41 +0100)]
Make capital fail more nicely when sector is unsuitable
The command fails without an explanation then. Change it to print
something like "X,Y is not a capital or mountain owned by you."
Markus Armbruster [Tue, 8 Jan 2013 13:26:19 +0000 (14:26 +0100)]
Change capital to take a single sector as argument
Capital takes a <SECTS> argument, and picks the first suitable sector
it finds there. It fails if none can be found, or if the first one
found already is the capital (even when more suitable sectors follow).
Has always worked that way, but never documented.
I don't think the search feature is really useful, and documenting it
isn't worth my while. Change the command to take a <SECT> argument
instead, as documented.
Markus Armbruster [Tue, 8 Jan 2013 12:39:17 +0000 (13:39 +0100)]
Update copyright notice
Markus Armbruster [Sat, 11 Aug 2012 15:21:48 +0000 (17:21 +0200)]
Change GODNEWS reports not to affect headlines and relations
Option GODNEWS controls news reports give's N_GIFT, N_TAKE, and edit's
and setsector's N_AIDS, N_HURTS.
They affect news headlines because of their non-zero r_good_will.
N_TAKE and N_HURTS can downgrade relations because of their negative
r_good_will. All tolerable, except N_TAKE has actor and victim
reverted: the deity running the give command is the victim, and the
sector owner is the actor. Because of that, give with a negative
amount downgrades the deity's relations towards the sector owner.
Inappropriate.
Has always been that way. Chainsaw disabled these news at
compile-time; to enable you had to define GODNEWS (not documented
anywhere). Empire 4.2.0 made GODNEWS a proper option, enabled by
default.
Fix by setting their r_good_will to zero.
Markus Armbruster [Sat, 11 Aug 2012 14:46:25 +0000 (16:46 +0200)]
Fix flying commands for destination equal to assembly point
bomb, drop, fly, paradrop, recon and sweep fail when given a
destination sector equal to the assembly point. Broken in commit
404a76f7, v4.3.27. Reported by Tom Johnson.
Before that commit, getpath() returned NULL on error, "" when input is
an empty path, "h" when it's coordinates of the assembly point, and a
non-empty path otherwise.
The commit accidentally changed it to return "" instead of "h".
Instead of changing it back, make it return NULL when input is an
empty path, and change bomb() & friends to accept empty flight paths.
This also affects sail: it now fails when you give it an empty path,
just like bomb & friends. Path "h" still works.
Markus Armbruster [Sat, 11 Aug 2012 13:36:07 +0000 (15:36 +0200)]
Fix portability bug in configure test for Windows API
The test uses an erroneous non-directive within #ifdef _WIN32 to
signal that _WIN32 is defined. Some compilers choke on this even when
_WIN32 isn't defined. Observed with FreeBSD 4.10's gcc 2.95.4.
Broken in commit
c02468fd, v4.3.22. Standaline client build already
broken in commit
774b590f, v4.3.17.
Use an unmatched brace instead.
Markus Armbruster [Sun, 5 Aug 2012 07:23:11 +0000 (09:23 +0200)]
Polish empthread documentation somewhat
Markus Armbruster [Sat, 4 Aug 2012 18:15:15 +0000 (20:15 +0200)]
Open journal before daemonizing, so we can fail in foreground
Just like we open server.log. Also permits calling journal_prng()
right where we seed the PRNG.
Markus Armbruster [Sat, 4 Aug 2012 18:07:39 +0000 (20:07 +0200)]
Permit empth_self() before empth_init()
Next commit wants this.
Markus Armbruster [Sat, 4 Aug 2012 10:08:20 +0000 (12:08 +0200)]
Fix headline pasto in info lookout
Already present in BSD Empire 1.1. Reported by Harald Katzer.
Markus Armbruster [Sun, 1 Jul 2012 12:51:45 +0000 (14:51 +0200)]
Forbid selling conquered populace
Only relevant when the deity allows selling civilians by customizing
table item, which is probably a bad idea.
Markus Armbruster [Sun, 1 Jul 2012 12:33:49 +0000 (14:33 +0200)]
Forbid selling units with unsalable cargo, permit selling military
Deities can customize which commodities can be sold in table item.
Default is to allow anything but civilians and military. However,
this applies only to the commodity market, not to the unit market:
cargo of ships and land units is not restricted.
Make the two markets consistent: permit selling military by default,
forbid selling units carrying unsalable commodities. This outlaws
selling units carrying civilians by default.
Markus Armbruster [Sun, 1 Jul 2012 11:06:23 +0000 (13:06 +0200)]
Drop trade_desc()'s first argument
Markus Armbruster [Sun, 1 Jul 2012 11:03:05 +0000 (13:03 +0200)]
Drop unclean assignments in trade_desc()
Assigning to tp->trd_owner is unclean. Can be dropped safely, because
it has no effect: prior check_trade() drops all trades where the
assignment would change anything.
Markus Armbruster [Sun, 1 Jul 2012 10:21:58 +0000 (12:21 +0200)]
Clean up use of union empobj_storage * as parameter type
Use it only for functions that assign objects through a pointer
parameter. Anything else can and should use struct empobj *.
Markus Armbruster [Sun, 1 Jul 2012 10:13:36 +0000 (12:13 +0200)]
Replace trade_check_item_ok() by check_obj_ok()
Relaxes the sanity check of the argument's ef_type. Could be avoided,
but not worth the bother.
Markus Armbruster [Sun, 1 Jul 2012 10:06:37 +0000 (12:06 +0200)]
Factor check_obj_ok() out of the check_*_ok()
Markus Armbruster [Sun, 1 Jul 2012 09:53:53 +0000 (11:53 +0200)]
New ef_nameof_pretty()
Markus Armbruster [Sun, 1 Jul 2012 08:48:03 +0000 (10:48 +0200)]
Drop obj_changed() parameter sz
Get size from empfile[] instead.
Markus Armbruster [Sun, 1 Jul 2012 08:00:35 +0000 (10:00 +0200)]
Fix obj_changed() to check object exists
Relatively harmless, because these kinds of objects don't go away.
Markus Armbruster [Sat, 30 Jun 2012 19:24:36 +0000 (21:24 +0200)]
Scrapping ships and land units now spreads the plague
Markus Armbruster [Sat, 30 Jun 2012 19:21:49 +0000 (21:21 +0200)]
Don't let scrap give away civilians
Scrapping unloads everything. Even stuff that unload can't: foreign
civilians. Kill them off instead, like scuttle does.
Markus Armbruster [Sat, 30 Jun 2012 15:09:24 +0000 (17:09 +0200)]
Pilots and air cargo now spread the plague
Planes flying one-way with crew or cargo spread plague from their old
base to their new base. Planes dropping cargo spread plague from
their base to the drop's target sector.
Markus Armbruster [Sat, 30 Jun 2012 14:06:54 +0000 (16:06 +0200)]
Clarify info Plague slightly
Markus Armbruster [Sat, 30 Jun 2012 13:34:44 +0000 (15:34 +0200)]
Streamline plist initialization
msl_equip(), find_escorts() and perform_mission() memset() the plist,
then assign to all members but load. Just zero load instead, like
getilists(), msl_sel() and pln_sel() do.
Markus Armbruster [Sat, 30 Jun 2012 13:32:32 +0000 (15:32 +0200)]
Initialize struct plist member queue properly in msl_equip()
Harmless, because queue isn't actually used. Clean it up anyway.
Markus Armbruster [Mon, 25 Jun 2012 05:48:21 +0000 (07:48 +0200)]
scripts: Use mailx rather than mail, and drop bogus -e
Markus Armbruster [Sat, 23 Jun 2012 19:29:15 +0000 (21:29 +0200)]
Don't let fly and drop give away civilians
Flying them to a foreign destination magically changes their
allegiance. Prohibit that.
Equivalent change was already in commit
35887222 (v4.2.17) but got
reverted immediately (commit
20199b22), because fly and drop should
stay consistent with load, which let you give away civilians then. No
more since commit
92a366ce (v4.3.20). This change makes fly and drop
consistent with load again.
Markus Armbruster [Sat, 23 Jun 2012 18:49:48 +0000 (20:49 +0200)]
Replace pln_oneway_to_carrier_ok() by pln_can_land_on_carrier()
Avoids reading the target ship again.
Markus Armbruster [Sat, 23 Jun 2012 18:36:48 +0000 (20:36 +0200)]
Replace pln_onewaymission() by pln_where_to_land()
New function reads and returns target sector/ship. Avoids reading the
target sector unnecessarily. Callers receive the target ship, not
just its number. Next commit will put it to use.
Markus Armbruster [Sat, 23 Jun 2012 15:48:19 +0000 (17:48 +0200)]
Fix fly to permit flying civs to a carrier in an occupied sector
Broken in commit
35887222, v4.2.17.
Markus Armbruster [Sat, 23 Jun 2012 15:10:48 +0000 (17:10 +0200)]
Drop could_be_on_ship()'s load count parameters
Just one caller wants them. Inline that call, and simplify the
others.
Markus Armbruster [Sat, 23 Jun 2012 14:46:11 +0000 (16:46 +0200)]
Inline fit_plane_on_ship() and fit_plane_on_land()
Just one call site each.
Markus Armbruster [Sat, 23 Jun 2012 14:38:04 +0000 (16:38 +0200)]
Don't let planes fly to a carrier without sufficient space
We test whether the the carrier has space for each plane individually
instead of whether it has space for all of them. The planes that fit
land, the others abort and get teleported home. Abusable.
pln_oneway_to_carrier_ok() was created in commit
1127762c (v4.2.17) to
fix almost the same bug. It worked fine then, because
fit_plane_on_ship() worked with load counters, and incremented them.
Broken in commit
3e370da5 (v4.3.17), which made fit_plane_on_ship()
count the loaded planes, to permit the removal of load counters. But
unlike load counters, loaded planes don't change during
pln_oneway_to_carrier_ok(). Thus, each plane is checked individually.
Fix by tallying all the planes before checking for space.
Markus Armbruster [Sat, 23 Jun 2012 14:28:59 +0000 (16:28 +0200)]
Factor inc_shp_nplane() out of could_be_on_ship()
Markus Armbruster [Sat, 23 Jun 2012 14:20:20 +0000 (16:20 +0200)]
Factor ship_can_carry() out of could_be_on_ship()
Markus Armbruster [Mon, 11 Jun 2012 18:18:08 +0000 (20:18 +0200)]
Fix tend to refuse tending civilians to foreign ships
Broken when Chainsaw 2 added tending to allies.
Markus Armbruster [Mon, 11 Jun 2012 17:48:36 +0000 (19:48 +0200)]
Fix tend not to leak which commodities are loaded on friendlies
Tending a negative number of commodities takes from the target ships.
The target ships must be owned. Tend complains when the target
doesn't have the commodity loaded. It does that even for friendly
foreign ships. Don't.
Broken when Chainsaw 2 added tending to allies.
Markus Armbruster [Mon, 11 Jun 2012 17:39:23 +0000 (19:39 +0200)]
Fix tend from target not to stop on foreign target
Tending a negative number of commodities takes from the target ships.
When a target ship is foreign, tend silently stops. This is wrong.
Fix it to skip foreign target ships instead.
Broken when Chainsaw 2 added tending to allies.
Markus Armbruster [Mon, 11 Jun 2012 15:13:55 +0000 (17:13 +0200)]
Let march sub-command 'm' sweep own and allied landmines
Markus Armbruster [Mon, 11 Jun 2012 15:02:53 +0000 (17:02 +0200)]
Land units no longer sweep allied landmines
They don't hit them since commit
fe372539, v4.3.27. Sweeping was
forgotten then.
Closes #717591.
Markus Armbruster [Sun, 10 Jun 2012 15:36:07 +0000 (17:36 +0200)]
Fix info bdes on funny designation arguments
Quoting "?" was accidentally fixed in commit
90631d56, v4.3.11.
Update documentation accordingly.
Closes #736592.
Markus Armbruster [Sun, 10 Jun 2012 15:08:06 +0000 (17:08 +0200)]
Fix bmap commands not to parse empty flags argument as "revert"
Broken in commit
a00f9e20, v4.3.27.
Markus Armbruster [Sun, 10 Jun 2012 08:52:22 +0000 (10:52 +0200)]
Update copyright notice
Markus Armbruster [Sun, 10 Jun 2012 08:42:17 +0000 (10:42 +0200)]
Bump version to 4.3.31
Markus Armbruster [Tue, 22 May 2012 18:35:52 +0000 (20:35 +0200)]
Update change log again for 4.3.30
Markus Armbruster [Tue, 22 May 2012 18:17:16 +0000 (20:17 +0200)]
Disable damage to base when missile explodes on launch
When a missile explodes on launch, it has a 33% chance to damage its
base.
Unfortunately, damaging the base breaks callers that call msl_launch()
for each member of a list of missiles created by msl_sel() or
perform_mission(). Damage to the base can damage other missiles
there. Any copies of them in the list become stale. When
msl_launch() modifies and writes back such a stale copy, the damage
gets wiped out, triggering a seqno oops.
Affects missile interdiction and interception using missiles with
non-zero load. Stock game's ABMs have zero load, so interception is
safe there. Relatively harmless in practice. Broken in Empire 2.
Instead of fixing the bug, simply disable damage to the base for now.
Markus Armbruster [Fri, 18 May 2012 14:29:19 +0000 (16:29 +0200)]
Fix march not to wipe out concurrent updates
March code reads the land units into a land unit list, and writes them
back when it changes them, e.g. when a land unit stops. If a land
unit changes in the land unit file while it is in such a land unit
list, the copy in the land unit list becomes stale, and must not be
used.
To that end, do_unit_move() calls lnd_mar() after prompting for path
or destination. lnd_mar() re-reads all the land units.
Unfortunately, it still writes back stale copies in certain
circumstances. Known ways to trigger such writes:
* Deity loads land unit onto a ship or land unit
* Land unit's crew killed just right, e.g. by collateral damage from
interdiction, followed by additional updates, such as shell fire
damage
* Sector no longer owned or allied, e.g. allied sector captured by an
enemy (own sector would kill or retreat the land unit)
Writing a stale copy wipes out the updates that made the copy stale,
and triggers a seqno mismatch oops. For instance, damage that follows
killing of all crew by collateral damage from interdiction is wiped
out. If no damage follows, we still get a generation oops.
Markus Armbruster [Thu, 17 May 2012 18:33:34 +0000 (20:33 +0200)]
Fix navigate not to wipe out concurrent updates
Navigation code reads the ships into a ship list, and writes them back
when it changes them, e.g. when a ship stops. If a ship changes in
the ship file while it is in such a ship list, the copy in the ship
list becomes stale, and must not be used.
To that end, do_unit_move() calls shp_nav() after prompting for path
or destination. shp_nav() re-reads all the ships. Unfortunately, it
still writes back stale copies in certain circumstances. Known ways
to trigger such writes:
* Deity sets a sail path
* Ship's crew gone, e.g. killed by shell fire
* Sector no longer navigable, e.g. harbor shelled down, or bridge
built
Writing a stale copy wipes out the updates that made the copy stale,
and triggers a seqno mismatch oops. For instance, ship damage that
kills all crew while the ship is being navigated gets wiped out.
Ron Koenderink [Fri, 11 May 2012 02:59:50 +0000 (20:59 -0600)]
Fix Windows build: gettimeofday() and SHUT_WR missing
Commit
904822e3 introduced use of SHUT_WR, which Windows calls
SD_SEND. Add the obvious work-around.
Commit
49ae6a7b introduced use of gettimeofday(), which the Microsoft
CRT lacks. Add a replacement based on _ftime_s().
Markus Armbruster [Sat, 5 May 2012 14:18:14 +0000 (16:18 +0200)]
Update change log again for 4.3.30
Markus Armbruster [Sat, 5 May 2012 12:16:00 +0000 (14:16 +0200)]
Fix buffer overruns in fairland for island size zero
Fairland creates islands with size 1 + random() % (2 * is - 1), where
"is" is either chosen by the user (fourth command line argument) or
defaults to half the continent size (second command line argument).
Negative values are silently replaced by zero.
Not only does value zero make no sense, it also breaks the code: the
island size is always one then (because random() % -1 is zero), but
allocate_memory() provides only space for zero sectors in sectx[],
secty[] and sectc[]. This leads to buffer overruns in try_to_grow(),
find_coast(), elevate_land, set_coastal_flags(). Can smash the heap.
Fix by changing the lower bound from zero to one. Diagnosed with
valgrind. Has always been broken.
Markus Armbruster [Sat, 5 May 2012 11:46:15 +0000 (13:46 +0200)]
Fix an out-of-bounds subscript in fairland
elevate_land() tests for capital sector in three places. The third
one is broken: half of the test is done even for islands, subscripting
capx[] and possibly capy[] out of bounds. This could screw up
elevation (unlikely) or crash (even less likely). Diagnosed with
valgrind.
Broken since the test was added in Chainsaw 3.12. Parenthesis were
added blindly 4.0.11 to shut up the compiler. Reindentation (commit
9b7adfbe and
ef383c06, v4.2.13) made the bug stand out more, but it
still managed to hide in the general ugliness of fairland's code.
Markus Armbruster [Sat, 5 May 2012 07:17:00 +0000 (09:17 +0200)]
Fix typo in change log
Markus Armbruster [Sun, 29 Apr 2012 18:29:04 +0000 (20:29 +0200)]
Update change log again for 4.3.30
Markus Armbruster [Sun, 29 Apr 2012 16:50:26 +0000 (18:50 +0200)]
Start the makefile's dependency section with a comment
Just to separate it visually from the preceding section
Markus Armbruster [Sun, 29 Apr 2012 10:36:54 +0000 (12:36 +0200)]
Journal login before changing the player thread's name
The journal logs a thread name for each event. The player thread name
changes on entry to the playing phase. Connecting old and new name
isn't as easy as it should be:
Sun Apr 29 12:13:39 2012 Conn29 input coun POGO
Sun Apr 29 12:13:39 2012 Conn29 input pass peter
Sun Apr 29 12:13:39 2012 Conn29 input play
Sun Apr 29 12:13:39 2012 Play#0 login 0 127.0.0.1 armbru
Sun Apr 29 12:15:39 2012 Play#0 logout 0
To connect Conn29 with Play#0, you have to know that country#0 is
named POGO.
Fix that by logging login before the thread name change:
Sun Apr 29 12:17:41 2012 Conn29 input coun POGO
Sun Apr 29 12:17:41 2012 Conn29 input pass peter
Sun Apr 29 12:17:41 2012 Conn29 input play
Sun Apr 29 12:17:41 2012 Conn29 login 0 127.0.0.1 armbru
Sun Apr 29 12:19:41 2012 Play#0 logout 0
Now "Conn29 login 0" makes the connection obvious.
This involves moving journal_login() from player_main() before
empth_set_name() in its caller play_cmd(). Move journal_logout() as
well, for symmetry.
If player_main() fails, we now log login/logout instead of nothing in
the journal. That's okay. Note that before commit
c9f21c0e (v4.3.8),
we logged just login then.
Markus Armbruster [Sun, 29 Apr 2012 07:58:51 +0000 (09:58 +0200)]
Fix arm to require nuke and plane to be in the same sector
It happily arms a plane with a remote nuke. The nuke gets teleported
to the plane when the plane moves (a two-way sortie doesn't count as
move). Broken in 4.3.3. Reported by Harald Katzer.
Markus Armbruster [Thu, 26 Apr 2012 18:15:48 +0000 (20:15 +0200)]
Update change log for 4.3.30
Markus Armbruster [Wed, 28 Mar 2012 16:29:04 +0000 (18:29 +0200)]
Document login_grace_time and the shutdown phase properly
Markus Armbruster [Wed, 28 Mar 2012 16:24:46 +0000 (18:24 +0200)]
Don't send "idle connection terminated" in login phase
Message was introduced in commit
08b94556, v4.3.20. Revert this
change, because it's undocumented, and probably not useful for
clients.
Markus Armbruster [Tue, 27 Mar 2012 18:06:35 +0000 (20:06 +0200)]
Rename play_lock back to update_lock
It was renamed to play_lock because it synchronized not just updates
but also shutdown. Since the previous commit, it again only
synchronizes updates. Rename it back.
Also move its initialization next to shutdown_lock's.
Markus Armbruster [Tue, 27 Mar 2012 17:58:31 +0000 (19:58 +0200)]
Fix synchronization between shutdown and player threads
shutdwn() sets the EOF indicator, aborts the running command, if any,
forbids sleeping on I/O and wakes up the player thread, for all player
threads in state PS_PLAYING. It takes play_lock to prevent new
commands from running. It then waits up to 3s for player threads to
terminate, by polling player_next(), to let output buffers drain.
Issues:
1. Polling is lame.
2. New player threads can still enter state PS_PLAYING. They'll block
as soon as they try to run a command. Somehwat unclean.
3. We can exit before all player threads left state PS_PLAYING, losing
a treasury update, play time update, and log entries. Could happen
when player threads blocked on output until commit
90b3abc5 fixed
that; its commit message describes the bug's impact in more detail.
Since then, the bug shouldn't bite in practice, because player
threads should leave state PS_PLAYING quickly.
Fix by introducing shutdown_lock: player threads in state PS_PLAYING
hold it shared, shutdwn() takes it exclusive, instead of play_lock.
Takes care of the issues as follows:
3. shutdwn() waits until all player threads left state PS_PLAYING, no
matter how long it takes them.
2. New player threads block before entering state PS_PLAYING.
1. shutdwn() still polls up to 3s for player threads to terminate.
Still lame. Left for another day.
Markus Armbruster [Tue, 27 Mar 2012 17:21:38 +0000 (19:21 +0200)]
Start player thread shutdown grace time at shutdwn() entry
Before, it was started after all commands aborted. Shouldn't make a
difference in practice, as command abortion is supposed to be quick.
Markus Armbruster [Sun, 18 Mar 2012 19:23:18 +0000 (20:23 +0100)]
Clean up superfluous includes
Markus Armbruster [Sun, 18 Mar 2012 18:20:21 +0000 (19:20 +0100)]
Belatedly update convert's c_form
Commit
82c91665 (v4.3.16) removed its optional third argument without
updating c_form.
Markus Armbruster [Sun, 18 Mar 2012 18:09:29 +0000 (19:09 +0100)]
Document the header for empmod.c and trdsub.c in prototypes.h
Markus Armbruster [Sun, 18 Mar 2012 18:07:47 +0000 (19:07 +0100)]
Document execute()'s subtle use of player->aborted
Markus Armbruster [Sun, 18 Mar 2012 18:04:21 +0000 (19:04 +0100)]
io_shutdown() is now unused, remove
Markus Armbruster [Sun, 18 Mar 2012 17:30:39 +0000 (18:30 +0100)]
Change login command kill to kill less ruthlessly
The victim's connection closes without any explanation. Output may be
lost. This is because kill_cmd() kills by calling io_shutdown(),
which shuts down the socket and drains the I/O queues.
How this makes the victim's thread terminate is a bit subtle: shutting
down the socket makes it ready. If the victim's thread is waiting for
I/O, it wakes up. Since all further reads return EOF, and all further
writes fail, the command terminates quickly (short of inifinite loop
bugs), then the command loop, and finally the thread.
To make kill behave more nicely, change kill_cmd() to work exactly
like server shutdown: send a flash message to the victim, set his EOF
indicator, abort the command, forbid sleeping on I/O, wake up the
victim's thread. Just as reliable, but doesn't lose output.
If the victim's client fails to close his connection, the victim's
thread may still linger in state PS_SHUTDOWN for up to
login_grace_time (default 120s). An attacker could try to use that to
make the server run out of file descriptors or memory, but simply
connecting achieves the same effect more cheaply.
Markus Armbruster [Sun, 18 Mar 2012 17:24:51 +0000 (18:24 +0100)]
Separate max_idle_visitor from max_idle
Cut it to 5 minutes, from max_idle's 15.
Since max_idle now applies only to authenticated players, increasing
it is perfectly safe.
Markus Armbruster [Sun, 18 Mar 2012 17:11:35 +0000 (18:11 +0100)]
Separate login_grace_time from max_idle
max_idle applies in state PS_PLAYING, login_grace_time before (login,
state PS_INIT) and after (logout, state PS_SHUTDOWN).
Cut login_grace_time to two minutes, from max_idle's 15. Two minutes
is plenty to complete login and logout. Makes swamping the server
with connections slightly harder, as they get dropped faster. While
that makes sense all by itself, the real aim is making increasing
max_idle safe. The next commit will complete that job.
Markus Armbruster [Wed, 14 Mar 2012 19:22:17 +0000 (20:22 +0100)]
Fix unwanted player thread blocking on output during shutdown
shutdwn() disables blocking on I/O for all player threads in state
PS_PLAYING, by setting struct player member may_sleep to
PLAYER_SLEEP_NEVER. This ensures the player threads complete logout
quickly and reliably. A thread may still block on I/O in io_close()
called from player_delete(), since commit
904822e3, but that's okay,
because it happens after all game state updates.
Bug: if shutdwn() aborts a command, the player thread returns through
dispatch(), which resets may_sleep back to PLAYER_SLEEP_FREELY. Input
can't block regardless, because the EOF indicator is set, but output
can. When it happens, the player thread may not complete logout
before shutdwn() terminates the process.
This can make us lose a treasury update (similar to the bug fixed by
commit
bdc1c40f; the relevant bug description is in commit note
6f8ca87f), play time update, and log entries.
How? There are two paths from dispatch() to player_delete(). Here's
the first one:
1. command()
Doesn't print since dispatch() returns 0 when it resets may_sleep
2. player_main()
Loop and call status()
3. status()
If the command set dolcost to a non-trivial amount, print it
Charge dolcost
If player went broke or became solvent, notify him
Charge time used
Return 0, because shutdwn() set the EOF indicator
4. player_main()
Break the loop
Charge time used
print Bye-bye
journal.log the logout
5. play_cmd()
server.log the logout
6. player_login()
Loop
Try to flush output
get EOF, break loop
print so long
call player_delete()
Ways the bug can bite:
A. When we block in 4. print Bye-bye, we can fail to log.
B. When we block in 3. print broke/solvent notification, we can
additionally fail to charge time used.
C. When we block in 3. print dolcost, we can additionally fail to
charge dolcost.
Note: B. and C. couldn't happen before commit
bdc1c40f. Instead,
something just like C happened always, whether player thread blocked
or not.
The second path:
1. execute()
Loop and call status()
2. status()
As above
3. execute()
break the loop
4. dispatch()
Continue with the first path
No additional ways to bite.
Fix by avoiding the may_sleep reset when the player thread is on its
way to terminate: may not sleep and has its EOF indicator set.
Broken in commit
0a4d77e9, v4.3.23.
Markus Armbruster [Sun, 11 Mar 2012 14:07:48 +0000 (15:07 +0100)]
Fix pr_player() and upr_player() to obey max_idle
The output queue flush can block indefinitely. Permits a client to
hog the thread indefinitely by not reading output.
Broken in commit
08b94556 (v4.3.20) "Reimplement max_idle without a
separate thread". Until then, the idle thread aborted a stuck attempt
to flush output.
Denial of service seems possible.
Markus Armbruster [Sun, 11 Mar 2012 13:56:41 +0000 (14:56 +0100)]
Factor player_output_some() out of pr_player(), upr_player()
Markus Armbruster [Sun, 11 Mar 2012 11:35:18 +0000 (12:35 +0100)]
Fix recvclient() to obey max_idle for output, too
recvclient() flushes the output queue before receiving input. The
receive obeys max_idle, the flush doesn't.
Broken in commit
08b94556 (v4.3.20) "Reimplement max_idle without a
separate thread". Until then, the idle thread aborted a stuck attempt
to flush output.
Markus Armbruster [Sun, 11 Mar 2012 11:32:17 +0000 (12:32 +0100)]
Clean up how recvclient() deals with command abortion
We must not block in io_input() after command abortion unblocked
io_output(). Instead of checking player->aborted, compute the
deadline according to player->may_sleep, like we do for io_output().
Markus Armbruster [Sun, 11 Mar 2012 10:49:46 +0000 (11:49 +0100)]
Fix player_login() to obey max_idle for output, too
player_login() flushes the output queue before receiving input. The
receive obeys max_idle, the flush doesn't. Which means a client could
hog the thread indefinitely.
Broken in commit
08b94556 (v4.3.20) "Reimplement max_idle without a
separate thread". Until then, the idle thread aborted a stuck attempt
to flush output.
Denial of service seems possible.
Markus Armbruster [Sun, 11 Mar 2012 10:43:38 +0000 (11:43 +0100)]
Flush all output before reading a login command, not just some
Before, a client could theoretically make the output queue grow
without bounds.
Markus Armbruster [Sun, 11 Mar 2012 09:06:04 +0000 (10:06 +0100)]
Fix io_close() to obey deadline for output, too
A client can delay thread exit indefinitely by not reading output.
Broken in commit
08b94556 (v4.3.20) "Reimplement max_idle without a
separate thread". Until then, the idle thread aborted a stuck attempt
to flush output.
Denial of service seems possible.
Note that commit
904822e3 moved flushing the output queue from
player_login() to io_close(). It also made io_close() wait for the
client to close the connection. That wait obeys the deadline.