Oops when a stale copy is written back, i.e. the processor was yielded
since the copy was made. Such bugs are difficult to spot. Sequence
numbers catch them when they do actual harm (they also catch different
bugs). Generation numbers catch them even when they don't.
New ef_generation to count generations. Call new ef_make_stale() to
increment it whenever the processor may be yielded.
New struct emptypedstr member generation. To conserve space, make it
a bit-field of twelve bits, i.e. generations are only recorded modulo
2^12. Make sure all members of unit empobj_storage share it. It is
only used in copies; its value on disk and in the cache is
meaningless. Copies with generation other than ef_generation are
stale. Stale copies that are a multiple of 2^12 generations old can't
be detected, but that is sufficiently improbable.
Set generation to ef_generation by calling new ef_mark_fresh() when
making copies in ef_read() and ef_blank(). nav_ship() and
fltp_to_list() make copies without going through ef_read(), and
therefore need to call ef_mark_fresh() as well. Also call it in
obj_changed() to make check_sect_ok() & friends freshen their argument
when it is unchanged.
New must_be_fresh() oopses when its argument is stale. Call it in
ef_write() to catch write back of stale copies.
Player threads may only sleep under certain conditions. In
particular, they must not sleep while a command is being aborted by
the update or shutdown.
io.c should not know about that. Yet io_output_all() does, because it
needs to give up when update or shutdown interrupt it. The function
was introduced in Empire 2, but it didn't give up then. Fixed in
commit a7fa7dee, v4.2.22. The fix dragged unwanted knowledge of
command abortion into io.c.
To clean up this mess, io_output_all() has to go.
First user is io_write(). io_write() automatically flushes the queue.
In wait-mode, it calls io_output_all() when the queue is longer than
the bufsize, to attempt flushing the queue completely. In
no-wait-mode, it calls io_output() every bufsize bytes. Except the
test for that is screwy, so it actually misses some of the flush
conditions.
The automatic flush makes io_write() differ from io_gets(), which is
ugly. It wasn't present in BSD Empire 1.1. Remove it again, dropping
io_write()'s last argument.
Flush the queue in its callers pr_player() and upr_player() instead.
Provide new io_output_if_queue_long() for them. Requires new struct
iop member last_out to keep track of queue growth. pr_player() and
upr_player() call repeatedly until it makes no more progress. This
flushes a bit less eagerly in wait-mode, and a bit more eagerly in
non-wait mode.
Second user is recvclient(). It needs to flush the queue before
potentially sleeping in io_input(). Do that with a simple loop around
io_output(). No functional change there.
Return number of bytes written on success, -1 on error. In
particular, return zero when nothing was written because the queue was
empty, or because the write slept and got woken up, or because the
write refused to sleep.
Before, it instead returned the number of bytes remaining to be
written when empth_select() failed, when woken up from sleep, or
refusing to sleep. You couldn't tell from the return value whether
the call made progress writing out the queue.
The current callers don't actually notice the change.
Don't set IO_EOF when writev() returns zero. I don't think this could
happen, but it's wrong anyway, because a short write should not stop
future reads.
The blocking I/O option makes no sense in the server, because it
blocks the server process instead of the thread. In fact, it's been
unused since Empire 2, except for one place, where it was used
incorrectly, and got removed in the previous commit.
Make I/O non-blocking in io_open() unconditionally. Remove IO_NBLOCK
and io_noblocking().
Chainsaw used this together with the notify callback to make the iop
data type usable for sockets it listened on, so that io_select() could
multiplex them along with the sockets used for actual I/O.
io_select() became unused in Empire 2, and finally got removed in
commit 875d72a0, v4.2.13. That made the IO_NEWSOCK and the notify
callback defunct. The latter got removed in commit 7d5a6b81, v4.3.1.
pthread.c's empth_select() returned -1 when empth_wakeup() interrupted
select(). The failure then got propagated all the way up, and the
player got logged out. Fix by returning 0 in that case. While there,
retry on EINTR, to match LWP. Also clarify comments.
Commit 08b94556 introduced the timeout parameter. The empthread
implementation could change it, at least on some systems, and its user
worked around a possible change. However, that behavior was not
documented, and it's inconvenient. Fix the pthread implementation,
and remove the workaround.
Remove the KillIdle thread. Add timeout to struct iop, initialized in
io_open(). Obey it in io_input() by passing it to empth_select(). If
empth_select() times out, report that back through io_input() to
recvclient() and player_login(). If player_login() receives a timeout
indication, print a message and terminate the session. If
recvclient() receives a timeout indication, flash a message to the
player and initiate a shut down the player's session.
Create WIN32 sys/time.h to define struct timeval. This creates some
conflicts with WIN32 windows.h definitions. Including windows.h in
show.c and info.c creates conflicts, so remove that. Modify service.c
to include sys/socket.h instead of windows.h to remove the conflict
with sys/time.h.
Move stuff to untangle the ugly cyclic dependencies between the
archives built for selected subdirectories of src/lib/:
* Move common/io.c to empthread/ because it requires empthread stuff
* Move parts of subs/nstr.c to common/nstreval.c to satisfy
common/ef_verify.o
* Move getstarg.c getstring.c onearg.c from gen/ to subs/ because they
require stuff from there
* Move bridgefall.c check.c damage.c empobj.c journal.c maps.c
sectdamage.c from common/ to subs/ because they require stuff from
there
* Move cnumb.c from subs/ to common/ to satisfy common/type.o
* Move log.c fsize.c from common/ to gen/ because they really belong
there
* Move emp_config.c mapdist.c from gen/ to common/ because they really
belong there, and require stuff from libglobal.a
Also package as/ as libas.a to satisfy common/path.o.
Remaining dependencies:
lib needs
--------------------------------------------
libas.a libglobal.a
libcommon.a libas.a libglobal.a libgen.a
libgen.a
libglobal.a
liblwp.a libgen.a
libw32.a[*] libgen.a
[*] Except for service.o, which can only be linked into the server
Link order now: liblwp.a libcommon.a libas.a libgen.a libglobal.a
libw32.a. The position of libw32.a is not quite right, but works
anyway.