Move call of ioq_makeiov() to its use, because calling it before
empth_select() is racy, as follows.
Player thread flushes output by calling io_output(player->iop, 1).
io_output() sets up iov[] to point to queued output. empth_select()
blocks on output.
Another thread sends a C_FLASH or C_INFORM message to this player.
This calls io_output(p->iop, 0). The output file descriptor has
become writable since the player thread blocked on it, so some output
gets written and dequeued.
The player thread resumes, writes out iov[] and dequeues. Any output
already written by the other thread gets duplicated. If the other
thread's dequeue operation freed struct io buffers, there's use after
free followed by double-free.
Oops when a stale copy is written back, i.e. the processor was yielded
since the copy was made. Such bugs are difficult to spot. Sequence
numbers catch them when they do actual harm (they also catch different
bugs). Generation numbers catch them even when they don't.
New ef_generation to count generations. Call new ef_make_stale() to
increment it whenever the processor may be yielded.
New struct emptypedstr member generation. To conserve space, make it
a bit-field of twelve bits, i.e. generations are only recorded modulo
2^12. Make sure all members of unit empobj_storage share it. It is
only used in copies; its value on disk and in the cache is
meaningless. Copies with generation other than ef_generation are
stale. Stale copies that are a multiple of 2^12 generations old can't
be detected, but that is sufficiently improbable.
Set generation to ef_generation by calling new ef_mark_fresh() when
making copies in ef_read() and ef_blank(). nav_ship() and
fltp_to_list() make copies without going through ef_read(), and
therefore need to call ef_mark_fresh() as well. Also call it in
obj_changed() to make check_sect_ok() & friends freshen their argument
when it is unchanged.
New must_be_fresh() oopses when its argument is stale. Call it in
ef_write() to catch write back of stale copies.
Unlike POSIX sockets, Windows sockets are not file descriptors, but
"OS handles", with a completely separate set of functions.
However, Windows can create a file descriptor for a socket, and return
a file descriptor's underlying handle. Use that instead of wrapping
our own file descriptors around Windows file descriptors and sockets.
Remove the wrapping machinery: MAX_FDS, enum fdmap_io_type, struct
fdmap, fdmap[], nfd, get_fd(), free_fd(), set_fd(), lookup_handle(),
lookup_fd().
Rewrite SOCKET_FUNCTION(), posix_accept(), posix_socket(),
posix_close(), ftruncate(), posix_open(), posix_read(), posix_write(),
fcntl().
Remove FILE_FUNCTION(), posix_fstat(), posix_lseek(),
SHARED_FUNCTION(), and fileno(), because the system's functions now
work fine.
posix_fsync() is used only #ifdef _WIN32, remove it, and call
_commit() directly.
The old code stuffed WSA error codes into errno, which doesn't work.
Use new w32_set_winsock_errno() to retrieve, convert & stuff into
errno. Adapt inet_ntop() to set the WSA error code instead of errno,
so it can use w32_set_winsock_errno().
Move EWOULDBLOCK from sys/socket.h to w32misc.h, and drop unused
ENOTSOCK, EAFNOSUPPORT.
Use SOCKET rather than int in Windows-specific code.
pthread.c's empth_select() returned 1 instead of 0 when empth_wakeup()
interrupted select(). This made io_input() attempt to read input,
which failed with WSAEWOULDBLOCK. The failure then got propagated all
the way up, and the player got logged out. Fix by returning 0 in that
case.
Player threads may only sleep under certain conditions. In
particular, they must not sleep while a command is being aborted by
the update or shutdown.
io.c should not know about that. Yet io_output_all() does, because it
needs to give up when update or shutdown interrupt it. The function
was introduced in Empire 2, but it didn't give up then. Fixed in
commit a7fa7dee, v4.2.22. The fix dragged unwanted knowledge of
command abortion into io.c.
To clean up this mess, io_output_all() has to go.
First user is io_write(). io_write() automatically flushes the queue.
In wait-mode, it calls io_output_all() when the queue is longer than
the bufsize, to attempt flushing the queue completely. In
no-wait-mode, it calls io_output() every bufsize bytes. Except the
test for that is screwy, so it actually misses some of the flush
conditions.
The automatic flush makes io_write() differ from io_gets(), which is
ugly. It wasn't present in BSD Empire 1.1. Remove it again, dropping
io_write()'s last argument.
Flush the queue in its callers pr_player() and upr_player() instead.
Provide new io_output_if_queue_long() for them. Requires new struct
iop member last_out to keep track of queue growth. pr_player() and
upr_player() call repeatedly until it makes no more progress. This
flushes a bit less eagerly in wait-mode, and a bit more eagerly in
non-wait mode.
Second user is recvclient(). It needs to flush the queue before
potentially sleeping in io_input(). Do that with a simple loop around
io_output(). No functional change there.
LWP and Windows implementations already did that. Rewrite the
pthreads implementation.
The write-bias makes the stupid play_wrlock_wanted busy wait in
dispatch() unnecessary. Remove it.
Return number of bytes written on success, -1 on error. In
particular, return zero when nothing was written because the queue was
empty, or because the write slept and got woken up, or because the
write refused to sleep.
Before, it instead returned the number of bytes remaining to be
written when empth_select() failed, when woken up from sleep, or
refusing to sleep. You couldn't tell from the return value whether
the call made progress writing out the queue.
The current callers don't actually notice the change.
Don't set IO_EOF when writev() returns zero. I don't think this could
happen, but it's wrong anyway, because a short write should not stop
future reads.
The blocking I/O option makes no sense in the server, because it
blocks the server process instead of the thread. In fact, it's been
unused since Empire 2, except for one place, where it was used
incorrectly, and got removed in the previous commit.
Make I/O non-blocking in io_open() unconditionally. Remove IO_NBLOCK
and io_noblocking().
Chainsaw used this together with the notify callback to make the iop
data type usable for sockets it listened on, so that io_select() could
multiplex them along with the sockets used for actual I/O.
io_select() became unused in Empire 2, and finally got removed in
commit 875d72a0, v4.2.13. That made the IO_NEWSOCK and the notify
callback defunct. The latter got removed in commit 7d5a6b81, v4.3.1.
Calculation of sleep duration suffered integer underflow for unsigned
time_t and arguments in the past. This made empth_sleep() sleep for
"a few" years instead of not at all.
Should be more portable to modern systems and could be less portable
to obsolete systems than the traditional sys/time.h sys/types.h
unistd.h incantation.
pthread.c's empth_select() returned -1 when empth_wakeup() interrupted
select(). The failure then got propagated all the way up, and the
player got logged out. Fix by returning 0 in that case. While there,
retry on EINTR, to match LWP. Also clarify comments.
Unused since 4.3.10. Can be used safely only in very special
circumstances.
Removal allows simplifying pthread.c and ntthread.c some. liblwp left
alone.
EMPTH_POSIX and EMPTH_W32 implementations rejected values other than a
single flag. Such values aren't used now, but it violates the
contract all the same.
Commit 08b94556 introduced the timeout parameter. The empthread
implementation could change it, at least on some systems, and its user
worked around a possible change. However, that behavior was not
documented, and it's inconvenient. Fix the pthread implementation,
and remove the workaround.
Remove the KillIdle thread. Add timeout to struct iop, initialized in
io_open(). Obey it in io_input() by passing it to empth_select(). If
empth_select() times out, report that back through io_input() to
recvclient() and player_login(). If player_login() receives a timeout
indication, print a message and terminate the session. If
recvclient() receives a timeout indication, flash a message to the
player and initiate a shut down the player's session.
Create WIN32 sys/time.h to define struct timeval. This creates some
conflicts with WIN32 windows.h definitions. Including windows.h in
show.c and info.c creates conflicts, so remove that. Modify service.c
to include sys/socket.h instead of windows.h to remove the conflict
with sys/time.h.
Dynamically allocate the string space for thread
name in WIN32 (ntthread.c) thread space. This
makes the WIN32 more consistent with the other
environments. It also addresses WIN32 issue of
the print width of 17 being used and only having
space for 16 characters in the fixed allocation.
The WIN32 version did not block when the sleep was
already reached by the time empth_sleep() did the
time remaining calculation. The other versions
of empth_sleep() do always yield.
This change fixes a bug where the threads were not treated
fairly. Before the fix, empth_yield would only yield to
threads already waiting hThreadMutex mutex. It would not
yield to other threads ready to run from the release of other
mutexes. An example of this is that the update task did
not start when force command was issued from script sequence.
Move stuff to untangle the ugly cyclic dependencies between the
archives built for selected subdirectories of src/lib/:
* Move common/io.c to empthread/ because it requires empthread stuff
* Move parts of subs/nstr.c to common/nstreval.c to satisfy
common/ef_verify.o
* Move getstarg.c getstring.c onearg.c from gen/ to subs/ because they
require stuff from there
* Move bridgefall.c check.c damage.c empobj.c journal.c maps.c
sectdamage.c from common/ to subs/ because they require stuff from
there
* Move cnumb.c from subs/ to common/ to satisfy common/type.o
* Move log.c fsize.c from common/ to gen/ because they really belong
there
* Move emp_config.c mapdist.c from gen/ to common/ because they really
belong there, and require stuff from libglobal.a
Also package as/ as libas.a to satisfy common/path.o.
Remaining dependencies:
lib needs
--------------------------------------------
libas.a libglobal.a
libcommon.a libas.a libglobal.a libgen.a
libgen.a
libglobal.a
liblwp.a libgen.a
libw32.a[*] libgen.a
[*] Except for service.o, which can only be linked into the server
Link order now: liblwp.a libcommon.a libas.a libgen.a libglobal.a
libw32.a. The position of libw32.a is not quite right, but works
anyway.
POSIX equivalents instead of using the WIN32 rand/srand functions.
The files were derived from GNU libc source.
(empth_threadMain) [_WIN32]: Remove the srandom() as the
POSIX equivalent are not thread specific as the WIN32 functions were.
that accidently got committed in rev 1.65. Was incorrect in rev 1.66.
(empth_sleep) [_WIN32]: Switch srand() to srandom() to be consistent
with the rest of the empire code.