3 Empire is designed as a smart server with dumb clients. An Empire
4 client need to know nothing about the game. Even telnet would do. In
5 fact, emp_client is just as a slightly specialized telnet.
7 In such a design, presentation is in the server, and it is designed
8 for human consumption. Ideally, presentation and logic are cleanly
9 separated, for easy selection between different presentations.
10 There's no such separation in the Empire server, and separating them
11 now would be a herculean effort.
13 Thus, smart clients have to work with output designed for humans.
14 That's not especially hard, just an awful lot of tedious work, and the
15 result gets easily broken by minor changes in output format, which
18 Instead of making smart clients parse output of commands designed for
19 humans, one can add commands designed for machines. Such commands can
20 share the code implementing game rules with their counterparts for
21 humans. To do that cleanly means separating logic and presentation.
22 Implementing them with their own copy of the code is no good --- how
23 would you ensure that the two copies implement precisely the same game
26 Except for commands that actually don't do anything. There's a useful
27 class of such commands: commands to show game configuration and state
28 without altering it. The only game rules involved are those that
29 govern who gets to see what. Ensuring that those are satisfied is
32 Empire has had one such command since the beginning: dump. Empire
33 4.0.6 added more: sdump, ldump, pdump and ndump. 4.0.7 added lost and
34 support for incremental dumps. These commands have served smart
35 clients well. However, they cover just the most important part of the
36 game state (sectors, ships, planes, land units, nukes), and no game
37 configuration. They are not quite complete even for what they attempt
38 to cover. Finally, their output is harder to parse than necessary.
40 The xdump command is designed to be the dump to end all dumps.
42 Like many good ideas, xdump has secondary uses. Dumping game state as
43 text can be one half of a game export/import facility. Useful to
44 migrate games to different machines or even, with some text mangling
45 perhaps, different server versions. We will see below that xdump
46 isn't quite sufficient for that, but it's a start.
48 If you can import game state, you can import game configuration, or in
49 other words: customize your game. As we will see, configuration files
50 have different requirements, which xdump doesn't satisfy without some
53 If game import code can edit everything, then a deity command capable
54 of editing everything is possible. Proof-of-concept code exists (not
58 Analysis of the data to dump
60 Game state consists of a fixed set of table files (sector, ship, ...),
61 telegram files, and a few miscellaneous files. Game configuration
62 consists of a fixed set of configuration tables and scalar parameters.
64 A table is an ordered set of records (table rows). All records have
65 the same fields (table columns), which are statically typed.
67 Fields may be integers, floating-point numbers, x- or y-coordinates,
68 symbols and symbol sets encoded as integers in a way specific to the
69 server version, and character arrays. Configuration table fields may
70 be pointers to zero-terminated strings, null pointers allowed. No
71 other pointers occur. Unions do not occur.
78 * Capable to dump all tables.
80 * Can do incremental dumps.
84 * Output is reasonably compact.
86 * Output is trivial to parse. Triviality test: if it's easy in AWK, C
87 (no lex & yacc, just stdio), Lisp (just reader) and Perl (base
88 language, no modules), then it's trivial enough.
90 * Output identifies itself.
92 * Output is self-contained; symbol encoding is explicit.
94 * KISS: keep it simple, keep it stupid.
98 * Generality. We're not trying to design a general mechanism for
101 * Completeness. We're not trying to dump stuff other than tables.
103 * Abstraction. We're not trying to hide how things are stored in the
104 server. When storage changes, xdump output will change as well, and
105 consumers need to be updated. This is not because abstraction
106 wouldn't be nice to have, just because we don't feel up to the task
112 Traditional dumps have a dump function for every table. These
113 functions are simple, but exceedingly dull and repetitive.
115 The selector code works differently. Each table has a descriptor,
116 which among other things defines a dictionary of selector descriptors.
117 A selector descriptor describes a field (table column) visible to
118 players. This is what we call meta-data (data about data). The
119 selector code knows nothing about the individual tables, it just
120 interprets meta-data. That's smart, as it keeps the dull, repetitive
121 parts in more easily maintainable meta-data rather than code.
123 xdump follows the selector design, and uses the existing selector
124 meta-data. This requires extending the meta-data to configuration
125 tables, which weren't previously covered. It also requires some
126 generalization of selector descriptors, so that all fields can be
129 To sum up, meta-data consists of a table of tables, and for each table
130 a table of selectors (table of columns, so to speak). It is specific
131 to the server version and how it is compiled on the host.
133 To interpret a table xdump, you need its meta-data, because without it
134 you have no idea what the columns mean. As meta-data is just a bunch
135 of tables, xdump can dump it. But now you need meta-meta-data to make
136 sense of the meta-data. Fortunately, meta-meta-data is the same for
137 all xdumps, and therefore the recursion terminates with a single
140 xdump dumps symbols and symbol sets as integers. To decode them, you
141 need to know what the symbol numbers and symbol set bits mean. For
142 this purpose, field meta-data includes the table ID of a symbol table.
143 A symbol table is a table of value-name pairs, with the value in the
144 leftmost column. You decode a symbol value by looking it up in the
145 symbol table. You decode a symbol set value by looking up its bits
146 (powers of two) in the symbol table.
148 Some integer fields are actually keys in other tables. For instance,
149 ship field type is a key in the table of ship types ship-chr, and
150 plane field ship is a key in the ship table. Key value -1 is special:
151 it's a null key. Meta-data encodes these table reference just like
152 for symbols: the meta-data has the ID of the referenced table, and
153 that table has the key in the leftmost column. Obviously, that
154 leftmost column is a table key as well, referencing the table itself.
156 A table with its key in the leftmost column can be dumped partially.
157 Without such a key, you need to count records to find the record
158 index, and that works only if you can see a prefix of the complete
161 The special table "ver" collects all scalar configuration parameters
162 in a single record. It does not occur in the table of tables.
165 Syntax of xdump command
170 The xdump output Language
172 Because the output is to be parsed by machines, it needs to be
173 precisely specified. We use EBNF (ISO 14977) for syntax, except we
174 use '-' in meta-identifiers and omit the concatenation symbol ','.
176 table = header { record } footer ;
177 header = "XDUMP" space [ "meta" space ] name space timestamp newline ;
178 name = name-chr { name-chr } ;
179 name-chr = ? ASCII characters 33..126 ? ;
181 footer = "/" number newline ;
182 record = [ fields ] newline ;
183 fields = field { space field } ;
184 field = intnum | flonum | string ;
185 intnum = ? integer in printf %d format ? ;
186 flonum = ? floating-point in printf %g format ? ;
188 | '"' { str-char } '"' ;
189 str-char = "\\" octal-digit octal-digit octal-digit
190 | ? ASCII characters 33..126 except '"' and '\\' ? ;
191 octal-digit = ? '0'..'7' ? ;
193 newline = ? ASCII character 10 ? ;
197 * The syntax for flonum is debatable. Precise conversion between
198 floating-point and decimal is hard, and C libraries are not required
199 to be precise. Using C99's %a format for flonum would avoid the
200 issue, but some programming environments may have trouble converting
201 that back to floating-point. We may change to %a anyway in the
202 future. Clients are advised to accept both.
204 * Strings syntax could perhaps profit from the remaining C escape
205 sequences. Except for '\"': adding that would complicate regular
206 expressions matching the string, and thus violate the `trivial to
209 * Space is to be taken literally: a single space character. Not a
210 non-empty sequence of whitespace.
214 * The table name in the header is one of the names in xdump table.
216 * The timestamp increases monotonically. It has a noticeable
217 granularity: game state may change between an xdump and the next
218 timestamp increase. If the table has a timestamp field, clients can
219 xdump incrementally by using a conditional ?timestamp>T, where T is
220 one less than the timestamp received with the last xdump of that
223 Timestamp values are currently seconds since the epoch, but this
224 might change, and clients are advised not to rely on it.
226 * The number in the footer matches the number of records.
228 * Fields match their meta-data (see Meta-Data below).
230 * "nil" represents a null string (which is not the same as an empty
231 string). Otherwise, fields are to be interpreted just like C
237 Table meta-data is in xdump table. Fields:
239 * uid: The table ID, key for xdump table. IDs depend on the server
240 version; clients should not hard-code them. This is the leftmost
243 * name: The table name. Clients may identify tables by name.
245 Field meta-data for table T is in xdump meta T. The order of fields
246 in the xdump T matches the order of records in xdump meta T. Fields
249 * name: The field name. Matches the selector name. Clients may
250 identify fields by name. This is the leftmost field.
252 * type: The field's data type, a symbol. Clients should use this only
253 as key for the symbol table. Symbols are:
254 - "d", field uses intnum syntax
255 - "g", field uses flonum syntax
256 - "s", field uses string syntax
257 - "c", field uses string syntax
259 * flags: The field's flags, a symbol set. Flags are:
260 - "deity", field visible only to deities
261 - "extra", field not to be dumped
262 - "const", field cannot be changed (see xundump below)
263 - "bits", field is a symbol set, field type must encode symbol "d",
264 field table must not be -1.
266 * len: If non-zero, then the record encodes an array with that many
267 elements. If field type encodes symbol "c", it is a character
268 array, which is dumped as a single string field. Else, the array is
269 dumped as len fields.
271 * table: Key for xdump table. Unless -1, it defines the table
272 referenced by the field value. Field type must encode symbol "d"
277 * value: The symbol's encoding as integer. If the symbol can be
278 element of a symbol set, this is a power of two.
280 * name: The symbol's name.
283 Notes on xdump Implementation
285 Overall impact on the server code is low.
287 To keeps xdump simple, storage of game state and game configuration
288 tables has been unified under the common empfile abstraction, making
289 nxtitem-iterators and selectors equally applicable to all tables.
291 xdump required a few extensions to meta-data, which may become useful
292 in other places as well:
294 * Selectors can now deal with arrays (revived struct castr member
295 ca_len). Not yet available on the Empire command line.
297 * Selector meta-data can now express that a selector value is a key
298 for another table (new struct castr member ca_table). The selector
299 code doesn't use that, yet.
301 * Selector flag NSC_EXTRA to flag redundant selectors, so that xdump
304 Meta-data is in empfile[] (table meta-data), src/lib/global/nsc.c
305 (selector meta-data), src/lib/global/symbol.c (symbol tables). The
306 command is in src/lib/commands/xdump.c, unsurprisingly.
309 Hints on Using xdump in Clients
311 Let's explore how to dump a game. To make sense of a table, we need
312 its meta-data, and to make sense of that table, we need meta-meta
313 data. So we start with that:
315 [14:640] Command : xdump meta meta
316 XDUMP meta meta 1139555204
324 To interpret this table, we have to know the field names and their
325 meanings. Clients hard-code them. They should be prepared to accept
326 and ignore additional fields, and to cope with changes in field order,
327 except they may rely on "name" coming first.
329 A word on hard-coding. Clients hard-code *names*. The numbers used
330 for table IDs and to encode symbols are none of the client's business.
332 The encoding doesn't normally change within a game. Except when the
333 game is migrated to a sufficiently different server. That's a
334 difficult and risky thing to do, especially as there are no tools to
335 help with migrating (yet). Clients may wish to provide for such
336 changes anyway, by decoupling the client's encoding from the server's,
337 and dumping fresh meta-data on login. Incremental meta-data dump
338 would be nice to have.
340 So we don't know how symbol type and symbol set flags are encoded. To
341 decode them, we need their symbol tables. However, we need flags and
342 type only for tables we don't know, and there's one more table we do
343 know, namely the table of tables. Let's dump that next, starting with
346 [31:640] Command : xdump meta table
347 XDUMP meta table 1139556230
352 Because xdump table is referenced from elsewhere (xdump meta meta
353 field table), the leftmost field must contain the key. Thus, the
354 leftmost field's meta-data field table must be the table ID of xdump
355 table itself. Let's try it:
357 [30:640] Command : xdump 26 *
358 XDUMP table 1139556210
371 It worked! Mind that the special table "ver" is not in the table of
374 Now dump the two symbol tables we postponed. Because xdump accepts
375 table IDs as well as names, we don't have to know their names:
377 [14:640] Command : xdump meta 32
378 XDUMP meta meta-type 1139555298
383 [15:640] Command : xdump 32 *
384 XDUMP meta-type 1139555826
401 [15:640] Command : xdump meta 33
402 XDUMP meta meta-flags 1139555303
407 [24:640] Command : xdump 33 *
408 XDUMP meta-flags 1139555829
415 We now have complete meta-meta information:
417 name type flags len table
418 -----------------------------------------
420 type d (const) 0 meta-type
421 flags d (bits const) 0 meta-flags
425 Dumping the remaining tables is easy: just walk the table of tables.
426 Here's the first one:
428 [36:640] Command : xdump meta 0
429 XDUMP meta sect 1139556498
437 A whole load of tables referenced! Only one of them (not shown above)
440 owner references table nat. No surprise.
442 xloc and yloc together reference the sector table, but that's not
443 expressed in meta-data (yet).
445 Let's stop here before this gets too long and boring. Experiment
446 yourself! Check out example Perl code src/xdump.pl.
449 Analysis of xdump as Configuration File Format
451 xdump makes a lousy configuration format because it is unwieldy to
452 edit for humans. That's because configuration files have different
453 requirements than dumps:
455 * Can be edited by humans with common tools, including text editors
458 Using text editors requires a nice fixed-width table layout.
459 Spreadsheet import requires trivial field separation. Tab character
460 field separator or fixed width columns should do. The syntax should
461 allow all that, but not require it.
465 - xdump's rigid horizontal and vertical spacing makes it impossible
466 to align things visually.
468 - xdump uses one line per record, which can lead to excessively long
471 - xdump's string syntax requires octal escape for space.
475 * Each table is self-contained. You don't have to look into other
476 tables to make sense of it.
478 This conflicts with xdump's separation of data and meta-data. You
479 need the table's meta-data to identify fields, and the referenced
480 symbol tables to decode symbols.
482 * Easy to parse. Don't compromise legibility just to please some dumb
485 Since we're trying to apply xdump to the configuration file problem,
486 we get an additional requirement:
488 * Reasonably close to xdump. Translation between machine-readable and
489 human-readable should be straightforward, if meta-data is available.
491 This leads to a human-readable dialect of the xdump language.
494 Human-Readable xdump Language
496 Fundamental difference to basic, machine-readable xdump: the rigid
497 single space between fields is replaced by the rule known from
498 programming languages: whitespace (space and tab) separates tokens and
499 is otherwise ignored. The space non-terminal is no longer needed.
501 Rationale: This allows visual alignment of columns and free mixing of
502 space and tab characters.
504 Comments start with "#" and extend to the end of the line. They are
505 equivalent to a newline.
507 Rationale: Follow econfig syntax.
509 Tables with a record uid in the leftmost field can be `split
510 vertically' into multiple parts. Each part must contain the same set
511 of records. The leftmost field must be repeated in each part. Other
512 fields may be repeated. Repeated fields must be the same in all
513 parts. Naturally, the parts together must provide the same fields as
514 a table that is not split.
516 Rationale: This is the cure for long lines. Line continuation would
517 be simpler, but turns out to be illegible. Requiring record uid is
518 not technically necessary, as counting records works the same whether
519 a table is split or not. Except humans can't count. Perhaps this
520 should be a recommendation for use rather than part of the language.
526 header = "config" name newline { colhdr } newline ;
527 colhdr = identifier [ "(" ( intnum | identifier ) ")" ] [ "..." ] ;
528 footer = "/config" newline ;
530 If colhdr ends with "...", the table is continued in another part,
531 which shall follow immediately.
535 - The xdump needs to identify itself as human-readable, hence change
536 from "XDUMP" to "config".
538 - The timestamp in the header is useless for the applications we
539 have in mind for human-readable xdumps. The number of records in
540 the footer is of marginal value at best, and a pain for humans to
543 - The column header is due to the self-containedness requirement.
544 It contains just the essential bit of meta-data: the column name.
548 field = intnum | flonum | string | symbol | symset ;
552 - Syntax for symbols and sets of symbols is due to the
553 self-containedness requirement. Machine-readable xdump gets away
554 with just numbers, which have to be decoded using meta-data.
556 * Friendlier numbers and strings:
558 flonum = ? floating-point in scanf %g format ? ;
559 str-char = "\\" octal-digit octal-digit octal-digit
560 | ? ASCII characters 32..126 except '"' and '\\' ? ;
564 - Machine-readable floating-point syntax is too rigid. Accept
565 everything that scanf does. Could also change intnum to %i
566 format, which accepts octal and hexadecimal in C syntax, but that
567 seems not worth the documentation bother.
569 - Machine-readable syntax requires \040 instead of space in strings
570 to allow trivial splitting into fields. This is unacceptable here
571 due to the legibility requirement, hence the change to str-char.
573 * Parse nil as symbol:
575 string = '"' { str-char } '"' ;
577 Rationale: This is a technicality required to keep the parse
582 symbol = identifier ;
583 symset = "(" { symbol } ")" ;
585 The special symbol "nil" is to be interpreted as null string.
589 - The symbol set syntax is the simplest that could work. We need to
590 allow space between the symbols for legibility anyway, so why not
591 make it the delimiter. A stop token is required to find the end
592 of the field, and a start token is useful for distinguishing
593 between symbol and symset. Bracketing with some kind of
594 parenthesis is an obvious solution.
596 The resulting sub-language for records is a superset of
597 machine-readable sub-language for records.
600 See src/lib/global/*.config for examples.
602 Human-readable xdump still has its shortcomings:
604 * Symbolic references work only with symbol tables. Consider sect-chr
605 selector prd, which is a key for table product. xdump should
606 support use of product selector sname values as keys. Same for
607 product selectors ctype and type, which should support item selector
610 * item selector pkg is an array indexed by values in symbol table
611 packing. The column header should support symbolic index values
615 Notes on Table Configuration Implementation
617 econfig key custom_tables lists table configuration files. At this
618 time, reading a custom table merges it with the built-in table, then
619 truncates the result after the last record read from the custom table.
621 Some of the tables are rather ugly in C, and cumbersome to edit. We
622 thus moved them to configuration files (src/lib/global/*.config). The
623 server reads them from builtindir before reading custom tables.
625 The code dealing with these files is in src/lib/common/conftab.c.
627 Actual work is done by src/lib/common/xundump.c, which accepts both
628 human-readable and machine-readable input. The parser is not precise;
629 it accepts human-readable syntax even within tables whose header marks
630 them machine-readable.
632 Configuration tables contain values that are not meant to be
633 customized. For instance, meta-data and symbol tables reflect the
634 encoding of C language constructs in the server. Selector flag
635 NSC_CONST marks them, so that the code can prohibit changes.
637 All tables are checked against meta-data on server startup by
638 ef_verify(). More elaborate checking would be nice, and probably
639 requires additional meta-data.
642 Appendix: Empire 3 C_SYNC --- A Cautionary Tale
644 Clients are just as important as the server, and it's too darn hard to
645 write a good client. In 1995, Ken Stevens decided to do something
648 Ken cast the problem as a data synchronization problem. Quote C_SYNC
649 RFC 5.1, section `Abstract':
651 This is a specification for a new method of synchronizing game data
652 in the Empire client with data in the server.
654 and section `Objectives':
656 This new mode of communication between the server and the client will
657 be called C_SYNC communication and will satisfy the following 6
660 (1) Output format will be version independent. So if someone is
661 using an old EmpireToolkit, then it will still work with a newer
662 version of the server.
664 (2) Every C_SYNC message will be a self-contained packet. i.e. the
665 client will not need to depend on previous messages (header messages)
666 to determine the meaning of a C_SYNC message.
668 (3) A C_SYNC message will be able to represent any of the
669 player-accessible data that is contained in the server database (e.g.
670 enemy ships, nations).
672 (4) Bandwidth will be minimized (i.e. the format will be as
673 concise as possible) while remaining human readable (i.e. no
674 binary messages). [Note that data compression may be added at a later
675 date, but if it is added, it will be added on a separate port to
676 maintain backwards compatability.]
678 (5) The client will be able to tell the server whether it wants
679 to receive C_SYNC messages and whether these messages can be sent
680 asynchroniously (via "toggle sync" and "toggle async" respectively).
682 (6) A portable ANSI C EmpireToolkit will be made available for
683 parsing C_SYNC messages and managing the data they contain.
685 C_SYNC worked by hooking into ef_write() & friends so it could
686 `synchronize' the client on game state changes.
688 Sounds jolly good, doesn't it?
690 Well, it was a failure, and Wolfpack ripped it out right away. Quote
693 Changes to Empire 4.0.0 - Initial release
694 * Initial Wolfpack release - Long live the Wolfpack!!!!
696 * Removed C_SYNC. This is done for 2 reasons. 1) None of us like it or
697 wish to support it. 2) We envision a better scheme for doing similar
698 things will come along.
700 But *why* did it fail? Just because Steve McClure hated it? Nope.
701 C_SYNC failed for several different reasons, each of them bad, but
702 only the last one is truly fundamental.
704 a. Lack of a rigorous and complete definition. The RFC is long on
705 syntax, but short on semantics. For instance, the unit type was
706 encoded as a number. Unit characteristics happened to be dumped in
707 an order that matched these numbers, but that wasn't defined
710 b. Overly complicated syntax. Trouble with encoding of strings.
712 c. Buggy implementation. Malformed C_SYNC messages, duplicate
713 messages, missing messages, semantically incorrect messages, you
716 d. Change of crew before it was finished. Wolfpack took over and
717 understandable wasn't interested in this half-finished mess.
719 None of the above is a fundamental, inherent flaw of the idea. The
720 next one is more serious:
722 e. It failed to achieve objective (4), and therefore slowed down
723 clients too much to be of use in real-time combat. When you fired
724 from a bunch of ships, C_SYNC would push complete records for all
725 the ships and the target to you. Most of that data is redundant.
727 That's because C_SYNC didn't transmit state changes, it
728 resynchronized state, and the pieces of state it could transmit
731 The network was slower then. But let's not be complacent. I/O is
732 slow. Always was, most likely ever will be.
734 Maybe sending the messages out of band (separate TCP stream) would
737 And here comes the killer:
739 f. The data to sync is not readily available the server.
741 Yup. Think about it. The game state on the server is *not* the
742 same as on the client. The server grants the client a carefully
743 limited view on certain parts of server game state on certain
746 To be complete, a machine readable protocol must disclose as much
747 information as the human readable output. Tracking server game
748 state changes cannot do that alone. For instance, lookout tells
749 you ship#, owner and location. That event does not trigger any
750 state change on the server!
752 To be correct, a machine readable protocol must disclose no more
753 information than the human readable output. When you observe a
754 server game state change, you can only guess what event triggered
755 it, and what it disclosed to which player. You're stuck with
756 conservative assumptions. That's the death knell for completeness.
757 Correct assumptions will be non-obvious, so correctness is
758 non-obvious, too, hence hard to achieve and maintain.
760 Bottom line: tracking server state change cannot support a complete
761 client protocol for hard theoretical reasons, and I believe it
762 cannot support a correct one for practical reasons.
764 Oddly enough, people criticized C_SYNC for all the flaws it had (and
765 some it hadn't), except for f.
767 What now? Throw up our hands in despair and give up? Nah. Ken tried
768 a shortcut, and it didn't work. That doesn't mean there's no way at
769 all. I believe the only way to get this done right is by tracking
770 *events*. Whenever something is printed to a player, be it live
771 connection or telegram, we need to transmit precisely the same
772 information in machine readable form. Much more work.
774 xdump shares valuable ideas with C_SYNC, e.g. using selector
775 meta-data. It is, however, much more modest in scope. We're pretty
776 sure we can get it right and get it done in a reasonable time frame.