622 lines
21 KiB
Text
622 lines
21 KiB
Text
Guidelines for Writing Empire Code
|
||
|
||
compiled and edited by Markus Armbruster
|
||
|
||
Preface
|
||
-------
|
||
|
||
Empire is an old program, and the age shows in places. It took a lot
|
||
of people a lot of effort to write it. Chances are that some of them
|
||
were smarter than you and me. Treat the code with the respect it
|
||
deserves.
|
||
|
||
These guidelines are just that: guidelines. Not law. There's no
|
||
Empire Coding Police. People hacked on Empire code long before I knew
|
||
it existed, and I hope they still will when I fade into Empire
|
||
history. But there's substantial experience behind these guidelines,
|
||
so don't dismiss them cavalierly.
|
||
|
||
Here's our goal: algorithmic clarity. All the rules and guidelines
|
||
are subservient to it. If you'd have to obfuscate your code to
|
||
satisfy the letter of some guideline, don't!
|
||
|
||
Mind, I said `clarity', not `self-realization' or `expression of your
|
||
individuality'. Dave Pare wrote:
|
||
|
||
Be invisible. When you make a change, think not about marking your
|
||
place in history, or about showing off how much nicer your two-space
|
||
tabs are than those old, icky eight-space tabs that the stupid
|
||
empire hackers of old used, but think instead about the aesthetics
|
||
of the whole. The resulting lines of code should flow smoothly from
|
||
the beginning of the procedure to the end. Empire is 60,000 lines
|
||
of code. If you're the general case, you're only changing fifty
|
||
lines, so be aware of that.
|
||
|
||
Some guidelines are more serious than others. When I use words like
|
||
`is', `shall' or `must', I mean it. When I write `should', it's a
|
||
recommendation.
|
||
|
||
Many guidelines concern style, i.e. they involve matters of taste.
|
||
That's how it has to be. Uniformity of style is important for
|
||
clarity.
|
||
|
||
These guidelines don't attempt to be exhaustive. More complete
|
||
guidelines that are mostly compatible with Empire can be found at
|
||
http://www.jetcafe.org/~jim/c-style.html
|
||
|
||
|
||
Source tree organization
|
||
------------------------
|
||
|
||
FIXME
|
||
|
||
|
||
Code formatting
|
||
---------------
|
||
|
||
In My Egotistical Opinion, most people's C programs should be
|
||
indented six feet downward and covered with dirt."
|
||
-- Blair P. Houghton
|
||
|
||
Over the years, enough Empire coders lacked the good taste to preserve
|
||
the style of the original Empire code in their changes, and thus
|
||
turned the source code into an unreadable mess. In 2003, we fed it to
|
||
GNU indent.
|
||
|
||
We tried to restore things to the original style, mostly. There is
|
||
one noteable change: basic indentation is now four spaces. Restoring
|
||
the original eight spaces would have resulted in many more long lines,
|
||
which would have to be broken by indent. Since indent isn't good at
|
||
breaking lines tastefully, we reluctantly chose four instead.
|
||
|
||
FIXME mention src/scripts/intend-emp, even better: convert it into an
|
||
indent profile and mention that.
|
||
|
||
If you use Emacs, `stroustrup' indentation style produces satisfactory
|
||
results. The command `c-set-style', normally bound to `C-c .', lets
|
||
you set the style for the current buffer. Set variable
|
||
`c-default-style' to "stroustrup" to switch to this style for all
|
||
buffers.
|
||
|
||
Tab character use
|
||
|
||
Whether you use tab characters or not doesn't really matter that much,
|
||
but tab stops are every eight characters, period.
|
||
|
||
Indentation, placement of braces, function name
|
||
|
||
Basic indentation is four spaces. The opening brace goes on the same
|
||
line as `if', `struct', etc. Put `else' and do loops' `while' one the
|
||
same line as the closing brace. You are encouraged to leave out
|
||
syntactically optional braces. Don't indent case labels.
|
||
|
||
Example:
|
||
|
||
if (is_fubar())
|
||
despair();
|
||
else {
|
||
do {
|
||
char ch = getchar();
|
||
switch (ch) {
|
||
case EOF:
|
||
return;
|
||
case '\n':
|
||
break;
|
||
default:
|
||
grind_along(ch);
|
||
}
|
||
} while (happy());
|
||
}
|
||
|
||
In a function definition, the return type, function name and the
|
||
braces must all start on a new line, unindented, like this:
|
||
|
||
type
|
||
name(arguments)
|
||
{
|
||
body
|
||
}
|
||
|
||
This does not apply to function declarations.
|
||
|
||
Breaking long lines
|
||
|
||
Line length should not exceed 75 characters. Break such lines at a
|
||
logical place, preferably before an operator with low precedence.
|
||
Line up the continuation line with the innermost unclosed parenthesis,
|
||
if there is one, else indent it four spaces.
|
||
|
||
Example:
|
||
|
||
FIXME
|
||
|
||
Bad example:
|
||
|
||
FIXME
|
||
|
||
Blank lines
|
||
|
||
Use blank lines to separate different things. Functions must be
|
||
separated by a blank line. You are encouraged to insert a blank line
|
||
between a block's declarations and statements.
|
||
|
||
Spaces
|
||
|
||
There is a space after `for', `if', and `while'. If `for' and `while'
|
||
are followed by `;' (empty loop body) on the same line, there is a
|
||
space before the `;'. There is no space between the function name and
|
||
the `(', but there is a space after each `,'. There should be no
|
||
space between type cast operator and operand. There is no extra space
|
||
after '(' and before ')'.
|
||
|
||
Example:
|
||
|
||
for (p = s; *p; ++p) ;
|
||
printf("%ld\n", (long)(p-s));
|
||
|
||
Comments
|
||
|
||
The function comment describing what a function does goes directly
|
||
above the definition.
|
||
|
||
Comments to the right of code should start in column 32 if possible
|
||
(counting from zero).
|
||
|
||
Comment lines should be indented exactly like the code the belong to.
|
||
|
||
You are encouraged to format multi-line comments like this:
|
||
|
||
/*
|
||
* Please use complete sentences, with proper grammer,
|
||
* capitalization and punctuation. Use two spaces between
|
||
* sentences.
|
||
*/
|
||
|
||
But please avoid elaborate boxes like this:
|
||
|
||
/***********************************************************
|
||
* Such decorations are a waste of time, hard to edit, and *
|
||
* ugly, too. *
|
||
***********************************************************/
|
||
|
||
Do not use
|
||
|
||
// C++/C99 comments
|
||
|
||
because they are not portable C89.
|
||
|
||
|
||
Conditional compilation
|
||
|
||
Unless the conditional code is very short, please comment it like
|
||
this:
|
||
|
||
#ifdef FOO
|
||
...
|
||
#endif /* FOO */
|
||
|
||
#ifdef BAR
|
||
...
|
||
#else /* !BAR */
|
||
...
|
||
#endif /* !BAR */
|
||
|
||
|
||
Style
|
||
-----
|
||
|
||
Names
|
||
|
||
DoNotUseStudlyCaps! Stick to lower_case and use underscores. Upper
|
||
case is for macros and enumeration constants.
|
||
|
||
File names should not differ in case only, since not all file systems
|
||
distinguish case. Best stick to lower case. Avoid funny characters
|
||
in file names. This includes space.
|
||
|
||
Preprocessor
|
||
|
||
Like many good things, the preprocessor is best used sparingly.
|
||
Especially conditional compilation.
|
||
|
||
Do not use the preprocessor to `improve' the syntax of C!
|
||
|
||
Macro names should be ALL_CAPS to make them stand out. Otherwise,
|
||
they can be mistaken for objects or functions. This is bad, because
|
||
`interesting' behavior can hide inside macros, like not evaluating
|
||
arguments, or changing them. Exception: if a function-like macro
|
||
behaves exactly like a function, then you may use a lower case name.
|
||
|
||
Parameters that can take expression arguments must be parenthesized in
|
||
the expansion. If the expansion is an expression, it should be
|
||
parenthesized as well.
|
||
|
||
You are encouraged to use enumeration constants instead of macros when
|
||
possible. The existing code doesn't, but it makes sense all the same.
|
||
|
||
Comments
|
||
|
||
Every file should have a file comment FIXME
|
||
|
||
Every function should have a function comment that describes what it
|
||
does. FIXME elaborate. Writing such comments helps both future
|
||
maintainers and yourself: if it's too hard to write a concise function
|
||
comment, then your function is likely too complicated and could use a
|
||
redesign.
|
||
|
||
The existing code has very little useful comments, and it'll likely
|
||
take us years to fix it. Please don't make it harder than it already
|
||
is.
|
||
|
||
Declarations
|
||
|
||
Do not declare system functions yourself; include the appropriate
|
||
system header.
|
||
|
||
Use prototypes, not old-style function declarations.
|
||
|
||
To get the best use of C's type checking, each function or variable
|
||
with external linkage should have exactly one declaration, in some
|
||
header file, and that declaration must be in scope at the definition.
|
||
No other declarations should exist. In particular, please include the
|
||
appropriate header instead of just declaring stuff again. The code
|
||
used to be full of this kind of junk, and it was quite some work to
|
||
clean it up.
|
||
|
||
Forward declarations of static functions should all go in one place
|
||
near the beginning of the file.
|
||
|
||
If you want a typedef name in addition to a structure or union tag,
|
||
give it the same name, like this:
|
||
|
||
typedef struct foo foo;
|
||
|
||
Yes, this is incompatble with C++. Reducing the number of names for
|
||
the same thing is more useful than compatibility to a programming
|
||
language we don't use.
|
||
|
||
Please don't hide away pointers with typedefs, like this:
|
||
|
||
typedef struct foo *foop;
|
||
|
||
When I see `foo *', I *know* what it is. When I see `foop', I have to
|
||
look it up.
|
||
|
||
Booleans
|
||
|
||
Do not use increment operators to set a variable to logical true! It
|
||
obfuscates the purpose, and narrow variables could even overflow.
|
||
Just assign 1. A lot of cleanup remains to be done there.
|
||
|
||
Type casts
|
||
|
||
Casting pointers to and from `void *' clutters the code and serves no
|
||
purpose (repeat after me: C is not C++). It is also unsafe, because
|
||
the cast suppresses type warnings.
|
||
|
||
Casting function values to void clutters the code and serves little
|
||
purpose (repeat after me: C is not Pascal).
|
||
|
||
Casting an argument to the type of the parameter clutters the code and
|
||
serves no purpose (repeat after me: ISO C is not K&R C).
|
||
|
||
Factor out common code
|
||
|
||
Do not gratuitiously duplicate code! Ken Stevens said it well, and
|
||
it's as relevant as ever:
|
||
|
||
Cut-and-Paste coding is by far the biggest problem that the current
|
||
Empire suffers from. This is how cut-and-paste coding happens. Joe
|
||
Shmuck decides that he wants to add a new function to the server.
|
||
So he goes hunting through the server to find some already existing
|
||
code which does something similar to what he wants to do. This is
|
||
good. You should always write new code to imitate old code. What
|
||
is bad is when Joe Shmuck decides to simply "copy" 200 lines of code
|
||
from the old function into his new function, and then just change a
|
||
couple of little things here and there to suit his needs. This
|
||
method, known as Cut-and-Paste coding is the fastest and easiest way
|
||
to code. However, it results in a server that is impossible to
|
||
maintain. What Joe _should_ have done, is "move" the 200 lines of
|
||
code into a new _third_ function which the first two both call.
|
||
This is called writing a "general solution" to handle both cases.
|
||
Most of my work in the Empire2 project consisted in cleaning up
|
||
after a bunch of Joe Shmucks.
|
||
|
||
|
||
Portability
|
||
-----------
|
||
|
||
FIXME C89, POSIX
|
||
|
||
FIXME sizes, printf formats
|
||
|
||
FIXME reserved names
|
||
|
||
FIXME conditional compilation is a last resort
|
||
|
||
|
||
Robustness
|
||
----------
|
||
|
||
Error checking
|
||
|
||
Check error conditions meticulously. The existing code is bad enough
|
||
at this, please don't make it worse.
|
||
|
||
FIXME what to do on error
|
||
|
||
Sanity checking
|
||
|
||
In many places, the code checks for conditions that should not happen,
|
||
and then tries to recover. This is sound defensive programming.
|
||
Please use CANT_HAPPEN() and CANT_REACH() for this purpose, because
|
||
they log the error condition, and optionally abort the program, or
|
||
write out a core dump. This is called an "oops".
|
||
|
||
Buffer overruns
|
||
|
||
FIXME
|
||
|
||
Thread safety
|
||
|
||
Thread stacks cannot grow. If you use too much automatic storage, you
|
||
can overrun the stack with disastrous consequences. FIXME elaborate
|
||
|
||
Empire uses non-preemptive threads: a thread runs until it yields the
|
||
processor. This simplifies programming, but there are still traps to
|
||
avoid. Yielding the processor is an explicit thread operation, and
|
||
whether a thread operation yields is documented in empthread.h.
|
||
However, the operation may be buried in library code.
|
||
|
||
In particular, player input may yield. Player output may yield only
|
||
if it goes to the current player, and his last command doesn't have
|
||
the C_MOD flag. You can use this to simplify command code: set C_MOD
|
||
if you don't want to worry about yield on output. This is commonly
|
||
done for commands that modify game state.
|
||
|
||
Be careful with static storage. Read-only data is fine. More than
|
||
one thread writing static data is problematic, but there are safe
|
||
uses. For instance, a static scratch buffer that is never used across
|
||
`yields' is safe.
|
||
|
||
Yielding the processor invalidates *all* the game state you copied
|
||
into variables. You have to reread and possibly recheck. See below.
|
||
|
||
Accessing game state
|
||
|
||
Game state consists of several sets of objects, e.g. sectors, ships,
|
||
bmaps, ... Each set is fully held in memory and backed by a file in
|
||
the data directory.
|
||
|
||
There are several ways to access an object in set FOO:
|
||
|
||
* You can get a pointer to the object (not a copy!) with getFOOp().
|
||
|
||
This is inherently dangerous, because it allows you to update the
|
||
object in memory without updating the disk file. It's used mostly
|
||
in update code, where it pays major performance dividends, but there
|
||
are uses in other places. Whether they are wise is debatable.
|
||
|
||
Obviously, the object can change when you let another thread run.
|
||
This is not an issue in update code, because it never yields.
|
||
|
||
* You can get a copy with getFOO() and write it back with putFOO().
|
||
|
||
Putting updates both the object in memory and the disk file.
|
||
|
||
Any change to the object invalidates the copy. Putting such an
|
||
invalid copy will clobber the change(s) that invalidated it,
|
||
possibly resulting in corrupted game state. The code oopses on such
|
||
puts, but it can't repair the damage.
|
||
|
||
There are two common ways to invalidate a copy: calling a function
|
||
that updates the object you copied (unless it does that through your
|
||
copy), and yielding the processor, which lets other threads update
|
||
the object you copied.
|
||
|
||
Therefore, you have to reget after a possible invalidation, and deal
|
||
with changes. In particular, if you checked whether the object is
|
||
suitable for a task, you need to check again after regetting it. If
|
||
you can afford to bail out when something changed, use
|
||
check_FOO_ok().
|
||
|
||
Function comments should state what objects the function can update.
|
||
Unfortunately, they generally don't.
|
||
|
||
It's best to keep puts close to gets, both in runtime and in the
|
||
source code.
|
||
|
||
* Bmaps have special access functions.
|
||
|
||
Each country has two bmaps: the working bmap and the true bmap.
|
||
Unfortunately, the bmap code calls the former `bmap' and the latter
|
||
`map'.
|
||
|
||
You update bmaps with map_set(). This doesn't write through to the
|
||
file; instead, it returns non-zero when the update changed anything.
|
||
In that case, you must write bmaps to disk with writemap() before
|
||
you yield. If you only updated the working bmap, then you can call
|
||
writebmap() instead.
|
||
|
||
Reading player input
|
||
|
||
Reading player input can fail, and you must check for that.
|
||
Neglecting it can break the interrupt feature for players (normally
|
||
Ctrl-C), and produce extra prompts that could confuse clients. Even
|
||
worse is neglecting it in a loop that terminates only when input is
|
||
read successfully.
|
||
|
||
When reading input fails, you should normally abort the command with
|
||
status RET_SYN. There are exceptions, e.g. when aborting a pinpoint
|
||
bombing run over the target.
|
||
|
||
Some functions to read player input return a special error value you
|
||
can check, e.g. recvclient(), prmptrd() and uprmptrd() return a
|
||
negative value, getstring() and getstarg() return NULL.
|
||
|
||
Others return the same error value for failed read and for successful
|
||
read of input that is invalid. Then you need to check
|
||
player->aborted; if it is set after a read, the read failed.
|
||
|
||
|
||
git
|
||
---
|
||
|
||
Commit related changes together, unrelated changes separately.
|
||
|
||
Write meaningfull commit messages. Start with a single short line
|
||
(ideally less than 50 characters) summarizing the change, followed by
|
||
a blank line and then a more thorough description.
|
||
|
||
The purpose of the change log is not to explain how the code works;
|
||
that should be done in the source code itself. It's to explain *why*
|
||
you made the change, and what is affected by it.
|
||
|
||
Don't change whitespace gratuitiously, say just because your editor
|
||
screws up tabs. Such changes make it much harder to figure out who
|
||
changed what and when.
|
||
|
||
|
||
Historical guidelines, superseded by the above
|
||
|
||
|
||
Remarks from Dave Pare:
|
||
|
||
And now, a few editorial remarks on my views on Good Techniques for
|
||
Modifying Existing Software:
|
||
|
||
My safari through the hot and bug-ridden jungle that Empire has become
|
||
has given me some new insights on the modification of large software
|
||
packages. My adventure has prompted me to propose some simple coding
|
||
guidelines for those dreaming of hacking on the code:
|
||
|
||
1) Be invisible. When you make a change, think not about marking your
|
||
place in history, or about showing off how much nicer your two-space
|
||
tabs are than those old, icky eight-space tabs that the stupid empire
|
||
hackers of old used, but think instead about the asethetics of the whole.
|
||
The resulting lines of code should flow smoothly from the beginning of the
|
||
procedure to the end. Empire is 60,000 lines of code. If you're the
|
||
general case, you're only changing fifty lines, so be aware of that.
|
||
|
||
2) Procedurize. If you find yourself in a triple-nested loop, or a five
|
||
hundred line procedure, perhaps it's because you need to split some of
|
||
that code off into a procedure. In general, if you've got more than two
|
||
levels of tabs (or indentations), or more than one or two major loops in
|
||
a procedure, something may well be amiss.
|
||
|
||
|
||
Sasha Mikheev on indentation:
|
||
|
||
The empire indentation style can be achived by using
|
||
indent -orig -i8 foo.c
|
||
|
||
or in old c-mode emacs (versions before 19.29):
|
||
;add this to .emacs
|
||
(setq c-indent-level 8)
|
||
(setq c-continued-statement-offset 8)
|
||
(setq c-argdecl-indent 8)
|
||
(setq c-brace-offset -8)
|
||
(setq c-label-offset -8)
|
||
|
||
|
||
Further comments by Ken Stevens:
|
||
|
||
1) Global variables
|
||
The only variables which should be global are constants. If you write
|
||
a routine which changes a global variable, then you will corrupt the
|
||
data when two different players run that routine at the same time.
|
||
|
||
2) Portability.
|
||
The following code:
|
||
char a;
|
||
#ifdef FOO
|
||
unsigned char b;
|
||
#endif FOO
|
||
Should be rewritten as:
|
||
s_char a;
|
||
#ifdef FOO
|
||
u_char b;
|
||
#endif /* FOO */
|
||
|
||
AIX has different conventions for signed chars, and IRIX requires the
|
||
/* comments */ after #endif.
|
||
|
||
3) Cut-and-Paste
|
||
Cut-and-Paste coding is by far the biggest problem that the current
|
||
Empire suffers from. This is how cut-and-paste coding happens. Joe
|
||
Shmuck decides that he wants to add a new function to the server. So
|
||
he goes hunting through the server to find some already existing code
|
||
which does something similar to what he wants to do. This is good.
|
||
You should always write new code to imitate old code. What is bad is
|
||
when Joe Shmuck decides to simply "copy" 200 lines of code from the old
|
||
function into his new function, and then just change a couple of
|
||
little things here and there to suit his needs. This method, known as
|
||
Cut-and-Paste coding is the fastest and easiest way to code. However,
|
||
it results in a server that is impossible to maintain. What Joe
|
||
_should_ have done, is "move" the 200 lines of code into a new _third_
|
||
function which the first two both call. This is called writing a
|
||
"general solution" to handle both cases. Most of my work in the
|
||
Empire2 project consisted in cleaning up after a bonch of Joe Shmucks.
|
||
I took repeated code and "consolidated" it into general function
|
||
libraries.
|
||
|
||
4) Good style.
|
||
Just to add to Dave's "Be Invisible" motto, I'd like to give a little
|
||
example to illustrate some basic do's and don'ts for coding style:
|
||
|
||
The following function has bad style:
|
||
|
||
double att_combat_eff(com,own)
|
||
struct combat *com;
|
||
natid own;{
|
||
double str;
|
||
double eff=1.0;
|
||
if(com->type==EF_SECTOR)
|
||
{
|
||
eff=com->eff/100.0;
|
||
if(com->own==own){
|
||
str=com->sct_dcp->d_ostr;
|
||
}else{
|
||
str=com->sct_dcp->d_dstr;eff=2.0+(str-2.0)*eff;
|
||
}else if(com->type==EF_SHIP&&com->own!=own)
|
||
eff=(1.0+com->shp_mcp->m_armor/100.0);
|
||
return eff;}
|
||
|
||
Here is the same function written with "good" style:
|
||
|
||
double
|
||
att_combat_eff(com, own)
|
||
struct combat *com;
|
||
natid own;
|
||
{
|
||
double eff = 1.0;
|
||
double str;
|
||
|
||
if (com->type == EF_SECTOR) {
|
||
eff = com->eff / 100.0;
|
||
if (com->own == own)
|
||
str = com->sct_dcp->d_ostr;
|
||
else
|
||
str = com->sct_dcp->d_dstr;
|
||
eff = 2.0 + (str - 2.0) * eff;
|
||
} else if (com->type == EF_SHIP && com->own != own)
|
||
eff = (1.0 + com->shp_mcp->m_armor / 100.0);
|
||
|
||
return eff;
|
||
}
|
||
|
||
These are all the things I fixed when changing the bad to the good:
|
||
- Function names should always start a new line (so you can search for them)
|
||
- There should always be a space after a ","
|
||
- Function arguments should be indented 8 spaces
|
||
- There should always be a tab after a type declaration
|
||
- Opening function bracket should be on a line by itself
|
||
- Indentation should be 8 spaces
|
||
- There should always be a space on both sides of every operator
|
||
- There should always be a space after if, for, while, switch
|
||
- The opening bracket should be on the same line as the if
|
||
- There should always be a space on either side of a {
|
||
- There should always be a new line after a ;
|
||
- The closing function bracket should be on a line by itself
|
||
|
||
|