empserver/doc/unicode
2005-05-29 22:21:34 +00:00

134 lines
3.9 KiB
Text

Unicode changes:
1. login utf-8
Added a login options. The first option is utf-8 and it sets
the PF_UTF8 player's flags. Default is off.
Syntax
options utf-8 -- turns on the utf-8
options utf-8=1 -- turns on the utf-8
options utf-8=0 -- turns off the utf-8
options -- lists current options and their values
2. flash and wall
a. Message as command argument
Interpret raw command line as message text rather than normal
text.
b. Multi-line mode
Read message lines as message text rather than normal text.
c. Break long lines
Count the charactes using utf8 format. This works for both ASCII
and UTF8 formatted strings.
d. Print lines
Print as message text rather than normal text.
3. Telexes and telex-like things
a. read and wire, MOTD and gamedown message
Print as message text rather than normal text.
c. tele, anno, pray, turn.
Read as message text rather than normal text.
4. Input filtering
a. Parsing commands (normal text)
Ignore control and non-ASCII characters when copying argument
strings.
b. Reading normal text command arguments
Replace control and non-ASCII characters, except for tab with
"?'.
c. Reading message text command arguments
Support message text arguments, used by 3a. and 2b. Replace
control and, if NF_UTF8 is off, non-ASCII characters.
5. Output filtering
Output filtering asssumes that there are no control characters or
invalid characters in the output messages. The control characters
and invalid characters are filtered out during input filtering or
that the server will not generate control characters or invalid
characters.
a. Printing normal text
When NF_UTF8 is on, highlighted text is printed using SO/SI.
b. Printing message text
When NF_UTF8 is off, replace UTF8 charactes with '?'.
Definitions:
1. Normal Text
For normal text, the following ASCII characters are valid:
CR, LF and 0x20-0x7e. Normally, LF is an termination action
event. Normally, CR is not used except by the server.
Normal Text does not support UTF8 characters. In normal
text, the 8th bit is used a highlight bit. If the client
has the utf8 nation flag set, the standout bit is removed
and the highlight block is prefixed with SO (ASCII standout)
and suffixed with SI (ASCII standin).
2. Message Text
For message text, the following ASCII characters are valid:
Tab, CR, LF and 0x020-0x7e. Normally, LF is an termination
action event. Normally, CR is not used except by the server.
Message text also supports UTF8 characters if the utf8 nation
flag is turn on otherwise only the ASCII characters are
supported.
Notes:
1. Strings that considered message text are commented.
2. Both Normal and Message text are char strings are in the server.
Care needs to be taken as some compiler consider char
signed and other default to unsigned char.
3. Unicode functions are prefixed with u.
Notes for Client Implementors:
ASCII Mode
1. If you do not specify a login options, it the server will start the
session in ASCII mode.
2. This is close to the previous mode (<4.2.21) but there is more filtering
to remove non-ASCII characters and ASCII control characters.
3. If another client in UTF8 mode tries to send to this client then the
server will replace the non-ASCII characters with question marks.
4. The standout works the same as before where the 8th bit indicates that
the character should be highlighted.
UTF8 Mode
1. The login options must be specified before the play command is sent.
The syntax is 'options utf-8'.
2. The server will filter ASCII control characters but will pass any characters
with the 8 bit set.
3. For the standout mode, the server inserts an ASCII SO character at the
beginning of standout sequence and the server sends an ASCII SI character at
the end of the standout sequence.