The HOSTCM protocol is a mechanism that allows a Commodore SuperPET or IBM PC running the Waterloo microSystem kernel to communicate with a remote file server over a serial line. This is a brief overview of the internals of the protocol.
Basics
A session between two machines1 consists of a structured series of synchronous requests from the client and corresponding responses from the host. These can be thought of as forming a simple RPC-like mechanism with ad-hoc serialization. Communication between client and host is encoded as 7 bit ASCII and transmitted over a serial line; speed, stop bits and parity can be set depending on the hardware connecting the host and client, but are not part of the protocol itself.
Syntax
This is a simple EBNF summary of the HOSTCM protocol. Literals are enclosed in quotes, square brackets indicate optional parameters, and whitespace is non-significant.
The protocol is asymmetric; the client and host have different roles.
Client machines issue <client-request>
messages and accept <host-response>
messages as replies;
the host does the reverse.
<client-request> ::= <request> <checksum-char> <lineend-char> |
<quit-request> <linend-char> |
<NAK> <lineend-char>
<request> ::= <init-request> |
<open-request> |
<close-request> |
<get-request> |
<put-request> |
<diropen-request> |
<dirread-request> |
<dirclose-request> |
<rename-from-request> |
<rename-to-request> |
<scratch-request> |
<seek-request>
<init-request> ::= "v" <protocol-id>
<open-request> ::= "o" <mode> <format> "(" <type> [":" <rl>] ")" <filename>
<mode> ::= "r" | "w" | "u" | "a" | "l" | "s"
<format> ::= "t" | "b"
<type> ::= "f" | "v" | "t"
<rl> ::= <ASCII-integer>
<filename> ::= <ASCII-string>
<close-request> ::= "c" <fileid-char>
<put-request> ::= "p" <fileid-char> <eor-char> <data-string>
<get-request> ::= "g" <fileid-char> ["l"]
<diropen-request> ::= "d" [<filename>]
<dirread-request> ::= "f"
<dirclose-request> ::= "k"
<rename-from-request> ::= "w" <filename>
<rename-to-request> ::= "b" <filename>
<scratch-request> ::= "y" <filename>
<seek-request> ::= "r" <fileid-char> <seekoffset>
<quit-request> ::= "q"
<NAK> ::= "N"
<protocol-id> ::= "80"
<fileid-char> ::= <ASCII-char>
<filename> ::= <ASCII-string>
<data-string> ::= <ASCII-hex> <ASCII-hex> [<data-string>]
<ASCII-string> ::= <ASCII-char> [<ASCII-string>]
Hosts respond to <client-request>
messages with <host-response>
messages.
<host-response> ::=
<response-char> <response> <checksum-char> <lineend-char> <prompt-char> |
<response-char> <NAK> <lineend-char> <prompt-char>
<response> ::= <OK-response> |
<error-response> |
<open-response> |
<get-response> |
<dirread-response>
<OK-response> ::= "b"
<error-response> ::= "x" <error-string>
<open-response> ::= "b" <fileid-char>
<get-response> ::= "b" <eor-or-eof-char> <data-string>
<eor-or-eof-char> ::= <eor-char> | <eof-char>
<eor-char> ::= "n" | "z"
<eof-char> ::= "e"
<dirread-response> ::= "e" | "b" <ASCII-string>
<lineend-char>
is a single ASCII character indicating the end of
a message. By default, it is ASCII CR, 0x0d
.
<prompt-char>
is a single ASCII character indicating that the host is ready
for the next message. By default, it is ASCII DC1, 0x11
.
<response-char>
is a single ASCII character indicating that the host is
responding to a client request. By default, it is ASCII DC3, 0x13
.
<checksum-char>
is a single ASCII character, representing a checksum of
the contents of the message up to that point. For a buffer b of size n, it
is calculated by
static char sumchar[] = "ABCDEFGHIJKLMNOP";
char
checksum(char *b, int n)
{
int i, s = 0;
for (i = 0; i < n; i++) {
s += b[i] & 0xf;
}
return (sumchar[s&0xf]);
}
<lineend-char>
, <prompt-char>
and <response-char>
are all configurable.
Obviously, host and client must be configured
identically for the protocol to work correctly, and all of these characters must be out-of-band to the remainder
of the protocol. I have not tested configurations other than the default.
<ASCII-char>
represents the set of printable ASCII characters, and <ASCII-hex-digit>
corresponds to members
of the set "01234567890ABCDEF"
.
Semantics
In general, the response to any correctly formatted request from the client to the host is either
<OK-response>
or <error-response>
, as appropriate.
Certain messages – <open-request>
, <get-request>
, and <dirread-request>
– have particular variations of
the general <OK-response>
tailored to their needs.
At any time, either the host or the client may respond to a message with a NAK; this will cause the receiver to repeat the last transmission.
Details of the various requests and responses follow:
<init-request>
: start a session with the host. The<protocol-id>
is a two-digit ASCII number, presumably corresponding to the version of the protocol that is supported by the client. This document describes<protocol-id>
“80”.
-
<open-request>
: open a file on the host, returning a<fileid-char>
if successful.<mode>
indicates the mode in which the file will be accessed: “r
” (read), “w
” (write), “u
” (update), “a
” (append), “l
” (load, read for executable files), “s
” (store, write for executable files)<format>
indicates the open format (which is different than the file type, see below). Can be “t
” for text, or “b
” for binary.<type>
is the type of the file; one of “f
” (fixed record length), “v
” (variable record length), or “t
” (text).If the file is a fixed record length, the record length may be encoded in the optional part of the open request. If the file has fixed length records but the record-length is not specified, it defaults to 80.
For more details of these how these parameters interact, see the Waterloo System Overview document.2
If successful, the host responds with an
<open-response>
, and provides a<fileid-char>
, to identify the open file for subsequent requests. The identifier is valid until the file is closed. -
<close-request>
: close a previous opened file.<fileid-char>
is the token returned from the corresponding<open-request>
-
<get-request>
: read data from a previously opened file. Returns data starting at the current file location, and extending to the end of the current record or until the response buffer is filled, whichever comes first. For text files, a record is considered to extend to the next end of line.The “
l
” parameter appears to mean that the file pointer should be rewound to the beginning of the current record, and the data read starting at that point. This is a supposition, as I have no examples of programs which use the facility, and the documentation is silent on the matter.For the
<get-response>
,<eor-char>
indicates if the returned data concludes a record (“z
”), or if the current record continues beyond the file position at the end of the read (“n
”). A response that includes<eof-char>
indicates that end of file has been reached; this is only sent once no more data remains to be transmitted.<data-string>
contains the requested file data. If the file type is binary, this parameter is encoded; each byte of data is sent as two ASCII hex digits (high nibble first) so as to pass through a 7-bit channel.3 Text file data is sent unchanged. -
<put-request>
: write data to an open file, at the current position. In the case of binary files, the<data-string>
value is encoded in the same way as for<get-request>
. -
<diropen-request>
: open the named directory for reading, or the current directory if no name is specified. Unlike<open-request>
, only one directory can be open at a time, so no identifier is returned. -
<dirread-request>
: read the next directory entry, and return a formatted string that shows a user-readable version of the entry data. The details of the information in this string are implementation dependant. If there are no more entries to read, reply with “e
”. -
<dirclose-request>
: close the currently open directory. -
<rename-from-request>
: provide the source file to be renamed by a subsequent<rename-to-request>
. -
<rename-to-request>
: rename the file named in the immediately preceding<rename-from-request>
to the provided name. -
<seek-request>
: seek to a position in a previously opened file. The seek offset is a 16-bit quantity. It is unclear whether the offset refers to a record number or a byte offset. The reference implementation presumes the latter, but this choice is arbitrary. -
<quit-request>
: the client wishes to terminate the HOSTCM session. The HOSTCM server should release the serial line, and terminate if appropriate.
Notes
As mentioned, this is only a brief summary of the protocol. For full details, please consult the reference implementation.
-
For ease of reference, the user’s machine (PC, SuperPET) will be the “client”, and the remote file server will be the “host”. ↩
-
Good luck with that. The Waterloo microSystem file semantics are overly complicated, poorly documented nonsense. Even the SuperPET ROM implementation doesn’t comply with the vague specifications that the manual provides, at least for certain types of local files. As you might guess, this makes figuring out how HOSTCM should behave somewhat difficult.
Since I’ve been unable to find any period program that actually makes use of the more advanced file system features, it’s probable that no one will ever care. However, if you want to know how I think the modes, types, and formats should interact, have a look at the source.
In particular, see the routine
hostopen()
in protocol.c; the filedigression.txt
also goes into some detail about the more subtle issues. ↩ -
The attentive reader will note that this inflates the size of a binary file by a factor of two. No one ever claimed that this was going to be fast. ↩