Archive
Projects
About

See figure 1.

Retrocomputing and other indulgences.

The HOSTCM protocol

The HOSTCM protocol is a mechanism that allows a Commodore SuperPET or IBM PC running the Waterloo microSystem kernel to communicate with a remote file server over a serial line. This is a brief overview of the internals of the protocol.

Basics

A session between two machines1 consists of a structured series of synchronous requests from the client and corresponding responses from the host. These can be thought of as forming a simple RPC-like mechanism with ad-hoc serialization. Communication between client and host is encoded as 7 bit ASCII and transmitted over a serial line; speed, stop bits and parity can be set depending on the hardware connecting the host and client, but are not part of the protocol itself.

Syntax

This is a simple EBNF summary of the HOSTCM protocol. Literals are enclosed in quotes, square brackets indicate optional parameters, and whitespace is non-significant.

The protocol is asymmetric; the client and host have different roles. Client machines issue <client-request> messages and accept <host-response> messages as replies; the host does the reverse.

<client-request> ::= <request> <checksum-char> <lineend-char> |
			<quit-request> <linend-char> |
			<NAK> <lineend-char>
			

<request> ::=   <init-request> |
		<open-request> |
		<close-request> |
		<get-request> |
		<put-request> |
		<diropen-request> |
		<dirread-request> |
		<dirclose-request> |
		<rename-from-request> |
		<rename-to-request> |
		<scratch-request> |
		<seek-request> 


<init-request> ::= "v" <protocol-id>

<open-request> ::= "o" <mode> <format> "(" <type> [":" <rl>] ")" <filename> 

<mode> ::= "r" | "w" | "u" | "a" | "l" | "s"

<format> ::= "t" | "b"

<type> ::= "f" | "v" | "t"

<rl> ::= <ASCII-integer>

<filename> ::= <ASCII-string>

<close-request> ::= "c" <fileid-char>

<put-request> ::= "p" <fileid-char> <eor-char> <data-string>

<get-request> ::= "g" <fileid-char> ["l"]

<diropen-request> ::= "d" [<filename>]

<dirread-request> ::= "f"

<dirclose-request> ::= "k"

<rename-from-request> ::= "w" <filename>

<rename-to-request> ::= "b" <filename>

<scratch-request> ::= "y" <filename>

<seek-request> ::= "r" <fileid-char> <seekoffset>

<quit-request> ::= "q"

<NAK> ::= "N"

<protocol-id> ::= "80"

<fileid-char> ::= <ASCII-char>

<filename> ::= <ASCII-string>

<data-string> ::= <ASCII-hex> <ASCII-hex> [<data-string>]

<ASCII-string> ::= <ASCII-char> [<ASCII-string>]

Hosts respond to <client-request> messages with <host-response> messages.

<host-response> ::= 
	<response-char> <response> <checksum-char> <lineend-char> <prompt-char> |
	<response-char> <NAK> <lineend-char> <prompt-char>

<response> ::=  <OK-response> |
		<error-response> |
		<open-response> |
		<get-response> |
		<dirread-response> 

<OK-response> ::= "b"

<error-response> ::= "x" <error-string>

<open-response> ::= "b" <fileid-char>

<get-response> ::= "b" <eor-or-eof-char> <data-string>

<eor-or-eof-char> ::= <eor-char> | <eof-char>

<eor-char> ::= "n" | "z"

<eof-char> ::= "e"

<dirread-response> ::= "e" | "b" <ASCII-string>

<lineend-char> is a single ASCII character indicating the end of a message. By default, it is ASCII CR, 0x0d.

<prompt-char> is a single ASCII character indicating that the host is ready for the next message. By default, it is ASCII DC1, 0x11.

<response-char> is a single ASCII character indicating that the host is responding to a client request. By default, it is ASCII DC3, 0x13.

<checksum-char> is a single ASCII character, representing a checksum of the contents of the message up to that point. For a buffer b of size n, it is calculated by

	static char sumchar[] = "ABCDEFGHIJKLMNOP";

	char
	checksum(char *b, int n)
	{
		int i, s = 0;

		for (i = 0; i < n; i++) {
			s += b[i] & 0xf;
		}

		return (sumchar[s&0xf]);
	}

<lineend-char>, <prompt-char> and <response-char> are all configurable. Obviously, host and client must be configured identically for the protocol to work correctly, and all of these characters must be out-of-band to the remainder of the protocol. I have not tested configurations other than the default.

<ASCII-char> represents the set of printable ASCII characters, and <ASCII-hex-digit> corresponds to members of the set "01234567890ABCDEF".

Semantics

In general, the response to any correctly formatted request from the client to the host is either <OK-response> or <error-response>, as appropriate.

Certain messages – <open-request>, <get-request>, and <dirread-request> – have particular variations of the general <OK-response> tailored to their needs.

At any time, either the host or the client may respond to a message with a NAK; this will cause the receiver to repeat the last transmission.

Details of the various requests and responses follow:

Notes

As mentioned, this is only a brief summary of the protocol. For full details, please consult the reference implementation.

  1. For ease of reference, the user’s machine (PC, SuperPET) will be the “client”, and the remote file server will be the “host”. 

  2. Good luck with that. The Waterloo microSystem file semantics are overly complicated, poorly documented nonsense. Even the SuperPET ROM implementation doesn’t comply with the vague specifications that the manual provides, at least for certain types of local files. As you might guess, this makes figuring out how HOSTCM should behave somewhat difficult.

    Since I’ve been unable to find any period program that actually makes use of the more advanced file system features, it’s probable that no one will ever care. However, if you want to know how I think the modes, types, and formats should interact, have a look at the source.

    In particular, see the routine hostopen() in protocol.c; the file digression.txt also goes into some detail about the more subtle issues. 

  3. The attentive reader will note that this inflates the size of a binary file by a factor of two. No one ever claimed that this was going to be fast. 

—   28 June 2013
(tags: , , )