The 9845 Assembler Project
The HP 9845 Assembly Language
The HP 9845 had been designed as an easy-to-use system rather for engineers and scientists but for computer specialists. Therefore, BASIC was chosen as the built-in, easy-to-learn programming language. The BASIC interpreter was completely integrated into the operating system ROM which allowed the system to boot up instantly. Other programming languages like FORTRAN or PASCAL had been available as separate development packages on floppy disk.
As for all interpreted programming languages, the BASIC code is converted during runtime into machine instructions (the object code) which can be directly executed by the language processor (LPU). Compared to other BASIC interpreters of that time, the HP 9845 BASIC interpreter had been highly optimized and wasted only a minimum of LPU processing time for the interpretation overhead through direct linking with precompiled machine language routines.
Before anyone could program the HP 9845 in BASIC, all the executable machine code for the whole operating system including the BASIC interpreter itself had to be developed and stored into the firmware ROMs. For all those system programming tasks, another programming language, the 9845 assembly language was used (in fact, the HP engineers also used a semi high-level language which did produce intermediate assembler code). Assembly languages are pretty close to the machine instructions themselves and therefore highly machine specific. In principle, they describe exactly the execution of machine instructions step by step.
For convenience, each machine instruction is represented by a three-letter operation code (op-code or mnemonic). As an example, the op-code "LDA" stands for the machine instruction which loads the A register with some data. Each instruction generally consists of the opcode itself, one or more arguments if applicable and, for reference to another instructions or data position, an optional label. For more efficient programming, assemblers provide additional features like pseudo-opcodes, expression evaluation, error checking or macro capabilities.
Part of a typical 9845 assembler listing (taken from the 9845 Assembly Development Manual)
Whereas the BASIC programming language is normally interpreted, assembler programs are generally compiled, i.e. the assembly language program is converted by an assembler into executable machine code in a batch process. There are exceptions from this rule: some BASIC variants work with compilers (MS Visual Basic does), also there assemblers which interpret the assembly language instructions during runtime. In fact, the Assembly Execution and Development ROM for the HP 9845 is one of those exceptions. But more about this later on.
Although HP BASIC provides many features which where top-of-the-line during the late seventies, not everything can be done in HP BASIC. For example, HP left out intentionally any means for direct access to system resources. Well-known BASIC calls like PEEK or POKE are missing, as well as any exit into user-written machine routines (like it is done by the SYS or USR calls in other BASIC dialects). One reason might have been that the system design with two processors, two shared system buses and banked system memory was was too complex for the simple BASIC system model. Another reason was certainly to keep the user consequently off the system resources in order to stabilize the whole system. Even the Assembly Execution and Development ROM did protect some critcal system resources against manipulation.
Assembly Execution and Development Option ROM
For us today, this direct system access is essential for a couple of tasks, e.g. for dumping system ROM content for preservation purposes. However what is not possible with HP BASIC can be done in assembler. HP's programmers used their own assemblers like MASM for this purpose. Advantages of assembler beyond direct system access were highly compact code, higher execution speeds and a higher level of protection for intellectual property. The resulting machine code could be stored and distributed in special binary files of type BPRG. This was also a way how the built-in BASIC instruction set could be extended by special statements (the other way were the so-called option ROMs, which were just another way to store and install machine programs).
The Assembly Language ROMs
If you own HP's Assembly Execution and Development ROM you're better off, since you do already have a good and quite comfortable environment for at least the first steps in assembly language programming, including very useful debugging features. The associated manual provides a very good introduction into and reference of the HP 9845 assembly language, but doesn't explain assembly programming concepts in general, nor does it tell much about the underlying architecture. So you should already have a basic understanding of assembly language programming before you dive into this bulky manual.
Anyway, study of this manual is highly recommended, since it tells much about how the HP 9845 basically works. There are copies for download at bitsavers (here) or at hpmuseum.net (look here). If you need a general introduction into assembly language programming, there are many sites in the web which teach the fundamentals.
HP's Assembly Language ROMs first had been developed for the HP 9835 system, where providing an easy-to-use assembly prgramming environment was part of the product positioning strategy for the 9835A/B systems. At the time this approach could really considered as revolutionary, since assembly programming had been the domain of system and mainframe programmers, and always was done in a multi-stage batch process. HP's engineers now applied the existing interactive programming environment of HP's enhanced BASIC with syntax checking on input and real-time debugging for assembly language. Although the Assembly Language ROMs lacked support for some advanced assembly programming features like macros, and no disassembler had been available, in fact this semi-interpretative approach was well done and still is unmatched. With the introduction of the HP 9845B model, the Assembly Language ROMs were also provided for those systems.
By the way, there is another option ROM called Assembly Execution ROM for the HP 9845. This ROM is only useful for loading and executing a special file format of type ASMB, which is used by the Assembly Execution and Development ROM as a saving format for object code modules. Unfortunately, this Assembly Execution ROM cannot (as the name already suggests) be used to develop assembly or machine code. Presumably the Assembly Execution ROM was originally intended as a run time environment for the distribution of OEM assembly programs.
Assembly Language ROMs were not designed as a standalone assembly language system, but were rather integrated into the BASIC development environment. When the Assembly Execution and Development ROM is plugged in, the BASIC instruction set will become extended by a couple of special BASIC instructions. The basic idea was that the assembly language program was in fact part of a BASIC program. This was achieved mainly by three new BASIC instructions: the ISOURCE instruction, the IASSEMBLE instruction and the ICALL instruction.
First, the whole assembly language program has to be included in the BASIC program with each assembly language line preceeded by a normal BASIC line number and the ISOURCE statement. Next, the IASSEMBLE instruction has to be included at the start of the BASIC program. And finally, calls from the BASIC code to the assembly language routines can be prgrammed with the ICALL instruction anywhere within the BASIC program. All this can be done with the standard 9845 program editor. The assembly language program is structured into modules, and each module can hold an arbitrary number of routines. Both modules and routines are identifies by unique names within the assembly language instruction code.
Now the BASIC program (with the assembly language program embedded inside) can be run in the normal way. At the point where the BASIC interpreter encounters the IASSEMBLE statement, the complete assembly language part is translated module by module into relocatable object code. Since all syntaxing and symbolic translation is already done during program input, this is a very fast process. The resulting object code modules are placed one after another in a dedicated RAM area, the ICOM region. When finished, the interpreter resumes the normal execution of BASIC statements until it runs on an ICALL instruction. At this point the appropriate machine code routine will be invoked. After execution the control returns to the BASIC line just after the ICALL statement.
One important fact is that object code created with an assembly ROM, when called with an ICALL statement, will be executed on the PPU (not on the LPU like BASIC programs). This makes it possible to directly access all I/O registers and status lines (including the HLT line for vertical retrace).
Once assembled and placed into the ICOM region, object code modules may be saved to mass storage with the ISTORE and loaded with the ILOAD commands (depending on whether saved to tape or to disk there are different file types used, either the OPRM or the ASMB type). Those files containing relocatable object code routines can only be loaded (and used inside BASIC programs) with either the Assembly Execution ROM or the Assembly Execution and Development ROM installed.
As the assembly language source code is actually part of the BASIC program, it will be saved and loaded together with the BASIC code as DATA or PROG file. If you load those files without the Assembly Execution and Development ROM installed, you'll get an error messages for each line containing assembly language source code, however the BASIC part will load anyway.
Stricly speaking, ROM-based assembly language development always produces hybrid programs with some BASIC and some assembly language statements. This had awesome advantages, especially for the programmer who wanted to combine the ease of BASIC programming with the power & control level of assembly language programming. The debugging and test support of the assembly development ROM for assembly language programs was really ahead of time. The drawback was that the assembly development ROM did intentionally not support the creation of standalone object code or BPRG binary programs (actually, this is only half of the truth - in fact, with some special techniques, the assembly ROMs were quite well used to create binary programs, and this was actually how HP's developers did it, but it was kept as a well-protected secret). So object code created with the assembly language ROM normally requires the presence of the Assembly Execution and Development ROM or at least the Assembly Execution ROM for execution.
Nevertheless, the assembler concept of the 9835/9845 was highly innovative for the time. See the article "Assembly Programming Capability in a Desktop Computer" in the May 1979 issue of the HP Journal for an overview of the 9835/9845 assembler approach.
The Assembly Development and Execution ROM did not only include an editor and an assembler, but also a couple of useful commands, including tracing and manipulation of assembler variables, memory dumps in all representations (decimal, octal, hexadecimal, ASCII), plus getting & setting arbitrary memory locations. The latter by default was restricted to non-critical areas, however the (undocumented) command IPROTECT [ON|OFF] could be used to get full access to all kind of memory, including processor registers and base page.
The Assembly Programmer's Utility
In addition to the option ROMs described above, HP provided a utility for translating BASIC programs into assembly language programs, the so-called Assembly Programmer's Utility. It shows that HP seriously tried to do everything to make the development of assembly language programs as easy as possible.
The Assembly Prgrammer's Utility actually consists of a single program named TRANS, which takes BASIC input either from a file or from the keyboard, and compiles this input into an assembler source file, which then can be translated by the Assembly Execution and Development ROM into executable object code. The package also contains a number of small assembly code libraries and some sample BASIC programs which demonstrate the performance gains of the assembly code.
The compiler can both create stand-alone programs and assembler subroutines, which can be later referenced within a BASIC program via ICALL statements. Of course, an Assembly Execution and Development ROM is required to produce object code modules. Once created, these modules can also be loaded and run with an Assembly Execution ROM. For some few operations like printing, mass storage access or error processing the Assembly Programmer's Utilities make use of utilities provided by the Assembly ROMs. In any other aspect the produced assembly code is completely independent from any BASIC routines implemented in ROM.
There exist some (few) restrictions on the BASIC input code, probably the most important restriction is that identifier names can't be longer than 13 characters (the compiler translates those identifiers into internal names by appending two additional characters). Also, only integer arithmetics are supported (no floating point), so EMC processing has to be implemented on your own (see the Assembly Development ROM Manual for how to program maths routines).
The translation scheme is straight-forward, and even the translated BASIC lines are included as comments in the produced assembly listing, so it is easy to track the transformation steps. During translation, all activities are logged to the display, and it makes fun to see the utility working. It can take some time to translate a larger BASIC program, and, since only few references to ROM routines are included, the resulting code can get much larger than the original BASIC program, however the performance gains during execution of the resulting object code can be quite impressive. The included sample programs show direct comparisons between BASIC and assembler for a couple of computation intensive applications, including hidden-line removed 3D diagrams and a wireframe modelled 9845 rotating in real-time.
3DHDN Example - Hidden Line Removal
Apart from the performance gains, for the assembly programmer the value is twofold. First, the Assembly Programmer's Utility can simplify assembly program development by first writing and debugging a program in BASIC and then transferring it into assembly language. The produced assembler program can still be edited and optimized. Second, both the libraries and the produced assembler code provide a good example, how to implement high level command like FOR .. NEXT loops, I/O, array or even graphics processing in assembler.
See the download section below for retrieving the Assembly Programmer's Utility programs, including source code listings for libraries and sample programs. Also, have a look at the Assembly Programmer's Utility Manual, which can be downloaded separately at hpmuseum.net.
The ASM45 Assembler
For those who do not own an Assembly Execution and Development ROM, or for those who like the more conventional way of compiling assembly program listings into object code, I have written a batch assembler/disassembler program running under Windows which translates the HP 9845 assembly language into executable object code. It has a very early version number, but should be pretty stable and takes almost anything as input including MASM or Assembly Execution and Development ROM listings. You can download the Windows executable and the source code here.
Basically, the assembler takes a plain text file with some assembly language code and produces one or more files with object code, one for each module. It offers a wide choice of options and supports binary, decimal, octal, hexadecimal, floating point and ASCII text data representations.
As most assemblers, the system 45 assembler understands a number of pseudo-instructions in addition to the native instruction set of the system 45 processors. These pseudo-instructions are used to control the assembly process. Some of those pseudo-instuctions have been taken from the Assembly Execution and Development ROM (mainly for compatibility reasons), some have been added to support the batch assembly process.
Some pseudo-instructions from the Assembly Execution and Development ROM are not implemented because they don't make sense in a batch assembler (i.e. those for source listing control, conditional assembly and BASIC interfacing), however if they are used, they are simply ignored and do not produce an error.
In turn, some additional features have been added, like support for hexadecimal notation in addition to octal and decimal notation (both for numeric input and output hexadecimal numbers are indicated by a leading dollar sign, like $FFFF).
Note that the ORG pseudo-instruction should be used before a NAM pseudo-instruction for any non-relocateable object module in order to define the starting address. Any address input at the beginning of an assembler input line is completely ignored.
Below is a summary of all available pseudo-instructions for the assembler. See the Assembly Development ROM Manual for a detailed description of those pseudo-instructions.
Pseudo-Instruction | Description |
NAM <name> | Designates the beginning of a module and specifies the module name. |
END <name> | Designates the end of module <name>. |
ORG <address> | (Re-)sets the address counter to <address>. Unless a completely relocatable module is intended, an ORG pseudo-instruction should be used immediately before the module declaration (NAM) in order to define the target address of the module. |
BSS <expression> | Reserves a block of memory. |
LIT <expression> | Reserves memory for literals and links. |
DAT <expression>[,<expression>[,...]] | Defines data generators. |
EQU <expression> | Defines a symbol. |
SET <expression> | Defines a symbol (same as EQU). |
ENT <symbol>[,<symbol>[,...]] | Identifies entry points in the module. |
EXT <symbol>[,<symbol>[,...]] | Identifies external entry points. |
IOF | Turns off automatic indirection by the assembler. |
ION | Turns on automatic indirection by the assembler. |
BON | Turns on automatic base page addressing by the assembler. |
BOF | Turns off automatic base page addressing by the assembler. |
REP | Repeat instruction. |
??? <operand> | Yes, three question marks. Not really a pseudo-instruction. Will be generated by the disassembler if there is no system 45 instruction for the object code. However, the ??? can be used to directly specifiy the word content at the address with <operand>. If compatibility mode is selected with the -isource option, this pseudo-opcode will be replaced by the DAT pseudo-opcode. |
Note: Pseudo-instruction have precedence over command line options.
The assembler is invoked by
asm45 [<options>] <input_file> [<output_file>]
<input_file> must be a valid assembler source file. <output_file> will be used for the object code. If no <output_file> is given, the object code is stored in the file out.obj.
Below is a summary of all available options for the assembler. Execute Asm45 with the -h option for a summary. Also have a look into the README file for up-to-date information.
Option | Description |
-h | Output a summary of the command line options. |
-i <address> | Specifies the initial memory address (the address where the object code shall be installed). As default the memory address 0 is used. When starting with address 0 keep in mind that the first 32 words are reserved for the processor. Also be aware that the base page usually takes the first 512 words of the lower and the last 512 words of the upper memory block. |
-l | Produce a normalized assembler code listing in addition to the object code. The listing will be sent to the standard output. |
-n | Activate line numbering (useful for transferring to Assembly Execution & Development ROM) |
-r | Select ROM mode instead of RAM mode. In ROM mode, all references to base page addresses are of type absolute (in RAM mode, those references are of type base page). |
-s | Output symbol table for the assembler source code. The symbol table will be sent to the standard output. |
-v | Enables more detailed output (mainly for diagnostic purposes) |
The ASM45 Disassembler
Once the assembly language programs are converted into machine code reading and understanding the code is a very hard task. Not completely impossible, but really tough. The best understanding can be achieved if you know the original developers. If not, the original commented assembly program listing would be the second best choice. Since those listings often represented man-years of work, they were kept top-secret, and even today it is sometimes hard to get a copy (if there is one at all). The third-best way is to use a so-called disassembler, which re-translates the machine code into an (of course uncommented but at least readable) assembly program listing, and to do the necessary reverse-engineering to reconstruct the original purpose.
For the HP 9825 calculator, the predecessor of the 9845, HP was as nice as including the commented assembly listings for the complete operating system in the patent documents (at least in the early ones until they realized what they had done). Unfortunately HP obviously decided not to repeat this for the HP 9845 series, so the operating system of the HP 9845 is, for the most parts, still a well-kept secret. We can only hope that HP eventually makes those listings public provided they still do exist.
For the time being, we can only try to dump the ROMs and the BPRG-files, and to use a disassembler to make them more readable. With the right command line options, the Asm45 program can support this task as a disassembler for object code. In this case, Asm45 takes an object code file as input and produces an assembly language listing, with all internal references being represented by labels. This step is supported by the knowledge of several base page locations, which work as system constants and variables and so can be denoted with the proper symbols. The resulting listing can even be altered and re-compiled to object code again. There are also a number of functions build into the assembler, which assist the further analysis, like a hexdump utility or a cross-reference-generator. See the Asm45 README for details.
The disassembler is invoked by
asm45 -d [<options>] <input_file>
<input_file> must be a valid object code file. The disassembly is sent to standard output and can be redirected into a file with the following syntax:
asm45 -d [<options>] <input_file> > <file>
Below is a summary of all available options for the assembler. Execute Asm45 with the -h option for a summary. Also have a look into the README file for up-to-date information.
Option | Description |
-h | Output a summary of the command line options. |
-i <address> | Specifies the initial memory address (the address where the object code shall be installed). As default the memory address 0 is used. When starting with address 0 keep in mind that the first 32 words are reserved for the processor. Also be aware that the base page usually takes the first 512 words of the lower and the last 512 words of the upper memory block. |
-isource | Produce compatible format usable as input for Assembly Execution & Development ROM (e.g. with ISOURCE statements) |
-lpu | Map LPU system constants into lower base page range. Useful for programming binaries. |
-ppu | Map PPU system constants into lower base page range. Useful for programming binaries. |
-o | Assume octal data representation for output (default is hexadecimal). |
-d | Assume decimal data representation for output (default is hexadecimal). |
-w <start>,<end> | Specifies the memory address window (scope) for the disassembly. If nothing is specified, the disassembler assumes 0 as starting address and disassembles the whole input file. |
-a | Adds the ASCII interpretation of the object code to the disassembly. Useful if parts of the object code contain strings. |
-b | Include used base page symbols as intro with EQU statements. Useful for compiling with base page references when using system constants and variables. |
-t | Adds timing information measured in clock cycles for each instruction to the disassembly. Useful where timing is of relevance. |
-c | Adds a short description for each instruction in form of a comment. Useful if not already familiar with the system 45 instruction set. |
-x | Build a cross reference. Useful to identify variables, subroutines and jump vectors, especially for code analysis. |
-s | Build a string table on a heuristic basis. Useful for the identification of text parts in the object code. |
-m | Produce a memory dump in addition to the disassembly. By default, the memory dump uses hexadecimal representation, however with the -o and the -d option it can be changed to octal or decimal. |
-n | Activate line numbering (useful for transferring to Assembly Execution & Development ROM) |
-v | Enables more detailed output (mainly for diagnostic purposes) |
Downloads
Click here for downloading Asm45:
Asm45 1.1 Windows executable: | asm45-11-bin.zip |
Asm45 1.1 source code package: | asm45-11-src.zip |
Here are sample system firmware disassemblies for several 9845 configurations:
System firmware disassemblies with cross reference, hexdump and string analysis for 4 selected system configurations (including firmware object code): | System-ROM-listings.zip |
Finally, here are the Assemby Programmer's Utilities for download:
Assembly Programmer's Utilities (System and Data disks in hpi format) | 09845-10260 9845B-C Assembly Programming (1981).zip |
Assembly Programmer's Utilities - sources (both disks) | 09845-10264 9845B-C Assembly Programming (1981) - sources.zip |
Any comments on this assembler are highly welcome (as are all trials to analyze the machine code fragments, see the Software section if you like to make a test).