The device_disasm_interface and the disassemblers¶
1. Capabilities¶
The disassemblers are classes that provide disassembly and opcode meta-information for the cpu cores and unidasm. The device_disasm_interface connects a cpu core with its disassembler.
2. The disassemblers¶
2.1. Definition¶
A disassembler is a class that derives from util::disasm_interface. It then has two required methods to implement, opcode_alignment and disassemble, and 6 optional, interface_flags, page_address_bits, pc_linear_to_real, pc_real_to_linear, and one with four possible variants, decrypt8/16/32/64.
2.2. opcode_alignment¶
Returns the required alignment of opcodes by the cpu, in PC-units. In other words, the required alignment for the PC register of the cpu. Tends to be 1 (almost everything), 2 (68000...), 4 (mips, ppc...), which an exceptional 8 (tms 32082 parallel processor) and 16 (tms32010, instructions are 16-bits aligned and the PC targets bits). It must be a power-of-two or things will break.
Note that processors like the tms32031 which have 32-bits instructions but where the PC targets 32-bits values have an alignment of 1.
2.3. disassemble¶
This is the method where the real work is done. This method must disassemble the instruction at address pc and write the result to stream. The values to decode are retrieved from the opcode buffer. A data_buffer object offers four accessor methods:
They read the data at a given address and take endianness and nonlinear PCs for larger-than-bus-width accesses. The debugger variant also caches the read data in one block, so for that reason one should not read data too far from the base pc (e.g. stay within 16K or so, careful when trying to follow indirect accesses).
A number of CPUs have an external signal that splits fetches into an opcode part and a parameter part. This is for instance the M1 signal of the z80 or the SYNC signal of the 6502. Some systems present different values to the cpu depending on whether that signal is active, usually for protection purposes. On these cpus the opcode part should be read from the opcode buffer, and the parameter part from the params buffer. They will or will not be the same buffer depending on the system itself.
The method returns the size of the instruction in PC units, with a maximum of 65535. In addition, if possible, the disassembler should give some meta-information about the opcode by OR-ing in into the result:
STEP_OVER for subroutine calls or auto-decrementing loops. If there is some delay slots, also OR with step_over_extra(n) where n is the number of instruction slots.
STEP_OUT for the return-from-subroutine instructions
In addition, to indicated that these flags are supported, OR the result with SUPPORTED. An annoying number of disassemblers lies about that support (e.g. they do a or with SUPPORTED without even generating the STEP_OVER or STEP_OUT information). Don't do that, it breaks the step over/step out functionality of the debugger.
2.4. interface_flags¶
That optional method indicates specifics of the disassembler. Default of zero is correct most of the time. Possible flags, which need to be OR-ed together, are:
NONLINEAR_PC: stepping to the next opcode or the next byte of the opcode is not adding one to pc. Used for old LFSR-based PCs.
PAGED: PC wraps at a page boundary
PAGED2LEVEL: not only PC wraps at some kind of page boundary, but there are two levels of paging
INTERNAL_DECRYPTION: there is some decryption tucked between reading from AS_PROGRAM and the actual disassembler
SPLIT_DECRYPTION: there is some decryption tucked between reading from AS_PROGRAM and the actual disassembler, and that decryption is different for opcodes and parameters
Note that in practice non-linear pc systems are also paged, that PAGED2LEVEL implies PAGED, and that SPLIT_DECRYPTION implies DECRYPTION.
2.5. pc_linear_to_real and pc_real_to_linear¶
These methods should be present only when NONLINEAR_PC is set in the interface flags. They must convert pc to and from a value to a linear domain where the instruction parameters and next instruction are reached by incrementing the value. pc_real_to_linear converts to that domain, pc_linear_to_real converts back from that domain.
2.6. page_address_bits¶
Present on when PAGED or PAGED2LEVEL is set, gives the number of address bits in the lowest page.
2.7. page2_address_bits¶
Present on when PAGED2LEVEL is set, gives the number of address bits in the upper page.
2.8. decryptnn¶
One of these must be defined when INTERNAL_DECRYPTION or SPLIT_DECRYPTION is set. The chosen one is the one which takes what opcode_alignment represents in bytes.
That method decrypts a given value read from address pc (from AS_PROGRAM) and gives the result which will be passed to the disassembler. In the split decryption case, opcode indicates whether we're in the opcode (true) or parameter (false) part of the instruction.
3. Disassembler interface, device_disasm_interface¶
3.1. Definition¶
A CPU core derives from device_disasm_interface through cpu_device. One method has to be implemented, create_disassembler.
3.2. create_disassembler¶
That method must return a pointer to a newly allocated disassembler object. The caller takes ownership and handles the lifetime.
This method will be called at most one in the lifetime of the cpu object.
4. Disassembler configuration and communication¶
Some disassemblers need to be configured. Configuration can be unchanging (static) for the duration of the run (cpu model type for instance) or dynamic (state of a flag or a user preference). Static configuration can be done through either (a) parameter(s) to the disassembler constructor, or through deriving a main disassembler class. If the information is short and its semantics obvious (like a model name), feel free to use a parameter. Otherwise derive the class.
Dynamic configuration must be done by first defining a nested public
struct called config
in the disassembler, with virtual destructor
and pure virtual methods to pull the required information. A pointer
to that struct should be passed to the disassembler constructor. The
cpu core should then add a derivation from that config struct and
implement the methods. Unidasm will have to derive a small class from
the config class to give the information.
5. Missing stuff¶
There currently is no way for the debugger GUI to add per-core configuration. In particular, it is needed for the s2650 and Saturn cores. It should go through the cpu core class itself, since it's pulled from the config struct.
There is support missing in unidasm for per-cpu configuration. That's needed for a lot of things, see the unidasm source code for the current list ("Configuration missing" comments).