blog:commodore_c65_dtv_programming

DTV (and C64) Programming - Using the CC65 toolchain

From Wikipedia:

The C64 Direct-to-TV, called C64DTV for short, is a single-chip implementation of the Commodore 64 computer, contained in a joystick (modeled after the mid-1980s Competition Pro
joystick), with 30 built-in games. The design is similar to the Atari Classics 10-in-1 TV Game. The circuitry of the C64DTV was designed by Jeri Ellsworth, a computer chip designer who
had previously designed the C-One.

These are my notes on programming for the C64 DTV and making use of some of the extended functionality of the hardware from within C, rather than 6502 assembly. I hope you find these useful - whilst I came to the DTV more than a decade after its popularity peaked, I've found it a very capable little system!

  • You can compile, assemble and link in one pass using cl65, but when using multiple source files you must use cc65, then as65, then ld65.
  • There is no distinct target for the DTV, use the CC65 target flag '-t c64'
  • Compiler optimisation is enabled by '-O', maximum optimisation is achieved by '-Oirs'
  • The C64 runtime needs to be linked against 'c64.lib', it should be the last command line argument to ld65.

This example makefile uses CC65 to compile, assemble and then link our source code. It assumes the following about our project layout:

  • Source code is stored in ./src/
  • Intermediary assembler and object files are created in ./bin/
  • Fully linked programme output is copied into ./dist/
  • The Vice x64dtv emulator is installed to run the resulting file(s).

Extend the OBJFILES value with any additional code that needs compiling, and add a stanza to match the bin/main.o entry for each. A macro to glob src/*.c would be an improvement, especially if compiling many different source files.

#################################
# Path to tools
#################################
CC 	= cc65
AS   	= ca65
LD 	= ld65
VICE    = x64dtv
RM	= rm
RMFLAGS	= -f -v
 
#################################
# Compiler flags
#################################
LIB = 
SYSTEM = dtv
EXTRA_INCLUDES = -I./src -I./common 
CFLAGS = -Oirs -t c64 
CFLAGS_DEBUG = -Oirs -t c64 -D DEBUG=1
ASFLAGS = -t c64
LDFLAGS = -t c64
EXTRA_LIBS = c64.lib
LIBRARY_PATH =
 
#################################
# What our application is named
#################################
TARGET_NAME = scrolls.prg
TARGET = bin/$(TARGET_NAME) 
 
#################################
# Targets to build/run
#################################
all: $(TARGET) 
full: all
run: all copy doit
 
#################################
# List of all needed game object files
#################################
 
OBJFILES = bin/main.o
 
###########################################################
# This builds our game binary from the defined OBJFILES
###########################################################
 
$(TARGET): $(OBJFILES)
	@echo ""
	@echo "========================================"
	@echo " -= Linking $(TARGET) =-"
	@echo ""
	@echo Linking main binary....
	$(LD) $(LDFLAGS) $(LIBRARY_PATH) -o $(TARGET) $(OBJFILES) $(EXTRA_LIBS)
	@echo "======= END OF $(TARGET) ==============="
	@echo ""
	@echo ""
 
#######################################
# C sections
#######################################
 
#### Main ####
bin/main.o: src/main_dtv.c
	$(CC) $(CFLAGS) $(EXTRA_INCLUDES) src/main_dtv.c -o bin/main.s
	$(AS) $(ASFLAGS) bin/main.s -o bin/main.o
 
###############################
# Copy in game assets
###############################
copy:
	@echo ""
	@echo "=========================="
	@echo " Copying game assets"
	@echo ""
	@echo "- Copying files..."
	rm -rf dist/*
	mkdir -p dist/assets
	cp -v $(TARGET) dist/
 
	@echo ""
	@echo "=========================="
 
###############################
# Run binary
###############################
doit:
	@echo ""
	@echo "=========================="
	@echo " Running game"
	@echo ""
	cd dist && $(VICE) -autostart $(TARGET_NAME) ; cd - 
 
###############################
# Clean up
###############################
clean:
	@echo ""
	@echo "=========================="
	@echo " Cleaning up"
	@echo ""
	@echo "- Old object files..."
	rm -rf bin/*
	rm -rf dist/*
	@echo ""
	@echo "=========================="

The C64 DTV is a very close re-implementation of the C64 (it is not a 100% accurate implementation, but fairly close), but it also has a substantial number of improvements over the original system:

  • 2MB RAM
  • Blitter
  • Faster memory/processor
  • New video modes
  • Digital 8bit sound

Some of these are exposed directly in CC65, whereas some need to be enabled/toggled via writing to various addresses in memory.

Detecting DTV

#include <accelerator.h>
unsigned char detect_c64dtv(void);

Returns 1 if the code is running on a C64 DTV2/3. Otherwise returns 0.

#include <stdio.h>
#include <accelerator.h>
 
int main(){
 
	if (detect_c64dtv()){
		printf("Hello, DTV world\n");
	} else {
		printf("Hello, C64 world\n");
	}
	return 0;	
}

Output:


Get/Set DTV CPU Speed

#include <accelerator.h>
unsigned char get_c64dtv_speed(void);
unsigned char __fastcall__ set_c64dtv_speed(unsigned char speed);

The DTV can (optionally) run faster than the normal C64. It has two speeds 0 (SPEED_SLOW / SPEED_1X) and 1 (SPEED_2X).

#include <stdio.h>
#include <accelerator.h>
 
int main(){
	unsigned char s;
	if (detect_c64dtv()){
		s = get_c64dtv_speed();
                printf("Hello, DTV world, speed is %d\n", s);
                s = set_c64dtv_speed(SPEED_2X);
                printf("Speed is now %d\n", s);
	} else {
		printf("Hello, C64 world\n");
	}
	return 0;	
}

Output


These functions raise the DTV above a simple C64 clone and add substantially improved functionality to the system: increased colours, digital sound and a dedicated image blitter.

Enabling 320x200 Linear Framebuffer

One of the main party tricks of the DTV is that it has support for a 320×200, 8bpp (256 colour) screen mode. But this takes up more memory (64000 bytes) than is available in the remainder of the CPU-accessible working RAM of the computer. Hence the framebuffer needs to be created in the memory region of 64kb-2048kb, known as extended/high memory.

Here's a set of simple functions and prototypes that initialise the DTV hardware and set up access to a 320×200, 8bpp linear framebuffer at 0x0100000. The location of the framebuffer in extended memory is controlled by the value of FRAMEBUFFER_0:

dtv_Init()

#include <stdint.h>
#include <accelerator.h>
#include "dtv.h"
#include "dtv_reg.h"
 
void dtv_Init(void){
	unsigned char i;
 
	// =================================================
	// Initialise extended functions of DTV
	// =================================================
	POKE(DTV_VIC_REG_EXTENDED, 0x01);	// Enabled extended VIC features
 
	// =================================================
	// Enable DTV 2X speed
	// =================================================
	set_c64dtv_speed(SPEED_2X);
 
	// =====================================
	// Set up the linear framebuffer and 8bpp chunky pixel mode
	// =====================================
	POKE(DTV_VIC_REG_CFG, 0x55);		// Enable linear framebuffer mode
	POKE(DTV_VIC_REG_CONTROL_1, 0x5B);	//
	POKE(DTV_VIC_REG_CONTROL_2, 0x18);	// 
	POKE(DTV_VIC_REG_BORDER, 0);		// Border colour to black
	POKE(DTV_VIC_REG_BG_0, 0);		// BG colour to black
	POKE(DTV_VIC_REG_LFB_STEPSIZE, 0x08);	// Set step size of 8 == 8bpp
 
	// =====================================
	// Set framebuffer address
	// =====================================
	POKE(DTV_VIC_REG_LFB_START_LOW, FRAMEBUFFER_0);
	POKE(DTV_VIC_REG_LFB_START_MED, FRAMEBUFFER_0 >> 8);
	POKE(DTV_VIC_REG_LFB_START_HI, FRAMEBUFFER_0 >> 16);
 
	// =========================================
	// Set the first 16 palette entries to
	// be part of the full 256 colours, and not
	// the standard C64 set.
	// =========================================
	for(i = 0; i < 16; i++){
		POKE(DTV_VIC_PALETTE + i, i);	
	}
}

dtv.h

#ifndef _DTV_H
#define _DTV_H
 
#include <stdint.h>
 
// These are simple macros to emulate BASIC peek/poke or the inb/outb of other platforms
#define POKE(addr,val)     (*(unsigned char*) (addr) = (val))
#define PEEK(addr)         (*(unsigned char*) (addr))
 
#define DMA_COMPLETE		1
#define DMA_IN_PROGRESS		0
#define DMA_RAM_MASK		0x400000	// Mask to apply to addresses when the address is of RAM
#define DMA_ROM_MASK		0x000000	// Mask to apply to addresses when the address is of ROM
 
#define	FRAMEBUFFER_0		0x010000	// Where the 64k linear framebuffer is located in high memory
#define	FRAMEBUFFER_1		0x020000	// Our 64k drawing pad that gets flipped to framebuffer 0
#define	FRAMEBUFFER_SIZE	320 * 200
 
// DMA functions
void dtv_dma(uint32_t src, uint32_t dst, uint16_t size);
 
// Generic
void dtv_Init(void);
 
#endif
 
<code C>

dtv_reg.h

// DTV registers which can toggle extended features on/off
#define DTV_VIC_REG_CONTROL_1			0xD011
#define DTV_VIC_REG_CONTROL_2			0xD016
#define DTV_VIC_REG_BORDER			0xD020 // Border pen colour
#define DTV_VIC_REG_BG_0			0xD021 // BG pen colour
#define DTV_VIC_REG_CFG				0xD03C // Controls linear framebuffer mode amongst other settings
#define DTV_VIC_REG_EXTENDED			0xD03F // Enables/disabled DTV extended feature set
#define DTV_VIC_REG_LFB_START_LOW		0xD049 // Linear framebuffer address low byte
#define DTV_VIC_REG_LFB_START_MED		0xD04A // Linear framebuffer address middle byte
#define DTV_VIC_REG_LFB_START_HI		0xD04B // Linear framebuffer address high byte
#define DTV_VIC_REG_LFB_STEPSIZE		0xD04C // Linear framebuffer chunk size, 8 == 8bpp
#define DTV_VIC_PALETTE				0xD200 // Start address of palette entries
// DMA registers
#define DMA_PORT_SOURCE_LOW			0xD300 // DMA transfer source address low byte
#define DMA_PORT_SOURCE_MED			0xD301 // DMA transfer source address middle byte
#define DMA_PORT_SOURCE_HI			0xD302 // DMA transfer source address high byte
#define DMA_PORT_DEST_LOW			0xD303 // DMA transfer destination address low byte
#define DMA_PORT_DEST_MED			0xD304 // DMA transfer destination address middle byte
#define DMA_PORT_DEST_HI			0xD305 // DMA transfer destination address high byte
#define DMA_PORT_SOURCE_STEP_LOW		0xD306 // DMA transfer step size low byte
#define DMA_PORT_SOURCE_STEP_HI			0xD307 // DMA transfer step size high byte
#define DMA_PORT_DEST_STEP_LOW			0xD308 // DMA transfer step size low byte
#define DMA_PORT_DEST_STEP_HI			0xD309 // DMA transfer step size high byte
#define DMA_PORT_SIZE_LOW			0xD30A // DMA transfer size (bytes) low byte
#define DMA_PORT_SIZE_HI			0xD30B // DMA transfer size (bytes) high byte
#define DMA_PORT_SOURCE_MODULO_LOW		0xD30C
#define DMA_PORT_SOURCE_MODULO_HI		0xD30D
#define DMA_PORT_DEST_MODULO_LOW		0xD30E
#define DMA_PORT_DEST_MODULO_HI			0xD30F
#define DMA_PORT_SOURCE_LENGTH_LOW		0xD310
#define DMA_PORT_SOURCE_LENGTH_HI		0xD311
#define DMA_PORT_DEST_LENGTH_LOW		0xD312
#define DMA_PORT_DEST_LENGTH_HI			0xD313
#define DMA_PORT_CLEAR_IRQ			0xD31D
#define DMA_PORT_MODULO_ENABLE			0xD31E
#define DMA_PORT_STATUS				0xD31F // DMA status & control register
// Blitter registers
#define BLITTER_SOURCE_A_LOW			0xD320	// Blitter source A address, low byte
#define BLITTER_SOURCE_A_MED			0xD321	// Blitter source A address, middle byte
#define BLITTER_SOURCE_A_HI			0xD322	// Blitter source A address, high byte
#define BLITTER_SOURCE_A_MODULO_LOW		0xD323	// Blitter source A line modulo, high byte
#define BLITTER_SOURCE_A_MODULO_HI		0xD324	// Blitter source A line modulo, high byte
#define BLITTER_SOURCE_A_LENGTH_LOW		0xD325	// Lenth of one line of pixels of source A, low byte
#define BLITTER_SOURCE_A_LENGTH_HI		0xD326	// Lenth of one line of pixels of source A, high byte
#define BLITTER_SOURCE_A_STEP			0xD327
#define BLITTER_SOURCE_B_LOW			0xD328	// Blitter source B address, low byte
#define BLITTER_SOURCE_B_MED			0xD329	// Blitter source B address, middle byte
#define BLITTER_SOURCE_B_HI			0xD32A	// Blitter source B address, high byte
#define BLITTER_SOURCE_B_MODULO_LOW		0xD32B	// Blitter source B line modulo, low byte
#define BLITTER_SOURCE_B_MODULO_HI		0xD32C	// Blitter source B line modulo, high byte
#define BLITTER_SOURCE_B_LENGTH_LOW		0xD32D	// Lenth of one line of pixels of source B, low byte
#define BLITTER_SOURCE_B_LENGTH_HI		0xD32E	// Lenth of one line of pixels of source B, high byte
#define BLITTER_SOURCE_B_STEP			0xD32F
#define BLITTER_DEST_LOW			0xD330	// Blitter desintation address, low byte
#define BLITTER_DEST_MED			0xD331	// Blitter desintation address, middle byte
#define BLITTER_DEST_HI				0xD332	// Blitter desintation address, high byte
#define BLITTER_DEST_MODULO_LOW			0xD333	// Blitter destination line modulo, low byte
#define BLITTER_DEST_MODULO_HI			0xD334	// Blitter destination line modulo, high byte
#define BLITTER_DEST_LENGTH_LOW			0xD335	// Lenth of one line of pixels of destination, low byte
#define BLITTER_DEST_LENGTH_HI			0xD336	// Lenth of one line of pixels of destination, high byte
#define BLITTER_DEST_STEP			0xD337
#define BLITTER_SIZE_LOW			0xD338	// Blitter copy size (bytes), low byte
#define BLITTER_SIZE_HI				0xD339	// Blitter copy size (bytes), high byte
#define BLITTER_START				0xD33A
#define BLITTER_CFG				0xD33B	//
#define BLITTER_MINTERM_CFG			0xD33E	// Sets operation mode of blitter
#define BLITTER_STATUS				0xD33F

Double Buffering & DMA Copies

To access the extended memory area you can use several techniques, the easiest is to engage the DMA controller to shift blocks of memory between regions. The controller works with a source and destination address and the length of the transfer (in bytes).

The transfer can go from the 0-64kb region to the 64-2048kb region and even from ROM to RAM. The function below is hard coded to transfer RAM to RAM (note the bitmask applied to the source and destination addresses) by setting the upper 2 bits of the high byte of both source and destination.

The address format is as follows:

  • Bits 0-7: Low byte (8 bits)
  • Bits 8-15: Middle byte (8 bits)
  • Bits 16-21: High byte (6 bits)
  • Bits 22-23: Address type (2 bits)

The meaning of those two address bits are:

  • 00: Address is ROM
  • 01: Address is RAM
  • 10: Address is RAM + Register (??)

So, OR your addresses with 0x400000 for RAM source/destinations, or leave them unmasked to indicate ROM sources (you probably know what you're doing if you're doing a DMA copy from ROM…):

dtv_dma

#include <stdint.h>
#include <accelerator.h>
#include "dtv.h"
#include "dtv_reg.h"
 
void dtv_dma(uint32_t src, uint32_t dst, uint16_t size){
	// Copy 'size' bytes of data from 'src' to 'dst' in high memory
	// using the DMA engine.
	// 'src' and 'dst' are the addresses to copy from - they are NOT pointers
	uint32_t src_masked;
	uint32_t dst_masked;
 
	// Source and destination addresses need to be OR-ed with a RAM/ROM
	// mask in order for the DMA controller to recognise that these are in
	// RAM, and not ROM.
 
	src_masked = src | DMA_RAM_MASK;
	dst_masked = dst | DMA_RAM_MASK;
 
	// Wait for any previous DMA to complete
	while (PEEK(DMA_PORT_STATUS) & 0x01){};
 
	// Set source low/medium/high bytes
	POKE(DMA_PORT_SOURCE_LOW, src_masked);
	POKE(DMA_PORT_SOURCE_MED, src_masked >> 8);
	POKE(DMA_PORT_SOURCE_HI, src_masked >> 16);
 
	// Set destination low/medium/high bytes
	POKE(DMA_PORT_DEST_LOW, dst_masked);
	POKE(DMA_PORT_DEST_MED, dst_masked >> 8);
	POKE(DMA_PORT_DEST_HI, dst_masked >> 16);
 
	// Set size of transfer in bytes
	POKE(DMA_PORT_SIZE_LOW, size);
	POKE(DMA_PORT_SIZE_HI, size >> 8);
 
	// Start the transfer
	POKE(DMA_PORT_STATUS, 0x0D);
}

Here are a couple of functions (draw_Clear() & draw_Flip()) which can fill an off-screen scratchpad with a solid colour, and then copies the entire 320×200 scratchpad to the linear framebuffer using the dtv_dma() function above:

#include <stdio.h>
#include <string.h>
#include "dtv.h"
 
void draw_Clear(void){
	// Clear the contents of the screen buffer with a solid colour, line by line
 
	unsigned char i;
	char blit_buffer[320];
 
	// Fill the blit buffer with 320px worth of black pixels
	memset(blit_buffer, 0x00, 320);
 
	// Copy SCREEN_HEIGHT worth of blit_buffers to the screen buffer
	for (i = 0; i < 200; i++){	
		dtv_dma((uint32_t*)&blit_buffer, FRAMEBUFFER_1 + (i * 320), 320);
	}
}
 
void draw_Flip(void){
	// Flip the screen buffer to the linear framebuffer in one operation
	dtv_dma(FRAMEBUFFER_1, FRAMEBUFFER_0, (uint16_t) FRAMEBUFFER_SIZE);	
}
Bugs

I've seemingly found a bug which manifests when making sequential DMA calls in a tight loop and the transfer length is 182 bytes or greater.

You can successfully DMA copy quite a large region in a single operation and there are no side-effects, however in my testing I have found that if you have a tight loop where you are DMA transferring line by line, I get corrupted transfers with any size over 181 bytes. See the example below:

The colour bars in the image should be continuous for the entire screen width, but they glitch at two points. The above example was generated by the following pseudocode:

screen = FRAMEBUFFER_LOCATION;
for (line = 0; line < SCREEN_HEIGHT; line++){
    memset(line_buffer, colour, SCREEN_WIDTH);
    dtv_DMA(&line_buffer, screen, SCREEN_WIDTH);
    colour++;
    screen += SCREEN_WIDTH;
}

If you reduce the transfer size to 181 bytes you do not get the glitching, and if you have a reasonable-number-of-cycles between DMA operations you also don't get the glitching… but I don't know what that reasonable-number-of-cycles value is. It's clearly a timing issue between DMA operations - due to the time it takes for transfers over a centre number of bytes, but all I know is that the documentation says that checking bit 0 of 0xD31F should indicate whether the transfer has finished or not. It's possible (though unlikely) that this is an emulation bug - until I have a working, physical DTV, I won't be able to confirm.


Blitter Operation

One of the big features of the DTV2 & 3 is the addition of blitter hardware. This dramatically speeds up transfer of memory and can be used to achieve very high speed image/screen manipulation.

The Blitter documentation is extensive, with dozens of different registers to set and configure. However, in most circumstances you will want to copy one solid block of pixels to another area (either on-screen or off - the blitter works over the entire 0-2048kb memory region, but not ROM).

Here I show a variation of the blitter command to transfer w x h pixels from src to dest.

  • src: address of the upper-left pixel of a rectangular block of source pixels
  • dst: address of the upper-left pixel of the destination area.
  • w: width of the source area to copy in pixels
  • h: number of lines of the source area to copy

The source area is OR-ed with the destination and transparency in the source area is honoured (source pixels with value 0 will not overwrite pixels in the destination).

dtv_Blit()

#include <stdint.h>
#include <accelerator.h>
#include "dtv.h"
#include "dtv_reg.h"
 
void dtv_Blit(uint32_t src, uint32_t dst, uint16_t w, unsigned char h){
	// Blit 'width * height' bytes of data from 'src' to 'dst' in high memory
	// using the blitter engine.
	//
	// 'src' is the address to copy from - it is NOT pointers
	// 'dst' is the destination address - it is not a pointer
	// 'w' is the width of the source rectangle, in pixels/bytes
	// 'h' is the height of the source rectangle, in pixels/bytes
 
	uint16_t total_bytes = w * h;
	uint16_t modulus = SCREEN_WIDTH - w;
 
	// ==================================
	// Source address
	// ==================================
	POKE(BLITTER_SOURCE_A_LOW, src);
	POKE(BLITTER_SOURCE_A_MED, src >> 8);
	POKE(BLITTER_SOURCE_A_HI, src >> 16);
 
	// ==================================
	// Destination address
	// ==================================
	POKE(BLITTER_DEST_LOW, dst);
	POKE(BLITTER_DEST_MED, dst >> 8);
	POKE(BLITTER_DEST_HI, dst >> 16);
 
	// ==================================
	// Set line length
	// ==================================
	POKE(BLITTER_SOURCE_A_LENGTH_LOW, w);
	POKE(BLITTER_SOURCE_A_LENGTH_HI, w >> 8);
	POKE(BLITTER_DEST_LENGTH_LOW, w);
	POKE(BLITTER_DEST_LENGTH_HI, w >> 8);
 
	// ==================================
	// Set modulus/wraparound
	// ==================================
	POKE(BLITTER_SOURCE_A_MODULO_LOW, modulus); 
	POKE(BLITTER_SOURCE_A_MODULO_HI, modulus >> 8);
	POKE(BLITTER_DEST_MODULO_LOW, modulus); 
	POKE(BLITTER_DEST_MODULO_HI, modulus >> 8);
 
	// ==================================
	// Step sizes
	// ==================================
	POKE(BLITTER_SOURCE_A_STEP, 0x10);
	POKE(BLITTER_DEST_STEP, 0x10);
 
	// ==================================
	// Set total transfer size
	// ==================================
	POKE(BLITTER_SIZE_LOW, total_bytes);
	POKE(BLITTER_SIZE_HI, total_bytes >> 8);
 
	// ==================================
	// Configure blitter mode and start blit
	// ==================================
	POKE(BLITTER_CFG, 0x04);			// Enable transparency, if supported by blitter
	POKE(BLITTER_MINTERM_CFG, 0x18);	// OR the source with destination
	POKE(BLITTER_START, 0x0B);
}

Here's an example of the blitter function in operation, copying a 100×100 rectangle of pixels from the source position of 1,1 to the destination at 219,100:

This is the source image - a basic set of colour bars:

Using dtv_Blit() we copy from 1,1 to 219,100 a 100×100 rectangle in OR mode:

Notice that the transparent (aka black) pixels in the source region are preserved when copying to the destination, and that the original content of the destination is preserved for those areas. Where the source is not transparent, those pixels then overwrite the destination. The mode of operation can be customised - see the DTV Programming Guide for more details; specifically the ALU mode set for register 0xD33E.

Bugs

However, there is a problem. On the DTV2 systems (the initial PAL models), the blitter is partially bugged and transparent copies result int a vertical banding effect:

Therefore, without mitigation, only DTV3/Hummer systems fully support transparent blits. If you are not bothered about transparency, then the blitter functions fine on DTV2, but to support transparency you need to do some type of workaround.

  • blog/commodore_c65_dtv_programming.txt
  • Last modified: 2023/01/25 17:21
  • by john