====== DTV (and C64) Programming - Using the CC65 toolchain ====== ~~TOC_HERE 2-5~~ From [[https://en.wikipedia.org/wiki/C64_Direct-to-TV|Wikipedia]]: > The C64 Direct-to-TV, called C64DTV for short, is a single-chip implementation of the Commodore 64 computer, contained in a joystick (modeled after the mid-1980s Competition Pro > joystick), with 30 built-in games. The design is similar to the Atari Classics 10-in-1 TV Game. The circuitry of the C64DTV was designed by Jeri Ellsworth, a computer chip designer who > had previously designed the C-One. These are my notes on programming for the C64 DTV and making use of some of the extended functionality of the hardware from within C, rather than 6502 assembly. I hope you find these useful - whilst I came to the DTV more than a decade after its popularity peaked, I've found it a very capable little system! {{:blog:c64:commodore64_dtv_mugshot-x600.jpg?200|}} ===== Basic use of CC65 ===== ==== CC65 Options for the C64 / DTV ==== * You can compile, assemble and link in one pass using **cl65**, but when using multiple source files you must use **cc65**, //then// **as65**, //then// **ld65**. * There is no distinct target for the DTV, use the CC65 target flag '**-t c64**' * Compiler optimisation is enabled by '**-O**', maximum optimisation is achieved by '**-Oirs**' * The C64 runtime needs to be linked against '**c64.lib**', it should be the last command line argument to **ld65**. ==== Example Makefile ==== This example makefile uses CC65 to compile, assemble and then link our source code. It assumes the following about our project layout: * Source code is stored in **./src/** * Intermediary assembler and object files are created in **./bin/** * Fully linked programme output is copied into **./dist/** * The [[https://vice-emu.sourceforge.io/vice_2.html|Vice x64dtv emulator]] is installed to run the resulting file(s). Extend the **OBJFILES** value with any additional code that needs compiling, and add a stanza to match the **bin/main.o** entry for each. A macro to //glob// **src/*.c** would be an __improvement__, especially if compiling many different source files. ################################# # Path to tools ################################# CC = cc65 AS = ca65 LD = ld65 VICE = x64dtv RM = rm RMFLAGS = -f -v ################################# # Compiler flags ################################# LIB = SYSTEM = dtv EXTRA_INCLUDES = -I./src -I./common CFLAGS = -Oirs -t c64 CFLAGS_DEBUG = -Oirs -t c64 -D DEBUG=1 ASFLAGS = -t c64 LDFLAGS = -t c64 EXTRA_LIBS = c64.lib LIBRARY_PATH = ################################# # What our application is named ################################# TARGET_NAME = scrolls.prg TARGET = bin/$(TARGET_NAME) ################################# # Targets to build/run ################################# all: $(TARGET) full: all run: all copy doit ################################# # List of all needed game object files ################################# OBJFILES = bin/main.o ########################################################### # This builds our game binary from the defined OBJFILES ########################################################### $(TARGET): $(OBJFILES) @echo "" @echo "========================================" @echo " -= Linking $(TARGET) =-" @echo "" @echo Linking main binary.... $(LD) $(LDFLAGS) $(LIBRARY_PATH) -o $(TARGET) $(OBJFILES) $(EXTRA_LIBS) @echo "======= END OF $(TARGET) ===============" @echo "" @echo "" ####################################### # C sections ####################################### #### Main #### bin/main.o: src/main_dtv.c $(CC) $(CFLAGS) $(EXTRA_INCLUDES) src/main_dtv.c -o bin/main.s $(AS) $(ASFLAGS) bin/main.s -o bin/main.o ############################### # Copy in game assets ############################### copy: @echo "" @echo "==========================" @echo " Copying game assets" @echo "" @echo "- Copying files..." rm -rf dist/* mkdir -p dist/assets cp -v $(TARGET) dist/ @echo "" @echo "==========================" ############################### # Run binary ############################### doit: @echo "" @echo "==========================" @echo " Running game" @echo "" cd dist && $(VICE) -autostart $(TARGET_NAME) ; cd - ############################### # Clean up ############################### clean: @echo "" @echo "==========================" @echo " Cleaning up" @echo "" @echo "- Old object files..." rm -rf bin/* rm -rf dist/* @echo "" @echo "==========================" ===== Accessing DTV Features ===== The C64 DTV is a very close re-implementation of the C64 (it is not a 100% accurate implementation, but fairly close), but it also has a substantial number of improvements over the original system: * 2MB RAM * Blitter * Faster memory/processor * New video modes * Digital 8bit sound Some of these are exposed directly in CC65, whereas some need to be enabled/toggled via writing to various addresses in memory. ==== Basic DTV Functions ==== === Detecting DTV === #include unsigned char detect_c64dtv(void); Returns 1 if the code is running on a C64 DTV2/3. Otherwise returns 0. #include #include int main(){ if (detect_c64dtv()){ printf("Hello, DTV world\n"); } else { printf("Hello, C64 world\n"); } return 0; } Output: {{:blog:c64:c64_cc65_dtv_detect.png?500|}} ---- === Get/Set DTV CPU Speed === #include unsigned char get_c64dtv_speed(void); unsigned char __fastcall__ set_c64dtv_speed(unsigned char speed); The DTV can (optionally) run faster than the normal C64. It has two speeds 0 (SPEED_SLOW / SPEED_1X) and 1 (SPEED_2X). #include #include int main(){ unsigned char s; if (detect_c64dtv()){ s = get_c64dtv_speed(); printf("Hello, DTV world, speed is %d\n", s); s = set_c64dtv_speed(SPEED_2X); printf("Speed is now %d\n", s); } else { printf("Hello, C64 world\n"); } return 0; } Output {{:blog:c64:c64_cc65_dtv_getset.png?500|}} ---- ==== Extended DTV Functions ==== These functions raise the DTV above a simple C64 clone and add substantially improved functionality to the system: increased colours, digital sound and a dedicated image blitter. === Enabling 320x200 Linear Framebuffer === One of the main party tricks of the DTV is that it has support for a 320x200, 8bpp (256 colour) screen mode. But this takes up more memory (64000 bytes) than is available in the remainder of the CPU-accessible working RAM of the computer. Hence the framebuffer needs to be created in the memory region of 64kb-2048kb, known as extended/high memory. Here's a set of simple functions and prototypes that initialise the DTV hardware and set up access to a //320x200, 8bpp linear framebuffer// at 0x0100000. The location of the framebuffer in extended memory is controlled by the value of //FRAMEBUFFER_0//: **dtv_Init()** #include #include #include "dtv.h" #include "dtv_reg.h" void dtv_Init(void){ unsigned char i; // ================================================= // Initialise extended functions of DTV // ================================================= POKE(DTV_VIC_REG_EXTENDED, 0x01); // Enabled extended VIC features // ================================================= // Enable DTV 2X speed // ================================================= set_c64dtv_speed(SPEED_2X); // ===================================== // Set up the linear framebuffer and 8bpp chunky pixel mode // ===================================== POKE(DTV_VIC_REG_CFG, 0x55); // Enable linear framebuffer mode POKE(DTV_VIC_REG_CONTROL_1, 0x5B); // POKE(DTV_VIC_REG_CONTROL_2, 0x18); // POKE(DTV_VIC_REG_BORDER, 0); // Border colour to black POKE(DTV_VIC_REG_BG_0, 0); // BG colour to black POKE(DTV_VIC_REG_LFB_STEPSIZE, 0x08); // Set step size of 8 == 8bpp // ===================================== // Set framebuffer address // ===================================== POKE(DTV_VIC_REG_LFB_START_LOW, FRAMEBUFFER_0); POKE(DTV_VIC_REG_LFB_START_MED, FRAMEBUFFER_0 >> 8); POKE(DTV_VIC_REG_LFB_START_HI, FRAMEBUFFER_0 >> 16); // ========================================= // Set the first 16 palette entries to // be part of the full 256 colours, and not // the standard C64 set. // ========================================= for(i = 0; i < 16; i++){ POKE(DTV_VIC_PALETTE + i, i); } } **dtv.h** #ifndef _DTV_H #define _DTV_H #include // These are simple macros to emulate BASIC peek/poke or the inb/outb of other platforms #define POKE(addr,val) (*(unsigned char*) (addr) = (val)) #define PEEK(addr) (*(unsigned char*) (addr)) #define DMA_COMPLETE 1 #define DMA_IN_PROGRESS 0 #define DMA_RAM_MASK 0x400000 // Mask to apply to addresses when the address is of RAM #define DMA_ROM_MASK 0x000000 // Mask to apply to addresses when the address is of ROM #define FRAMEBUFFER_0 0x010000 // Where the 64k linear framebuffer is located in high memory #define FRAMEBUFFER_1 0x020000 // Our 64k drawing pad that gets flipped to framebuffer 0 #define FRAMEBUFFER_SIZE 320 * 200 // DMA functions void dtv_dma(uint32_t src, uint32_t dst, uint16_t size); // Generic void dtv_Init(void); #endif **dtv_reg.h** // DTV registers which can toggle extended features on/off #define DTV_VIC_REG_CONTROL_1 0xD011 #define DTV_VIC_REG_CONTROL_2 0xD016 #define DTV_VIC_REG_BORDER 0xD020 // Border pen colour #define DTV_VIC_REG_BG_0 0xD021 // BG pen colour #define DTV_VIC_REG_CFG 0xD03C // Controls linear framebuffer mode amongst other settings #define DTV_VIC_REG_EXTENDED 0xD03F // Enables/disabled DTV extended feature set #define DTV_VIC_REG_LFB_START_LOW 0xD049 // Linear framebuffer address low byte #define DTV_VIC_REG_LFB_START_MED 0xD04A // Linear framebuffer address middle byte #define DTV_VIC_REG_LFB_START_HI 0xD04B // Linear framebuffer address high byte #define DTV_VIC_REG_LFB_STEPSIZE 0xD04C // Linear framebuffer chunk size, 8 == 8bpp #define DTV_VIC_PALETTE 0xD200 // Start address of palette entries // DMA registers #define DMA_PORT_SOURCE_LOW 0xD300 // DMA transfer source address low byte #define DMA_PORT_SOURCE_MED 0xD301 // DMA transfer source address middle byte #define DMA_PORT_SOURCE_HI 0xD302 // DMA transfer source address high byte #define DMA_PORT_DEST_LOW 0xD303 // DMA transfer destination address low byte #define DMA_PORT_DEST_MED 0xD304 // DMA transfer destination address middle byte #define DMA_PORT_DEST_HI 0xD305 // DMA transfer destination address high byte #define DMA_PORT_SOURCE_STEP_LOW 0xD306 // DMA transfer step size low byte #define DMA_PORT_SOURCE_STEP_HI 0xD307 // DMA transfer step size high byte #define DMA_PORT_DEST_STEP_LOW 0xD308 // DMA transfer step size low byte #define DMA_PORT_DEST_STEP_HI 0xD309 // DMA transfer step size high byte #define DMA_PORT_SIZE_LOW 0xD30A // DMA transfer size (bytes) low byte #define DMA_PORT_SIZE_HI 0xD30B // DMA transfer size (bytes) high byte #define DMA_PORT_SOURCE_MODULO_LOW 0xD30C #define DMA_PORT_SOURCE_MODULO_HI 0xD30D #define DMA_PORT_DEST_MODULO_LOW 0xD30E #define DMA_PORT_DEST_MODULO_HI 0xD30F #define DMA_PORT_SOURCE_LENGTH_LOW 0xD310 #define DMA_PORT_SOURCE_LENGTH_HI 0xD311 #define DMA_PORT_DEST_LENGTH_LOW 0xD312 #define DMA_PORT_DEST_LENGTH_HI 0xD313 #define DMA_PORT_CLEAR_IRQ 0xD31D #define DMA_PORT_MODULO_ENABLE 0xD31E #define DMA_PORT_STATUS 0xD31F // DMA status & control register // Blitter registers #define BLITTER_SOURCE_A_LOW 0xD320 // Blitter source A address, low byte #define BLITTER_SOURCE_A_MED 0xD321 // Blitter source A address, middle byte #define BLITTER_SOURCE_A_HI 0xD322 // Blitter source A address, high byte #define BLITTER_SOURCE_A_MODULO_LOW 0xD323 // Blitter source A line modulo, high byte #define BLITTER_SOURCE_A_MODULO_HI 0xD324 // Blitter source A line modulo, high byte #define BLITTER_SOURCE_A_LENGTH_LOW 0xD325 // Lenth of one line of pixels of source A, low byte #define BLITTER_SOURCE_A_LENGTH_HI 0xD326 // Lenth of one line of pixels of source A, high byte #define BLITTER_SOURCE_A_STEP 0xD327 #define BLITTER_SOURCE_B_LOW 0xD328 // Blitter source B address, low byte #define BLITTER_SOURCE_B_MED 0xD329 // Blitter source B address, middle byte #define BLITTER_SOURCE_B_HI 0xD32A // Blitter source B address, high byte #define BLITTER_SOURCE_B_MODULO_LOW 0xD32B // Blitter source B line modulo, low byte #define BLITTER_SOURCE_B_MODULO_HI 0xD32C // Blitter source B line modulo, high byte #define BLITTER_SOURCE_B_LENGTH_LOW 0xD32D // Lenth of one line of pixels of source B, low byte #define BLITTER_SOURCE_B_LENGTH_HI 0xD32E // Lenth of one line of pixels of source B, high byte #define BLITTER_SOURCE_B_STEP 0xD32F #define BLITTER_DEST_LOW 0xD330 // Blitter desintation address, low byte #define BLITTER_DEST_MED 0xD331 // Blitter desintation address, middle byte #define BLITTER_DEST_HI 0xD332 // Blitter desintation address, high byte #define BLITTER_DEST_MODULO_LOW 0xD333 // Blitter destination line modulo, low byte #define BLITTER_DEST_MODULO_HI 0xD334 // Blitter destination line modulo, high byte #define BLITTER_DEST_LENGTH_LOW 0xD335 // Lenth of one line of pixels of destination, low byte #define BLITTER_DEST_LENGTH_HI 0xD336 // Lenth of one line of pixels of destination, high byte #define BLITTER_DEST_STEP 0xD337 #define BLITTER_SIZE_LOW 0xD338 // Blitter copy size (bytes), low byte #define BLITTER_SIZE_HI 0xD339 // Blitter copy size (bytes), high byte #define BLITTER_START 0xD33A #define BLITTER_CFG 0xD33B // #define BLITTER_MINTERM_CFG 0xD33E // Sets operation mode of blitter #define BLITTER_STATUS 0xD33F ---- === Double Buffering & DMA Copies === To access the extended memory area you can use several techniques, the easiest is to engage the DMA controller to shift blocks of memory between regions. The controller works with a source and destination address and the length of the transfer (in bytes). The transfer can go from the 0-64kb region to the 64-2048kb region and even from ROM to RAM. The function below is hard coded to transfer RAM to RAM (note the bitmask applied to the source and destination addresses) by setting the upper 2 bits of the high byte of both source and destination. The address format is as follows: * Bits 0-7: **Low** byte (8 bits) * Bits 8-15: **Middle** byte (8 bits) * Bits 16-21: **High** byte (6 bits) * Bits 22-23: **Address type** (2 bits) The meaning of those two address bits are: * 00: Address is ROM * 01: Address is RAM * 10: Address is RAM + Register (??) So, OR your addresses with 0x400000 for RAM source/destinations, or leave them unmasked to indicate ROM sources (you probably know what you're doing if you're doing a DMA copy from ROM...): **dtv_dma** #include #include #include "dtv.h" #include "dtv_reg.h" void dtv_dma(uint32_t src, uint32_t dst, uint16_t size){ // Copy 'size' bytes of data from 'src' to 'dst' in high memory // using the DMA engine. // 'src' and 'dst' are the addresses to copy from - they are NOT pointers uint32_t src_masked; uint32_t dst_masked; // Source and destination addresses need to be OR-ed with a RAM/ROM // mask in order for the DMA controller to recognise that these are in // RAM, and not ROM. src_masked = src | DMA_RAM_MASK; dst_masked = dst | DMA_RAM_MASK; // Wait for any previous DMA to complete while (PEEK(DMA_PORT_STATUS) & 0x01){}; // Set source low/medium/high bytes POKE(DMA_PORT_SOURCE_LOW, src_masked); POKE(DMA_PORT_SOURCE_MED, src_masked >> 8); POKE(DMA_PORT_SOURCE_HI, src_masked >> 16); // Set destination low/medium/high bytes POKE(DMA_PORT_DEST_LOW, dst_masked); POKE(DMA_PORT_DEST_MED, dst_masked >> 8); POKE(DMA_PORT_DEST_HI, dst_masked >> 16); // Set size of transfer in bytes POKE(DMA_PORT_SIZE_LOW, size); POKE(DMA_PORT_SIZE_HI, size >> 8); // Start the transfer POKE(DMA_PORT_STATUS, 0x0D); } Here are a couple of functions (**draw_Clear()** & **draw_Flip()**) which can fill an off-screen scratchpad with a solid colour, and then copies the entire 320x200 scratchpad to the linear framebuffer using the** dtv_dma()** function above: #include #include #include "dtv.h" void draw_Clear(void){ // Clear the contents of the screen buffer with a solid colour, line by line unsigned char i; char blit_buffer[320]; // Fill the blit buffer with 320px worth of black pixels memset(blit_buffer, 0x00, 320); // Copy SCREEN_HEIGHT worth of blit_buffers to the screen buffer for (i = 0; i < 200; i++){ dtv_dma((uint32_t*)&blit_buffer, FRAMEBUFFER_1 + (i * 320), 320); } } void draw_Flip(void){ // Flip the screen buffer to the linear framebuffer in one operation dtv_dma(FRAMEBUFFER_1, FRAMEBUFFER_0, (uint16_t) FRAMEBUFFER_SIZE); } == Bugs == I've seemingly found a bug which manifests when making sequential DMA calls in a tight loop and the transfer length is 182 bytes or greater. You can successfully DMA copy quite a large region in a single operation and there are no side-effects, however in my testing I have found that if you have a tight loop where you are DMA transferring line by line, I get corrupted transfers with any size over 181 bytes. See the example below: {{:blog:c64:vice_dtv_256_speed2x.png?400|}} The colour bars in the image should be continuous for the entire screen width, but they glitch at two points. The above example was generated by the following pseudocode: screen = FRAMEBUFFER_LOCATION; for (line = 0; line < SCREEN_HEIGHT; line++){ memset(line_buffer, colour, SCREEN_WIDTH); dtv_DMA(&line_buffer, screen, SCREEN_WIDTH); colour++; screen += SCREEN_WIDTH; } If you reduce the transfer size to 181 bytes you do not get the glitching, and if you have a //reasonable-number-of-cycles// between DMA operations you also don't get the glitching... but I don't know what that //reasonable-number-of-cycles// value is. It's clearly a timing issue between DMA operations - due to the time it takes for transfers over a centre number of bytes, but all I know is that the documentation says that checking bit 0 of 0xD31F should indicate whether the transfer has finished or not. It's possible (though unlikely) that this is an emulation bug - until I have a working, physical DTV, I won't be able to confirm. ---- === Blitter Operation === One of the big features of the DTV2 & 3 is the addition of [[https://en.wikipedia.org/wiki/Blitter|blitter hardware]]. This dramatically speeds up transfer of memory and can be used to achieve very high speed image/screen manipulation. The Blitter documentation is extensive, with dozens of different registers to set and configure. However, in most circumstances you will want to copy one solid block of pixels to another area (either on-screen or off - the blitter works over the entire 0-2048kb memory region, but //not// ROM). Here I show a variation of the blitter command to transfer w x h pixels from src to dest. * **src**: address of the upper-left pixel of a rectangular block of source pixels * **dst**: address of the upper-left pixel of the destination area. * **w**: width of the source area to copy in pixels * **h**: number of lines of the source area to copy The source area is OR-ed with the destination and transparency in the source area is honoured (source pixels with value //0// will //not// overwrite pixels in the destination). **dtv_Blit()** #include #include #include "dtv.h" #include "dtv_reg.h" void dtv_Blit(uint32_t src, uint32_t dst, uint16_t w, unsigned char h){ // Blit 'width * height' bytes of data from 'src' to 'dst' in high memory // using the blitter engine. // // 'src' is the address to copy from - it is NOT pointers // 'dst' is the destination address - it is not a pointer // 'w' is the width of the source rectangle, in pixels/bytes // 'h' is the height of the source rectangle, in pixels/bytes uint16_t total_bytes = w * h; uint16_t modulus = SCREEN_WIDTH - w; // ================================== // Source address // ================================== POKE(BLITTER_SOURCE_A_LOW, src); POKE(BLITTER_SOURCE_A_MED, src >> 8); POKE(BLITTER_SOURCE_A_HI, src >> 16); // ================================== // Destination address // ================================== POKE(BLITTER_DEST_LOW, dst); POKE(BLITTER_DEST_MED, dst >> 8); POKE(BLITTER_DEST_HI, dst >> 16); // ================================== // Set line length // ================================== POKE(BLITTER_SOURCE_A_LENGTH_LOW, w); POKE(BLITTER_SOURCE_A_LENGTH_HI, w >> 8); POKE(BLITTER_DEST_LENGTH_LOW, w); POKE(BLITTER_DEST_LENGTH_HI, w >> 8); // ================================== // Set modulus/wraparound // ================================== POKE(BLITTER_SOURCE_A_MODULO_LOW, modulus); POKE(BLITTER_SOURCE_A_MODULO_HI, modulus >> 8); POKE(BLITTER_DEST_MODULO_LOW, modulus); POKE(BLITTER_DEST_MODULO_HI, modulus >> 8); // ================================== // Step sizes // ================================== POKE(BLITTER_SOURCE_A_STEP, 0x10); POKE(BLITTER_DEST_STEP, 0x10); // ================================== // Set total transfer size // ================================== POKE(BLITTER_SIZE_LOW, total_bytes); POKE(BLITTER_SIZE_HI, total_bytes >> 8); // ================================== // Configure blitter mode and start blit // ================================== POKE(BLITTER_CFG, 0x04); // Enable transparency, if supported by blitter POKE(BLITTER_MINTERM_CFG, 0x18); // OR the source with destination POKE(BLITTER_START, 0x0B); } Here's an example of the blitter function in operation, copying a **100x100** rectangle of pixels from the source position of **1,1** to the destination at **219,100**: This is the source image - a basic set of colour bars: {{:blog:c64:c64_dtv_blitter_example_source.png?400|}} Using dtv_Blit() we copy from 1,1 to 219,100 a 100x100 rectangle in OR mode: {{:blog:c64:c64_dtv_blitter_example_dtv3.png?400|}} Notice that the transparent (aka //black//) pixels in the source region are preserved when copying to the destination, and that the original content of the destination is preserved for those areas. Where the source is not transparent, those pixels then overwrite the destination. The mode of operation can be customised - see the [[https://www.c64-wiki.com/wiki/C64DTV_Programming_Guide#BLITTER_DATAPATH|DTV Programming Guide]] for more details; specifically the ALU mode set for register **0xD33E**. == Bugs == However, there is a problem. On the **DTV2** systems (the initial PAL models), the blitter is partially bugged and transparent copies result int a vertical banding effect: {{:blog:c64:c64_dtv_blitter_example_dtv2.png?400|}} Therefore, //without mitigation//, only **DTV3/Hummer** systems fully support transparent blits. If you are //not// bothered about transparency, then the blitter functions fine on DTV2, but to support transparency you need to do some type of workaround.