Recollection 02 ch05 Examination Of An Early Tape Loader
An examination of an early tape loader
by Fungus/Nostalgia/Onslaught
Loader: Pavloada V2 (I believe, correct me if I'm wrong please ;)
Used on Hektic, Orbitron, Munch Mania, Cosmic Kanga and other games from Mastertronic and several other software houses in 1985 and 1986.
Foreward:
Cracking tapes is an interesting and challenging process. I have decided to disect an Old pavloader as lesson 1 in a series of tape cracking articles I hope to continue with. All work was done only with the Action Replay Monitor (and NOT the freezer!). Please enjoy ;)
Kudos to Qed for doing the first article long ago. (In C= Hacking)
The Commodore C64 Programmers Reference Guide, and Mapping The Commodore 64 were the reference books used in the research of registers, rom routines and system memory locations.
64doc.txt and NMOS_6502_extra_opcodes.txt documents were used in the reverse engineering of the unimplemented opcode auto-boot routine. These texts (and MUCH more!) are available at ftp://ftp.funet.fi/pub/cbm/documents/chipdata
And now the fun begins.
Insert your prospective tape into the tape drive. Rewind, clear tape counter.
Enter the AR or RR monitor and type the following.
L"",01
The tape header will now load into the tape buffer, which is located at $033c. The boot file will also load into $02a7.
If you examine the loaded parts, they appear as gibberish. The loader itself occupies the space in the tape buffer $0351-$03fc. It is EOR encoded at this point. Lets have a look at the boot file which was loaded to $02a7 and ended at $0304.
$0302-$0303 is the basic idle loop vector. This is how MOST tapes autostart. There are several different ways of achieving an autostart, this happens to be one of the most common. I'll discuss other methods in future installments.
NOTE: Disk games can also autostart this way. And you can use it in your own programs aswell.
Now, look at 02a7 with
M 02a7
you should see,
.:02a7 64 ae 4e bf 02 14 cc a2
If not, its not the same loader. Note: the 4th byte, $bf, maybe something else.
If you dissassemble it, you get seeming gibberish with a normal monitor.
.> 02a7 64 ???
.> 02a8 ae 4e bf ldx $bf4e
.> 02ab 02 ???
.> 02ac 14 ???
.> 02ad cc a2 ff cpy $ffa2
etc...
Well, it is really a sneaky trick. The code is partially comprised of uninplemented opcodes.
Lets have a look at how the code really looks to the processor.
.> 02a7 64 ae skb $ae ;skip byte ($ae)
.> 02a9 4e bf 02 lsr $02bf ;decode byte at $02bf ($06 becomes $03)
.> 02ac 14 cc skb $cc ;skip byte ($cc)
.> 02ae a2 ff ldx #$ff ;load x index with #$ff
.> 02b0 8b 51 xaa #$51 ;and x index with #$51 and transfer to accumulator (lda #$51)
.> 02b2 87 fb sax $fb ;and x index with accumulator and store in $fb
.> 02b4 04 4c skb $4c ;skip byte $4c
.> 02b6 8b e1 xaa #$e1 ;and x index with #$e1 and transfer to accumulator (lda #$e1)
.> 02b8 54 cc skb $cc ;skip byte $cc
.> 02ba 8f 28 03 sax $0328 ;and x index with accumulator and store in $0328 (disable run/stop)
.> 02bd af 3c 03 lax $033c ;load x index and accumulator with memory address $033c ($03, filetype)
.> 02c0 87 fc sax $fc ;and x index with accumulator and store in $fc
.> 02c2 a0 ff ldy #$ff ;load y index with #$ff
.> 02c4 b3 fb lax ($fb),y ;load x index and accumulator with indirect address at $fb/$fc ($0351)
.> 02c6 54 20 skb $20 ;skip byte $20
.> 02c8 4d 02 03 eor $0302 ;eor accumulator with memory address $0302 ($a7, loaded)
.> 02cb 80 ee skb $ee ;skip byte $ee
.> 02cd 4d 17 03 eor $0317 ;eor accumulator with memory address $0317 ($fe, normally)
.> 02d0 89 20 skb $20 ;skip byte $20
.> 02d2 91 fb sta ($fb),y ;store accumulator in indirect address at $fb/$fc ($0351)
.> 02d4 14 cc skb $cc ;skip byte $cc
.> 02d6 88 dey ;decrement y index.
.> 02d7 c0 ff cpy #$ff ;compare y index with #$ff
.> 02d9 80 ee skb $ee ;skip byte $ee
.> 02db d0 e7 bne $02c4 ;branch if y index <> #$ff
.> 02dd 14 4c skb $4c ;skip byte $4c
.> 02df f0 70 beq $0351 ;branch if y index = $ff , Start real loader
.> 02e1 a0 c0 ldy #$c0 ;load y index with #$c0
.> 02e3 1b 3c 03 aso $0330,y ;arithmetic shift left memory, or with accumulator
.> 02e6 88 dey ;decrement y index.
.> 02e7 d0 fa bne $02e3 ;branch if y index <> 0
.> 02e9 14 2e skb $2e ;skip byte $2e
.> 02eb 20 93 fc jsr $fc93 ;jump to subroutine $fc93
.> 02ee 6c 4e 00 jmp ($004e) ;jump to indirect address in $4e/$4f
.> 02f1 20 33 a5 jsr $a533 ;jump to subroutine $a533
.> 02f4 89 ee skb $ee ;skip byte $ee
.> 02f6 20 59 a6 jsr $a659 ;jump to subroutine $a659
.> 02f9 4c ae a7 jmp $a7ae ;jump to basic start $a7ae
Now, that makes a lot more sense doesn't it?
It's easy to see the program start by initting some zp vectors, and decoding the loader at $0351 before executing it at $02df. There is also some other stuff after the start code. Wonder what that is? I guess we have to continue examing the loader to determine this. So lets decode it shall we? The following routine will decode the loader, without having to use the above code. Although it's important to understand how these things work. Cheating is NOT the way to do things properly, your only cheating yourself of greater knowledge.
start lda #$50 ;setup indirect at $fb/$fc to $0350
sta $fb
lda #$03
sta $fc
ldy #$af ;load y index with #$af bytes to decode
decode
lda ($fb),y ;decode the loader in the cassette buffer ($0351-$03ff)
eor #$a7 ;notice the decode loop goes backwards through memory
eor #$fe
sta ($fb),y
dey
bne decode
rts
The values for decoding were taken from the boot, or break points were set in the boot code to extract the needed values. Feel free to practice doing this for yourself, sometimes it's a challenge all in itself.
Lets have a look at the decoded loader now. (ooooh the fun stuff!)
.> 0351 78 sei ;disable interrupts
.> 0352 ad 11 d9 lda $d011 ;load accumulator with vic control register
.> 0355 29 ef and #$ef ;and with %11101111 (bit 4 = 00 , blank screen)
.> 0357 8d 11 d0 sta $d011 ;store accumulator in vic control register
.> 035a a9 00 lda #$00 ;load accumulator with 00
.> 035c 85 c6 sta $c6 ;keyboard que = 0
NOTE: somtimes ( .> 035e 85 9d sta $9d ;kernal msgs off ) is present here
.> 035e a9 80 lda #$80 ;load accumulator with #$80 : restart loop
.> 0360 8d 11 d0 sta $dd04 ;store accumulator in cia 2 timer a low byte latch
.> 0363 a9 01 lda #$01 ;load accumulator with #$01
.> 0365 8d 05 dd sta $dd05 ;store accumulator in cia 2 timer a high byte latch
.> 0368 a9 19 lda #$19 ;load accumulator with #%00011001
.> 036a 8d 0e dd sta $dd0e ;store accumulator in cia 2 control register a
.> bit 0 = 1 start timer
.> bit 1 = 1 timer a output mode to pb6 = yes
.> bit 3 = 1 one shot mode
.> bit 4 = 1 force load timer a
.> bit 5 = 0 count phase 02 clock cycles
.> bit 6 = 0 serial port i/o mode = input
.> 036d a5 01 lda $01 ;load accumulator with i/o port
.> 036f 29 1f and #$1f ;and accumulator with #%00011111
.> 0371 85 01 sta $01 ;store accumulator in i/o port , bit 5 off = cassette motor on
.> 0373 a0 00 ldy #$00 ;load y index with 00
.> 0375 20 bb 03 jsr $03bb ;sync to block
.> 0378 20 d2 03 jsr $03d2 ;get a byte
.> 037b 85 20 sta $20 ;store load address low byte
.> 037d 85 c1 sta $c1 ;make a copy of it
.> 037f 20 d2 03 jsr $03d2 ;get a byte
.> 0382 85 21 sta $21 ;store load address high byte
.> 0384 85 c2 sta $c2 ;make a copy of it
.> 0386 20 d2 03 jsr $03d2 ;get a byte
.> 0389 85 22 sta $22 ;store end address low byte
.> 038b 85 c3 sta $c3 ;make a copy of it
.> 038d 20 d2 03 jsr $03d2 ;get a byte
.> 0390 85 23 sta $23 ;store end address high byte
.> 0392 85 c4 sta $c4 ;make a copy of it
.> 0394 20 d2 03 jsr $03d2 ;get a byte - main loading loop
.> 0397 91 c1 sta ($c1),y ;store it memory at the indirect address loaded from block header
.> 0399 e6 c1 inc $c1 ;increment load address low byte
.> 039b d0 02 bne $039f ;skip next instruction if not equal to 00
.> 039d e6 c2 inc $c2 ;increment load address high byte
.> 039f d0 02 lda $c1 ;load accumulator with save address high byte
.> 03a1 c5 c3 cmp $c3 ;compare accumulator with low byte of end address (affecting the carry flag)
.> 03a3 a5 c2 lda $c2 ;load accumulator with load address high byte
.> 03a5 e5 c4 sbc $c4 ;subtract accumulator with carry from load address high byte (checking for end of file)
.> 03a7 90 eb bcc $0394 ;if carry is clear then continue loading
.> 03a9 20 d2 03 jsr $03d2 ;get a byte
.> 03ac d0 b0 bne $035e ;reset and restart load if not 00 (files to load)
.> 03ae 20 d2 03 jsr $03d2 ;get a byte
.> 03b1 85 4e sta $4e ;store start jump low byte
.> 03b3 20 d2 03 jsr $03d2 ;get a byte
.> 03b6 85 4f sta $4f ;store start jump high byte
.> 03b8 4c e1 02 jmp $02e1 ;jump back to boot file (finished loading now)
.> 03bb 20 e2 03 jsr $03e2 ;get a bit - sync to data block routine
.> 03be 66 bd ror $bd ;rotate bit into input byte. bit orientation is right to left ($bd = $00 on startup)
.> 03c0 a5 bd lda $bd ;load accumulator with input byte
.> 03c2 c9 96 cmp #$96 ;compare input byte to sync byte
.> 03c4 d0 f5 bne $03bb ;if not equal keep checking
.> 03c6 20 d2 03 jsr $03d2 ;get a byte
.> 03c9 c9 96 cmp #$96 ;compare accumulator to sync byte
.> 03cb f0 f9 beq $03c6 ;loop until end of sync mark
.> 03cd c9 81 cmp #$81 ;compare accumulator to block id = #%10000001
.> 03cf d0 ea bne $03bb ;resync
.> 03d1 60 rts ;return from subroutine
.> 03d2 a2 08 ldx #$08 ;load x index with #$08 - get a byte routine
.> 03d4 20 e2 03 jsr $03e2 ;get a bit
.> 03d7 66 bd ror $bd ;rotate bit into input byte
.> 03d9 ee 20 d0 inc $d020 ;increment border color (load effect)
.> 03dc ca dex ;decrememnt x index
.> 03dd d0 f5 bne get loop ;loop if 8 bits not received
.> 03df a5 bd lda $bd ;load accumulator with input byte
.> 03e1 60 rts ;return from subroutine
.> 03e2 a9 10 lda #$10 ;load accumulator with the mask #%00010000
.> 03e4 2c 0d dc bit $dc0d ;test the bits in accumulator against cia 1 interrupt control register
.> bit 4: cassette read / serial buss SRQ input
.> 03e7 f9 fb beq $03e4 ;if bit = 0 then wait more
.> 03e9 ad 0d dd lda $dd0d ;load cia 2 interrupt control register
.> bit 0 = timer A timeout (0 or 1)
.> 03ec 4a lsr ;shift bit into carry flag
.> 03ed a9 19 lda #$19 ;load accumulator with #%000110011
.> 03ef 8d 0e dd sta $dd0e ;store accumulator in cia 2 control register a
.> bit 0 = 1 start timer
.> bit 1 = 1 timer a output mode to pb6 = yes
.> bit 3 = 1 one shot mode
.> bit 4 = 1 force load timer a
.> bit 5 = 0 count phase 02 clock cycles
.> bit 6 = 0 serial port i/o mode = input
.> 03f2 60 rts ;return from subroutine
...and there it is.
$0351-$0373 turns off irq's, blanks the screen, sets up the timing constants for the loader and turns the cassette motor on.
The load loop begins at $0373, by doing a jsr to the sync routine to sync to a data block. The load then loads the start and end address of the file into zeropage.
Note: Conveniently it makes copies of these datas already, for making your own transfer tool. (which will be the next article using this loader).
The load loop is quite simple, fetching bytes and storing them and then checking for the end of file. Upon reaching the End of file mark, it then loads another byte, if this byte is equal to 00 then it exits the loader and does a few things to start the file, weather it be Machine Language or Basic.
The routine at $03bb is the data block sync routine. It rotates bits in one at a time until it gets a match to the sync byte. then reads sync bytes until it gets another byte. If this byte does not match the data block id byte, then the loader tries to find the next data block.
The byte fetch routine waits for a 1 in bit 4 of $dc0d, the cassette read line. When this bit is set a bit has been read from the datasette. We now check the the timeout flag of cia 2 interrupt control register. If a timeout occured, then the bit is a 1, if not it is a 0. The timer is restarted for the next bit and the fetched bit is rotated into the input byte. When 8 bits have been fetched it loads the input byte and returns.
Well kiddies, thats it for now. Next time we discuss how to make a tape to disk transfer out of this routine so we dont have to crack it by hand everytime. Bye for now.
Fungus/Nostalgia/Onslaught.