6502 Assembly Language

Software Portability and Optimization

Introduction

The 6502 is an 8-bit processor with a 16-bit address bus. It is, therefore, able to access 64 kilobytes (216 bytes). Since each 16-bit address is comprised of two 8-bit bytes, memory can be viewed as 256 pages of 256 bytes each. In this blog post, I will use the 6502 emulator to work with bitmap operations.

Initial Code

The following code fills the emulator's bitmapped display with the yellow colour. First, we set a pointer located at $40 to hold the memory location for the first 256 pixels of the 32x32 display. Next, the main colour and the counter are set. Then we enter a loop that iterates through each page(each "page" is one-quarter of the bitmapped display) and sets 256 pixels for that page to the given colour. When the page is done, it moves to the next one incrementing the pointer by one until 6 is reached:
    lda #$00             ; set a pointer at $40 to point to $0200
    sta $40 
    lda #$02 
    sta $41 
    lda #$07             ; colour number 
    ldy #$00             ; set index to 0 

loop: sta ($40) , y    ; set pixel at the address (pointer)+Y 
    iny                      ; increment index 
    bne loop              ; continue until done the page 

    inc $41               ; increment the page 
    ldx $41               ; get the current page number 
    cpx #$06            ; compare with 6 
    bne loop             ; continue until done all pages

Calculating Performance

To measure the performance of the code above, we need to determine the number of cycles required for each instruction to execute using the 6502 documentation assuming 1Mhz clock speed.

InstructionCyclesCycle CountAlt CyclesAlt CountTotal
LDA #$00212
STA $40313
LDA #$02212
STA $41313
LDA #$07212
LDY #$00212
loop: STA ($40),y610246144
INY210242048
BNE loop31020243068
INC $415420
LDX $413412
CPX #$06248
BNE loop332111
Total:11325Cycles
CPU Speed:1MHz
Time per Cycle:1uS

Total:11325uS

Improvements

The code below decreases the time taken to fill the screen with a solid colour by removing some of the instructions and using less space. However, in the nutshell it is the same code:
     ldx #$02 ; start at $0200 
     lda #$07 ; set colour to yellow

store:    stx $01         ; store current page in index X to $0010

loop:     sta ($00),Y ; set pixel at the address (pointer)+Y
            iny ; increment index Y
    bne loop ; continue until done the page
    inx ; increment the page
    cpx #$06 ; compare with 6
    bne store ; if less store the page value

InstructionCyclesCycle CountAlt CyclesAlt CountTotal
ldx #$05212
lda #$07212
next: stx $013412
loop: sta ($00),Y610246144
iny210242048
bne loop31020243068
dex248
cpx #$01248
bne next332111
Total:11303Cycles
CPU Speed:1MHz
Time per Cycle:1uS
Total:11303uS

Modifications

Changing the colour to be displayed is simple. We just need to change the value of the "A" register after the pointer has been initialized to whatever value is available. 

In the example below, the colour is set to light blue which corresponds to 0e in HEX in the code:

To display each page with a different colour, we need to have a structure with colours for each page and when we increment the page, we also need to increment the pointer to the current colour:
    lda #$00 ; set a pointer at $40 to point to $0200
    sta $40
    lda #$02
    sta $41

    lda #$00 ; store the current number of a color in colours
    sta $10 ; at address $10
    ldy #$00      ; set index to 0
    lda colours, y      ; colour number in colors
loop: sta ($40), y    ; set pixel at the address (pointer)+Y
    iny      ; increment index
    bne loop      ; continue until done the page

    inc $10            ; set to the next the colour 
    ldy $10            ; get the colour number
    lda colours, y    ; put the colour into the accumulator
    ldy #$00          ; clear y

    inc $41     ; increment the page
    ldx $41     ; get the current page number
    cpx #$06     ; compare with 6
    bne loop     ; continue until done all pages

colours:                   ; colours for each of the four pages
    dcb $07, $01, $02, $04


Experiments

To start, I do not provide the code for each experiment, rather I include the screenshots for each because it is the same initial code with minor additions.

1. Adding "tya" instruction after the loop: label and before the sta ($40), y instruction results in the following:

As can be seen from the screenshot, there are 16 colours in total repeating twice because TYA instruction transfers Index Y to the Accumulator 32 times per row and the colour is set to the lowest four bits of each byte of index Y.

2. Adding  "lsr" after the "tya" results in the following:

Now, there are the same number of colours but the pixels of one colour on each row became twice as larger(2 pixels) compared to the previous screenshot because of the fact that "lsr" instruction basically divides by two every byte of index Y which is transferred to the Accumulator:

3. If we keep adding the "lsr" instruction the one colour pixel on each row will be larger and larger till the 5th "lsr" which result in each colour taking the whole row(32 pixels) because each "lsr" instruction increases the one colour width by two:

4. On the other hand, the "asl" instruction decreases the number of colours by two because it multiplies the value of index Y by two resulting in the palette with even value colours. Consecutive "asl" instructions will each decrease the number of colours by two:

5. Going back to the initial code, if we add another "iny" instruction, then every second pixel will be coloured with the given value. 
However, if we add another "iny" instruction, the display will be evenly coloured, because we skip the zero flag which "bne" depends on by incrementing index Y before the flag can be read. I found that only "iny" number of the powers of two works as intended - only each 2, 4, 8, 16 ... pixel is coloured: 
4 "iny" s

3 "iny"s


6. To make each pixel to be a random colour, we need to use a pseudo-random number generator located at address $fe and load a value from it to the Accumulator each time the pixel is drawn:

Some Challenges

1. The program below draws lines at the top and bottom of the display(red line across the top, green line across the bottom). Simply, we set the address of the first and the last row and then draw 32 pixels for each:
define TOP $02          ; top colour constant
define BOTTOM $05    ; bottom colour constant

setup:    lda #$00      ; set a pointer at $40 to point to $0200
             sta $40
             lda #$02
             sta $41
     lda #$e0      ; set a pointer at $42 to point to $05e0
     sta $42
     lda #$05
     sta $43
         ldy #$00      ; set index to 0
     lda #TOP      ; get TOP colour

loop:      sta ($40), y  ; set pixel at the address (pointer)+Y
         iny          ; increment index
         cpy #$20      ; draw only 32 pixels
         bne loop       ; continue until done the top row

switch:   ldy #$00         ; set index to 0
     lda #BOTTOM   ; get BOTTOM colour

loop2:    sta ($42), y   ; set pixel at the address (pointer)+Y
     iny          ; increment index
         cpy #$20      ; draw only 32 pixels
         bne loop2      ; continue until done the top row


2. The program below sets all of the display pixels to yellow, except for the middle four pixels, which will be drawn in blue. First, we draw every pixel to the main colour and then draw a center block to the other one:
define WIDTH 2 ; width of center block
define HEIGHT 2 ; height of center block
define CENTER   $06 ; center block colour
define MAIN       $07 ; main colour

setup: lda #$ef ; set a pointer at $10 to point to the location
          sta $10 ; of the block to be drawn in the center
          lda #$03
          sta $11
          lda #$00 ; set the number of rows drawn at $12
          sta $12
          ldy #$00 ; index for screen column
        ldx #$02 ; start at $0200 
        lda #MAIN ; set main colour

store: stx $01 ; store current page in index X to $0010
loop:  sta ($00),Y ; set pixel at the address (pointer)+Y
                iny         ; increment index Y
        bne loop         ; continue until done the page
        inx         ; increment the page
        cpx #$06         ; compare with 6
        bne store         ; if less store the page value
 

center: lda #CENTER ; set center colour
        sta ($10), y
          iny
          cpy #WIDTH ; check if the block width is filled
          bne center         ; if no, continue filling
   
          inc $12 ; increment row counter 
 
          lda #HEIGHT ; check if block height is filled
          cmp $12 ; if no, continue filling 
          beq done         ; if yes, exit
 
          lda $10 ; load pointer to the location
          clc
          adc #$20         ; add 32 to go to the next row
          sta $10
          lda $11             ; carry to high byte if needed
          adc #$00
          sta $11
 
          ldy #$00
          beq center

done: brk         ; stop when finished

Conclusion

As far as I am concerned, the 6502 assembly language is a great starting point in learning the low-level operations and processes used in modern computing. It gives you an overview of how complex the processors are and what it takes to develop software in respective to the low level. As for me, I have learned how to work with bitmap display and draw whatever I want using instructions from the documentation. I have enjoyed the problem-solving process.

Author: Iurii Kondrakov 
GitHub: github.com

P.S this blog post is created for the SPO600 Lab 2

Comments