Friday, September 28, 2012

Tracking Local Variable Usage

I'm trying to learn to code for the NES, and one thing that is obvious very quickly is that oine has to take advantage of the zero page memory. It's faster for one, but if you have multiple subroutines, it's better to reuse that memory. The idea is that if a subroutine needs memory, give it some zeropage ram and reuse it with another sub later. If a subroutine needs something to be static, assign it memory explicitly in the zeropage or bss segment. So I was reusing zeropage for subroutine locals and I got paranoid that I was going to overwrite something or run into some problem or another, so I invented a pretty easy to use method of tracking local variable ram usage and paramater ram usage (since I also use zeropage to pass paramaters). It does NOT use a stack, it just gives you warnings via NintendulatorDX's debugOut macro when you might be doing something dangerous. Use in code is as follows:
; In a main module, define:

FUNCTION_RANGE_CHECKING_ON = 1 ;Turn off if functions are okay
FUNCTION_LOCAL_SIZE = $0A ;# bytes for local zero page shared
FUNCTION_PARAMS_SIZE = $06 ;bytes for paramaters

; these two must be the first thing reserved in zeropage
.segment "ZEROPAGE"
function_locals: .res FUNCTION_LOCAL_SIZE  
param:   .res FUNCTION_PARAMS_SIZE     

Anywhere else or in an included file define your callable functions as:
.func get_nametableaddress


.locals
 nametableaddress .byte
.endlocals

; IN:  reg x has nametable x, reg y has nametable y
; OUT : reg x has low address, reg y has high address
; LOCAL: Uses 1 byte

 tya     
 asl    
 
 asl    
 asl
 asl
 asl
  
 stx local::nametableaddress
 ora local::nametableaddress 
 tax
 tya 
 lsr
 lsr
 lsr
 ora #$20
 tay
 rts
 
.endfunc

If you wish to define a function as above inside a scope and export/import it:
;export:
.exportfunc get_nametableaddress

;import:
.importfunc get_nametableaddress
In code, you can use call:
  call get_nametableaddress, #10,#10
The macro will generate a small amount of code that checks the local variable usage and parameter usage. In this example no paramater memory is used because the macros justs loads reg x and y. There is no stack! Only a small amount of code will be output that checks for overuse or nested use of paramaters and local memory use and outputs a warning, so you can look closer and see if there is a problem or not. If you do pass paramaters in paramater memory space use clear_param num, where num is the number of paramaters acknolwedged as read/no longer needed:
.func some_function

; IN reg X,Y (addressIN) and a, param+0 (addressout)

.locals
 addressIN .word
 addressOUT .word
.endlocals



 stx local::addressIN
 sty local::addressIN + 1
 
 sta local::addressOUT
 lda param+0
 clear_param 1
 sta local::addressOUT + 1
 ;.....
If you look at the very first line of code:
FUNCTION_RANGE_CHECKING_ON = 1 ;Turn off if functions are okay
If you comment that line, no extra code will be generated and your assembled code will be the same as if you did not use any of these macros.

Code:

; Define this somewhere in your main module near the begining:
; FUNCTION_RANGE_CHECKING_ON = 1
; FUNCTION_LOCAL_SIZE = $08
; FUNCTION_PARAMS_SIZE = $08

;.segment "ZEROPAGE"
;function_locals: .res FUNCTION_LOCAL_SIZE
;param: .res FUNCTION_PARAMS_SIZE


.feature leading_dot_in_identifiers

.if ::FUNCTION_RANGE_CHECKING_ON
.pushseg
.segment "BSS"
locals_used: .res 1
params_used: .res 1
.popseg
.endif

.macro .exportfunc func1, func2, func3, func4, func5, func6, func7, func8, func9, func10

.export func1
.export .ident(.sprintf("%s%s",.string(func1),"___sizeof_locals" ))
.ifnblank func2
.exportfunc func2, func3, func4, func5, func6, func7, func8, func9, func10
.endif

.endmacro

.macro .importfunc func1, func2, func3, func4, func5, func6, func7, func8, func9, func10

.import func1
.import .ident(.sprintf("%s%s",.string(func1),"___sizeof_locals" ))
.ifnblank func2
.importfunc func2, func3, func4, func5, func6, func7, func8, func9, func10
.endif

.endmacro


.macro .func name

.ifdef _name_
.undefine _name_
.endif
.define _name_  name

.proc _name_

.endmacro


.macro .locals
_locals_ .set 1
.struct local

.endmacro

.macro .endlocals
.endstruct
.endmacro



.macro .endfunc
.endproc

.ifdef  _name_::_locals_
.ident(.concat(.string(_name_),"___sizeof_locals" )) = .sizeof(_name_::local)
.else
.ident(.concat(.string(_name_),"___sizeof_locals" ))  = 0
.endif

.endmacro


.macro call function, paramX, paramY, paramA, param0, param1, param2, param3

; use in order: reg.x, reg.y, reg.a, param zeropage, ....etc

.if .xmatch(paramA,a)
pha
.endif


.if ::FUNCTION_RANGE_CHECKING_ON
.ifdef  ::.ident(.concat(.string(function),"___sizeof_locals" ))

lda locals_used
if not zero
debugOut { "WARNING: Nested local variable usage calling: ", .string(function) }
endif
lda # ::.ident(.concat(.string(function), "___sizeof_locals"))
clc
adc locals_used
sta locals_used
cmp #FUNCTION_LOCAL_SIZE
if greater
debugOut { "WARNING: Function: '", .string(function), "' local memory exceeded: ",  fHex8 { locals_used } }
endif
.endif
.if .paramcount > 4
lda params_used
if not zero
debugOut { "WARNING: Nested paramater usage: ", .string(function) }
endif
lda #<( .paramcount - 4)
clc
adc params_used
sta params_used
cmp #FUNCTION_PARAMS_SIZE
if greater
debugOut { "WARNING: Function: '", .string(function), "' parameter memory exceeded: ", fHex8 { params_used } }
endif
.endif
.endif
.ifnblank param3
lda param3
sta param+3
.endif
.ifnblank param2
lda param2
sta param+2
.endif
.ifnblank param1
lda param1
sta param+1
.endif
.ifnblank param0
lda param0
sta param+0
.endif
.ifnblank paramA
.if .xmatch (paramA,a)
pla
.else
lda paramA
.endif
.endif
.ifnblank paramY
.if .not .xmatch (paramY,y)
ldy paramY
.endif
.endif
.ifnblank paramX  ; reg.x
.if .not .xmatch(paramX,x)
ldx paramX
.endif
.endif
jsr function
.if ::FUNCTION_RANGE_CHECKING_ON .and .defined(::.ident(.concat(.string(function), "___sizeof_locals")))
lda locals_used
sec
sbc # ::.ident(.concat(.string(function), "___sizeof_locals"))
sta locals_used
.endif
.endmacro


.macro clear_param num ; clear a vlaue from the param count
.if ::FUNCTION_RANGE_CHECKING_ON
.repeat num
dec params_used ; if range checking on, dec params
.endrepeat
.endif
.endmacro



Thursday, September 20, 2012

Wierd "Double Dabble" code.

This code, as the comments suggest, takes 24 bits up to a decimal value of 999,999 and converts it to a base100 code into three bytes. Code assumes it can use local vars starting at address $0000. Assumes it can use .macros from here.

.proc convert_to_base100 ; (addressIN: word, addressOUT: word)

; regx, regy: low, hi, address input (24 bit value up to 999,999) 
; reg a, pla: low high address out
; convert 24 bits to 3 bytes: 10,000s, 1000s, 100s
; double dabble: to correct a single digit, before 
; it is shifted look if it will become a 10. If it was to become 
; a 10 you add 6 to correct before a shift you add 3 for the 
; same result. Adding 3 is done right before the shift would have 
; moved the digit into the correct spot
; 
; so for a value of 100 you add 9C or 4E before the shift
;
; 100 in hex:       64     32
;                +  9C   + 4E
;                -----   -----
;                   100    80  --> shl 1 = 100
;
; add 4E if the value to be shifted once more is >=32hex
;
; 0 - 9 is ten digits, 0 -99 one hundred digits , need to check the 
; hundreds for greater than or equal 32
; 
; 999999 = 00001111   01000010   00111111  4 shifts to discard zeros
; to be shifted into 00000000 00000000  00000000 
; 32 in binary = 00110010  = 6 more shifts before check needed 
; 10 shifts total needed before testing
; 
; 14 more shifts after
;

.struct
 addressIN   .byte 2
 addressOUT   .byte 2
 tempIN    .byte 3
 tempOUT   .byte 3
.endstruct

 
 stx addressIN
 sty addressIN+1
 sta addressOUT
 pla
 sta addressOUT+1
 
 ldy #2
 repeat 
  lda (addressIN),y
  sta tempIN,y
  dey
 until negative
 
 
 lda #0
 sta tempOUT
 sta tempOUT+1
 sta tempOUT+2
 
 clc

 ; 4 shifts to remove zeros
  
  lda tempIN
  rol a   
  rol tempIN + 1
  rol tempIN + 2
  rol a   
  rol tempIN + 1
  rol tempIN + 2
  rol a   
  rol tempIN + 1
  rol tempIN + 2
  rol a   
  rol tempIN + 1
  rol tempIN + 2
 
 
 ldy #4 ; 4 shifts that have to go into OUT
 repeat
  rol a   
  rol tempIN + 1
  rol tempIN + 2
  
  rol tempOUT
  dey
 until zero
 
  ; TWO MORE
  
  lda tempIN + 1
  rol a
  rol tempIN + 2
  
  rol tempOUT
  rol a
  rol tempIN + 2
  
  rol tempOUT
  sta tempIN + 1
 
  
 ; 10 shifts done 
 
 ldx #0
 
 ldy #6
 repeat

  jsr check32
  
  rol tempIN + 1
  rol tempIN + 2
  
  rol tempOUT
  rol tempOUT + 1
  
  dey
 until zero
 
 ; after this next check, byte 1 could hold $80
 ; 6 more shifts to check byte 2 ..don't need to though
 

 ldy #8
 repeat
 
  ldx #1
  jsr check32
  dex
  jsr check32
 
  
  ; clc
  ; rol tempIN
  ; rol tempIN + 1
  rol tempIN + 2
  
  rol tempOUT
  rol tempOUT + 1
  rol tempOUT + 2
  
  
  dey
 until zero
 
 ldy #2
 repeat 
  lda tempOUT,y
  sta (addressOUT),y
  dey
 until negative
 

 rts

 
 check32:
 lda tempOUT,x
 cmp #$32
 if greaterORequal
  ; carry is set
  adc #$4D ; add #$4E
  sta tempOUT,x
 endif
 
 rts
 
 
.endproc

Monday, September 10, 2012

ca65 tokenlist

Just some info on what ca65 recognizes and automatically turns into tokens when, for example, you use these characters when calling a macro. http://cc65.oldos.net/snapshot/sources/cc65-snapshot-2.13.9.20120311/src/ca65/token.h

Thursday, September 6, 2012

CA65 Highlevel Macros


EDIT: I see this blog post may still turn up in search engines. Please use the code maintained here: https://gitlab.com/ca65/ca65hl
This macro code is much cleaner and faster than any of the old code releated to these posts. I always liked the idea of higher level assembly as long as it didn't get too carried away and there was no worry about getting too far away from the opcodes. I was trying to use NESHLA earlier but it just seems to be missing something, and CA65 has a lot more going for it. I decided to try to get some high-level functionality with ca65 macros, and I think I am pretty much done.

I had some rough code going before, but it was a bit sloppy and I think what I have now is pretty good and very near, if not entirely complete. There may be some bugs, but not many. So a summary of the features:

All the macro code is doing is taking a simple expression representing 6502 flags, and evaluating which branch to use, as well as generating some labels. Valid expressions are:
C set
C clear
Z set
Z clear
N set
N clear
V set
V clear

You can also always negate the flag check with a no or not in front of these expressions. You can also add an additional set or clear at the end. The set will do nothing, but the clear will negate the 'set'. This make more sense with .define macros:
 .define less              C clear 
 .define greaterORequal    C set   
 .define carry             C set     
 .define zero              Z set     
 .define equal             Z set     
 .define plus              N clear 
 .define positive          N clear 
 .define minus             N set    
 .define negative          N set   
 .define bit7              N set   
 .define overflow          V set     
 .define bit6              V set     

You can create any identifiers you like for different flags as long as they follow the format above. There is a special case for testing for greater or less or equal:

.define greater                G set
.define lessORequal            G clear

This is a fake flag that makes it easier to evaluate the condition.


IF-ELSE-ENDIF

IF-ENDIF blocks can be created and nested without any practical limitation:

lda $1234
cmp #$aa
if greater
 lda $4321
 sta $5678
 bit $2000
 if bit7 set
   if bit6 clear
    ldx #3
   else
    ldx #2
   endif
 endif
endif

You can add {} on the condition if you like: if {carry clear}

DO-WHILE ... REPEAT-UNTIL

You can start a block with do or repeat and end it with while <condition> or until <condition> . Repeat is identical to the functionality of do, it just reads better sometimes. Can also be nested as needed.

 do
  do
   lda (address),y
   sta PPU_DATA
   iny
  while not zero
  inc address+1
  dex
 while not zero   

WHILE-ENDWHILE ..

This is a while <condition> do. The do indicates this begins a block. Use an endwhile to close the block.

OTHER NOTES

There is some decent error checking and I have tested quite a bit and am about to try it in my code. Please let me know of any problems if you use this.

Updated: Changed the macro code to use one set of global variable 'constants'. Before this change, a new set of counters would be defined every time that the code was inside a new scope, which sounds good, but I don't see much advantage, so the macro has been changed to all global counters. I actually like it better.

Please see updated code here: http://mynesdev.blogspot.ca/2012/10/ca65-hl-macros-updated-again.html

Saturday, September 1, 2012

6502 techniques I like/use


I like to use some of these ideas, or variations on them:

Flags

It's important to be able to set flags. If you want to be quick about it and avoid using any CPU registers you can use:
inc flag          ;set flag to 1
;later in code:
dec flag          ;set flag to 0
This is all great until you run into some logic where you aren't sure if the flag was set or not and you want to clear it. Instead, I use this:
sec
ror flag          ;set flag bit7 to 1
; later in code:
lsr flag          ;set flag bit7 to 0
In this case you are achieving the same effect but clearing the flag clears the flag everytime as long as you only ever use bit7 to test the flag:
bit flag           ;test the flag
bmi label          ;branch if set
So I just implement some ca65 macros to make it easier to utilize:
.macro setflag flag
   sec
   ror flag
.endmacro

.macro clrflag flag
   lsr flag
.endmacro

.macro bfs label
   bmi label
.endmacro

.macro bfc label
   bpl label
.endmacro
A slightly faster way to clear the flag is do a lda #0 / sta flag but it does take up a bit more code space and clears the accumulator. More to come...

NESDEV beginning


There are a lot of smart people in the online NESDEV world, and most of them are very helpful. Despite this, I've found it is a bit difficult to get started in the world of NES development. This is no ones fault, it's due to the fact that there are many different tools and ideas about what assemblers to use and how to code different things.

I decided to blog a bit about what I've found that works, in an attempt to share how great my development is going, as well as to help others.

The tools I've decided to use, and the strategies I've used to code are the best I have tried at the time, and make me happy enough to stop looking, but I'm always open to something new or better if the change is worth it and it can improve upon what I am currently using.


Where to start?


Assembler

So, to start, I'll say I haven't even considered C, though it is possible using cc65, a cross compiler for 6502 based systems. What I have settled on, is ca65, the assembler that is a part of the cc65 package. Originally, I really liked the idea of NESHLA, but despite some really cool ideas, it's not really suitable for large projects. The macro functionality of ca65 is powerful enough to make things somewhat high level anyway. This assembler is often considered difficult to use, but once you have a suitable nes.cfg file, it's not really any harder then any other, and has many more advanced features waiting for you when you need them.


Editor

Notepad++ with NppExec is perfect for editing. I have colors setup the way I like, and NppExec allows quick execution of any commands I want, which is mainly make with a great generic makefile from here.  (Note, there will be some changes needed as mentioned in the post.) NppExec has a console capture window that can be easy setup to allow for double clicking the error to bring you to the offending line.





Debugging

Nintendulator is often considered one of the most accurate emulators. As such it's good for testing code and as well, thefox has added some features to a customized version (see link for image) that allows for much easier debugging. these three things combined allow for quick NES development. Of course there are always other options, and other opinions, but this is my suggestion at this time.

Other tools

For CHR (graphics) editing, I suggest NES Screen Tool (see link for image) as a great generic NES graphics tool. It's also good for learning about (and debugging) the screen addresses, as it shows the offsets of the nametable and attribute locations. You can also click the 1x grid to see pattern tile layout, 2x grid to see which tiles must use the same colors, and 4x grid too see the areas that use the same attribute table addresses. (It also shows which bits are applied to the 2x grid.)

That's it for a beginning, the intention of this blog is to track my progress from here, or anything I think is interesting and related to NESDEV.

But I'm a noob/newb and I don't quite...

If you don't quite feel like you are ready to start from this point, there are a few things I would suggest:

1. Learn binary and hexadecimal. You don't have to be good at it, just understand the basics and get better from there.
2. Learn the basic concepts of CPU and address and data bus. Along with this you should have some understanding of (all) the CPU registers, RAM/ROM and the PPU chip in the NES.
3. Learn about the 6502 instructions. As you'll no doubt find out, the NES CPU core is a MOS 6502 without decimal mode, so ignoring decimal mode, practically all 6502 (not 65c02) documentation applies to the NES CPU.
4. If you have the above more or less complete, the next step would be to start coding some simple code that does something and ideally synchronizes to the PPU framerate with the NMI signal.

That's it for the intro..