Talk:Fixed cycle delay

From NESdev Wiki
Jump to navigationJump to search

No requirements missing from most examples?

Shouldn't there be a "no requirements" entry for every example? Isn't that the natural maximum byte size / endpoint for each of these? - Rainwarrior (talk) 20:14, 16 March 2016 (MDT)

It is not possible to do 3-cycle delay without any requirements. The best you can get is JMP *+3 which requires relocations, but does not have any execution-time side-effects. Additionally, "no requirements" would always involve a long sequence of "NOP"s followed by a "JMP" if the cycle count is odd. To reduce the size of the page, I omitted these for larger delays, and additionally started considering "writes to stack" harmless from 64 cycles onwards, because in most applications it is. --Bisqwit (talk) 15:02, 17 March 2016 (MDT)

Page size problematic? Maybe the wiki sin't the best place for exhaustive permutations...

I'm noticing the wiki has some serious problems trying to diff some of the history on this page. I noticed this edit with the comment "Further tweak code to prefer repeated sequences, because reducing the page size helps fend off MediaWiki crashing". Given the explosive nature of permutations here, maybe it would be better to just implement this as a javascript tool on a webpage, and link it from here? (Would also be nice because you could just dial in constraints, etc.) - Rainwarrior (talk) 21:02, 16 March 2016 (MDT)

I'd recommend splitting them by 2-19, 20-100, 100-200, etc. --Tepples (talk) 09:40, 17 March 2016 (MDT)
I agree with principle on a Javascript tool, but there is the problem that finding these delay options is rather CPU-time intensive. The running time for my generator program is O(n^2) for the number of cycles to delay. This bodes badly with a tool that would run in the browser. --Bisqwit (talk) 15:18, 17 March 2016 (MDT)
All O(n^2) tells me is that if n gets large enough, execution times would eventually be unbearable, but n isn't unbounded here. Does generating a single case really take a significant amount of time for, say, n=200? (How long does the whole table take?) - Rainwarrior (talk) 16:53, 17 March 2016 (MDT)

Formalizing random writes

"it is difficult to formalize the rules under which one could write to such random addresses."

A write to a random address in a 256-byte page is fine if all addresses in the page are decoded to nothing, to a read-only memory, to a read-only port without side effects, or to memory that will be overwritten later. Most NES mappers decode $4100-$41FF to nothing. In addition, many games use $0200-$02FF, $0300-$03FF, or $0700-$07FF to hold a display list rebuilt from scratch every frame; if the delay occurs before the next rebuild of the display list, there is no conflict. --Tepples (talk) 09:43, 17 March 2016 (MDT)

I tried to think of a way to use "INC abs,X" in a partial-instruction context, but without any additional game-specific knowledge, there are only a few addresses where one can write safely. Here's a sample:

A9 FE    LDX #$FE ; hides 'INC $FDD0,X'
D0 FD    BNE *-1

This would read from $FDCE, read from FECE, and then write to FECE. Writing to $8000-$FFFF is unsafe in several mappers, so I tried a bit different approaches.

A9 FE    LDX #$3E ; hides 'ROL $3FE0,X'
E0 3F    CPX #$3F
D0 FB    BNE *-3

This would read from $3F1E (maps to $2000; write-only register, so reading is safe), and then read and write $401E (unmapped, so access is safe). This would be completely safe. Similar combinations were found for each RMW operation (limited by the value of X which doubles as the RMW opcode number). However, there is no guarantee that the code will not loop infinitely, because the branch instruction is not masked on the second iteration, unlike in the previous sample. If the second opcode was 1-byte long, then the first byte of the branch instruction would be masked, but the second byte would still be executed. The only byte that would work for the second byte of the branch instruction would be F8, which means the code would have to be at least 9 bytes long, which further complicates matters.

If branches and partial-instruction execution are not involved, then it bears down to how to set up X in such manner that a wrap is guaranteed, and whether the setup code + the RMW instruction are short enough for the number of cycles they consume.

I am yet to think of a way where RMW abs,X can be used. I was able to incorporate INC zp,X in a few situations where X is known to be zero, though. Of course if you can know a certain page is free for tampering, something could be devised. There might also be benefit if it is known some zp address contains a pointer to another address that can be read/written safely. The addresses I considered unsafe for reading were 4015-4017, 4020-5FFF, and (&7 in (2,7)) in 2000-3FFF. The addresses I considered unsafe for writing were everything except 4018-401F, and (&7==2) in 2000-3FFF. --Bisqwit (talk) 15:18, 17 March 2016 (MDT)