Star Fox: solving the throughput problem
Added 2021-04-13 17:01:02 +0000 UTCI recently published an article explaining the throughput problem, a limitation where it's directly attached to the amount of operations per second the SNES CPU can do per second and how many pixels you must draw to achieve a certain resolution + frame rate.
But what about Star Fox? How many frames is it possible to get given its current configuration?
Currently, the game runs at about 9~12 FPS (on average, it requires 7 frames to update its frame buffer). The frame buffer is 224x192 big and under the current configuration, the game takes three frames to update it to the VRAM. This means the game can't output more than 20 FPS right now.
So, keeping the resolution and maximum frame rate in mind, how many cycles available do we have per pixel?
Doing the math:
- Game resolution => 224x192 = 43,008 pixels
- Super FX clock => 21.48 MHz = 21,477,272 cycles
- Cycles per frame => 1364 master cycles x 262 scanlines - 4 master cycles (PPU NTSC correction) = 357,364 cycles per frame.
- Time available at 20 FPS => 3 [frames] * 357,364 [cycles per frame] = 1,072,092 cycles
- Pixels per frame buffer at 20 FPS => 1,072,092 [cycles] / 43,008 [pixels] ~= 24.928 cycles per pixel
Unlike SA-1, the Super FX can't draw pixels while the frame buffer is transferred to the VRAM, because it does not have the Super MMC controller which lets both CPUs access the same resource (RAM, ROM, SRAM, etc.) freely. So we have to keep that in account:
- 2 pixels can be stored in a single byte (4BPP frame buffer)
- Each byte takes 8 master cycles to be transferred (2.68 MHz)
- Total Super FX cycles to be awaited for each done frame buffer => 43,008 / 2 * 8 => 172,032 cycles
- Time available for drawing => 1,072,092 [cycles] - 172,032 [cycles] = 900,060 [cycles]
- Cycles per pixel => 900,060 [cycles] / 43,008 [pixels] ~= 20.928 cycles per pixel
So we have the roughly value of 21 cycles per pixel, which is pretty challenging for maintaining 20 FPS. Given the 10.74 MHz Super FX 1 clock, this value ends up being 10 cycles per pixel. No clue why the game always ran so slowly, the devs did a miracle using Super FX's pipeline processing. Keep in mind that the chip doesn't only do the drawing, but it also takes care of the space projection, game logic, game collision, it's a lot of tasks to be done in so few time.
If you are curious, for outputting Star Fox at 30 FPS, we would have only roughly 12.618 cycles per pixel for drawing everything at this pace.
Initially, the Star Fox Super FX 2 upgrade will target 20 FPS, so we can have around 21 cycles per pixel for drawing. If we get the game optimized enough to the point of each pixel taking around 12 cycles per pixel, it might be possible to make it run at 30 FPS. It's an almost impossible value, but who knows with some Super FX overclocking (done via cartridge mod) we can get this fast?
My custom Super FX cartridge, capable of programming custom ROMs is expected to arrive around middle May. With it, I will be able to actively test and experiment possible optimizations for my future Super FX work.
What are your expectations over the Star Fox + Super FX 2 + Delta Based Correction patch?
Comments
I'll write a post about it, thanks for the idea! Answering your question directly, I would take a look on Ersanio's ASM tutorial here: https://ersanio.gitbook.io/assembly-for-the-snes/ for learning general 65c816 assembly, then I would take a look on SNES Dev Book I and II for figuring out its architecture - https://floating.muncher.se/bot/manual/book1_text.pdf & https://floating.muncher.se/bot/manual/book2_text.pdf plus anomie & no$cash documentation for more technical details about the SNES and reference docs for the registers: https://floating.muncher.se/bot/regs.txt & https://problemkaputt.de/fullsnes.htm
Vitor
2021-04-25 19:31:16 +0000 UTCIf a usual software developer (C#, Java, PHP, etc) would like to learn more about snes hacking and is interested in contributing to the project, what would be a good path? I think I might start by learning assembly, I guess...
Zoio Silva
2021-04-24 14:41:42 +0000 UTCA solid 20 fps would be a huge improvement already.
dogen
2021-04-13 21:19:28 +0000 UTC