• Hurdles to making a multitasking environment on the NES

    I’ve been thinking about what would be required to make a multitasking environment/platform on the NES.


    • Can load applications on-demand as independent processes
    • Can launch multiple instances of each application
    • Uses cooperative multitasking

    Realistically you’ll want the cartridge to have some RAM and allow bank switching for both RAM and ROM in order to increase the memory & storage available to programs.

    • The 6502 has a single stack that’s fixed to live from $100 to $1ff. This is mapped to console RAM and can’t be bank-switched. Each process wants its own stack, so they’ll either have to share this very limited space, or you’ll have to swap the contents of the stack when switching tasks.
      • Compared to x86, where you can update SS to any segment & SP to any location within the segment.
    • Similarly, process memory stored in system RAM will need to be swapped out on task switch. Memory located above $4020 could be bank-switched instead.
    • Accessing data in different banks is desirable so we can jump to code in a currently-unloaded bank, or simply access data from one.
      • We can add code to perform this work and store it in a fixed bank that’s always available, and make the compiler use that instead of a plain JSR.
      • Pointers will need to include bank information as well.
      • Compared to x86, where you can jump to a different segment directly without losing access to the caller’s segment.
    • Graphics/PPU state also needs to be associated with each process.
      • It’s probably easiest to give the active process the full screen, instead of allowing background processes to share the display (eg. overlapping/tiled windows). The CHR ROM (or other video data) for a background application probably needs to get swapped out for the active process so it won’t be available to show properly, and there are challenges around sharing the current palette.
        • It might be possible to switch banks/contexts between scanlines, which could require windows to be screen-width but let them successfully stack vertically.
      • A system menu UI could be handled using code & data in reserved always-available banks, like the code we use to handle moves and jumps across banks.

  • Using setjmp/longjmp from JNA

    TLDR: I didn’t think it would work, and it didn’t.

    Today I had a goal to call an established C library from Java, but it uses setjmp and longjmp for error reporting.

    I had been hoping/planning to use Java Native Access to interact with the libraries. This is just a simple hobby project, so I want to keep it as simple as I realistically can. That means I don’t want to add a C build step to my project at all, not to mention having the build target multiple OS platforms and CPU architectures.

    But I didn’t really expect setjmp and longjmp to work in Java. I have no idea what the JVM does with the execution environment and I expected longjmp would interfere with it in a way that would very probably corrupt the JVM’s state.

    I tried it anyway. It didn’t work. The program crashed with SIGABRT after longjmp (running on Linux).

    I encountered some things I found a little more interesting than just “it doesn’t work”, though:

    jmp_buf’s size isn’t predictable

    setjmp requires that you allocate a jmp_buf to store the environment in.

    jmp_buf is defined in the system setjmp.h. On my 64-bit Linux system, sizeof(jmp_buf) == 200, and it’s defined as a 1-element array containing a struct, so it can be allocated easily then passed by reference.

    I dug into setjmp.h first to understand it more, and realized the size of jmp_buf isn’t really predictable:

    1. It varies by architecture even with the same C library, and
    2. It’s not specified by the standard that it even has to be a struct or anything. It could just be a handle or whatever.

    setjmp could be a macro

    The standard doesn’t specify whether setjmp is a function or macro. JNA can only call functions, since macros are inlined by the compiler at build time.

    (I didn’t check how it’s implemented in other C libraries, like MSVCRT on Windows or libSystem on macOS.)

    Not exactly related, but I also happened to call fflush(stdout) from Java. It turns out that stdout is actually specified in C89/C99 to be a macro. In glibc, it’s also exported as extern FILE *stdout so I was able to use that, but then my code would not conform to standard.

    I guess I’m gonna have to write a C adapter library that’s more Java-friendly.

  • Smooth scrolling inside caves on Dragon Warrior for the NES

    Lately I’ve been wondering if it’s possible to achieve smooth scrolling in Dragon Warrior on the NES.

    This post describes the problem and a possible solution. I haven’t tried this solution, but based on my understanding of the NES’s PPU (picture processing unit), I think it could work. At least, maybe it’ll inspire someone to try it or to propose different techniques.


    In Dragon Warrior, scrolling is smooth in totally-exposed areas, which means all areas outside of caves. On the NES, this is achieved by using a method typical of the NES: by updating the PPU’s scroll registers and updating the tiles just outside the visible screen area while scrolling. I’m not going to explain PPU mirroring and scrolling; there are explanations of the topic like Retro Game Mechanics Explained’s great video on the topic.

    Dragon Warrior uses vertical mirroring, which means that the video memory can support two whole screens wide and tiles can be updated off-screen while moving horizontally; but the tiles being updated on the top and bottom of the screen are visible during a vertical scroll. This is hidden by most TVs (and emulators!) because it’s outside the physically visible screen area, due to overscan.

    In caves, Dragon Warrior doesn’t scroll smoothly at all. This is caused by two conflicting factors:

    • To hide the tiles being updated as they change or become visible/invisible, they need to be on the screen edges.
    • In caves, you can only see the area immediately surrounding you, which varies from a 1x1-tile area (no lighting) to a 7x7-tile area (RADIANT spell). This doesn’t reach the edges of the screen.

    As a result, stepping by one “Dragon Warrior” tile in a cave only has two frames of animation: One stepping halfway (one 8x8 character) and then one to step the next half.

    Aside: While viewing the name table in FCEUX I observed that the X scroll position alternates between 32 ⯈ 0 ⯈ 32 during a cave step:

    1. X scroll = 32 (at idle)
    2. Update all the tiles for the half-step in the off-screen nametable at X = 0
    3. Set X scroll to 0
    4. Update all the tiles for the full step in the off-screen nametable at X = 32
    5. Set X scroll to 32

    The problem

    This feels sluggish and disorienting, especially when you’re only using a torch or have no light at all. In fact, the brick floor is only an 8x8 repeating pattern, and since a half-tile step moves by 8 pixels at a time, you can’t tell if you’re moving at all. Pressing in a direction isn’t a reliable way to move because move inputs only seem to register on certain frames, so depending on when you press a direction and for how long, you might turn & move, or just turn.

    Clip of cave movement with no light.

    Most players would be using a torch or the RADIANT spell, so they can orient themselves using the walls visible around them. But speedrunners, especially those playing Dragon Warrior randomizer, are more likely to lack or avoid using torches or RADIANT, and they’re familiar enough with the cave layouts that they don’t need to see where they are in them. However, they still need to know whether or not their inputs registered as movements.

    In a situation like this, players typically bonk against walls to orient themselves, which plays a sound effect. But sometimes you need to make a turn before reaching a wall, so the problem isn’t solved for all cases.

    Proposed solution

    I think it’s possible to use the NES’s smooth scrolling in caves, while still keeping a limited visible area.

    The process is essentially the same as full-screen smooth scrolling, but instead of updating characters that are just off the edges of the screen, we’ll update them right outside of the player’s lit area, and try to hide them so that the result looks clean.

    This is a little different for horizontal and vertical scrolling because of the PPU’s limitations.

    Horizontal scrolling

    This process uses black sprites on the left and right of the lit area to cover our tiles while they’re updated. This is a pretty common technique for full-screen horizontal scrolling, where the left edge of the screen is masked using a PPU feature that does precisely that; and the right edge is covered by sprites.

    When the player moves left or right:

    1. Place a column of black sprites just outside the left and right of the lit area. These will cover the characters as they’re being updated.
    2. Place the new set of characters on the side that’s coming into view – eg. on the right when the player moves right. They’ll be obscured by the column of black sprites on that side.
    3. Update the X scroll as usual during a horizontal scroll. The outgoing side will also be covered by a column of black sprites that we added in step 1.
    4. Remove the characters and the sprites.

    Since the hero is typically the only sprite visible while in caves, sprite limitations are unlikely to be a problem. The only other example I can think of is that the princess is probably also a sprite in the swamp cave. The Dragonlord at the bottom of Charlock is also a sprite, but that area isn’t dark; it’s a full-screen fully-visible area.

    Vertical scrolling

    This is the same process as horizontal scrolling, but we can’t use sprites to obscure the characters being updated, because there’s a limit of 8 sprites per scanline. However, it’s possible to update the X scroll position between scanlines, which games often use to achieve a status bar that stays in position instead of scrolling with the rest of the screen. We’re going to use that to display an empty part of the nametable for the areas where we’re updating the characters during a scroll.

    First, we need to ensure that the nametable is empty (all black characters) at Xscroll + 32, so that when we swap over to it, we won’t show anything. This only needs to be done when entering the cave, because we will never have any reason to draw anything in that space while doing other cave stuff.

    Then, during a vertical move, we need to do this on every frame:

    1. Increment X scroll by 32 so that we’re showing the empty area.
    2. Update the Y scroll to keep a smooth scroll.
    3. After drawing the last empty scanline, increment X scroll by 32 again so it wraps around and starts showing our tiles.
    4. Allow it to draw the entire lit area.
    5. After the last lit scanline is drawn, increment X scroll by 32 again so that we show the empty area again.
    6. At the end of the frame, increment X scroll by 32 again to return it back to the visible area. This is where we want to stay while idle, and during a movement we can repeat this process. As an optimization, we can skip this step and step 1 if the next frame is a continuation of the vertical scroll, because they cancel each other out.

    As with horizontal scrolling, we’ll need to update the incoming and outgoing tiles during the scroll, but this is also as usual during a vertical scroll; we’re just doing it near the lit area rather than at the screen edges.

    Split scrolling can cause some artifacting/jitter as the scroll position might not get updated before the next scanline starts to be drawn, but since we have just empty black space at the start of each scanline, I don’t think the artifacting will even be visible.

    Possible issues and limitations

    Visible updates during scroll

    Although I think this process will work, I haven’t actually tested it. My primary concern is that we don’t have enough time to update the incoming characters before we need to show them. I think it would be unacceptable to delay a tile movement to update those characters before the animation, because this would make continuous movement (eg. holding right for 2+ tile movements) jittery as it would need to pause on each tile.

    I’m also not sure if we would leak palette changes into the lit area during scroll.

    UI interference

    It’s possible that this solution would interfere with the UI and battle view, but I don’t think it will because none of those can happen during a scroll, and when the hero’s position is aligned with the tile we don’t need to do any work to scroll at all.

    Mitigating these issues

    In existing full-screen scroll implementations, games ensure that the characters immediately off-screen are ready for the next scroll. If we did that, we’d need to maintain the X scroll split and the “hiding” sprites at all times. I think this would solve the problem with having enough time to update characters before the scroll, because we can update them during the scroll instead – just like games already do for full-screen scrolling.

    This would definitely cause interference with the UI. To mitigate that, it might be acceptable to clear all of our scroll machinery (our hiding sprites and pre-drawn characters) when we need to show UI, and restore it when the UI disappears. This would probably add some delay to UI appearance and removal, but I think that would be acceptable as the UI isn’t particularly quick to begin with.

    Terms used in this post

    • PPU: The NES’s picture processing unit that manages video memory, the nametable, scrolling, and everything else connected to the display.
    • Player: The player of the game.
    • Hero: The hero sprite that the player controls.
    • Character: A PPU character, which is a single 8x8-pixel entry in the PPU’s nametable.
    • Tile: A “Dragon Warrior” tile, a 16x16-pixel background tile composed of four 8x8-pixel nametable characters arranged in a 2x2 fashion. This is a common arrangement on the NES because even though characters are 8x8, color palettes apply only to a 16x16-pixel space.
    • Lit area: The visible tiles in the cave. The area varies from 1x1 to 7x7 depending on whether the player is using the RADIANT spell (which gradually decays from 7x7 to 1x1), a torch (3x3 decaying to 1x1), or neither (1x1).

subscribe via RSS