Cave Story, Counterpoint, Cross Relations, and Bitonality

Two articles I wrote in 2012 using tracks from the Cave Story OST to explain some music theory and harmony concepts–republished here.

This article and others like it are supported by readers like you through


Thank you!

Subject of today’s post (first post!)…the Cave Story OST! If you aren’t familiar with it you can download it for free here or listen on YouTube…would be good to marinate it some of this before we get into the details 🙂

Some background first:

Cave Story is a miracle of a game. It’s a free 2D platforming shooter for the PC, developed entirely by one guy, “Pixel,” in his free time. It takes after the style of some of the classic games of its genre: Mega Man, Metroid, Castlevania, but the actual experience of the game is unique. The gameplay, story, music, are all top-notch, and it’s is probably one of my favorite single-player games of all time. The only tragedy is that the game is short–although considering that one person did everything it’s quite an accomplishment.

A game developed by one guy, in his free time–and he did EVERYTHING. Yes, that includes the music. This game had some of the best music ever written for a video game. But before we get to Cave Story specifically, let’s first talk about video game music in general:

Video game music has always been sort of looked down upon–in the same way film music/incidental music in general has historically been considered inferior to “absolute” music (“Symphony in G Minor”), in this case video game music is low art in two ways–the fact that the music is incidental, and more importantly, that the video game medium itself is still largely considered unworthy of any real critical consideration. A few games (the Legend of Zelda in particular) have entered a space where I think the level of respect is higher–Nintendo recently commissioned an original symphony, The Legend of Zelda, Symphony of the Goddesses, and has been going on a hugely popular tour at big concert venues all around the US and Japan. The Play! symphony series have also attained a high level of popularity.

But something I think a lot of people fail to appreciate is that there is an unique challenge to writing video game music–something that doesn’t exist in other mediums.

Video game music repeats.

The person playing the game is in charge of what happens in the game, so as a composer you have to assume that the music you write will loop for any given amount of time. It means your music has to be catchy, interesting, and somehow able to repeated without sounding repetitive. And so as far as I’m concerned, the Mario theme is one of the greatest musical achievements of our time. There’s a cover of it on pretty much every instrument on YouTube and it doesn’t seem to get old, despite it being only a few minutes of music. And it’s not like techno or some other electro-dance music genres, where musical repetition is a big defining feature of the music. In a lot of game music, there is melodic and harmonic progression, the kind you get with a Radiohead song or a Chopin Nocturne. But despite how you feel about those pieces of music, most people would not want to listen to music like that looping for an infinite amount of time. I think it is really worth respecting good game music, at the very least, for having that magical attribute that allows it to be played back over and over without fear of annoyance.

Now onto Cave Story:

I mentioned earlier that game music, despite its repetitive nature, usually has interesting melodic and harmonic progressions as well. The Cave Story soundtrack is an amazing example of this–the tracks are typically between a minute and 2 minutes, but the amount of musical interest in those is astounding. Let’s talk about COUNTERPOINT!


I guess I should apologize for talking so much about getting to Cave Story and then when I finally start talking about it I go off on random music theory stuff…but it’s important! And not at all random!

Counterpoint, in short, is the process of using multiple melodic lines to create your musical structure. It occurs when you have two melodic lines (meaning, it’s enough of a tune that you could enjoy humming it) combining together to create harmony (as opposed to have a harmonic section whose sole purpose is to set a backdrop for the melody). When you sing a row (like in “Row Row Row Your Boat”) that’s a very simple example of counterpoint at work.

An example I really like is from James Blake’s “CMYK”: it’s easy to hear but the effect is awesome. He introduces the first melody at 0:14 (sampled from Kelis – Caught Out There: 1:17) and after that one is good and stuck in your head, he introduces the second melody at 2:04 (sampled from Aaliyah – Are You That Somebody: 0:48). Near the end of the song at 2:58, he plays them both at the same time, and despite both being singable melodies they act as harmonic backings for each other. They are both melody and harmony at the same time. THIS IS AWESOME. YOU SHOULD BE EXCITED ABOUT THIS.

So one reason I really like Cave Story’s soundtrack is because it has great counterpoint. Let’s take “Safety,” the music that plays when you’re in the Teleporter room in Mimiga Village (Arthur’s House).

The music begins with your usual melody-harmony relationship–a clear melody over a lower repeating harmonic pattern (as opposed to a melody in counterpoint, the repeating harmonic pattern isn’t much of a “tune”). Let’s go through this in detail:

0:06–Melody and harmonic pattern enter. The higher, tinny-sounding one is the melody, lower “rounder” sounding one is the harmony (this should be easy to hear)

0:22–A higher-pitched decorative pattern enters. Again, this isn’t really counterpoint since the pattern itself is not particularly melodic, but it serves to decorate the music, since is the 2nd time you’re hearing the melody

0:43–A new section begins, with one melody by the lower “rounder” sounding instrument (and a bass, but that’s wayyy in the background and just for support)

0:52–A new melody joins, played by the tinny-sounding instrument! It enters through imitation, which is just where it imitates (this is not hard) the previous melody and we hear these two melodies play out together. Using the interaction of the two melodies the music builds up to the “high point,” at 1:06 (literally the highest note in the song), and then the music starts over.

Spend some time listening to the interaction of the two melodies! Try to separate them in your mind as individual lines (imagine 2 people, each singing their own part). It really adds a lot of interest to the music, doesn’t it?
Note also, that the counterpoint is used near the end of the song as a sort of unifying, final gesture to build up and to end the piece, very much in the same way that James Blake’s “CMYK” does.


Now that you’ve gotten familiarized with counterpoint, let’s talk about specific things you can do in counterpoint. Things are gonna get a little bit more technical.

Cross-relations are something that Pixel uses a lot in the Cave Story OST. The basic idea is this–when you write counterpoint you typically want your melodies to harmonize well together, you want them to “get along.” But interesting music rarely has things get along all the time; tension and interest is often built from creating clashes and dissonance. But an important thing to understand about harmonic dissonance is that usually, the dissonance that occurs is still within an established harmonic context (G Major, for example). In an attempt to explain this without getting too technical, let’s think about “harmonic context” (“G Major”) in terms of colors. If every note (in the 12 note scale) represented a color, G Major represents a subset of those colors that happen to work well together. So let’s just say, for example, that a G Major painting consists of mostly blues, greens, and some purple. Your typical dissonance events would still happen WITHIN those colors. So maybe most of your painting consists of blues with pockets of green and purple, but in one section there is a green and purple splotch that stands out. That would be a typical dissonant event.

Cross-relations are different. Cross-relations involve, quite literally, invasions by notes that are not part of the main subset of notes at all, and the invasion must occur in a contrapuntal situation. If we think back to the color example, imagine your bluish painting. A cross-relation would be like slapping an orange dot onto that painting–a small, probably tiny, orange dot, but no matter how small it is, you’re certain to see it. The challenge with its musical analogue, of course, is that because music moves through time, small things can occasionally slip by unnoticed. I’ll cover those in the next post!

Last time I gave a short intro to Cave Story and video game music in general, introduced the idea of counterpoint, and then ended with an attempt to explain cross-relations with a painting analogy. To quickly reiterate that idea, let’s go straight into another track on the OST, “Geothermal.” The piece is in D Minor (actually it’s more like in limbo between D minor and F major, but that’s not important right now)–this doesn’t have to mean anything to you, other than, similarly with any other key, D Minor has a certain subset of notes within the 12-note palette that distinguish D Minor from all the other 23 keys. Using the example from last time, let’s say D Minor uses mostly bluish colors, with some green and purple for dissonance. A cross-relation, which I described as an “invasion by a foreign note,” would be like a small orange dot against the blue backdrop. Let’s see where this appears in “Geothermal.”

Again, I’ll go through this in detail:

0:01–The intro sets the harmonic backdrop for the piece–most of the song actually just goes back and forth between these 2 chords introduced here. You can hear the chord change at 0:05 and it changing back at 0:10.

0:18–Melody comes in! Take note of what 0:23 sounds like. The first time you hear this part of the melody, it is in line with the other parts of the music (the harmonic backing). Blue on blue.

0:36–The melody repeats. In a tradition dating back as far as written music goes (and probably even before that), the second time you hear a melody has to be different from the first in some sort of way. The “different thing” that happens here is at 0:40.

Do you hear the difference between 0:23 and 0:40? There should be something “off” about what happens at 0:40. What you are hearing is the cross-relation! It’s the orange dot against the blue backdrop. Another way to think of it is: if someone were playing this music live, what happens at 0:40 might be perceived as a mistake. Cross-relations sound “off” because they ARE, in a way, “wrong notes.” They’re not part of the subset that’s dictated by the key of the piece (D Minor). Chopin wrote an Etude that played with this concept (and it was then aptly nicknamed the “Wrong Note” Etude):

Isn’t that awesome?! What a special and interesting way to vary the second repetition of the melody! This is not something you usually see when you see music makers creating interest in the 2nd repetition of a melody. The typical ways of creating variation come in the form of:

drums (beginning vs 0:22):

added instrumentation (beginning vs 1:10):

or (more commonly in classical music) harmonic variation (beginning vs 0:19).

And of course people often vary the melody itself (but I excluded that because it’s not an exact repetition of the melody anymore, is it?). But what Pixel does here with the use of dissonance is really pretty rare in music today.

Of course, he doesn’t do it just once! Let’s look at one of my favorite tracks from the Cave Story OST, “Labyrinth Fight.”

0:01–Similar to the other songs we’ve looked at so far, it starts by introducing the harmonic framework of the song. However, here we must recognize a crucial difference–in the “Geothermal” example, the introductory passage is a repeating pattern. It anchors around a few notes that make up a chord (which is why I noted above that there are only 2 chords, and that they change back and forth between each other). Another way to say this is that you could play all the notes in that passage simultaneously and it would create the chord you want. This is NOT the case here. The introductory passage in Labyrinth Fight is not a repeating pattern–it’s a melody of its own! It’s the bassline of the song, which means that while it does SUGGEST the harmonic structure of the song, in this case, it’s melodic as well.

0:13–The melody comes in! But the bassline is melodic as well–what does that imply? Where is the “harmonic backing” we usually have? Thinking back to what I said in the last post, it means that, yes, this is a contrapuntal piece! The bassline and the melody are actually both “melodies” in that they’re interesting on their own, and they have to combine and work together to create the harmonic progression. There is no instrument that acts purely as a “harmonic backing” in this piece.

When the melody first comes in at 0:13, notice that the bassline, in comparison to how it sounds in the intro, becomes much more subdued, and much more “bassy.” If you’re on laptop speakers, chances are you can’t even hear it.
Think of it as the bassline coming in first with his big melodic introduction, then fading away to allow the upper melody (because the bassline is the lower melody, right?) to come in and make HIS big melodic introduction. After they’ve both introduced themselves, both sing at full volume, together (this is at 0:24, the 2nd time we hear the upper melody).

Let’s look now at the first time the upper melody comes in. Right away, right before 0:15, there should be something “off” about the interaction of the two melodies. You may have to listen kind of hard, since the bassline is pretty quiet. If it goes by you pretty easily, that’s fine. I think that there’s not meant to be a harsh or noticeable “offness” quality to it. But keep that passage in your mind. Remember how in “Geothermal,” the second iteration of the melody was accompanied with a cross-relation? Let’s hear what happens the second time the (upper) melody is played in this piece. Listen carefully to what happens at 0:25–it may been hard to hear at 0:15, but you should be able to hear something clearly “off” at 0:25. It’s clearer because the bassline is playing at full volume this time! How awesome that he manages to achieve the same effect of a “different repeat of the same melody” without even CHANGING THE NOTES this time!

So what exactly is “off” about 0:25? It should be clear at this point that this is another cross-relation. It sounds like a “wrong” note, something is “off,” it must be a cross-relation.

But it’s not.

What? Think back to how I even started introducing the idea of a cross-relation. I had to introduce the concept of a “key,” and I made an analogy about how a cross-relation is like an orange dot on a blue painting. In “Geothermal,” we looked at a musical example of how that “orange” note invaded our “blue” harmonic backing.

Now think back to what happens at 0:25 in “Labyrinth Fight.” If we were to assign “orange” and “blue” to the musical events happening here, which would be orange? Blue?

As I mentioned before, the difference between “Geothermal” and “Labyrinth Fight” is that one is your classic “melody vs harmonic backing” and that the other is “two melodies in counterpoint.” In “Geothermal,” it’s easy to assign orange to the “off” note in the melody, and the blue background to the harmony. But when you have two melodies in counterpoint, we cannot say EITHER is “the harmony” or “the melody.” They’re equally both! The “orange dot on blue” analogy cannot APPLY to counterpoint, because we’re talking about an entirely different system of writing music!

What exactly is going on, then? The best way to describe what’s going on here (because these things are rarely black and white) is bitonality. Breaking that word down, bi meaning two, tonal meaning “harmonic areas,” what bitonality essentially is is the interaction of two musical objects, each in a DIFFERENT key. Typically in bitonal music, the two keys that are juxtaposed are usually similar. In “Labyrinth Fight,” the bassline mostly contains notes from G Minor, while the upper melody is in G Major. What this allows for is what you hear in “Labyrinth Fight”: sections of consonant harmony (because G Major and G Minor have many notes in common), with sections of “offness” (where G Major and G Minor clash). If we use a color analogy, we can say, for example, that “G Major” is made up of reds and oranges. “G Minor” can be made up of reds and purples. Where the reds coincide, there are no problems–but occasionally purple and orange will be juxtaposed, and that’s what we hear at 0:25.

It takes some knowledge of musical history and musical convention to fully appreciate it, but bitonality in music other than 20th century classical music is RARE. The fact that this kind of stuff showed up in a video game soundtrack is actually really amazing to me. It’s not easy to pull off well! Usually, taking two melodies from different keys and putting them together is cause for disaster. Cave Story pulls it off like no other.

BONUS: AWESOME bitonal piano music from 20th century composer Darius Milhaud: Sorocabo from “Saudades do Brazil”

If you enjoyed this article, please consider supporting me on


Thank you!

2 thoughts on “Cave Story, Counterpoint, Cross Relations, and Bitonality

  1. Awesome anaylsis, Jason! Really glad I stumbled upon this. This is the kind of stuff that reminds me why I got into music to begin with 🙂

    • Thanks! I wrote these so long ago so they’re in a different style from my writing now, but the content’s still all there. Glad you liked it. I’m planning to do more pieces like this soon, focusing on a few or just a single piece but going through a deep analysis while explaining the music theory behind it.

Leave a Reply

Your email address will not be published. Required fields are marked *