work on karya: May 2012

A lot of interesting things have been happening lately. I've been putting off writing about them because, well, all the time spent writing blog stuff is time I could have been writing code.

Bloom

One of the most exciting is that I "finished" the first piece of music to come out of karya. It actually doesn't totally count, since it's more like "re-edited", since I started with "bloom", which I wrote a long time ago on the Amiga, and wrote a little program to dump out the old OctaMED save and load it into karya. Since that point it's been the testbed for score features, and motivation for new ones. It inherited the block structure and all the notes from the old tracker piece, so I still don't have many tools for dealing with larger scale score structure. It also doesn't give me much feel for how I should structure a score given that karya is not subject to the tracker two level structure of a list of blocks. It turns out that structure is convenient for caching, so I'll probably need something like it, but I still have a lot of unused flexibility in the layout of the piece.

Also, it's not really finished because there are some things I'd like to add, but they're mostly for experimental purposes rather than musical ones. Of course, I hope that technical experimentation will always be a source of musical inspiration, so presumably this will always be true. One thing is more experimentation with expressive wind melodies. The other is that I noticed that one section sounded interesting backwards. Incidentally, this was thanks to step play being able to go backwards. It's interesting how some features lend themselves to serendipitous discovery. In this sense, as well as its intended purpose of sounding out music at my own speed, step play has been worth the time put into implementing it.

Integration

Anyway, one problem with step play is since it's a MIDI-level mechanism it's not really that easy to turn it back into score. Initially I wrote a UI event transformer Cmd to reverse a bit of score, but ran into problems associating the controls with each note. I realized I was, once again, redoing work that the deriver already does. I have already put quite a bit of work into a vocabulary of ways to transform notes at the deriver level, and one thing I'd always planned on doing was to come up with a way to bounce a derivation back up to the score. So you can choose whether you'd like to implement a particular effect by writing out the expressions to make it happen, as programming languages do, or transform the source directly, as sequencers tend to do. The advantage of the first is that you retain the original data and the structure of the music is apparent in a high level way in the score, and the advantage of the second is that you can continue to edit the score. In a way, abstraction in programming languages forces you to pick a point in the abstraction space and makes it hard to reach down below that, but in music you often want to reach across those boundaries (e.g. call this function 15 times, but on the 8th time add 3 to that value instead of 2). In a programming language, this turns into a configuration problem: either every function needs an ad-hoc growing list of parameters, or there's some kind of dynamic environment or implicit parameters, or maybe inheritance and overriding as in the OO world. Actually, the "dynamic environment" is exactly what the deriver already does to try to attack these problems, but there are still plenty of cases where the change is too ad-hoc. Reflecting the derived notes back up to the score is another angle of attack on that problem, and I realized all I really needed was a function from score events to UI events. It was surprisingly easy to write, just about 200 lines. I decided to call it "integration" because it's the reverse of derivation. I originally intended to give that name to recording MIDI -> score, but "record" is a more conventional name for that more conventional concept.

I also realized I could sort of have my cake and eat it too by setting some blocks to automatically integrate the output of another block, and with a merge function, I can figure out which events have been edited or are new, and try to merge the newly integrated output back in. So I should be able to have one part be a derivation of another part, edit the derivation a bit, then edit the source block, and have it merge my changes back in to the integrated result. I can use an event style to mark derived notes, and detect deleted notes, moved notes, and perhaps even transposed ones. The capabilities of the merge function may need to vary by score. This is a feature I had planned on adding from the very beginning, but was putting it off for a distant future version. But now it seems like it shouldn't actually be that difficult to implement.

The next step is integrate output into the same block, which presumably involves fancier merging. This has the obvious applications with fugues and canons, in addition to doubled parts, sekaran, imbal, and all manner of parts derived from other parts.

Git Save

Another feature I intended to implement from way back is incremental saving. The idea is to continually save each modification, so that explicit saving is unnecessary (though you can checkpoint to mark specific spots). This means that the undo history is saved along with the rest of the score.

I originally intended to save a simple numbered series of files representing each update, and made some changes to the Updates to support this. But I realized I'd have to explicitly delete the future if I did edits after an undo, and why shouldn't I preserve it instead? That led to the idea of using a hash of the contents of each update so they won't conflict with each other, and provide some assurance that you don't apply them out of order, and that in turn started to sound a bit like git. It turns out that git is actually a general purpose object store with a source control system built on top, so I looked into using git instead of inventing my own system.

So I have a partial implementation, it can save and load complete states as well as checkpoints, but is not yet hooked into the undo / redo system. It's a bit awkward and quite slow to interact with git by sending tons of shell commands, but it works for the moment. There's a library version of git called libgit2 which I can bind to once this is working. In return, I get git's ability to pack up lots of small files efficiently, add references to mark particular checkpoints, and gc away unreferenced ones. I should also be able to use graphical git tools to visualize the history and branches of a score. I don't think I'll be able to support merges, or have any use for branches, but since I use the object store directly (there's no "working set" in the filesystem) I'm not sure those features make sense.

Hopefully I can finish this up, test it thoroughly, and never have to deal with losing data due to a crash again. Not that crashes happen very much, this is haskell!

Theory3

Since I now support diatonics and enharmonics I have to understand scales in more detail than I previously did. Specifically I need to know about the key, for scales that have them. This is used for not only diatonic transposition, but also symbolic transposition, both chromatic and diatonic, and choosing the most appropriate enharmonic given an input note.

This is well defined for the church modes, but leads to some interesting questions when we start pushing the boundaries of the tonal system. Specifically, how should non-seven-tone scales work? The simplest examples are octatonic and whole tone. In the case of octatonic, the letters no longer indicate the scale degree because one will have to repeat. That means the usual definition of diatonic transposition, which is to increase the scale degree, can't just increment the note letter. Then there's the question of spelling. If I take the Messiaen approach and say there are only three octatonic scales then I can simply hardcode the spelling for each one. But in practice when I write in octatonic I'm still thinking tonally, so there should be a key. And the key must fall within the scale, so if I'm writing in Db octatonic then I expect that Db should come out when I hit the C# key! So I settled on two octatonic scales, octa21 and octa12, multiplied by each chromatic pitch class.

The simpler approach is to adjust the system to accomodate these scales naturally, e.g. by spelling octatonic A-H and whole tone A-F, etc. This leads to much simpler notation for things which are purely in that scale, but would lead to difficulties when modulating between, say, octatonic and a church mode, or writing something which is somewhere in between the two.

Even as the weakness of the A-G twelve tone system is that it poorly expresses the specific mode you are writing in, that same aspect is a strength in that you need not commit fully to any particular mode. So it exists at a peculiar point of tension between tonal style spelling and giving up and writing in a purely 12-tone style and choosing enharmonics purely melodically.

All this will be especially important for non-12-TET scales, where enharmonics will actually be different frequencies. I'm still not clear on how a 12-note scale can coexist with just intonation, but there's a company called h-pi that makes such keyboards. When the time comes I should put in the time to understand their system better.

For some reason it's taken a long time and 3 iterations to get this right, and it's still not done yet, but it will be necessary eventually.

TrackNum vs. TrackId

This is an internal cleanup headache. I use BlockIds instead of directly talking about Blocks so that I can have multiple Views backed by the same block. I originally thought it may be useful to do the same thing with tracks, so blocks actually have TrackIds instead of Tracks, and theorotically a track could appear in more than one block, or more than once in one block. I say theoretically because now there are two ways to refer to a track: either by its TrackId, or by a (BlockId, TrackNum) pair. When referering to the events I should use the TrackId so e.g. I only transform the events of a track once even if it appears in a block twice, and when referring to the track in context it must by (BlockId, TrackNum). Unfortunately I got that wrong a lot and as a result a track appearing in a block twice will probably not work right. For instance, the event stack uses the TrackId, not the TrackNum, which means that signal output and Environ are saved by TrackId. But since the two tracks appear in different contexts they may very well render to different signals, and will likely have different Environs.

Also TrackIds add some complexity. I have to come up with names for them and worry if those names should be meaningful, GC away orphaned tracks that no longer appear in any block, and deal with all the functions that give a TrackId when they should have given a TrackNum (you can't sort the tracks by their order in the score if all you have is a TrackId, and you're not guaranteed to be able to get a TrackNum from it because of this multiple tracks in a block thing).

So I want to scrap the whole idea of the TrackId. Unfortunately, it has its tentacles in deep and sometimes it's very convenient to be able to talk about a track without reference to a block. I have a State.Track type which is a (BlockId, TrackNum) pair, but that's turned out to be rarely used because in practice I often want to separate out the BlockId and do different things with it.

Every time I come across a case where TrackId is incorrect or error-prone it goes onto the list of reasons to switch. Eventually it'll get heavy enough that I'll be motivated to go through the hassle and breakage.

Memory Hog

At one point I laughed at web browsers for their memory hogging nature. That was back in the days of Netscape Navigator. Well, the laugh's on me now, because I recently noticed the sequencer hanging out at 750mb, with Chrome down there at a mere 250mb.

So clearly I need to do some space optimization. It looks like about 40mb for just the program, but once I enable the REPL and it links in all its libraries that winds up at 116mb. This probably means the sequencer has all libraries linked in twice, which is just a bit wasteful. Perhaps I can get both the main compile and the GHC API to link shared libraries?

Also the score data is huge, and it grows over time. For that I think I can do some work on interning strings, flattening pointers out of data structures, discarding old undo states (once I've finished the stuff to save them to disk), compacting cached signals and rendered MIDI down to flat bytes, and someday compacting long series of 'set' calls in control tracks down to their raw values, which should be a big time win if there are lots of them.

I sometimes get pretty laggy behaviour under OS X and I think some of it is due to excessive memory usage. OS X definitely has poor behaviour when under load. Work on this machine effectively comes to a stop when linking, to the point where the task switcher will get stuck and vim will get laggy. And it's pretty normal to hit the pause key and wait a couple of seconds for itunes to actually wake up and pause. So it's probably not totally my fault.

This laptop felt pretty fast when I first got it a number of years ago, but there's just no comparison against my relatively newer linux desktop. One benefit of taking so long to finish the program is that "wait for computers to get faster" is a reasonable alternative to optimization.

Wednesday, May 2, 2012