work on karya: 2016

Here are some major changes over the last year, in roughly chronological order:

macros

These are basically like function calls, and I actually wound up with two kinds. Maybe overkill?

Now that I have ky files, I started to want to be able to define new calls out of compositions of other ones. These macros are defined textually in the ky language, and are basically just replaced with their values. The replacement is post-parsing, but call names are still looked up dynamically, which is important because it means, for instance, a call to m will look up the instrument specific mute technique. This is kind of like late binding, so these are called dynamic macros.

Then, calls started to require "realize" postproc calls, and require them in a specific order, e.g. cancel notes, then infer pitches, then randomize starts. That made me want to define a call as a composition of other calls, in Haskell. The other calls might be in a different module, and I didn't want to rely on what happens to be in scope, and in contrast to dynamic macros, don't want to be vulnerable to rebinding. This is like static binding, so these are static macros.

gamakam4

Every once in a while I would write a library for gamakam, and then decide it was too clumsy. The 4th incarnation seems to more or less work, though it's still missing a lot. It's actually just a simplification of gamakam3, from last year.

negative duration

I once again reworked how negative durations worked. I figured out that even Balinese notation seems to work better if only certain calls use negative duration. For instance, norot has a bit of preparation, but is mostly sustaining a pitch in the usual positive manner.

But then it's quite common to have a negative duration event aligning with a positive one, and since I represented negative durations as exactly that, I'd wind up with two notes with the same start, which violates the track invariant about no overlapping notes... not to mention it would be a tricky UI question of which one you wanted to edit.

So I decided to change the representation to actually all be positive durations, but negative oriented notes would have a flag, and the UI would interpret that flag as drawing text at the bottom. For some reason I was also under the impression that this would get rid of all of the complicated special casing to support negative durations, since there wouldn't actually be negative durations anymore.

It turned out I was totally wrong about that last part, because while I could indeed do that, it turns out that for the purposes of editing I want to treat negative duration notes in the complementary way (e.g. set duration moves the start, or selections include the end but not the start), so I not only had to keep all the complicated stuff, but I had to reimplement it for the new implementation, and it kind of got even more complicated because now it sort of acts like the note is at the end of the duration, but it's actually just a normal note with a flag. Also it's still not worked out, because it's quite confusing to edit them, since you edit at the top but the text appears at the bottom.

So I don't know. Maybe I just need to polish the rough edges, and then get used to the quirks. Or maybe go back to real negative durations, but come up with some other way to have two events that "start" at the same time.

The bright part is that between this and note cancellation I seem to finally have a more or less usable way to express typical moving kotekan patterns, even if the entry and editing is rough. It just seems way more involved than it should be.

text display refactor

Since the above negative duration stuff meant I started actually writing scores that used negative duration, it annoyed me that text wrapping didn't work for them. Also, I thought it would be worth putting some time into fancy text layout so I could squeeze in the right-aligned merged track text wherever there was room, so I rewrote the text wrapping algorithm. It got really annoyingly complicated, especially with positive and negative events, and it still has some minor bugs I haven't bothered to go figure out yet, but at least negative events wrap upwards now.

It turns out squeezing the right-aligned text into gaps caused by wrapped left aligned text just doesn't happen that often, though.

im, 音

I sampled my reyong, and the simple job of supporting multiple articulations (open, damped, cek, etc.) per pitch turned out to be ridiculously complicated. This is due to terrible MIDI, terrible buggy Kontakt, and its terrible excuse for a scripting language, KSP.

I always intended to add a non-realtime non-MIDI synthesizer, and since surely a sampler is the easiest to implement, I started on an implementation. I got as far as a basic proof of concept, namely a note protocol, an offline renderer, and a simple sample playback VST to handle the synchronization. It's crazy how much simpler things become when I don't have to deal with MIDI.

However, losing realtime response is likely to be quite annoying, though I have a plan for how to get it back in a limited way. And since I already did all the work to get the reyong samples working in Kontakt, I don't have a lot of motivation yet to finish my own implementation. It will likely have to wait until I either have another annoying sampling job, or more likely, finally get around to writing a physical modelling synthesizer.

instrument generalization

Now that I technically have a non-MIDI non-lilypond backend, I needed to generalize instruments into common and backend-specific parts.

Also there was a lengthy and messy transition from the old way where you'd directly name an instrument's full name in the score, to having short aliases to the full instrument name, to aliases becoming separate allocations for the instrument (so you could use the same instrument twice), to aliases becoming the only way to refer to an instrument and renaming them to "allocations."

In retrospect, I should have done it that way from the beginning.

retuning

Of course I already have arbitrary tuning via pitch bend, but it's a hassle to add a bunch of VSTs, so I added support for retuning via the MIDI realtime tuning "standard" (supported only by pianoteq), and retuning via KSP (naturally supported only by Kontakt).

hspp

GHC 7.10 finally has call stacks, which let me get rid of the hacky preprocessor. It served its purpose, but it's much nicer to not need it. I lost calling function names because the 7.10 support for that is kind of broken, but I think it may be fixed in GHC 8.

solkattu

The track format isn't great for expressing rhythms, since the rhythm is implicit in the physical location of the note. Also, except for integration, which is complicated, it's basically "first order" in that it doesn't easily support a score that yields a score. For instance, the mapping between solkattu and mridangam strokes is abstract and dependent on the korvai, and of course could map to any number of instruments.

So I came up with a haskell-embedded DSL which is purely textual and thus can express rhythmic abstraction. It's mostly useful for writing down lessons, but it can be easily reduced to track notation, so in theory I could write solkattu and integrate that to a mridangam track. I also have some experimental alternate realizations to Balinese kendang and ideas for a reyong "backend". That way I could write in solkattu, and then have it directly realized to any number of instruments, either to play simultaneously or exchange material.

It looks like this:

t4s :: [Korvai]
t4s = korvais (adi 6) mridangam $ map (purvangam.)
    [ spread 3 tdgnt . spread 2 tdgnt . tri_ __ tdgnt
    , spread 3 tdgnt . tri (ta.__.din.__.gin.__.na.__.thom)
    , tri_ (dheem.__3) (ta.din.__.ta.__.din.__.p5)
    , tri_ (dheem.__3) (p5.ta.__.din.__.ta.din.__)
    , p123 p6 (dheem.__3)

    , p123 p5 (tat.__3.din.__3)
    , let dinga s = din!s . __ . ga
        in p5.dinga u . ta.ka . p5.p5. dinga i . ta.ka.ti.ku . p5.p5.p5
    , tri (tat.dinga . tat.__.dinga.p5)
    ]
    where
    tdgnt = ta.din.gin.na.thom
    p123 p sep = trin sep p (p.p) (p.p.p)
    purvangam = tri (ta_katakita . din.__6)
    mridangam = make_mridangam $
        [ (ta.din.gin.na.thom, [k, t, k, n, o])
        , (ta.din, [k, od])
        , (dheem, [u])
        , (din, [od])
        , (tat, [k])
        , (ta.ka.ti.ku, [k, p, n, p])
        , (ta.ka, [k, p])
        , (dinga, [od, p])
        ] ++ m_ta_katakita

And reduces to a mridangam realization like this:

k _ p k t k t k k o o k D _ _ _ _ _ k _ p k t k
t k k o o k D _ _ _ _ _ k _ p k t k t k k o o k

D _ _ _ _ _ k _ _ t _ _ k _ _ n _ _ o _ _ k _ t
_ k _ n _ o _ k t k n o _ k t k n o _ k t k n o

I noticed that after editing a score for 10 minutes or so, the UI would start getting laggy. Usually that means a memory leak and too much GC, and sure enough ekg showed that after each derivation memory usage would jump up, and never go back down again.

The first thing I blame is the cache, because it's the only thing that remains after each derivation, besides the note data itself. If the cache itself somehow holds a reference to the previous cache, then no derivation will ever be freed. It's like the joke where all I wanted was the banana, but I got the banana, the monkey holding the banana, and the jungle the monkey lives in. Only in this case I also get all the previous generations of monkeys.

I tried to debug by stripping out various fields in the cache, and got mysterious results. Dropping the cache entirely would fix the leak, but replacing all of its entries with Invalid tokens would add to it. Then I discovered that Data.Map's fmap is always lazy (of course) and so the test itself was insufficiently strict, and Data.Map.Strict.map lead to more consistent results.

I discovered an intentionally lazy field, with a potentially complicated thunk lurking inside. That's the root cause. In this case, the mechanism to get neighbor note pitches sticks the evaluation in a lazy field, with the idea that if you don't need a neighbor pitch (the common case), then you don't have to pay for the evaluation. I don't need that field once the computation is done, so I stripped it out on return. This still didn't solve the problem, because it was going into another intentionally lazy field, so of course the stripping didn't happen. I bang-patterned the value before putting it in the record, and the leak was gone!

As an aside, I discovered that I don't even need to use the value, e.g.: make x = Record (f x) where !unused = f x is already enough. Of course that's perfectly normal in a strict language, but in haskell I'm used to freely deleting unused bindings.

So the leak was gone, but now the UI had a hitch. The reason the field was intentionally lazy was to avoid doing that work in the event loop, so it could be passed to another thread and forced over there. So removed the bang and now the hitch is gone, but the leak is back! But shouldn't the other thread forcing have cleaned up the thunk in the first place? Then I discovered another fun bug:

force_performance perf = perf_logs perf `deepseq` perf_events perf
    `deepseq` perf_warps perf `deepseq` perf_track_dynamic
    `deepseq` ()

Not too obvious, right? It turns out perf_track_dynamic is exactly the field I needed to force, and yes functions are in NFData, so no type error for that.

So I fixed that and... still the leak. Actually, it seems like the leak is gone in the application, but still there in the test. I did all sorts of messing about trying really ensure that field is forced in the test and no luck.

Finally I somewhat accidentally fixed it, by refactoring force_performance to use less error-prone pattern matching instead of accessor functions, and added another field to the deepseq chain while I was at it. It turns out that other field, which has nothing to do with the guilty perf_track_dynamic one, still somehow had a pointer to it in its thunk. Since it's built in the same function that builds the other record, maybe it has a pointer to that whole function, and hence everything that function mentions. And of course one of the core principles of hunting space leaks is that you have to kill them all at once. It's like a hydra, where you have to cut off all the heads at once to have any effect.

The morale is be really careful about intentionally lazy fields. Of course that includes anything wrapped in any standard type like Maybe or (,)... so put in regression tests for both memory usage growing too much (too lazy) and functions taking longer than expected on large input (too strict).

work on karya

Tuesday, December 13, 2016

another year