Sunday, September 2, 2012

sampling


Since I'm wanting to come up with notation for kendang and whatnot I've started to actually get around to sampling, and making instruments out of the samples I've already made.

I posted this on google+:

Sampling is such mind-numbing work.

First decide on how many dynamic levels, how many articulations, how many round-robin variations, how many pitches, and multiply all those together.  That's how many samples.  Then record them all, being as consistent as possible, plus a few variations to give choice when editing.  The slightest noise can ruin the sample.  Try not to swallow or breathe too loudly.  Unplug the refrigerator and wait for airplanes to pass.

Then proceed to snip up all 300-500 samples.  It should be somewhat automate-able, but it's not.  Get sore wrists.  Snip, snip, snip.  Try to pick consistent variations, but not so consistent that there's no point to having the variations.  The annoyance of every bug or UI misdesign in the editor's "inner loop" is multiplied by 1000 repetitions.  Be completely consistent about ordering and naming or samples will get mixed up or lost.

Then lay them out in the sampler, which is almost completely manual, and intentionally designed to resemble a cramped LCD screen from the '90s.  Get the dynamics smooth as possible.  Enjoy the bugs here too, evidently they figure not many people are creating their own samples so they don't need to polish that part so much.

Finally, find out if it actually works for music!  If you consistently played some articulation oddly then stringing them together sounds bizarre and stiff, like an automated subway announcement.  Maybe start over from the beginning, or redo bits, being careful to recreate the original recording situation as much as possible.

And this is percussion, probably the easiest thing to sample.  I have a lot of respect for the people who do this professionally.

update: integration, git, theory, tracknum

I've been practicing kendang in earnest lately, and so on the sequencer side here's kendang tunggal notation, along with composite notation with automatic realization, and how that gets integrated back into score.  The idea is that you'd then go edit the individual parts, becaue the composite notation doesn't express all the details of what wadon and lanang really do.

Work has slowed lately since I've been busy with other things, but there has been some progress.

I've reached one of those points where I'm tired of features, so it's time to work on some more music, and see what features (and bug fixes of course) fall out of that.

    Bali

One of the larger reasons for business was going to Bali to perform in the PKB.  The performance, or rather our integration with it I'd say was only partially successful, but it did imply a sort of trial by fire for the sequnecer.  I originally wanted to see about writing for the gamelan, but it turned out to be more practical to write for the Chinese instruments.  Partially because there were fewer of them, but also because they're used to sight reading, and and it's hard to beat sight-reading when rehearsal time is limited.

Since I naturally wanted to use Karya for writing the music, it entailed a lot of work on the lilypond backend.  Writing to a deadline also enforces an economy of effort, which is reflected in the fact that at present the lilypond backend works ok for a single score, but is definitely not able to typeset anytihng not written specifically with it in mind.  For a few days I would get up early, work on adding a feature so I could print the score out around noon, so we could then rehearse before going over to Gusti's to practice his stuff.

Since I'd like to write mixed acoustic and electronic music, a lilypond backend will be useful in general, so the effort was definitely not wasted.  Also, I'd like to use Karya to write as nice an electronic version as possible, while still preserving the possibility of live performance by printing out a nicely typeset 5-line score.

It's essentially a problem of conversion from a notation at one level of abstraction to another one at a different level, and like all such translation there's no complete solution, but I believe I should be able to get pretty nice lilypond eventually.  Since I need to render e.g. trills to a special lilypond symbol rather than a sequence of notes, I need a seperate derivation and for ornaments to be able to know about lilypond and emit a lilypond annotation when they are in lilypond mode.  I also need to be able to annotate the Karya score with things staff notation cares about, like left and right hand for piano parts.  Attributes should be able to express that kind of thing.

Hand-editing the generated lilypond output would be pretty untenable, so I also have to be able to embed bits of lilypond so I can keep things like layout tweaks and page breaks in the Karya score.

Once completed I think it should be possible to wind up with a fairly nice lilypond score creating environment as well as a sequencer.


Anyway, enough new things, here's how the various themes from previous post have fared:

    Integration

I've been working on this one lately.  I have a basically working setup and design, though there are a number of known problems and I have the feeling it's the sort of thing I'll be picking bugs out of for a long time.  I've got to stop adding these kinds of features!

Integration comes in two flavors: block integration takes the output of a block and creates a new block from it, and track integration takes the output of a track and creates new tracks in the same block.  I'm guessing that I'll use track integration more because it's more convenient to have the source track right next to the destination tracks when editing.

The complicated part of integration is merging.  Namely, I want to be able to edit the destination track, then edit the source track, and have the integrated changes merge with the manual changes on the destination.  So what I need is to turn the destination into a set of edits that describe how it has been edited manually.  When the source changes, I recreate its events and reapply the manual edits.

After thinking about various approaches, I decided that I could save the integration with the block.  When there is a new integration (i.e. the source has changed), I diff the current score against the previous integrated score to derive the edits, and then apply those edits to the new integrated output.  I call the saved integration an "index", because it maps the event's stack (i.e. the stack of the generating score event) to the generated UI event.

The edits can express moving an event, changing its duration, replacing the text, or adding a prefix to the text.  So if you derive some notes, then make them longer and apply an ornament to them, then change the source in a way that their text changes (e.g. change the pitch), they should get the new text, but have the modified duration and prepend the ornament.

I went down a bit of a dead-end trying to use the saved integration to determine the tracknum.  The idea was that the integration could decide to emit the "same" events on a different track, but since the index uses the stacks of the generating event I can track them down and keep them on a consistent destination track.  Unfortunately this ran into trouble because a new event on the source track doesn't exist in the index, so it still needs to come up with a tracknum for it.  It turned out that just given the stack this isn't really possible, since one score event will wind up generating several tracks of UI events, since pitch and each control is a separate track at the UI level.

So I switched tactics---instead of diffing the entire integrate output, I first match up the tracks and diff the individual tracks.  I can match up the control tracks by name.  Since controls are indexed by name in the score event, a score event will never integrate to two control tracks with the same name.  That leaves matching up the note tracks, which I don't have a satisfactory answer for.  I just match them in order.  The effect will be that if a deriver decides to create more note tracks (e.g. it has more simultaneous notes) and if a new track winds up before other ones, they'll get bumped over by one, not line up with the destination tracks anymore, and all of the generated events on the destination track will be mistaken for manually added events, none of the generated events in the integration will match up with the destination, and Chaos will Ensue.

As a result, the index is no longer keyed on the whole stack of the source event, but simply its position in the track.  It's simpler but if the tracks get lined up wrong easy leads to false positives.  I'll have to see from practice whether this is a problem or not.  I don't need to completely solve the merge problem (which is not possible anyway), I only need to adequately solve the subsets that are likely to turn up in the course of the way I'm likely to use integration.  It's probably possible to have the deriver include more information and get a more reliable way to match up tracks, or maybe some technique to ensure the deriver out is "stable" as far as new tracks goes (i.e. they should always be added at the end), but I'll wait to see if there's actually a problem in practice before trying to get even more fancy.

For some reason the complexity of three inputs---diffing one against the other so I can apply the results to the third---proved to be just a little too much for my tiny brain, and needed real concentration to not forget what I'm doing and what input is what.  I guess I know my limits.

One essential thing is coming up with distinct names for each concept, which I guess is one of those Hard Things about programming.  I eventually decided that "integrate" is the verb for the whole process, "integrated" is the output from derivation which is just score events plus some additional bits, "convert" is the process of converting score events to UI ones and splitting tracks, "source" is the block or track that is being integrated from, and "destination" is the block or track that is being integrated to.  It's mostly ok, though "integrateds" is a really awkward variable name, and "destinations" winds up being shortened to "dests".


Integration is analogous to a macro system like Template Haskell, and like TH it requires staged compilation.  So I wind up with a derivation that results in changes, which should result in further derivation.  Since derivation is triggered by score damage, it's easy to get this wrong and wind up with an endless deriving loop.  I originally debated either building the derive -> integrate -> derive cycle into derive (which is how ghc does TH), or keeping integration out at the cmd level and bouncing between Cmd and Derive several times.  I decided on the latter, because all of the facilities for applying changes to a score, turning those changes into damage and firing off the necessary derivations all exist at the Cmd level and it seemed complicated to try to embed them in Derive.

The original plan still required special support in Derive for integration, in that the first time you run it, it derives just the integrate sources with some specific tweaks like omiting tempo tracks, and the output events go to integration rather than the performance.  I eventually figured out I could eliminate all this by collecting integrated events in the Collect state and having the Cmd level integrate look for performances with non-null integrated output.  So you set up integration by just sticking a normal derive transformer in some spot to catch events, and all the work of integration happens at the Cmd level when it notices a derivation includes integrated output.


It's worth noting that one source block or track can have multiple destinations.  So you can e.g. one core melody reflected into several different parts, each of which differ from it in different ways.  It's also theoretically possible to stack the integrations, e.g. a source creates a destination, which you can manually edit, which is then integrated again to create the final destination.  So it's a fairly powerful technique.

    Git Save

It's all hooked up into undo.  So I can load a project at a certain point in its history (available through the 'GIT_DIR=save/file.git git log') and then go back or forward in time.  If I undo and make new changes, the old ones aren't lost, I simply have a branching history.  Redo will just pick the last one I used, but when loading from the cmdline I can give the git hash and load up that specific branch.  These branches are not branches as understood by git-branch since I manage the tags and history myself, and I don't even have a working directory, but at least gitk seems to be able to display them correctly.  I haven't used it much and will no doubt need further polishing, but the basics are there.

The thing that makes me nervous about this is the potential for errors.  Within the app itself I don't worry about change deltas except where I have to.  All cmds directly modify the state and when deltas are needed (to either update the UI or determine what needs to be rederived), a diff is computed on the spot for that particular purpose.  This probably hurts some performance-wise over recording all edits as diffs directly, but I have a feeling it's saved me from lots of buggy hassle.  I have this feeling because I actually do record diffs for event updates, because I don't want to compare every single event to look for diffs, and those updates have been a big hassle with e.g. undo and redo.

The thing is, git save winds up being deltas again, since I don't want to save the entire score every time someone adds a single note.  The same goes for loading old states, undo wouldn't be very usable if every tiny undo required reloading and rederiving everytihng.  But this means when I save a git state it's vitally important that the code that computes the deltas to commit is working with the same previous state as the one denotated by "parent commit" hash entry in the commit.  Otherwise, I'll happily and quietly save deltas that will not take you from the previous state to the next one accurately, and you'll only find out when you try to undo or redo past that bit.  Save files are the most critical for correctness, so they shouldn't be the most dangerous!

I've fixed a few bugs, but at one point I remember seeing a complaint in the logs about wanting to dump state for a view which had been closed, so the bugs aren't fully worked out yet.

That actually indicates a further complication with undo: it's confusing when undo open and closes windows, so while I definitely need to store the window state so I can restore it, undo should probably carry over the current window state, unless it has to close one because the underlying block disappeared.  So if you undo back to a previous state, you're not at the same state you used to be at because the windows are different.  Then if you make changes and it wants to save them as diffs against the previous state, but since the real state and the state on disk differ maybe you get bogus diffs saved to disk.  Ugh.

Given that I like the feature and want to keep it, how can I make myself trust it?  The first thing is that I still save the complete state on explicit checkpoints.  So if things are screwed up, at least I still have what I had without the feature, which is plain old save like most apps have.  I even have a bit better, since I keep all the explicit checkpoints, so it's like one of those systems that backs up the old save instead of overwriting.

Secondly, it seems like I should be able to take the quickcheck approach and throw a giant pile of automatically generated changes at it, then verify that loading the initial state and redoing until the end will wind up with the same final state.  Of course since undo has the ability to mess this up, it needs to include undoes as well.  This is not totally trivial because I have to drive it through the responder loop to ensure that the tests are doing exactly the same thing as a real user, and I'm not totally clear how to make sure I'm getting a broad sampling in the state space, but it should be possible.

Thirdly, of course the most desirable approach is to eliminate or simplify the error-prone parts.  The underlying problem is that I have a state in memory and a state on disk, the state in memory changes so I apply diffs to the state on disk, trusting that firstly the diffs result in the final memory state and secondly that the first disk state is equal to the first memory state.  I could eliminate both at once by saving the entire state, but I think it would be too slow.  Other than that, I can't think of a way to eliminate one or the other separately, but maybe I'll think of something later.

One other little detail is that Git has support for diff compression, but I'm pretty sure it only works for text.  Initially I was worried about the overhead of saving a whole track every time an event on it changed, so I implemented a scheme for saving events in individual files.  But when I actually tested it, it was actually slower than just dumping the whole track, and it was a bit complicated, so I left it out.  Originally the problem was worse because I having to save the whole track on every single keystroke, but I decided I should come up with a wait ot make undo coarser-grained.  To that end, there's a concept of "suppressed" cmds.  These are cmds that, if a bunch of them happen in a row, are collapsed into a single undo action.  The intent was to collapse a series of "add a letter" cmds in raw edit mode.  That wound up being too unpredictable since an intervening cmd would split the undo, so the current implementation suppresses until the edit mode changes.  This is analogous to vim, which waits until you leave insert mode to record an undo.  I'm not totally sure I like it because it's easy to make a lot of separate changes without changing input mode and then an undo undoes too much.

This also added quite a bit of buggy complexity because it has to save a "pre-commit" state, and when it becomes unsuppressed, save both the suppressed undo and the current one.  Part of the bugginess was that undo, like all time travel, is confusing.

It also meant I wound up needing names for cmds, which I had wanted to do anyway.  One thing leads to another.

There seems to be an overall tension between thinking about states "functionally", simply as a series of states, and thinking about them "imperatively", as a series of transformations.  The functional way is much less error-prone, but I always wind up wanting the transformations back out.  This happened with integrate merge as well.  A kind of consolation is that each time I want a different kind of diff.  So perhaps the decision to store state functionally and derive the diffs as needed was the correct one.

    Theory

Not Theory3 anymore since I finally stopped doing rewrites and got rid of the old hacky Derive.Scale.Theory.  It turns out getting sharps and flats and enharmonics right in a generalized system is complicated!  I say generalized because I wanted to use the same system to support standard church modes along with more exotic scales like hijaz (exotic because it uses sharps and flats in the same key signature) or octatonic (exotic because with eight scale degrees, there is no 1:1 correspondence between scale degrees and note letters), and yet more exotic scales on non-piano layouts like octatonic spelled A-H or Bohlen-Pierce, which can have 9 note modes of a 13 note scale that repeats at a twelfth, not an octave.  Wacky stuff.  Hipster Composer says just scales that repeat at the octave are too mainstream.

Actually it turned out "plain" octatonic or whole tone was the most problematic because the scale degrees don't line up with letters, so it's hard to figure out how spelling should work.

I wound up implementing a Just scale as well, that can be retuned based on a base frequency signal.  But since it's diatonic at the moment I didn't wind up using any of the Theory stuff, so now I have three ways to implement scales: Derive.Scale.Util for simple diatonic scales tuned to absolute but non-repeating frequencies, Derive.Scale.TwelveUtil for complex scales that use letters and keys and chromatics and enharmonics, and Derive.Scale.Just which uses letters, but is diatonic, and figures out frequencies on the fly based on key and a set of ratios.  It seems like it would be nice to have a more unified situation here.

    TrackNum vs. TrackId

Rather than following through with the headache of trying to eliminate all TrackIds, I came up with a much easier and practical idea.  Actually, I was coming up with reasons to have the same TrackId on different blocks, namely that they could be used to present a certain view of a score.  For instance, in a large score with many instruments you might want to put the winds and strings together and edit them alone.  If the block contains the same tracks from the complete score, any edits are reflected in both places automatically.  I'm not totally sure this will actually turn out to be a practical way of doing things, but it doesn't seem totally far-fetched that the ability could be useful someday.

Since TrackNum is only ambiguous if the same TrackId occurs more than once in a single block, I can just ban that specific case, but otherwise leave TrackIds alone.

So the easy solution was to add a precondition to the UI State, and enforce it in the function that adds a new track.  Done.  Now I can convert TrackIds to TrackNums and the awkwardness of two non-interchangeable forms of address is mostly gone.

That brought to my attention that State has a number of invariants but no particularly strong protection for them.  If you use only the monadic functions you'll probably maintain them, but you can also go directly modify the State and break them all you want.  I decided to get a little more principled by only exporting the higher level functions, which basically means not exporting State.put.

Unfortunately, that theory immediately started to show its holes.  Firstly, loading a saved state needs State.put.  But I can export put as long as I make it check the invariants.  So it's now an expensive operation, but is at least available.  But then what about undo?  It *should* be safe, since the states have all occured in the past, so they're probably not just random garbage, and undo shouldn't have to do a full consistency check each time you want to undo a single note insert.  So I added 'unsafe_put' which doesn't do the check.

I already had to export the unsafe put because I discovered that if Cmd.M is to implement State.M so I can use State functions unlifted, it needs to implement the State methods, namely 'put'.  So either I cram the 1000 line Cmd module into the 1500 State module (fat chance) or just export unsafe_put.  I guess it's just part of monads not really being modular.

I actually only implemented part of the verification and left the rest for later.  I'm not really convinced it's a big deal.  It doesn't have to be impossible to make a mistake, it just has to not be easy to make one.  I'm protecting myself against mistakes, not against someone intentionally trying to break things.

    Memory Hog

Yaa... probably still hoggy.  I did a few half-hearted heap traces and the heap showed reasonable numbers so maybe the 'ps' numbers are way more than actual data.  I think there's at least two times for copying GC overhead, and then all kinds of libraries and maybe pinned memory that's not counted by the heap profiler.  I'll wait until I can upgrade to ghc 7.6 and its better profiling before fussing with this again.


I'm writing this at 法鼓山 while Hui-ju has a meeting.  A few years ago I was here also, writing out the plan for how to implement caching.  It seems long ago and I suppose it was, since then we've been a year and a half married.  I should find the original cache plan and put it on the blog at some point.