Monday, July 11, 2011

What haskell doesn't have

(but it does have fun!)


At one point, someone on buzz was advocating haskell, and he pulled out the usual "look how elegant the fibonacci function looks!" thing.  Someone else was doing the "the point must be productivity, and you can't prove it adds more productivity" thing.  I think the point is fun, not productivity.  I wrote a big long response, but then felt like getting involved in a big language discussion on work time was kind of a waste of time, so didn't bother posting it.  And it was mostly going to be a self-indulgent preaching to the choir thing anyway.  But then I thought, well, as long as I've written it I might as well put it somewhere.  And I don't believe I've noticed anyone saying it in exactly this way before.  Anyway, where it is:


I know people like to pull out fibonacci and quicksort, but they don't seem all that compelling to me.

Instead:

Think about the whole thing with iterators or generators or whatever else that so many imperative languages have.  That's all gone.

Select vs. threads vs. coroutines is also gone, but that's just a ghc feature.  All that flap about asynchronous events and event based programming?  Gone.

Think about that whole null pointer thing that all these imperative languages have.  Null pointer exceptions?  Gone!

Think about that whole thing with reference vs. values.  That's gone.  Or think about the thing with identity vs. equality vs. deep equality, that's gone too.

All that stuff about copy constructors or equals methods or all the 'this.x = x' constructor garbage, all that boring boilerplate is gone.  No more writing toString()s or __str__()s by hand, you can get a useful one for free.

Or how about all that stuff about "const" or "final" or "const correctness" and separate const and non-const methods, that all just goes away and good riddance.

Type casts are gone too.

All that hairy generics stuff is vastly simplified without subtyping.  Contravariance vs. covariance, sub/super constraints on generics, multiple vs.  single inheritance, the whole is-a vs. has-a thing is gone.  Yes it means there's no dynamic dispatch, but I don't miss it much.  If you want to parameterize on an operation, just pass the operation as a parameter, instead of passing an object that dynamically dispatches on one of several operations.

Think about all the mandatory type declarations those not-python languages have.  That's gone.  Or maybe think about how slow and single-threaded and hard to deploy and runtime-error-loving python is, that's gone too.

That whole thing about control structures being built-in syntax and different from functions?  That's gone.  It's hard to quantify the difference it makes because you stop thinking about the difference between them and just structure your code as feels most natural.  Like the whole order of definition thing (that's gone too), it's just one more restriction removed.  The whole statement vs. expression thing is gone too.

That lengthy edit, compile, wait, debug cycle is mostly gone too.  It usually takes less than a second to reload a module and start running the code you wrote one second ago.


And of course, this is just a list of the things that are gone.  There's an equally large list of new things that are there, but it might be hard to see how those are useful from the outside.

I have a project with more than 200 source files, each one defines from 5 to 20-ish data types.  If it were java or c++, with their heavyweight class declarations, each one would be a directory with 5 to 20 files in it, about one per data type.  No one is going to go to all that work, so the result is that you create types for the big abstractions and pass the rest around as ints and doubles and strings, or Maps and Lists.  The result is undescriptive signatures, search and replace expeditions when a type changes, reusing "almost right" types that results in extra "impossible" error cases to deal with, mixed up values, and of course runtime type errors.

Similarly, when defining a new function requires minimum 3 lines of boilerplate and some redundant type declarations that need to be updated whenever your types change and maybe defined half a screen away, you create a lot fewer functions, and factor less.  If it's too annoying to do, no one will do it.  And if they do, they will do it because they feel obligated to do the right thing, even if it's tedious.  Not because they're having a good time.

To me, the main compelling thing is *fun*.  All of that stuff which is gone is mostly boring bookkeeping crap.  I never get excited thinking about putting in new 'const' declarations or equals() or toString() methods.  I don't look forward to all the TypeErrors and NameErrors and ValueErrors I might face today, or waiting for the application to relink.  Programming is more fun without that stuff.

11 comments:

  1. So when is your burrito de nuclear waste in a space suit monad tutorial coming out? :)

    ReplyDelete
  2. This article really pushes my buttons about learning Haskell (I've tried in the past, see: http://stackoverflow.com/questions/3884121/haskell-function-application-and-currying). But it would be so much better if you could elaborate/substantiate each of your claims with examples: for each of the "gone", "how it is in Python/C/whatever" vs "how is is 'gone' in Haskell". As it is, your claims appear interesting, but how do they really look like when put to the test?

    In other words, I am asking for an elaboration sequel to this post :-)

    ReplyDelete
  3. All of those things that disappear with using haskell also disappear when you start selling ice cream from an ice cream van. They're not so much plus points of haskell.

    All of those things have a certain function in a normal language. More interestingly, C++ has almost no pad words - each token is required to find out what a certain bit of code means. What in haskell takes the place of their function in a normal language?

    ReplyDelete
  4. Think about that clear syntax and that readable code. That is gone too.

    ReplyDelete
  5. simple syntax for accessing and updating arrays in place... gone.

    ReplyDelete
  6. @ OP
    > Think about the whole thing with iterators or
    > generators or whatever else that so many
    > imperative languages have. That's all gone.

    Not quite; iterators are replaced by Foldable and Traversable, while generators are replaced by iteratees and enumerators. I'm not sure it's an improvement.

    > Type casts are gone too.

    Nope; see Data.Typeable and Data.Dynamic.

    > Yes it means there's no dynamic dispatch, but
    > I don't miss it much.

    Haskell has dynamic dispatch, via classes. I'm terrified just thinking of what a 200-module Haskell project with no classes.

    > That whole thing about control structures
    > being built-in syntax and different from
    > functions? That's gone.

    'if' and 'case' ring a bell? These are non-function control structures.

    @ Candy
    > All of those things have a certain function
    > in a normal language. More interestingly, C++
    > has almost no pad words - each token is
    > required to find out what a certain bit of
    > code means. What in haskell takes the place
    > of their function in a normal language?

    Careful with that phrase "normal language", it's loaded!

    C++ has a notoriously complex syntax, so much so that it's literally impossible to write a C++ parser. Modern languages like Python and Haskell are designed to have very simple and regular syntax, so there's far fewer magic symbols floating around.

    @dT/dZ
    > Think about that clear syntax and that
    > readable code. That is gone too.

    Haskell has much clearer syntax, and more readable code, than either C++ or Java. There's not as much magic line noise like (**i[(*(j->x))++] &= ~(foo | bar) flying around.

    @Anon0AnALY5e
    > simple syntax for accessing and updating
    > arrays in place... gone.

    Eh? Updating arrays is quite simple:

    array <- malloc 2
    poke array 0 0x20
    poke array 0 0x30

    ReplyDelete
  7. Thanassis:

    WRT to learning haskell, keep on plugging! It'll be fun when you get there.
    Don't mind too much about point free style, I've had exactly the same confusion
    as you with (.) and two argument functions. In fact, part of the point of my
    post was to not mind about all those fancy new features like point-free and
    typeclasses, and just use plain code to write real programs, and enjoy the
    complexities which have been removed rather than added. The fancy features
    will gradually start to creep in once you have some context for where they
    would be useful.

    Excatly the same advice I'd have for someone wanting to learn OO, incidentally
    :)

    > ... how do they really look like when put to the test?

    Well, there's no real "test", gone is gone. How they really look is like less
    code and more fun writing it :) How does GC look when put to the test? One
    less bit of tiresome busywork!

    Most of them are features imperative languages have to have to deal with
    pervasive mutation which are rendered unnecessary, but some are due to laziness
    or a smarter compiler.

    So iterators and generators are there to delay computation so you can
    interleave it with other computation. They work around the fact that 'for x in
    giant_map.keys()' wants to create the giant list of keys all at once. Laziness
    means that computation is delayed and interleaved automatically, no separate
    concept needed.

    Type declarations going away is just smarter compiler, in fact C++ wants to add
    a limited form of this with the 'auto' keyword. The coroutines stuff is
    just ghc's nifty multiplexed threads implementation, python could theoretically
    do that if it weren't mired down in the GIL. And Go is starting to do the same
    thing.

    The deeper stuff is reference vs. value. Pointers and references are
    introduced in C as a way to efficiently pass data around without copying it.
    But the only reason you wanted to copy it in the first place was that you were
    afraid someone might mutate it. That knocks out copy constructors and
    initializing objects piecemeal (and therefore incompletely constructed objects)
    and a whole bunch of other bits of busywork.

    Oh, and I forgot to mention a big one: locks! Locks are gone!

    Candy:

    I don't get the ice cream metaphor, but if you're saying that every word in
    c++ is necessary, then you might start in a simpler place and look up the java
    equivalent of 'delete x'. Yes, there's a neverending debate about GC as well,
    but in my daily work I almost never think "gosh I miss manual memory
    management". And the one time I did, for a scheduler, I wrote it in C and
    called it from Haskell.

    Did I mention how much more fun the haskell FFI is than, say, swig?

    Anyway, my point was more about deeper things you have to *think* about
    and aren't fun, rather than syntactic things you have to type and aren't fun,
    though the two are interrelated.

    Anon0AnALY5e:

    Actually, array indexing and update is easy, you can write 'get a i' or
    'set a i x'. I've almost never wanted to do that though. My java code is
    full of ImmutableList.of(..) and Collections.unmodifiableList(..) anyway.

    What is awkward is record update. 'rec { record_field = x }' is a poor
    substitute for 'rec.field = x' and most importantly doesn't nest, so by the
    time you're writing

    let modify_b f = b { b_f2 = f (b_f2 b) }
    in a { a_f1 = modify_b rejigger (a_f1 a) }

    ... you really start wishing for

    a.f1.f2 = rejigger a.f1.f2

    There are libraries that work around this, but no universal built-in solution.

    ReplyDelete
  8. ("a.f1.f2 = rejigger a.f1.f2")

    Law of Demeter is a bad idea, eh?

    ReplyDelete
  9. jmillikin:

    > Not quite; iterators are replaced by Foldable and Traversable, while
    > generators are replaced by iteratees and enumerators. I'm not sure it's an
    > improvement.

    Not in practice, at least not for me. I just reduce it to a list and then map
    over that. E.g. to iterate upwards from 'x' in a Map: 'Map.toList . snd .
    Map.split k'.

    My point was not really about all the complex things you *could* do, but what I
    actually wind up doing in practice for the majority of my time. I used
    Traversable.map for the first time in 12 years of writing haskell just the
    other day, and it was just to map over a fixed-length list, and that was just
    to make me feel better about a compiler warning. I've never used Foldable.
    Meanwhile, in C++, Java, and Python, iterators are ubiquitous. You can't go
    three paces without bumping into one, so therefore you can't use the language
    without learning about them.

    As far as iteratees etc., they're brand new. Probably no one was using them
    even 5 years ago, so you don't really need to learn them to use haskell. I
    think they're an interesting research project and I'd like to try one some day
    though.

    > Nope; see Data.Typeable and Data.Dynamic.

    Same point here. I've never used Dynamic, though I may some day. It's a
    specialized feature that's useful in specialized places. I don't know about
    the Java you look at, but the Java I look at is full of type casts. You can't
    write Java without learning about type casts. It must have been even worse
    pre-generics.

    For that matter, you can get the moral equivalent of pointers and mutation back
    in haskell if you want, even outside of IO and ST. And in fact I do. And then
    of course in that context all the "gone now" things come right back again.

    > Haskell has dynamic dispatch, via classes. I'm terrified just thinking of
    > what a 200-module Haskell project with no classes.

    Hmm, I'm no expert on typeclasses so you may be right about this one, but what
    I meant was that given '(C a) => a ...' the type of 'a' must be determined at
    compile time, and thus the dictionary is also determined at compile time.
    That's what I would call static dispatch. Yes, I know, existentials, etc. etc.
    But as far as I'm concerned, it's another specialized tool for specialized
    places and I've never had a use for them.

    And I never said I didn't use typeclasses, though as it turns out, I don't
    really. A quick grep over 272 source files comes back with 10 class
    declarations. Of those, 3 are hacks to avoid having to write ugly class
    contexts everywhere (that mtl-1 -> mtl-2 thing removed the Functor superclass
    constraint), 1 is a custom CPS monad that is unnecessary but fun to write,
    1 is to work around Data.Serialize being untrustworthy in the long run, 1 is
    not used, 2 are for pretty printing and minor syntactic convenience, and only 2
    are what I would call really useful: one to do typechecking in a simple DSL,
    and one to project values in the DSL into the haskell type system and eliminate
    a whole bunch of 'unexpected X found in Y' runtime error checking.

    In both cases, the useful ones are used in highly localized parts of the code.

    Typeclasses are another one of those things I think people get all excited
    about, but don't actually factor much into day to day work, except of course
    that you use basic ones like Ord and Num all the time. Most of my time is
    writing programs, not libraries.

    I know other people have different styles, and I could probably do fancier
    stuff, and maybe someday I will. But I like concrete monomorphic code.


    I get that sense of terror when I look at something like, say, Text.Regex.Base
    and the (=~) operator. No disrespect to the author since it's a fine library,
    but I wound up writing my own wrapper module with concrete functions because I
    got lost every time I tried to look at the haddock for it.

    ReplyDelete
  10. rdm:

    > Law of Demeter is a bad idea, eh?

    No, it's a good idea :) I'm just lazy and will always write the short version unless I have a specific reason to think it'll change. In practice I often wind up being a good Demeter citizen because haskell makes me write all these modify functions.

    That said, in haskell 'a.b.c' and 'a.b.c := e' would be composed function calls, so when I move a field I just make a new function with the old name as an alias and Demeter is happy. As it is, that already applies to record access, but sadly not if you use the update syntax. Lenses solve these problems, but as I said they're not universal or built-in.

    ReplyDelete
  11. This article expresses exactly how I feel about Haskell!

    ReplyDelete