Comments on Bits and Bytes: binary 0.7

Thanks for such a complete answer, it seems that y...

2013-05-03T22:31:38.253+04:00

Thanks for such a complete answer, it seems that you have tried all the possibilities. I'll look at them and specially the arrows tutorial.

It's not possible to build a monadic parser th...

2013-05-03T22:00:54.841+04:00

It's not possible to build a monadic parser that statically can determine how much input it'll require, so the binary package is not able to do this.
It does not matter whether it's a deep or shallow embedding, it just can't be done with a monad.
To statically determine the required input, I've found the best way is to use arrows.
Read about the limitation of the monads, and why arrows can solve this, here's a great tutorial I found: http://ertes.de/new/tutorials/arrows.html

You can find my binary-arrow experiment here;
https://github.com/kolmodin/binary-arrow
However, it seems that it does not always result in a great speedup, it seems that doing allocation is more expensive than boundary checks, and therefore there is little to save by eliminating the boundary checks.
Also, the way I wrote the arrow library, it's not possible to have recursive parsers due to the very agressive inlining that it does.
Next, I wanted to rewrite it into CPS in hope that GHC can optimise better, but I have not yet come around to doing it.
If you come to a different conclusion about the efficiency of statically knowing more about the parser, I'm of course interested to hear :)

Do you think that knowing in advance how many byte...

2013-05-03T20:45:10.375+04:00

Do you think that knowing in advance how many bytes a monadic function like for example {getWord8 >> getWord8 >> getWord8} needs will provide a significant speed improvement?

Is it possible to do this with the kind of deep embedding you have now?

If yes to the first question and no to the second I can try building a wrapper package using the operational or free package to see how it works.

'operational' looks like deep embedding of...

2013-04-30T15:14:40.440+04:00

'operational' looks like deep embedding of a dsl.
binary indeed does something similar. However, binary is shipped together with GHC, so I don't want to add dependencies which are not strictly required.

Have you thought of using the operational package ...

2013-04-30T07:01:41.826+04:00

Have you thought of using the operational package to solve the problem of the sufficient input left? You could reify the monad and obtain something like an AST of the parser. Although I don't know about the performance of the solution.

binary is now using continuation-passing style, wh...

2013-03-03T20:50:27.797+04:00

binary is now using continuation-passing style, which GHC can generate efficient code from. The old binary was a state monad.

binary-0.7 seems to be as fast as binary-0.5, never slower, and at times up to 5x faster (or even more in some extreme cases).
Of course, to see a large speedup it requires that your code is already quite tuned. If the biggest cost in your decoder is not the code from binary, then even binary-0.7 won't make it faster.

So, it seems the error reporting comes for free.

How does the performance compare to binary-0.5? is...

2013-03-03T17:56:59.696+04:00

How does the performance compare to binary-0.5? is the better error reporting costing much?