tag:blogger.com,1999:blog-87616235681094550932024-03-07T23:21:11.246+03:00Bits and BytesTo Err is Human, to Arr is Pirate!Lennart Kolmodinhttp://www.blogger.com/profile/11472900998358020886noreply@blogger.comBlogger15125tag:blogger.com,1999:blog-8761623568109455093.post-65852231922874660712013-03-03T13:56:00.000+04:002013-03-03T13:56:20.353+04:00binary 0.7<span style="font-family: Courier New, Courier, monospace;">binary 0.7 </span>has just been released and compared to the current version included in the Haskell Platform, <span style="font-family: Courier New, Courier, monospace;">binary 0.5.1.0,</span> it brings a few new things. Most of this has also been available in the <span style="font-family: Courier New, Courier, monospace;">0.6.*</span> versions which already have been available on hackage for some time.<br />
<br />
<h4>
Incremental interface and improved error reporting</h4>
<div>
This is probably the most requested feature, and was introduced already in <span style="font-family: Courier New, Courier, monospace;">binary 0.6</span>. <br />
In older versions of <span style="font-family: Courier New, Courier, monospace;">binary</span>, you could run the Get monad with the following function:<br />
<pre><span class="hs-definition">runGet</span> <span class="hs-keyglyph" style="color: red;">::</span> <span class="hs-conid">Get</span> <span class="hs-varid">a</span> <span class="hs-keyglyph" style="color: red;">-></span> <span class="hs-conid">Lazy</span><span class="hs-varop">.</span><span class="hs-conid">ByteString</span> <span class="hs-keyglyph" style="color: red;">-></span> <span class="hs-varid">a</span>
</pre>
<div>
It takes the monad you want to execute, together a with a lazy <span style="font-family: Courier New, Courier, monospace;">ByteString</span> containing all the input it will need to finish. While this is straight forward and easy to use, it does not meet all needs:
<br />
<ol>
<li>There is no error reporting (compare with a function returning <span style="font-family: Courier New, Courier, monospace;">Maybe a,</span> or <span style="font-family: Courier New, Courier, monospace;">Either ErrorMessage a</span>). If a decoder error happens, it'll just call <span style="font-family: Courier New, Courier, monospace;"><a href="http://hackage.haskell.org/packages/archive/base/4.6.0.1/doc/html/Prelude.html#v:error" target="_blank">error :: String -> a</a></span>.</li>
<li>What if you don't have all input when you start to decode, like for example if you're reading from a large file or a socket? Previously, lazy IO has been used to address this, but we will look into why this is not always suitable.</li>
</ol>
To address (1), we introduce the data type <span style="font-family: Courier New, Courier, monospace;">Decoder</span> as the return type for our new run function.
<br />
<pre><span class="hs-comment" style="color: green;">-- | Incremental interface to run a decoder.</span>
<span class="hs-definition">runGetIncremental</span> <span class="hs-keyglyph" style="color: red;">::</span> <span class="hs-conid">Get</span> <span class="hs-varid">a</span> <span class="hs-keyglyph" style="color: red;">-></span> <span class="hs-conid">Decoder</span> <span class="hs-varid">a</span></pre>
A <span style="font-family: Courier New, Courier, monospace;">Decoder</span> can have successfully finished in which case it'll be the <span style="font-family: Courier New, Courier, monospace;">Done</span> constructor and have the final value, or have encountered a problem, in which case it'll be the <span style="font-family: Courier New, Courier, monospace;">Fail</span> constructor together with the error message. In both cases, you will also get the remaining unused input, and the number of bytes consumed when the decoder succeeded or failed. In the next section, we will see why the this new run function does not need input as a second argument.<br />
<br />
For (2), the traditional answer has been to use lazy I/O to get a lazy stream of data, which will hide that I/O is going on, and we can pretend that we already have all the input in memory. While this is fine for some scenarios, we have given up control over error handling and file handle resources. <a href="http://stackoverflow.com/questions/5892653/whats-so-bad-about-lazy-i-o" target="_blank">For some applications, this is simply not good enough.</a><br />
<br />
Here's where the incremental interface can help. It works in the same way in <span style="font-family: Courier New, Courier, monospace;">binary</span> as seen in other libraries for parsing. Instead of taking any input, it immediately produces a <span style="font-family: Courier New, Courier, monospace;">Decoder</span><span style="font-family: inherit;">. In addition to </span><span style="font-family: Courier New, Courier, monospace;">Done</span><span style="font-family: inherit;"> and </span><span style="font-family: Courier New, Courier, monospace;">Fail</span><span style="font-family: inherit;"> mentioned above, it also has the </span><span style="font-family: Courier New, Courier, monospace;">Partial</span><span style="font-family: inherit;"> constructor, meaning that the </span><span style="font-family: Courier New, Courier, monospace;">Decoder</span><span style="font-family: inherit;"> needs more input in order to finish.</span><br />
<pre><span class="hs-comment" style="color: green;">-- | A decoder produced by running a 'Get' monad.</span>
<span class="hs-keyword" style="color: blue;">data</span> <span class="hs-conid">Decoder</span> <span class="hs-varid">a</span>
<span class="hs-keyglyph" style="color: red;"> = </span><span class="hs-conid">Done</span> <span class="hs-varop">!</span><span class="hs-conid">B</span><span class="hs-varop">.</span><span class="hs-conid">ByteString</span> <span class="hs-varop">!</span><span class="hs-conid">ByteOffset</span> <span class="hs-varid">a</span>
<span class="hs-comment" style="color: green;">-- ^ The decoder has successfully finished. Except for the</span>
<span class="hs-comment" style="color: green;">-- output value you also get any unused input as well as the</span>
<span class="hs-conid"> <span class="hs-comment" style="color: green;">-- number of bytes consumed.</span></span>
<span class="hs-conid"> <span class="hs-keyglyph" style="color: red;">|</span> Fail</span> <span class="hs-varop">!</span><span class="hs-conid">B</span><span class="hs-varop">.</span><span class="hs-conid">ByteString</span> <span class="hs-varop">!</span><span class="hs-conid">ByteOffset</span> <span class="hs-conid">String</span>
<span class="hs-comment" style="color: green;">-- ^ The decoder ran into an error. The decoder either used</span>
<span class="hs-comment" style="color: green;">-- 'fail' or was not provided enough input. Contains any</span>
<span class="hs-comment" style="color: green;">-- unconsumed input and the number of bytes consumed.</span>
<span class="hs-keyglyph" style="color: red;">|</span> <span class="hs-conid">Partial</span> <span class="hs-layout" style="color: red;">(</span><span class="hs-conid">Maybe</span> <span class="hs-conid">B</span><span class="hs-varop">.</span><span class="hs-conid">ByteString</span> <span class="hs-keyglyph" style="color: red;">-></span> <span class="hs-conid">Decoder</span> <span class="hs-varid">a</span><span class="hs-layout" style="color: red;">)</span>
<span class="hs-comment" style="color: green;">-- ^ The decoder has consumed the available input and needs</span>
<span class="hs-comment" style="color: green;">-- more to continue. Provide 'Just' if more input is available</span>
<span class="hs-comment" style="color: green;">-- and 'Nothing' otherwise, and you will get a new 'Decoder'.</span></pre>
</div>
<br />
<h4>
GHC Generics</h4>
<div>
<span style="font-family: Courier New, Courier, monospace;">binary</span> has also got support for automatically generating instances of the Binary class for your data types, using GHC's generics.</div>
<pre><span style="color: blue; font-style: italic;">{-# LANGUAGE DeriveGeneric #-}</span>
<span style="color: green;">import</span> Data<span style="color: red;">.</span>Binary
<span style="color: green;">import</span> GHC<span style="color: red;">.</span>Generics <span style="color: red;">(</span>Generic<span style="color: red;">)</span>
<span style="color: blue;">data</span> Foo <span style="color: red;">=</span> Foo
<span style="color: green;">deriving</span> <span style="color: red;">(</span>Generic<span style="color: red;">)</span>
<span style="color: green;">-- GHC will automatically fill out the instance</span>
<span style="color: blue;">instance</span> Binary Foo</pre>
The was committed to binary by Bryan O'Sullivan, though pieces copied from the cereal package where it was written by Bas van Dijk.<br />
<div>
<br /></div>
<h4>
Control.Alternative</h4>
<span style="font-family: Courier New, Courier, monospace;">binary</span> now implements the <a href="http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Applicative.html#t:Alternative" target="_blank">Alternative class</a>, and you can use the <span style="font-family: Courier New, Courier, monospace;"><|></span> combinator to try a decoder and fallback to another if the first failed. Although many binary formats don't strictly require backtracking in the decoder, it can be used for convenience or if it leads to code which is easier to read. In the example below, we parse frames in some binary format, trying first one kind of frames, then the other. If it's neither kind, we fail altogether.<br />
<br />
<pre><span style="color: blue;">data</span> Frame <span style="color: red;">=</span> FrameA <span style="color: blue; font-style: italic;">{- some fields specific to A frames -}</span>
<span style="color: red;">|</span> FrameB <span style="color: blue; font-style: italic;">{- some fields specific to B frames -}</span>
decodeFrame <span style="color: red;">::</span> Get Frame
decodeFrame <span style="color: red;">=</span> decodeFrameA <span style="color: blue;"><|></span> decodeFrameB <span style="color: blue;"><|></span> fail <span style="color: #b45f06;">"not a frame!"</span>
decodeFrameA :: Get Frame
decodeFrameA = <span style="color: red;">do</span>
word <span style="color: red;"><-</span> getWord8
unless <span style="color: red;">(</span>isFrameAMagicWord word<span style="color: red;">)</span> <span style="color: red;">$</span> fail <span style="color: #b45f06;">"not an A-frame!"</span>
<span style="color: blue; font-style: italic;">{- continue decoding... -}</span>
return (FrameA {..})</pre>
<pre></pre>
<pre></pre>
<pre>decodeFrameB :: Get Frame
decodeFrameB = <span style="color: red;">do</span>
word <span style="color: red;"><-</span> getWord8
unless <span style="color: red;">(</span>isFrameBMagicWord word<span style="color: red;">)</span> <span style="color: red;">$</span> fail <span style="color: #b45f06;">"not an B-frame!"</span>
<span style="color: blue; font-style: italic;">{- continue decoding... -}</span>
return <span style="color: red;">(</span>FrameB {..}<span style="color: red;">)</span></pre>
<br />
In this particular example, though, it's not required to have backtracking, as we could have first decoded the first <span style="font-family: Courier New, Courier, monospace;">Word8</span>, and then chosen to continue with either <span style="font-family: Courier New, Courier, monospace;">FrameA</span> or <span style="font-family: Courier New, Courier, monospace;">FrameB</span>. This continues to be the better choice for performance and should be preferred when possible, but one can imagine a more complicated binary format where using <span style="font-family: Courier New, Courier, monospace;"><|></span> is required or simply leads to cleaner code.<br />
<br />
<h4>
Control.Applicative</h4>
<div>
For each use of a primitive decoder, such as <span style="font-family: Courier New, Courier, monospace;">getWord8</span>, <span style="font-family: Courier New, Courier, monospace;">getWord32le</span>, and so on, <span style="font-family: Courier New, Courier, monospace;">binary</span> needs to check whether there is sufficient input left, or if it needs to demand more input from the caller (by returning a <span style="font-family: Courier New, Courier, monospace;">Partial</span>, as discussed above).</div>
<div>
<br /></div>
<div>
<pre>decodeWordTuple :: Get (Word32, Word16)
decodeWordTuple = <span style="color: red;">do</span>
a <span style="color: red;"><-</span> getWord32le
b <span style="color: red;"><-</span> getWord16le
return <span style="color: red;">(</span>a,b<span style="color: red;">)</span>
</pre>
<pre></pre>
</div>
<div>
<br />
Here, <span style="font-family: Courier New, Courier, monospace;">binary</span> will check twice whether there is enough input. First for the 4 bytes of the <span style="font-family: Courier New, Courier, monospace;">Word32</span>, and a second time whether there is 2 bytes left for the <span style="font-family: Courier New, Courier, monospace;">Word16</span>. Once could think this is wasteful, and that we should only check once at the beginning for the 6 bytes that we know will be required. Unfortunately, with a monadic API this is not possible (maybe possible with some new fancy kind of rewrite rules?).</div>
<div>
When using the <a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Applicative.html#t:Applicative" target="_blank">applicative class</a> though, <span style="font-family: Courier New, Courier, monospace;">binary</span> tries to be clever and join these checks.</div>
<div>
<br /></div>
<div>
<pre>decodeWordTuple :: Get (Word32, Word16)
decodeWordTuple = (,) <span style="color: red;"><$></span> getWord32le <span style="color: red;"><*></span> getWord16le
</pre>
</div>
<div>
<br /></div>
<div>
It does this by using GHC's rewrite rules to rewrite the expression into a single decoder. Note that for this to make a worthwhile difference to the performance of your application, your application needs to already be quite tuned. The rewriting also relies heavily on that all decoder functions gets inlined, and can therefore be a bit hard to trigger.</div>
<br />
<h4>
Backwards compatibility</h4>
<div>
I'm pleased to say that while the internals of binary has undergone major changes, it still largely provides the same API to the developer. A few convenience functions have been added, and a few functions that doesn't make sense any more have been removed. Read the haddocks for the <a href="http://hackage.haskell.org/package/binary-0.7.0.1" target="_blank">full API here</a>.</div>
<div>
<br /></div>
<pre><b>Data.Binary:</b>
<span style="background-color: #b6d7a8;">decodeFileOrFail :: Binary a => FilePath</span>
<span style="background-color: #b6d7a8;"> -> IO (Either (ByteOffset, String) a)
decodeOrFail :: Binary a =></span>
<span style="background-color: #b6d7a8;"> Lazy.ByteString </span><span style="background-color: #b6d7a8;">-> Either (Lazy.ByteString, ByteOffset, String)
(Lazy.ByteString, ByteOffset, a)</span>
<b>Data.Binary.Get:</b>
<span style="background-color: #b6d7a8;">runGetIncremental :: Get a -> Decoder a
runGetOrFail :: Get a -> Lazy.ByteString
-> Either (Lazy.ByteString, ByteOffset, String)
(Lazy.ByteString, ByteOffset, a)
pushChunk :: Decoder a -> ByteString -> Decoder a
pushChunks :: Decoder a -> Lazy.ByteString -> Decoder a</span>
<span style="background-color: #b6d7a8;">pushEndOfInput :: Decoder a -> Decoder a</span>
<span style="background-color: #f4cccc;">uncheckedLookAhead :: Int64 -> Get Lazy.ByteString
uncheckedSkip :: Int64 -> Get ()</span>
</pre>
<div>
<br />
The functions <span style="font-family: Courier New, Courier, monospace;">lookAhead</span> and <span style="font-family: Courier New, Courier, monospace;">lookAheadM</span> which disappeared in <span style="font-family: Courier New, Courier, monospace;">binary-0.6.*</span> and have been brought back in this release, making it easier to transition from <span style="font-family: Courier New, Courier, monospace;">binary-0.5.</span><br />
<br />
<h4>
Code</h4>
The package is available on <a href="http://hackage.haskell.org/package/binary" target="_blank">hackage</a>.<br />
The code is available on <a href="https://github.com/kolmodin/binary" target="_blank">github</a>.<br />
<br />
Contributors since the last Haskell Platform version:<br />
<br />
<ul>
<li>Lennart Kolmodin</li>
<li>Johan Tibell</li>
<li>Bryan O'Sullivan</li>
<li>Bas van Dijk</li>
<li>Paolo Capriotti</li>
<li>Eyal Lotem</li>
<li>Gabor Greif</li>
</ul>
</div>
</div>
Lennart Kolmodinhttp://www.blogger.com/profile/11472900998358020886noreply@blogger.com7Moskva, Ryssland55.749646 37.62368000000003655.749646 37.623680000000036 55.749646 37.623680000000036tag:blogger.com,1999:blog-8761623568109455093.post-81826764884123792172011-02-17T01:58:00.000+03:002011-02-17T02:14:51.663+03:00binary by the numbers<div id="new-home"><h1>New home</h1>
<p>I've taken the liberty to fork the <code>darcs</code> repo of <code>binary</code> and put it on github. It contains both the latest released version as well as the new experimental continuation based <code>Get</code> monad a branch called <code>cps</code>.</p>
<blockquote>
<p>git clone <a href="https://github.com/kolmodin/binary">git://github.com/kolmodin/binary</a></p>
</blockquote></div>
<div id="performance"><h1>Performance</h1>
<p>It's interesting to run the benchmark of <code>binary</code> on different architectures and with different versions of <code>GHC</code>. Although there recently has been work within the community with fast writing (<code>blaze-builder</code> comes to mind) I've mostly been working on how to read things fast.</p>
<p>The classic <code>binary</code> implementation of the <code>Get</code> monad is a state monad while the new experimental version is continuation based, so fundamentally different. They also perform differently. To produce the numbers below I ran the benchmark suite of <code>binary</code>. It reads either <code>Word8</code>, <code>Word16</code>, <code>Word32</code> or <code>Word64</code> in a (hopefully) tight loop and then presents how fast it could do it. For example, see this graph over performance in a 32bit environment;</p>
<img src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhIOK3YSfPpbx-6blofofB7_udKlUpgxoMkUBEz_5EDiCpexeGFWUu_4uwrpboGGT5vR76pz6kOv0t8M2Sd4qoT9r8NheDfSde-H3vpVE0Mhv6PfppXf94yZvUgTA4qcV9u_6o3hqUUwtsr/" />
<p>The nice news is that <code>GHC 7.0.1</code> always performs better than <code>GHC 6.12.3</code>. Also the experimental <code>cps</code> branch (the wide green line) is faster than the classic <code>master</code> branch.</p>
<p>Things seems to be going well in 32bit land. Let's have a look in a 64bit environment;</p>
<img src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgA-iQKCGPqPDH4ouK0gHmVEKa2c1Ch9HR2Wb90vGA0zLqpp4fYecHNdqvBLdfwWG-rm0iw-8cJ0XYR1avNwCAPnhGnyfVuaM-pfJJCi1a3GsadpmJyfSlOZYgjtZ_tPEwRuGwGgkuOLwlL/" />
<p>This gives a different picture. <code>GHC 7.0.1</code> still performs better than <code>GHC 6.12.3</code>, but we can also see that the <code>cps</code> branch can't keep up with the state monad based <code>master</code> branch (in contrast to when compiling for 32bits). Future work will include to figure out why, and how to fix it.</p>
<p>Lets have a look at how binary performs at writing too;</p>
<img src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjfjW8Az7W4XNXcMLb-a8y5ELqUT6DBUetaagFwORl-nk3HyjmpRmOgTqWcDqNmYcHiuETcBBKOmuAdEwcHRffttSzGx1DVCWa8j0OCLzk-SxMj-5MTWEvqhsm1XesgNfMSfNECQ3vsKQRo/" />
</div>
<div id="benchmark-environment"><h1>Benchmark Environment</h1>
<p>The tests have been performed on a Sandy Bridge CPU using GHCs native backend. I wanted to try the LLVM backend too, but unfortunately LLVM crashes when compiling the benchmark executable.</p></div>Lennart Kolmodinhttp://www.blogger.com/profile/11472900998358020886noreply@blogger.com9tag:blogger.com,1999:blog-8761623568109455093.post-32426680829649996702011-01-09T05:36:00.000+03:002011-01-08T20:37:35.464+03:00About binaryAbout a year ago I started hacking on a project to change the Get monad of the binary package. I've been working off and on since... :)
The change will allow us to incrementally provide the Get monad with input, as well as allow backtracking and a more graceful error handling.
The code was public, but got moved with the haskell.org infrastructure changes. It's now available again;
<pre class="source-code"><code>darcs get http://code.haskell.org/~kolmodin/code/binary/</code></pre>
Johan Tibell writes about the API update in <a href="http://blog.johantibell.com/2011/01/haskell-library-improvements-id-like-to.html">his recent blog post</a>.
Developers familiar with attoparsec code will largely be familiar with the new binary Get monad too, as it's been a heavy influence.
The type for the parse function would essentially be something like this:
<pre class="source-code"><code>data Result r =
Fail ByteString [ByteString] -- an error msg and a trace
| Partial (ByteString -> Result r) -- incremental input
| Done r -- finished!
</code></pre>
A few forks of binary tries to address this too, all in their own way. Currently I know of <a href="http://hackage.haskell.org/package/cereal">cereal</a> and <a href="http://hackage.haskell.org/package/binary-strict">binary-strict</a>.
<h2>Performance</h2>
When benchmarking cereal and the new binary package, cereal comes out on top, at the expense of not being able to consume the input in incremental chunks.
I couldn't find the benchmark suit for binary-strict.
The reason for cereal being faster, I think, is due to that its simpler code when having a simpler state, essentially only a single strict ByteString (the input).
In binary (and attoparsec) it's a bit more complicated, due to the incremental input (in combination with supporting MonadPlus):
<pre class="source-code"><code>data S =
S { input :: !B.ByteString -- the current input chunk
, next_input :: !B.ByteString -- saved input to be used when backtracking
, read_all :: !Bool -- have we requested all input available to parse?
} deriving Show
newtype Get a =
C { runCont :: forall r.
S ->
Failure r ->
Success a r ->
Result r }
type Failure r = S -> [String] -> String -> Result r
type Success a r = S -> a -> Result r
bindG :: Get a -> (a -> Get b) -> Get b
bindG (C c) f = C $ \st0 kf ks -> c st0 kf (\st1 a -> runCont (f a) st1 kf ks)
</code></pre>
Unfortunately, this results in bad performance.
I'm guessing that's it's because it's reconstructing the state value
(of type <code>S</code>), and the remaining <code>ByteString</code> input for each value consumed, thus a lot of allocations.
So, as an experiment, I manually unpacked <code>S</code>;
<pre class="source-code"><code>-- No longer using S
{-
data S = S { input :: !B.ByteString
, next_input :: !B.ByteString
, read_all :: !Bool
} deriving Show
-}
newtype Get a =
C { runCont :: forall r.
-- these three were part of S, now they are separate arguments
B.ByteString -> -- 1
B.ByteString -> -- 2
Bool -> -- 3
Failure r ->
Success a r ->
Result r }
type Failure r = B.ByteString -> B.ByteString -> Bool -> [String] ->String -> Result r
type Success a r = B.ByteString -> B.ByteString -> Bool -> a -> Result r
bindG :: Get a -> (a -> Get b) -> Get b
bindG (C c) f = C $ \inp next eof kf ks ->
c inp next eof kf
(\inp' next' eof' a -> runCont (f a) inp' next' eof' kf ks)
</code></pre>
With ghc-7, this yields a huge speed boost and reaches about half the speed of the old binary library (and ~30% faster than cereal). Unfortunately, I find the code is less readable and harder to maintain. Maybe it's worth it though.
I got a hint to see <a href="http://hackage.haskell.org/trac/ghc/ticket/1349">ghc ticket #1349</a>. Duncan Coutts <a href="http://haskell.1045720.n5.nabble.com/More-speed-please-td3182083.html">summarizes the issue</a>, this time about the Put monad. There are a lot of details and examples in those mails, suggesting an extension to GHC to control strictness in the arguments of higher order components. It'd allow us to write the prettier version, yet enjoying the nicer properties of the unpacked version. It's unlikely that we'll see the proposal implemented soon, though.
It seems there are four options;
<ul>
<li> Go with the manually unpacked code </li>
<li> Drop support for backtracking </li>
<li> Drop support for incremental input </li>
<li> Find something even better, you're all invited :)</li>
</ul>Lennart Kolmodinhttp://www.blogger.com/profile/11472900998358020886noreply@blogger.com4tag:blogger.com,1999:blog-8761623568109455093.post-81134451915511021572008-05-10T01:08:00.000+04:002008-05-10T01:48:46.695+04:00HCAR DeadlineIf you haven't followed the mailing lists lately, you might have missed that the HCAR deadline is today (10th May).
<blockquote><a href="http://www.haskell.org/communities/">The Haskell Communities & Activities Report</a> is a bi-annual overview of
the state of Haskell as well as Haskell-related projects over the
last, and possibly the upcoming 6 months.
[..]
If you are working on any project that is in some way related to Haskell, please write a short entry and submit it to the us. Even if the project is very small or unfinished or you think it is not important enough -- please reconsider and submit an entry anyway!</blockquote>
Get the full details <a href="http://www.haskell.org/pipermail/haskell/2008-April/020378.html">here</a>.
I'm much looking forward to the next edition!Lennart Kolmodinhttp://www.blogger.com/profile/11472900998358020886noreply@blogger.com0tag:blogger.com,1999:blog-8761623568109455093.post-47528468713585860862008-01-20T12:00:00.000+03:002008-01-20T02:58:28.946+03:00The Beginning of a New EraI still remember the event <a href="http://www.jobs-in-fp.org/">Jobs in Functional Programming</a>, held here in Göteborg, Sweden, the 14th December last year.
It was the first event of its kind, an event to attract hackers to employers, paying for you to hack in functional programming languages!
Not only were 8 companies represented in some way, 6 of them had people there explaining why you should go to their company and not someone else's :)
With over 100 people registered, and about as many as came, it was a joy.
<a href="http://www.cs.chalmers.se/%7Erjmh/">Professor John Hughes</a>, well known for his work with Haskell, was the host for the evening. John ended the event with a speech, I got reminded of it by <a href="http://article.gmane.org/gmane.comp.lang.haskell.general/15935">SPJ's mail</a>. Congratulations, btw! :)
They both conclude that Haskell has become much more popular, and that it has happened very fast. Five or ten years ago we would not have dreamed of having such an event as we did.
Some students found it humorous when John spoke very enthusiastically about Haskell's new won popularity. But think of it this way, twenty years ago few companies would make a fuzz about FP at all. Today C#, one of the most popular languages out there from one of the (the?) biggest computer companies out there, has lots of ideas from FP. There is also <a href="http://research.microsoft.com/fsharp/fsharp.aspx">F#</a>: Microsoft's newest addition to the .NET family, a somewhat more evolved version of OCaml, adjusted to fit into the .NET world. It's not difficult to make the prediction that Haskell will get even more attention in the near future.
The times are really changing.
As Hughes said in his speech,
this is the beginning of a new era.Lennart Kolmodinhttp://www.blogger.com/profile/11472900998358020886noreply@blogger.com1tag:blogger.com,1999:blog-8761623568109455093.post-85266137494876885212008-01-19T19:46:00.000+03:002008-01-19T21:48:47.642+03:00Wireless on a Thinkpad x61sI've just configured wireless networking on my laptop, so I'd like to share a little checklist to speed up the process for others in the same situation.
As you will see, it is that hard to get it going. It's as simple as setting the kernel modules, drivers, encryption and local settings. What took me the longest was actually to remember to allow my MAC address in the router... bummer :)
This guide is specific to Gentoo Linux, but it should not differ too much against other distributions. I also expect you are familiar with Gentoo and how to compile your kernel.
<code># lspci | grep Wireless
03:00.0 Network controller: Intel Corporation PRO/Wireless 4965 AG or AGN Network Connection (rev 61)</code>
First, we need to get the network interface working.
Enable the new network stack, in the kernel menuconfig, enable
<code>Networking -> Wireless -> Generic IEEE 802.11 Networking Stack (mac80211)</code>.
I used the currently latest stable kernel, <code>gentoo-sources-2.6.23-r3</code>
Also don't forget to compile the required crypto algorithms under Cryptographic API:
<code><M> Cryptographic algorithm manager
<M> CBC support
<M> ECB support
<M> AES cipher algorithms
<M> ARC4 cipher algorithm
<M> Michael MIC keyed digest algorithm
<M> PCBC support</code>
This will build modules called <code>mac80211, blkcipher, aes, arc4, ecb, cryptomgr and crypto_algapi</code>. Load them during boot in <code>/etc/modules.autoload.d/kernel-2.6</code>.
Also modprobe right away, so we can continue without rebooting. Add to <code>kernel-2.6</code> like above.
Install the network drivers; unmask and unhardmask <code>iwlwifi</code> and <code>iwl4965-ucode</code>, add <code>ipw4965</code> to your use flags, then run <code>emerge iwlwifi</code>
This will install the iwlwifi system and the firmware needed by your hardware.
Now run <code>modprobe iwl4965</code>. Check success/failure with <code>ifconfig -a</code> or <code>dmesg</code>, you should have a new interface available.
As soon as you've got the interface up and running, check your HW address. If you've got MAC filtering, don't forget to add it to your "Allow list". Better get that out of the way so we don't forget it...
I'm using wpa_supplicant to configure for my networks, <code>emerge wpa_supplicant</code>.
<code>/etc/conf.d/net:
----------------
#dhcp is default on all devices
modules=( "wpa_supplicant" )
wpa_supplicant_wlan0="-Dwext"
/etc/wpa_supplicant/wpa_supplicant.conf:
----------------------------------------
ctrl_interface=/var/run/wpa_supplicant
# Let any member of the group "wheel" configure the network
ctrl_interface_group=wheel
ap_scan=1
# Add network block to connect to unsecure networks.
# Giving it a low priority will make all other blocks preferred.
network={
key_mgmt=NONE
priority=-9999999
}
# Add a WPA2 network
network={
ssid="my ssid"
proto=WPA2
key_mgmt=WPA-PSK
psk="secret password"
}
</code>
More information about wpa_supplicant, and examples, are available in <code>/usr/share/doc/wpa_supplicant*/</code>.Lennart Kolmodinhttp://www.blogger.com/profile/11472900998358020886noreply@blogger.com0tag:blogger.com,1999:blog-8761623568109455093.post-49657704740362467642007-08-23T19:34:00.000+04:002007-08-23T20:10:41.606+04:00Crap mail clientAt work we're blessed with a wonderful mail client which we are suppose to use, by policy. And most people do. I try not to.
Maybe you've seen this signature in some peoples' mail:
<blockquote>A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?</blockquote>
and indeed they are right.
This particular mail client encourages top posting by doing two things.
The first one is that when you hit reply it formats the new mail in this manner:
<cursor here>
<your signature>
------<other persons name> wrote:------
<other persons mail body indented>
... where the irritating thing is that it places your signature above the other persons mail.
The second thing is that all mails are in HTML, and the client has some limitations in that regard: The text from the mail you're replying to gets indented in a way where you can't inline your replies without having your text indented too.
The interesting stuff happens when people in my organization think of how to workaround these limitations. Some use top posting, and forgets to reply to half of the message. The others use a feature that comes with HTML, coloring!
That is, the common way to reply is that you pick a previously not used color in that mail, and color your inlining with that.
When the mail gets replied for the seventh time there is indeed eight different colors used, and everything is indented seven steps. Yay.... Naturally there is no "standardization" of which color to pick next, so each conversation is colored in a different way. It can get quite tricky to follow. Also, after a while it gets hard to pick a color that is sufficiently different from other colors. Luckily I'm not color blind.
Just to make you realize how bad this really is, imagine this with your favorite indention sensitive programming language:
In your editor;
<ul><li>You can't indent.</li>
<li>When you hit the tab key you just get the color for the current indention level.</li>
<li>The mapping of "intention level" to color is different for each source file.</li>
</ul>
A very interesting thought.
The first person to implement this in the editor of their choice will get a postcard from Göteborg, by yours truly.Lennart Kolmodinhttp://www.blogger.com/profile/11472900998358020886noreply@blogger.com0tag:blogger.com,1999:blog-8761623568109455093.post-51172767525808399512007-06-01T11:53:00.000+04:002007-06-01T18:36:14.500+04:00Code vs. CoderI read a blog this morning, which got a few thoughts started (oh noes, not again!).
Rich Skrenta argued that <a href="http://www.skrenta.com/2007/05/code_is_our_enemy.html">code is our enemy</a> while Jeff Atwood replied that the code itself isn't the problem, but that <a href="http://www.codinghorror.com/blog/archives/000878.html">the real problem is what you see in the mirror</a>.
This reminded me of the good <a href="http://www.md.chalmers.se/%7Erjmh/">professor John Hughes</a> (who taught me Haskell 5 years ago!), who has along with <a href="http://www.md.chalmers.se/%7Ekoen/">Koen Classen</a> and others, been writing on <a href="http://www.cs.chalmers.se/%7Erjmh/QuickCheck/">QuickCheck</a> the past years. It has been rewritten a number of times, each time with more features but fewer LOCs (lines of codes).
Bearing in mind that <a href="http://en.wikipedia.org/wiki/Source_lines_of_code">counting LOCs is regarded as a measurement of productivity</a>, Hughes asks,
<blockquote>Have we had negative productivity?
We do more with less code!</blockquote>
Another example would be in the newly released <a href="http://xmonad.org/">xmonad 0.2</a> where <a href="http://cgi.cse.unsw.edu.au/%7Edons/blog/2007/05/17#xmonad_part1b_zipper">Don (dons) Stewart recently blogged about</a> how he with fewer lines of code did more by moving some of the logic into the data structures. He used a technique called zipper (references in the end of his blog entry) and could thus merge two data structures into one, eliminating the problem of keeping them concise. Easier to write the code and less risk of writing bugs. Did he have negative productivity too? I would have to say no.
Written code has to be maintained by someone, depending on what it was written to do. When GHC 6.6 was released the <a href="http://www.gentoo.org/proj/en/prog_lang/haskell/index.xml">Gentoo Haskell Herd</a> got busy as roughly half of the Haskell packages only compiled with GHC 6.4.2 (about 16 broken out of 40). Naturally we wanted them to work with GHC 6.6 too. It is hard to foresee what will change, no matter how you write the code, thus giving Skrenta a point. Code is the enemy.
<a href="http://www.haskell.org/%7Egentoo/gentoo-haskell/projects/GHC-6.5-failures.html">Reason for the broken packages</a> (<a href="http://www.haskell.org/%7Egentoo/gentoo-haskell/projects/GHC-6.6-failures.html">2</a>) mostly was (in no particular order)
<ul>
<li>deprecated modules and classes</li>
<li>functionality moved to other modules</li>
<li>changed type signatures</li>
<li>changes in the type system?</li>
<li>changes in the build system</li>
<li>UTF8 trouble</li>
</ul>
All this makes me extra jumpy when Erik Hellman, a Java Consultant in Göteborg, blogs about when <a href="http://jsolutions.se/?p=120">a 1400 lines game written in 50 min</a> (in Swedish) which was later presented at the JavaOne conference. 500 lines are auto generated getters/setters, along with 900 manually written lines. Assuming writing the manual lines took 40 min, he wrote a line about every three seconds non-stop. I hardly think though that I even could write 900 lines in 40 min even by heart. He agrees, and notes that this is not possible without an IDE, in which case he recommends IntelliJ IDEA 6.0.
Maybe Hellman is an extremely talented hacker and that his tools indeed are excellent, thus it's not possible to write the code in less than 1400 lines (in Java) and that it really should have taken some other programmer the whole day. Possibly the job has been solved and no one has to touch the code ever again. Personally though, I would be more interested to see how the game would evolve in 50 hours, days, or weeks. I guess it depends on what you want your app to do.
Still impressive to write a game like that in 50 min, but 1400 LOCs? Hellman replies by saying that <a href="http://jsolutions.se/?p=120#comment-867">we are on the same track</a>, and that we should see the other guy's code.
How much logic can you write in 40 min? The resulting massive amount of code does not seem to be a problem.
I guess that we just have different backgrounds.
I have not yet had a look at <a href="http://jsolutions.se/?p=145">the code</a>, which recently was put online, but I do hope that the IDE will help to debug and maintain it.
This raises the question of how many lines it would be in Haskell, given the corresponding libraries exists. Anyone want to give it a try?Lennart Kolmodinhttp://www.blogger.com/profile/11472900998358020886noreply@blogger.com1tag:blogger.com,1999:blog-8761623568109455093.post-50634318297986100252007-04-19T22:40:00.000+04:002007-04-20T00:20:30.900+04:00Pepper<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://upload.wikimedia.org/wikipedia/commons/thumb/d/da/Red_capsicum_and_cross_section.jpg/800px-Red_capsicum_and_cross_section.jpg"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 200px;" src="http://upload.wikimedia.org/wikipedia/commons/thumb/d/da/Red_capsicum_and_cross_section.jpg/240px-Red_capsicum_and_cross_section.jpg" alt="" border="0" /></a>I've been thinking (oh noes!)...
Peppers are hollow, but what's actually inside?
Is it simply air or perhaps some gas secreted while growing?
Assume it is air, is it then air from where the pepper is grown, or from a more recent location?
If the "filling" is indeed from where it's grown - let's say Croatia (I've got reliable intelligence that the best tasting vegetables are grown there):
If I'm lucky, I might just manage to carefully make a couple of holes in the pepper, breath in, and get a small breath of this mysterious gas.
That's a mini vacation right there!Lennart Kolmodinhttp://www.blogger.com/profile/11472900998358020886noreply@blogger.com6tag:blogger.com,1999:blog-8761623568109455093.post-53233897691919834972007-04-19T21:35:00.000+04:002007-04-22T21:21:41.374+04:00Vårt svenska språkOnly in Swedish...
Fick en fråga av en kollega:
Varför heter det att man "<span style="font-weight: bold;">ligger</span> i lumpen", "<span style="font-weight: bold;">sitter</span> i fängelse" och "<span style="font-weight: bold;">går</span> i skolan"?
Nästan Anders och Måns-kvalitet på den. Men hur kommer det sig?
Ligga i lumpen; motiverat med allitteration...
Sitta i fängelse; man kommer ingen vart? Sitt still!
Går i skolan; det går framåt? Err.. Hm..Lennart Kolmodinhttp://www.blogger.com/profile/11472900998358020886noreply@blogger.com0tag:blogger.com,1999:blog-8761623568109455093.post-32382414678207390362007-04-19T19:36:00.000+04:002007-04-19T20:04:41.457+04:00Asking for a day off?Hittade en lapp på jobbet som förklarar varför man inte kan få ledigt om man ber om det...
<div style="text-align: center;">Vet du vad du egentligen begär?
Året har 365 dagar, men du arbetar ju inte varje dag.
Under årets 52 veckor har du ledigt 2 dagar varje vecka.
Då återstår 261 dagar.
Du är ledig 16 timmar varje dag, det blir sammanlagt 170 dagar.
Då återstår 91 dagar som du kan arbeta.
Varje dag tar du lunch 1 timme, det blir sammanlagt 48 dagar.
Nu återstår 43 dagar.
Du behöver inte arbeta 6 röda dagar under året.
Det blir 37 dagar kvar.
Varje dag tar du kafferast 10 minuter, det blir totalt 11 dagar.
Kvar finns 26 arbetsdagar.
Sedan har du 25 dagars semester.
Då återstår bara en dag.
Och den förstår du väl att du inte kan få ledigt.
</div>
Which (loosely) translates into:
I found a note at work why I can't have the day off.
(Notice that this is is with Swedish vacation, 5 weeks each year).
<div style="text-align: center;">Do you even know what you're asking?
Each year has 365 days, but you're not working all of them.
For each of the year's 52 weeks, you've got 2 free days,
leaving 261 days.
Being free 16 hours a day adds up to 170 days.
91 days left to work.
Add up 1 hour lunch each day, that's 48 days.
43 days remains to work.
You don't have to work the 6 holiday-days, 37 days remains.
10 minutes coffee break each day adds up to 11 days.
Left is 26 days to work.
Then you have 25 days vacation.
Only one single day to work.
Now you must understand you can't get that day off?
</div>
I don't really get the numbers to add up, but hey! :D<hints id="hah_hints"></hints>Lennart Kolmodinhttp://www.blogger.com/profile/11472900998358020886noreply@blogger.com0tag:blogger.com,1999:blog-8761623568109455093.post-82669596992610420412007-04-16T12:41:00.000+04:002007-04-16T13:36:38.616+04:00RIP rydis<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://mikael.jansson.be/rydis/rydis-barnfamn_300.jpg"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 320px;" src="http://mikael.jansson.be/rydis/rydis-barnfamn_300.jpg" alt="" border="0" /></a>
<a href="http://mikael.jansson.be/rydis/">Martin "rydis" Rydström</a> disappeared the 24th February earlier this year. His body was found last week in the water outside of Göteborg.
Martin was a fellow lisper, which I only met irl one single time. I regret I didn't get to know you better.
We will remember you, friend.Lennart Kolmodinhttp://www.blogger.com/profile/11472900998358020886noreply@blogger.com2tag:blogger.com,1999:blog-8761623568109455093.post-77733418933230070882007-04-12T18:25:00.000+04:002007-04-12T18:52:20.544+04:00Fun at workImagine my face when I walked into the kitchen at work and saw a keyboard with dirt all over it, well wrapped in plastic film. Is it supposed to be there? What happened and who did it?
While taking another cup of tea, I recalled <a href="http://www.nada.kth.se/%7Ehjorth/krasse/english.html">an old prank</a> I've seen circulate the net. Ah!
<a href="http://picasaweb.google.com/kolmodin/Keyboard/photo#5052235670814221970"><img src="http://lh5.google.com/image/kolmodin/Rh0lkyBwZpI/AAAAAAAAAGc/Ee8rllmSB7A/s288/IMG_0306.JPG" /></a>
Five days later, it had progressed nicely... Looked like a green carpet.
<a href="http://picasaweb.google.com/kolmodin/Keyboard/photo#5052235825433044642"><img src="http://lh5.google.com/image/kolmodin/Rh0ltyBwZqI/AAAAAAAAAGk/Hq8K6bxewss/s288/IMG_0311.JPG" /></a>
Nicely set-up. Guess if he was surprised?
<a href="http://picasaweb.google.com/kolmodin/Keyboard/photo#5052241945761441458"><img src="http://lh6.google.com/image/kolmodin/Rh0rSCBwZrI/AAAAAAAAAGs/izvDkDZC2kQ/s288/IMG_0321_mod.JPG" /></a>
What would you say if your office looked like that after the easter holiday? Oh yeah.
All you need is an old keyboard, some dirt or cotton, and seeds of something easy growing. Childish humour can be an asset too. Complete instructions on the <a href="http://www.nada.kth.se/%7Ehjorth/krasse/english.html">other site</a>.
This is yet a reason why I choose to write the report of my thesis at work instead of at home :D
Now back to work!Lennart Kolmodinhttp://www.blogger.com/profile/11472900998358020886noreply@blogger.com0tag:blogger.com,1999:blog-8761623568109455093.post-604053452408901772007-04-02T22:04:00.000+04:002007-04-02T23:25:07.471+04:00xmonadIf you've followed #haskell@freenode the last month, you've had to try hard to not hear about heard about <a href="http://xmonad.org/">xmonad</a>.
It's a lightweight <a href="http://en.wikipedia.org/wiki/Window_manager">window manager</a>, a clone of dwm, similar to ion, ratpoison, wmii and larswm. To be more precise it's a <a href="http://en.wikipedia.org/wiki/Tiling_window_manager">tiling window manager</a> which means that it makes sure that no windows are overlapping. It can take a while to get used to, but it's well worth the effort. While xmonad still is relatively easy to learn, the <a href="http://modeemi.fi/%7Etuomov/ion/">Ion manifest</a> still applies. It will help you to <a href="http://www.codinghorror.com/blog/archives/000825.html">put down the mouse</a>.
The impressive part is that while one of dwm's goals is to be <2000 lines of C, xmonad has similar features in <400 lines of <a href="http://www.haskell.org/">Haskell</a>. It also does a couple of things dwm doesn't, namely automated testing of the internal window manager properties (with <a href="http://www.cs.chalmers.se/%7Erjmh/QuickCheck/">QuickCheck</a>) and support for <a href="http://en.wikipedia.org/wiki/Xinerama">xinerama</a>. (The lack of xinerama is, otoh, <a href="http://www.suckless.org/wiki/dwm">listed as a feature</a> of dwm).
See a screenshot <a href="http://cse.unl.edu/%7Esjanssen/xmonad-tiles.png">here</a>.
It's not yet released, so you'll have to "<a style="font-family: courier new;" href="http://www.darcs.net/">darcs</a><span style="font-family:courier new;"> get</span>" the sources, instructions on <a href="http://xmonad.org/">the homepage</a>. Unless you're running Gentoo Linux with the <a href="http://haskell.org/%7Egentoo/gentoo-haskell/">haskell overlay</a>, in which case a "<span style="font-family:courier new;">emerge xmonad-darcs</span>" will do.
Give it a try and give feedback on #haskell!Lennart Kolmodinhttp://www.blogger.com/profile/11472900998358020886noreply@blogger.com9tag:blogger.com,1999:blog-8761623568109455093.post-6800294540293886032007-02-07T22:18:00.000+03:002007-02-07T23:19:17.778+03:00Fish out of waterI've been forced out of my safe zone and into Windows. It came with the job...
Now, 6 months later, I've collected a few tools that makes the daily headake set a little later in the afternoon.
<a href="http://fy.chalmers.se/%7Eappro/nt/TXMouse/">True X-Mouse</a>:
The first thing I missed was beeing able to copy and paste simply by selecting and clicking. Fortunately, True X-Mouse solves that.
<a href="http://www.microsoft.com/windowsxp/downloads/powertoys/xppowertoys.mspx">Open Command Window Here</a>:
Part of the Microsoft PowerToys for Windows XP. Right click on any folder and select "Open Command Window Here" and you'll have a cmd.exe with CWD as that folder. Must have saved me hours of typing... Perfect if you use darcs as version control system and need to type a few commands.
<a href="http://gnuwin32.sourceforge.net/">GnuWin32</a>:
Don't like the Windows Search function? Neither do I. I miss my bash shell, and the coreutils :/
Imagine my smiling face when I found a port of many common GNU tools. You can get far with the coreutils and grep package.
<a href="http://virt-dimension.sourceforge.net/">Virtual Dimension</a>:
Another thing that's hard to live without is virtual desktops. I use them all the time on my linux boxes, and constantly hit the keys to switch desktop even on windows machines. This tool makes a pretty good job at faking virtual desktops, but you do notice that it's not real virtual desktops. If (when!) a program hangs, it will appear on all desktops, and if you're unlucky you're stuck on your current desktop for a while. Still a great tool, I'd probably miss this one the most of them all.
<a href="http://www.vim.org/">Vim</a>:
You don't get very far with notepad or wordpad. Using vim in cmd.exe is just weird as cmd.exe is far to crappy. GVim works perfectly though. Emacs is available too if you're into that kind of thing.
<a href="http://www.chiark.greenend.org.uk/%7Esgtatham/putty/download.html">Putty</a>:
Putty is your new SSH client. Nice!
<a href="http://www.rarlab.com/">Winrar</a>:
Opens your .tar.gz and .tar.bz2 (and a bunch of other) files, yay!
<a href="http://www.daemon-tools.cc/">Daemon tools</a>:
Mount your CD images as virtual cdrom drives.
Browsers:
Same same... I use both Firefox and Opera.
With a few of these tools your Windows experiense won't be as frightening, and you might even get some work done instead of just being frustrated. Bonus: All these tools are free (as in beer).
Although I hope that none of you ever need to use these tools, I wish you good luck!
<a href="http://fy.chalmers.se/%7Eappro/nt/TXMouse/"><span class="" style="display: block;" id="formatbar_CreateLink" title="Link" onmouseover="ButtonHoverOn(this);" onmouseout="ButtonHoverOff(this);" onmouseup="" onmousedown="CheckFormatting(event);FormatbarButton('richeditorframe', this, 8);ButtonMouseDown(this);"></span></a>Lennart Kolmodinhttp://www.blogger.com/profile/11472900998358020886noreply@blogger.com6