Archive for the 'My Kit' Category

07/22 Grrrr

I built a functioning UM in around four hours, plus some debugging and tweaking to help me understand more what the code does. Step by step debugger and all the trimmings. Nifty.

But… I can’t get past the decryption level. I am doomed. Oh well, just saved my week-end I guess!

07/21 ICFP Contest

Haven’t decided yet whether to join this year’s ICFP contest – will have to see the task. And even so – which language? Ah well… Erlang would be cool, even if I am not yet up to speed on it. RB could work – after all this is one language I am most comfortable in. But Functional it is not :)

From this year's contest codex

Weird, innit? This is from the codex posted yesterday, I think.
There’s a lot of gobbledygook in there, including a Gif89a header. Waiddaminit! This I know from my fPic efforts. RB sure came in handy, to extract the gif file. Never mind the jpeg above, what I did was remove the 0×1a73 first bytes, and save that as a Gif. Since you can stuff anything and everything after a valid gif without breaking it – it will just be ignored, which is, among others, a crude way to perform steganography… – the file displayed alright, but was still 2.2MB heavy… Screenshot, conversion from PDF to JPG – with an RB app, of course! – and here we go.

The theme is supposed to be computational archaeolinguistics. Read at face value – I am alas very good at that – it is right down my field of [supposed] expertise. Reading between the lines, as some #erlang fellows mentioned, it could be that the computational is not the means – doing linguistic research with ‘puters – but rather digging an old computer language, or else:

<marc_vw>	it is probably about data mining
<marc_vw>	maybe looking at computation results
<marc_vw>	and trying to figure out what language the hardware used was
		programmed in
<marc_vw>	my guess is reengineering a dead computer language
<noss>		mine too, evaluating some kind of program specification.

Which makes more sense… Other tidbits include Latin ignoti et quasi occulti ie “unknown and almost hidden”, English, including written from right to left: welldonedaedsi luap “Well done, Paul is dead”, several appearances of the word apply, which could be a computer command, a magical word – abracadabra – and other meaningless stuff.

Four more hours to go.

07/13 clic clac

Via Bill de Hóra, unicode converter.

Clic clac!

Why go to the web for that? This thing stays out of the way when not needed:

Automatic copy/paste: when you enter the plain text box, it pastes whatever text the clipboard has, and selects it. In the Unicode box, it pastes the content of the clipboard if it matches the \uXXXX format. After conversion, the result is copied back to the clipboard. What say?

Download  Tested on a TiBook 17″ running 10.3.9. Prolly won’t work on MacIntels. Not me faults, Guv’.

07/10 X-Encodings in Erlang/mb

Been hard at it. But I hit a snag when encountering a plain, innocuous-looking sinogram: 內. Pretty harmless, right? D’uh. Big time. Because in giapponese it’s 内, not 內. Friggin’ variants. 0×5167 vs 0×5185 [If you don’t know what I am talking about, it’s okay. In this case, ignorance is bliss!]. Can I slap someone – preferrably not me? Say hello to hasVariant(X) who just joined us. Nyessss…. Anyway, all fixed now – or rather, for now!

A couple of screenshots is worth 1,000 words!

Here’s a screenshot of the source of KanjiTest.html, in SEE. It’s a cross-table of a few sinograms for a slew of encodings, showing the respective code points for each character in all the encodings. The page itself is in utf-8.
KanjiTest.html screenshot

Here’s a screenshot of the source of KanjiTestUTF8.html. An extract of the former table, if you will.
KanjiTestUTF8.html screenshot

Look, Mom, with only one hand: a UTF-16 encoded page showing a table of sinograms, in UTF-16 of course.
KanjiTestUTF16.html screenshot

Ooooh, lookit, no hands! Same sinograms, in Shift-JIS. Damn, I rock!
KanjiTestSJ.html screenshot

It’s not exactly rosy. The code’s a deluxe candidate for refactoring – read, it’s a mess – but is written in a way that can handle easily as many new encodings as you can throw at me, provided you give me a UTF-8/16 to said encoding cross-ref file. The whole yahzoo [case folding data, Big5, CCCII, Shift-JIS, EUC-KR] is stored in dets tables – UTF-16⇔UTF-8 is an algorithm, thus a tad faster. Because this thing, it’s nice to have, but not exactly a Ferrari. The test that produces these tables, it runs in, ahem, er…, 14 seconds? Don’t reach for your gun right now though, because:
A. I am going to work on speed when functionalities are all in and tested
B. Try to do that right now in Erlang :D – good luck!

So I guess slow is better than nothing. But we’ll work on speed.

Erlang

07/08 Erlang multi-byte module, continued

There are still many dusty corners, and some major refactoring to do, but it is going well, very well indeed.

Erlang