There are 21 posts tagged with code.

There is an Atom feed for posts tagged with code.

mini-fract 0.6

2009-12-09 Tags: , ,

I just released mini-fract 0.6, the mostly standalone version of fract. This is a minor update to make it use ZPNG, the supported successor to salza-png that I was previously using. Adam Majewski wrote some introductory notes on how to get started for someone with very little experience with the Common Lisp programming environment. Enjoy!

Code examples from Montreal Python 2

2008-04-12 Tags: , ,

I just uploaded the code examples from my presentation at Montréal Python #2 on PyQt and PyOpenGL. I fixed the lighting and the positioning of the model but otherwise, the package contains exactly what was on the screen. You can also download the multiple alignment viewer that I presented from my bioinformatics section. Enjoy!

Tail call elimination is good in C too

2008-04-09 Tags: , ,

All recursive algorithms can be converted into an iterative version. They teach that in school. They teach it as something that should be done, and it used to be true. However, we a living in a wonderful time and old wisdoms sometimes cease to be true.

Some algorithms are much more elegant when written recursively; the recursive version is also easier to prove. On the other hand, the iterative version will avoid many stack manipulation operations. Theoretically, any tail recursive function can be converted into an iteration transparently by the compiler without performance penalty compared to the hand written iterative version. Theoretically.

When I wanted to prove that Common Lisp had really fast implementations, I did my best to use the iterative version of the Mandelbrot formula. Unfortunately, there was no way around it, the iterative version would always run a bit faster. "So be it," I said to myself in resignation.

Too many files: Reiser FS vs hashed paths

2007-12-26 Tags: , ,

On my hard drive, some directories attract lint faster than my CPU fan. I wipe then clean but as soon as I blink they already contain a zillion files. And for some reason, having a zillion files in a directory on GNU/Linux is a really bad thing.

With Gazest, I decided to store only the files meta data in the database and to keep the content on disk. To prevent name clashes, the content filenames are the HMAC-SHA1 hashes of the data. Of course, that means that I have to expect a zillion files in the content directory and as soon as you mention the problem of to many files you hear "Reiser FS," loud and proud from the cube next to yours. But it is really the best solution?

I'll focus on a single problem: access time of a file stored in a directory with a lot of other files. I don't care about CPU usage, disk usage, fragmentation, paging, or recovery tools. When I read a file, how long will it take? Thats it. Surprisingly, I could not find good benchmark for that typical use case. Sure, there are many anecdotal tales of people who noticed a significant improvement with a new FS when they re-installed their mail server but they end up comparing completely different setups with a variable real world server load. Lack of measurements lead to counter productive faith based debates so here is my attempt to quantify the issue.

Exporting kmail filters

2007-12-23 Tags: , , ,

I love emails. Emails can be threaded into conversations, sent to multiple recipients, contain attachments, and be archived into folders for easy retrieval. Emails are scriptable. You can setup filters that will classify them according to complex criteria, put them into folders, call scripts to trigger events, pipe them through commands, or simply delete them.

I use kmail. It has a great filter system with a visual regular expression editor and filters can be bound to toolbar icon or to keyboard shortcuts. But kmail has a really poor filter export facility. I can understand why it's hard to export them: filters are not just action units; they are part of a pipeline and many of my filters are completely meaning less if they are not executed in a very specific order. Nevertheless, I have a set of non-trivial filters at work that I want to keep in sync with my filters at home.

Kmail hides its filters in kmailrc, a .ini file a lot a auto generated noise. At first, I tried to copy all the [Filter XX] blocks from one kmailrc to the other but there is a really big problem with this solution: it seems to work. Some of the filters will indeed get imported but if you end up with more filters than you previously had, the last few filters in your pipeline will silently get discarded. For some reason, kmail can't figure out how many filters you have so it keeps the count in the [General] section of the kmailrc. I don't want to count my filters each time I synchronize them so I wrote a convenient script to take care of that. Enjoy!

Many code news

There are fashions in the markup world. There was a time when using colons (':') to split fields in /etc/passwd was enough, a time when no one had a problem with using TABs as command delimiter in Makefiles. Then came the era of heavy markup, "more semantic!", they all asked for, and we received XML.

More semantic is a good thing but anyone who wrote documentation using DocBook knows that the heavy syntax gets annoying really fast. No wonder no one documents his programs. Fortunately, some lazy programmers wanted, for some obscure reason, to document their programs; they propelled us into a new era of light weight markup.

There are quite a few really good light weight markups out there, and Gazest supports most of them. For simple formating, my favorite is definitely Markdown. It reads like text emails: the syntax doesn't do much but the essentials are there and the syntax actually helps to read the source instead of obfuscating it. For blog comments, or anything that won't need much semantic, in applications where you can't use for HTML, for security reasons or just because it's a pain to type, Markdown is the way to go.

Gazest is being sponsored

2007-11-08 Tags: , , , ,

I'm really excited to announce that the Gazest development is being sponsored by Savoir-faire Linux. The demo site will now run much faster: they have server farms with big fat pipes scattered across Montréal. Cyrille, their CEO, has a great understanding of post-industrial economy: the market of services where knowhow is the main capital. It's always a pleasure to talk with him; he sees free software not as an altruistic endeavor but as the only logical choice for corporations to compete in the modern world. He believes in the economic viability of free software and the current sponsorship is the testimony that those are not empty words. More exciting news to come shortly, stay tuned.

setuptools_git 0.3

2007-11-06 Tags: , , ,

My gitlsfiles plugin is dead: it was a silly name. It has reborn with the really sexy name of setuptools_git. Setuptools_git 0.3 has better documentation and is more portable than gitlsfiles.

Gazest 0.3.9

2007-10-27 Tags: , ,

Gazest 0.3.9 is out. It now runs on Alchemy 0.4, there are many bug-fixes here and there, the style have been improved and you can now search. Enjoy!

update: 0.3.9.1 is out: there was a bug with the abuse report form.

A new kind of wiki

2007-10-18 Tags: , , ,

What is a wiki? It has to be more then a just program to transform _light_ *weight* text markup into valid html. Beyond the markup, a wiki is a platform to help many persons work on a shared document. Being a programmer, I'm familiar with this concept since we use similar tools to work on shared programming projects: revision control systems.

The state of the art in revision control systems is Git and Mercurial. What is it that makes then so good? Some people will tell you that it's because they are distributed. That's a good point but there is more to it. They manage to work without centralized authority by merging concurrent changes. To be able to do that, they have to be able to detect which changes you need to merge and and which ones you already have and the key to do that is to keep the full family lineage of every revision of every files for all the people that you work with. The main problem with Subversion is not that it's centralized, it's that it flattens the history into a linear series of revision and that destroys the hope for smart merges.

Git and Setuptools

2007-09-28 Tags: , , ,

Explicit is better than implicit. It's in the Zen of Python. Who could disagree? Setuptools has a feature that would prevent me from reaching peace of mind. You can tell it to include in your package all the files that you track with a revision control system. I used to prefer being explicit by using MANIFEST.in, until I started to heavily refactor a package layout. This is one thing that Git does really well. You just add all the new files recursively and it will figure out which files are really new and which are new names for old files. But updating MANIFEST.in can become quite a pain.

What happens in practice is that rules in MANIFEST.in have an extremely broad scope. The latest Pylons recursively includes everything in the template directory. It would be a pain to make the right rule; you need to include all the templates for all the templating languages supported by Buffet and each engine is really permissive on the file extension used to name its template. The current rule will match all Emacs backup files and a lot of junk that most people don't want to distribute. When I switched to include the files tracked by a revision control, the only file that I don't explicitly wanted in there was .gitignore. In this case, being explicit on what we don't want is a lot cleaner than being explicit on what we do want.

"Oh wait," you may ask. I mentioned using Git but Setuptools has no Git plugin. Until now. Here is gitlsfiles (2.4 egg, 2.5 egg), a plugin to have Setuptools packages all the files tracked by Git. You just need to install it and Setuptools will figure out the rest.

The best time to submit to Digg and Reddit

2007-08-13 Tags: , , ,

For the blogger, making the front page on Digg or Reddit presents major impacts. It will bring in a tidal wave of traffic, several external links, a little fortune in adwords revenue and ultimately, crank up the page rank and consecrate the lucky author as the authority of a given subject. The initial wave is nice but it is the recurring traffic that makes websites lively.

Because of all the benefits of being reddited or dugg, it is not uncommon for a blogger to craft some of his posts in order to optimize his chance to make it to the front page on those popular news sites. There are even strategy guides to help him do that. One advice that is often given is to pick your submission time wisely. The idea that the submission time has an impact on the probability of making the front page appeals to common sense but it is supported by very little experimental data. Here I will describe a simple scheme to measure the activity patterns of social news websites.

A cure for gigazillion account syndrome

2007-07-18 Tags: , , ,

This whole thing is moving so fast. Ain't it a great time to be alive? I mean, the Net.

What is Web 2.0? I don't know. I think it doesn't exists yet. At some point, Web application were templates over a databases. You used the application and you knew exactly what was the schema. Fortunately, Web apps now see a bit further than the socket to MySQL. They talk with other apps. The SQL schema is no more leading the flow, the users are.

"I'm not interested by politics." You hear that all the time. But it's not true. Men and women have lost interest in Federal, Provincial, and City level politics, this is true. But, politic is more than that. Look at a white collar office, at someone walking his dog, at a crowded restaurant lineup. People love to give their opinion, to look for approval, to influence others. They love authority, no mater how indirect it may be. Authority is too diluted by broken voting systems at the Federal level but people love politics. And this is what Web 2.0 is all about.

Yould 0.3.5

2007-07-16 Tags: ,

Yould 0.3.5 is out. It's the first release under GPLv3. The --list-sets option now works, you can apply a regexp to filter generated words and there is a new --version option.

The new regexp option is useful if you are looking for a word that include a specific sub-string. If you are looking for a name for a new cat website, you would do "yould -n 30 -R cat", if you are looking for adverbs for your new artificial language, you would do "yould -n 30 -R ly$". Have fun.

Stallman on GPLv3

2007-07-10 Tags: , , , ,

The Free Software Foundation just published a video from the launch of the GPL version 3. In it, Stallman gives a short and understandable overview of the changes in version 3 and why it is important to upgrade. No legalese, just plain, understandable, speech. Stallman at its best. The GPL version 3 itself is clear and readable. It's a good license and I will definitely release most of my code under it. Video is in OGG Theora format. I think that some players have problems with that but VLC will work. More info is available on the GPLv3 website.

Yould 0.3

2007-06-16 Tags: , ,

Yould 0.3 is out. I improved the command line interface and included trained engines for English, French, German, and more. There is also a web interface to automate domain availability checks, thanks to Register 4 Less.

Finding good names

2007-02-12 Tags: , ,

It happens all the time. I am about to start a new project and I can't find a good name. I could postpone the decisions until publication. With my completion rate, that would save me a lot of thinking. Still, the name of a project, a programming one at least, is scattered all over the place from the start: you create a tree in your cvs/git/svn, you create a package/namespace/module in your programming environment and you write the project's name in the documentation (at least you intend to). You can change the name later but you save a lot of work if you start with the right one.

There was a time when you could pick a random word from the dictionary and be done with it. Today, it seems, all the real words are taken, even vellication. Here I use a widely accepted definition of "taken": is the .com owned by someone else. There are other root domains (.net, .org, .info) but the squatters bought the most of the dictionary there too. If we don't limit ourself to the dictionary, why should we after all, combinatorics saves the day. There are just to many ways to put letters together. Squatters can buy all the tree letter acronyms but that's as far as they can get.

Raw Sockets

2006-11-22 Tags: , ,

Python is sometimes described as "just a scripting language". This show how much some people want to separate all programming languages into two or three categories without even knowing what they are talking about.

One nice thing about Python is the way it exposes the Unix internals. Almost all the system calls have bindings with interface so close to C that you can follow the man pages when using them. Of course, efficiency is not always there but you can sketch out a solution in no time and fall back to C when the profiler tells you to do so.

The socket API is exposed in all its gory details and it possible to do raw sockets: to build each packets byte by byte including the headers. The C API uses unions that are casted as struct when an individual field needs to be set and casted as byte array when written on the wire. The Python API uses string objects which makes it a bit painful to set a single field but poses no problems when building the packets from scratch.

As an example, its is possible to implement ping in just over 100 lines. Using time.time(), we don't have enough resolution to make accurate reading for round trips under ~5ms but it still works pretty much as expected otherwise. Note that if you are to try it you'll need to be root since it's the only way to access raw sockets (the real ping is setuid).

A new non baroque language should aim to be as cleaver as Python in exposing the host platform internals. CL+UFFI is not bad, SBCL's sb-posix is especially nice. Skilled use of macros makes the whole binding set easy to read and to maintain:

       (define-call "link" int minusp (oldpath filename) 
                                      (newpath filename))
       (define-call "lseek" sb-posix::off-t minusp (fd file-descriptor) 
                                                   (offset sb-posix::off-t) 
                                                   (whence int))
       (define-call "mkdir" int minusp (pathname filename) 
                                       (mode sb-posix::mode-t))
       (define-call "mkstemp" int minusp (template c-string))
       (define-call "sync" void never-fails)

On Invocations

2006-10-04 Tags: , ,

There is no doubt that all programming language have problems and that a new one is needed as soon as possible. Once someone has an idea of what he want his language to be, he need to write either a compiler, an interpreter or both. You don't see pure interpreters that much these days, most language implementations at least compile to byte code. Though the principle of a compiler is simple, we see them as obscure forgotten charms carefully crafter by wizards with powerful incantations.

A compiler just translates the text of a program in language A into a program in language B. Technically, the lexer split the program into tokens. The parser then build a tree with the tokens and give this tree some meaning, that's the abstract syntax tree (AST). Finally the code generator walk the tree and print all the nodes in language B.

For a really simple language a compiler will fit in a few hundred lines. It gets tricky when you want to make an interesting language. For the language to be fast you need to transform the AST to eliminate slow operations. Such optimizations include loop unrolling, function call inlining and tail call eliminations. Interesting languages will also include a runtime environment: closures bindings, garbage collectors, type system, etc. The target audience for those topics being limited, there are no general resources where one could learn them. But if you ask in the right place, you might get a list of grimoires with many spells related to those dark arts.

Even though thaumaturgy will never be easy to learn, it is comforting to see that one can learn to conjure without going through apprenticeship. As the task of making a new programming language becomes surmountable, the question "What feature should a new language have?" becomes fixation, a puzzle that the warlock must solve to achieve wizardry.

Torrent Full of Lisp Porno Movies

2006-03-31 Tags: , , ,

The Mandelbrot Set is the index of the parameters for which there is a connected Julia Set. Seeing how the two are related is quite interesting. Especially with animations. And if you like the little sample, there is a torrent full of them.

The complete source code is included, of course. The walks along the main cardioid and the main bulb are pretty neat. From this we can guess that a walk along some secondary bulb would especially nice. Now, I have to ask for help on that one. The point of intersection of the secondary bulbs are know, they are

   (/ (- 1 (expt (- (exp (* 2 #c(0 1) pi (/ p q))) 1) 2)) 4)

The main bulb is p=1 and q=2. The first person who tell me how to find center of the secondary bulbs will win a custom rendered movie! Isn't it a good deal?

update: the torrent is gone. Bittorrent is great for large files with flash popularity. Without it hosting large files like that would have been really difficult. The catch is that you need seeds and you can't hold on them forever. The movie pack was downloaded 75 times for a total of 19 gigs. Wow! Thanks to all the seeds! I'll pay you back with more fractal stuff soon.

Older stuff


2006-02-25 Viruses