PHP and big numbers

One would expect, that one of the most used script languages of the world would be ableto do proper comparisons of numbers, even big numbers, right?Well, PHP is not such a language, at least not on 32bit systems.Given a script like this:


$t1 = „1244431010010381771“;

$t2 = „1244431010010381772“;

if ($t1 == $t2) {

    print „equaln“;



A current PHP version will output:

schoenfeld@homer ~ % php5 test.php

It will do the right thing on 64bit systems (not claiming that the numbers are equal).Interesting enough: An equal-type-equality check (see my article from a few years ago) will not tell that the two numbers are equal.

LDAP performance is poor..

Todays rant of the day:In a popular LDAP directory management tool, not to be named, there is a message indicating that the performance of the LDAP server is poor. While this might still be true: Honestly, building LDAP filters like you and then complaining about the LDAP server is like, lets say, searching papers in the whole city, while you know they are certainly located within a single drawer, in a single closet, in a single room of your apartment and blaming the city council because your search took so damn long.What a mockery.

Struggling with Advanced Format during a LVM to RAID migration

Recently I decided to invest in another harddisk for my atom system. That system, I built up almost two years ago, has become the central system in my home network, serving as a fileserver to host my personal data, some git repositories etc., streaming server and since I switched to a cable internet connection it also serves as a router/firewall.Originally, I bought that disk to backup some data, of the systems in the network, but I realized that all data on this system were hosted on a single 320GB 2,5″ disk and it became clear to me, that, in absense of a proper backup strategy, I at least should provide some redundancy.

So I decided, once the disk was in place, that the whole system should move to a RAID1 over the two disks. Basically this is not that hard as it may seem at a first glance, but I had some problems due to a new sector size in some recent harddisks, which is called Advanced Format.

But lets begin from the start. The basic idea of such a migration is:

  1. Install mdadm with apt-get. Make sure to answer ‚all‘ to the question which devices need to be activated in order to boot the system.
  2. Partition the new disk (almost) identical.Because the new drive is somewhat bigger that wouldn’t make sense, but at least the two partitions which should be mirrored on the second disk, need to be identical.Usually this is achieved easily by using

    sfdisk -d /dev/sda | sfdisk /dev/second/sdb

    In this case, it wasn’t that easy. But I will come to that in a minute.

  3. Change the type of the partitions to ‚FD‘ (Linux RAID autodetect) with fdisk
  4. Erase evidence of an eventual old RAID from the partitions, which is probably pointless on a brand-new disk, but we want to be sure:

    mdadm –zero-superblock /dev/sdb1mdadm –zero-superblock /dev/sdb2

  5. Create two DEGRADED raid1 arrays from the partitions:

    mdadm –create /dev/md0 –level=1 –raid-devices=2 /dev/sdb1 missingmdadm –create /dev/md1 –level=1 –raid-devices=2 /dev/sdb2 missing

  6. Create filesystem on the first raid device, which will become /boot.
  7. Mount that filesystem somewhere temporary and move the contents of /boot to it:

    mount /dev/md0 /mnt/somewhere

  8. Unmount /boot, edit fstab to mount /boot from /dev/md0 and re-mount /boot (from md0)
  9. Create mdadm configuration with mdadm and append it to /etc/mdadm/mdadm.conf:

    mdadm –examine –scan >> /etc/mdadm/mdadm.conf

  10. Update the initramfs and grub (no manual modification needed with grub2 on my system)and install grub into the MBR of the second disk.

    update-initramfs -uupdate-grubgrub-install /dev/sdb

  11. The first point to pray: Reboot the system to verify it can boot from the new /boot.
  12. Create a physical volume on /dev/md1:

    pvcreate /dev/md1

  13. Extend the volume group to contain that device:

    vgextend /dev/md1

  14. Move the whole volume group physically from the first disk to the degraded RAID:

    vgmove /dev/md1

    (Wait for it to complete… takes some time 😉

  15. Reduce first disk from the VG:

    vgreduce /dev/sda2

  16. Prepare it for addition to the RAID (see step 3 and 4) and add it:

    mdadm –add /dev/md0 /dev/sda1mdadm –add /dev/md1 /dev/sda2

  17. Hooray! Watch into /proc/mdstat. You should see that the RAID is recovering.
  18. When recovery is finished pray another time and hope that system is still booting with it running from the RAID entirely. If it does: Finished 🙂

Now to the problem with the advanced format:There is some action taking place with the hardware vendors to move to a new sector size. Physically my new device has a size of 4096 bytes per sector. Somewhat different to the 512 bytes disks used to have the last decade.

Logically it still has 512 bytes per sector. As far as I understand this is achieved by placing 8 logical sectors into one physical sector, so when partitioning a new disk the alignment of the disk has to be so that partitions start in a sector which is a multiple of 8.

That, obviously, wasn’t the case with the old partitioning on my first disk. So I had to manually create partitions by specifying start points manually and making sure they are dividable by 8.Otherwise fdisk would complain about the layout on the disk.This does not work with cfdisk, because it does not accept manual alignment parameters and unfortunately the partitions it creates do have a wrong alignment. So good old fdisk and some calculations how many sectors are needed and where to start, to the rescue.

So the layout is now:

Device Boot      Start         End      Blocks   Id  System
/dev/sdb1            2048      291154      144553+  fd  Linux raid autodetect
/dev/sdb2          291160   625139334   312424087+  fd  Linux raid autodetect

On Debian discussions

In my article „Last time I’ve used network-manager“ I made a claim for which I’ve been criticized by some people, including Stefano, our current (and just re-elected) DPL.  I said that a certain pattern, which showed up in a certain thread, were a prototype for discussions in the Debian surroundings.

Actually I have to commit, that this was a very generalizing statement, making my own point against the discussion point back directly at myself.
Because as Stefano said correctly there has been some progress in the Debian discussion cult.
Indeed, there are examples of threads, were discussions followed another scheme.
But to my own advocacy I have to say that such changes are like little plants (in the botanical sense). They take their time to grow and as long as they are so very new, they are very vulnerable to all small interruptions. Regardless of how tiny those interruptions may seem.

I’ve been following Debian discussions for 6 or 7 years. That scheme I was describing was that which had the most visibility of all Debian discussions. Almost every discussion which were important for a broader audience followed that scheme. It has a reason that Debian is famous for flamewars.
In a way its quiet similar to the network-manager perception, some people have. Negative impressions manifest themselves. Especially if they have years of time.
Positive impressions does not have a chance to manifest themselves as long as the progress is not visible enough to survive small interruptions.

I hope that I didn’t cause to much damage with my comment, which got cited (context-less) on other sites. Hopefully the Debian discussion cult will improve further to a point where there is no difference between the examples of very good, constructive discussions we already have in some parts of the project and the project-wide decision-making-discussions which affect a broad audience and often led to flamewars.

Directory-dependent shell configuration with zsh (Update)

For a while I’ve been struggling with a little itch. I’m using my company notebook for company work and for Debian related stuff. Now, whenever I switch between those two contexts, I had to manually fix the environment configuration. This is mostly related to environment variables, because tools like dch et cetera rely on some, which need to be different for the different contexts, like DEBEMAIL.
A while ago I had the idea to use directory dependent configuration for that purpose, but I never found the time and mood to actually fix my itch.
Somewhere in the meanwhile I applied a quick hack („case $PWD in …; do export…; esac“) to my zsh configuration to ease the pain, but it still did not feel right.

For the impatient: Below you find a way to just use whats described here. The rest of the article just contains detailed information on how to implement something like this.

The other day I were cleaning up and extending my zsh configuration and it came to my mind again. I then thought about what my requirements are and how I could solve it. First I thought about using a ready solution, like the one in the Grml Zsh configuration, but at that point I did not remember it (it needed a hint by a co-worker *after* I finished the first version of my solution). Then I came up with my requirements:

  • Separate profile changing logic from configuration (as far as possible):I don’t want to re-dive into a script logic every time I decide to change something, like adding a variable or changing it. Generally I find a declarative approach much cleaner.
  • Avoid repeating myself
    Basically all I do when switching profiles is to change environment variables. Usually I don’t want my shell to do extraordinary things, like brewing coffee when I switch the context, so I’d like to avoid typing an „export foobar..“ for every single environment variable and every single profile.
It lead to a configuration approach as a first start. When thinking about how to represent the configuration I looked into the supported data types in zsh. zsh supports arrays, which is perfect for my need. I came up with something like that:

  „private“     „“
  „company“     „“
  „debian“      „“
  „DEBEMAIL“  „$EMAILS[debian]“
  „EMAIL“     „$EMAILS[debian]“
  „DEBEMAIL“  „$EMAILS[company]“

The next part was selecting the right profile. In the first version I used the old case logic, but it was breaking my separate logic and configuration paradigm. Approximate at this point the co-worker lead me to the grml approach, which I borrowed an idea from:

# Configure profile mappings
zstyle ‚:chpwd:profiles:*company*‘ profile company
zstyle ‚:chpwd:profiles:*debian*‘ profile debian

and the following code to lookup profile based on $PWD:

1 function detect_env_profile {
2   local profile
3   zstyle -s „:chpwd:profiles:${PWD}“ profile profile || profile=’default‘
4   profile=${(U)profile}
5   if [ „$profile“ != „$ENV_PROFILE“ ]; then
6   print „Switching to profile: $profile“
7   fi
8   ENV_PROFILE=“$profile“
9 }

For an explanation: zstyle is a zsh-builtin which is used to „define and lookup styles“, as the manpage says, or put different: Another way to store and lookup configuration values.
Its nice for my purpose, because it allows storing patterns instead of plain configuration values which can be compared against $PWD easily with all of the zsh globbing magic. This is basically whats done in line 3. zstyle then sets $profile to the matching zstyle configuration in the :chpwd:profiles: context or to ‚default‘ if no matching zstyle is found.

The (almost) last part is putting it together with code to switch the profile:

1 function switch_environment_profiles {
2   detect_env_profile
3   config_key=“ENV_$ENV_PROFILE“
4   for key value in ${(kvP)config_key}; do
5     export $key=$value
6   done

The only non-obvious part in this are lines 3 and 4. Remember, the profiles were defined as ENV_PROFILE, where PROFILE is the name of the profile. We cannot know that key in advance, therefore we have to construct the right environment variable from the result of detect_env_profile. We do that in line 3 and lookup this environment variable in line 4.
The deciding aspect for that is the P-flag in the parameter expansion. It tells zsh that we do not want the value of $config_key, but instead the value of $WHATEVER_CONFIG_KEY_EXPANDS_TO.
The other flags k and v tell zsh that, from the array, we want both: keys and values. If we’d omitted those flags it would have given us the values only.
We then loop over that to configure the environment. Easy, hu?

We would be finished, if this would do anything. The code above needs to be called. Lucky for us thats pretty easy to achieve, as zsh has a hook for when a directory is changed. Making all this work is simply a matter of adding something like this:

function chpwd() {

Now, one could say, that the solution in the grml configuration has an advantage. It allows calling arbitrary commands on profile changing, which might be useful to *unset* variables in a given profile or whatever you could think of.
Well, its a matter of three lines to extend the above code for that feature:

# Taken from grml zshrc, allow chpwd_profile_functions()
if (( ${+functions[chpwd_profile_$ENV_PROFILE]} )) ; then

to the end of switch_environment_profiles and now its possible to additionall
y add a function chpwd_profile_PROFILE which is called whenever the profile is changed to that profile.

USAGE: I have put the functions into a file which can be included into your zsh configuration, which can be found on github.
Please see this README and the comments in the file itself on further usage instructions.

password-gorilla ACCEPTED into unstable

The password-gorilla package has lacked some love since a while and at some point in time I orphaned it.
That happened due to the fact, that the upstream author was pretty unresponsive and inactive and my own TCL skills are very limited. As a result password-gorilla package was in a bad state, at least from a user point of view, with several (apparently) random happening error message and alike, stalling feature development etc.

But in the meanwhile there was a promising event arising. A guy, named Zbigniew Diaczyszyn, wrote me a mail that he intended to continue upstream development. Well, meanwhile is kind of an understatement. That first mail already happened in December 2009. And he asked me, if I’d like to continue maintaining password-gorilla in Debian. I agreed to that, but as promising as it sounded to have a new upstream, I was not sure if that would work out. However: My doubt were not justified.

In the time between 2009 and now Zbigniew managed to become the official upstream (with the accreditation of the previous upstream), create a github project for it and make several releases.

I know there are several people out there who tested password-gorilla. I know there were magazine reviews including the old version, which were a bit buggy with recent tcl/tk versions. It made a quiet good multi-platform password manager, with support for very common password file formats, stand in a bad light.
I recommend previous users of password-gorilla to try the new version, which recently has been
uploaded to unstable.

Last time I’ve used network-manager..

Theres an ongoing thread on the Debian mailing lists about making network-manager installed by default on new Debian installations. I won’t say much about the thread. Its just a prototype example for Debian project discussions: Discuss everything to death and if its dead discuss a little more. And – very important – always restate the same arguments as often as you can. Or if its not your own argument you restate, restate the arguments of others. Ending with 100 times stated the same argument. Even if its already disproved.

I don’t have a strong opinion about the topic in itself. However there is something I find kinda funny. A statement brought up by the people who strongly oppose network-manager as a default.
A statetement I’ve heard so often that I can’t count it anymore.

The last time I’ve tried network-manager it sucked.

It often comes in different masquerades, like:

  • network-manager is crap.
  • network-manager is totally unusable
  • network-manager does not even manage to keep the network connection during upgrades
But it basically boils down to that basic essence of the sentence I’ve written above. Sometimes I ask people who express this opinion a simple question:

When did you test network-manager the last time?

The answers are different but again the basic essence of the answers is mostly the same (even if people would never say it that way):

A long time ago. Must have been around Etch.

And guess what: There was a time when I had a similar opinion. Must have been around Etch.
During the life cycle of network-manager between Etch and now a lot has happened. I restarted using network-manager at some point of the Lenny development.
My daily driver for the management of my network connections on my notebook. Yes, together with ifupdown because, yes, network-manager does not support every possible network-setup with all of the special cases possible. But it supports auto-configuring of wired and wireless devices. Connecting to a new encrypted network, either in a WLAN or in a 802.1x LAN, using UMTS devices, using tethering with a smart phone. And everything: on a few mouse-clicks.
Yes, it had some rough edges in that life cycle. Yes, it had that nasty upgrade bug, which was very annoying.
But face it: It developed a lot. Here are some numbers:
Diffstat between the etch version and the lenny version:
 362 files changed, 36589 insertions(+), 36684 deletions(-)
Diffstat between the Lenny version and the current version in sid:
 763 files changed, 112713 insertions(+), 56361 deletions(-)
The upgrade bug has been solved recently. Late. But better late then never.
So what does that mean? It means that, if your last network-manager experience was made with Lenny or even worse around Etch, you should better give it another try, if you are interested in knowing what you talk about. For now it seems that a lot of people do not know. Not even in a distance.

Let me introduce DPKG::Log and dpkg-report

We have customers, which require a report about what we’ve done during maintenance windows. Usually this includes a report about upgrades, newly installed packages etc. and obviously everything we’ve done apart from that.
Till now we’ve prepared them manually. For a greater bunch of systems this is a big PITA, because to be somehow useful you have to collect the data for all systems and after that, prepare a report where you have:

  • The upgrades done on all systems (e.g. a libc or kernel update)

    seperated from

  • the updates specific to a certain system or system class

Its also error-prone because humans make mistakes.

Perl to the rescue!
At least the part about generating a report about installed/upgraded/removed packages could be automated, because dpkg writes a well-formed logfile in /var/log/dpkg.log. But I noticed that there appearently is no library specialised at parsing that file. Its not a big deal, because the format of that file is really simple, but a proper library would be nice anyway.
And so I wrote such a library.
It basically takes a logfile, reads it line by line and stores each line parameterized into a generic object.

Its features include:

  • Parsing the logfile (obviously)
  • Storing each line in a logfile as a DPKG::Log::Entry object that holds informations about the entry type (e.g. action or status update), the package associated (if any), timestamp, verbatim line, line number etc.
  • Limiting the number of lines parsed to a time range specified by DateTime objects

Based on that, I wrote another library DPKG::Log::Analyse, which takes a log, parses it with DPKG::Log and then extracts the more relevant information such as installed packages, upgraded packages, removed packages etc.

This, in turn, features:
– Info about newly installed packages
– Info about removed packages
– Info about upgraded packages
– Info about packages which got installed and removed again
– Info about packages which stayed halfinstalled or halfconfigured at the end of the logfile (or defined reporting period)

These libraries are already uploaded to CPAN and packaged for Debian.
They passed the NEW queue very quickly and are therefore available for Sid:

As an example use (and for my own use case, as stated above), I wrote dpkg-report, which uses the module and a Template::Toolkit based template to generate a report about what happened in the given logfile.
It currently misses some documentation, but it works somehow like this:

Report for a single host over the full log:


Report for a single host for the last two days:

dpkg-report –last-two-days

Report for multiple hosts (and logfiles):
The script expects that each log file has the name .dpkg.log so that it can guess the hostname from the system and can grab all such log files from a directory if a directory is specified as log-file arguments:

dpkg-report –log-file /path/to/logs

This will generate a report about all systems without any merging.

Report for multiple hosts with merging:

dpkg-report –log-file /path/to/logs –merge

This will do the following:

  • Report packages which where installed/removed/etc. on all systems seperate from the upgrades only done on the specific systems
  • If systems start with a common name and end on a number (e.g. mail1, mail2, mail3 etc.), report packages which where installed on all systems with that common name seperate from the upgrades only done on the specific systems
  • For a specific system list only the changes on that particular system and nothing else.

A (fictive) report could look some what like this:

dpkg-Report for all:
Newly Installed:
bar (0.99-12 – 1.00-1)
dpkg-Report for test*:
Newly Installed:
dpkg-Report for test1:
baz (1.0.0-1 -> 1.0.0-2)
dpkg-Report for test2:
zab (0.9.7 -> 0.9.8)

Currently this report generator is only included in the source package or in the git(hub) repository of the library. I wonder if it makes sense to let the source package build another binary package for it.
But its only a 238 lines perl script with a dependency on the perl library so I’m unsure if it warrants a new binary package. What do others think?

FAI, my notebook and me

I use to take my (company) notebook with me on business travels.
Two times I now had the unlucky situation that something bad happened to it on such an occassion. Whenever you get in the situation that you need to reinstall your system in a hotel room you’ll might have the same wish that I got: A way to quickly bring the system in a state where I could work with it.

Well, I used FAI a while back for a customer. Its a real great tool for automated installations and I really prefer it over debian-installer preseeding. Apart from the fact that the partitioning is way easier it also gives me the power to complete the whole installation up to a point where I’ve got almost nothing to do anymore. It also features an installation completely from CD or USB-Stick which makes it suitable for me.

However, my notebook installation has a little „caveat“ which made that a little bit more harder as previously thought. As it is a notebook and I carry company data on it it has to be encrypted. Disk encryption at a whole.
The stable FAI version does not support this.
The problem is: The current support for crypto in setup-storage (FAIs disk setup tool) is not very far. Supported is the creating of a LUKS container with a keyfile, saving this keyfile to the FAI $LOGDIR and creating a crypttab.
Unfortunately for a root filesystem this would leave us with an unbootable system, because this requires manual interaction. And on the other hand using a keyfile for a cryptoroot is a no-go anyway. We want a passphrase.
On a side-note: cryptoroot support with a keyfile is more complex than with a passphrase, as you have to provide a script that knows how to get to the key.

So I started experiments with scripts in the FAI-configuration that added a passphrase, changed the crypttab and recreated the crypttab. That worked, although it was very ugly.
 But due to a good coorperation with Michael Tautschnig, a FAI- and Debian-Developer, on this, the FAI experimental version 4.0~beta2+experimental18
now supports LUKS-volumes with a passphrase that can be specified in the disk_config.

Now its actually possible to setup a system like mine with FAI out-of-the-box. One thing (apart from the FAI configuration and setup as you want and need it) has to be done, anyway:
The initrd support of cryptsetup requires busybox (otherwise you will see a lot of „command not found“ errors and you system won’t boot) and it requires initramfs-tools, which is standard nowadays.
So you have to make sure that these packages are in your package config!

So now I can define a FAI-profile for my notebook, create a partial fai mirror with the packages it needs and put all this together on an USB stick with fai-cd (don’t worry about the name, it can be used to create ISO images as well). I can carry this with me and if I need it I stick it into my notebook and let FAI automatically reinstall my system. Nice 🙂

Update: Somebody asked me, weither he understood me right, that I’d put my LUKS passphrase on a FAI usbstick clear-text. Obviously, the answer is and should be NO. What I do and what I’d suggest to others: Use a default passphrase in the FAI configuration, install with it – after all on a fresh installation there is not much to protect – and once it is finished *change* the passphrase to something secure by adding a new keyslot and removing the old.

Getting the mp3 mess into control

I have some mp3 files in my collection. Some years ago, back in the times, when I was a Windows-User, I used „The Godfather„, to keep the chaos under control. This software, although not Open Source, is really good at what it does. And it does almost everything to organize mp3s. From (mass-)tagging to auto-renaming to auto-sorting your mp3s. But this wouldn’t be a blog entry by me if this became… a praise to a windows software so..

When I became a daily Linux User I searched for alternatives. With GUI or without a GUI, but I did not really find an application that suited my needs. As far as tagging and renaming was concerned it wasn’t that hard. There are really excellent command line tools (lltag, id3, etc.) and rumours have been heard that there are also GUI tools.
But what I still did not find is a simple, yet flexible, tool to sort a huge collection of mp3s into a flexible structure on the filesystem like „The Godfather“ was able to do since.. ehh.. a lot of years.

So the day before yesterday I finally decided to write one on my own and came up with an ~ 280 lines perl script (POD-documentation included) which does exactly what I want and is simple.
It does no tagging.
It does no renaming.
All it does is sorting mp3s into a given template-based directory hierarchy based on their ID3 tag.

At this point a warning is due. I’m sharing that script with you under the terms of the GPL. BUT it still needs testing and therefore you probably do not want to use it without its safety measures (e.g. dry-run or copy instead of move) to avoid data loss. And notice that although it has been written with portability in mind, it has only been tested on a Debian GNU/Linux system.

The script is hosted at github. Here (or raw to download it directly).

After I’ve talked to a co-worker about the script, he told me that arename would do what I want. So you probably don’t want to use my script, because arename is probably more tested and much more sophisticated. But on the other hand I had a quick look at the arename manpage and its so utterly feature-loaded, that it cannot avoid a certain complexicity. My tool is simple.
And there is another advantage. If you want arename to handle a certain amount of mp3s you have to use your shell magic to find the files and pipe it to arename. My script finds all mp3s recursively from where you let it start (default is $PWD so be careful) and will happily move it into a hierarchy under a directory you ask it to. For the simple job of sorting mp3s its probably easier to use.

(Oh and if its worth nothing, it might still be of use as a simple programming example, how one could solve this problem in Perl)