Posts Tagged ‘BSD’

Here a core, there a core, everywhere a core-core!

Wednesday, April 4th, 2007

Over at CompSoc (where this blog is hosted) we’ve had some issues with PHP for some time now.

PHP executes code as it should, but as it exits it was dumping core on signal 11.

I’ve probed this once before and spotted that the backtrace pointed fingers at the session module.

I decided to have another look at this today and, armed with the knowledge that PHP was dumping core unloading session.so, I did some searching.

It turns out that simply moving extension=session.so to the top of extensions.ini provides a fix.

I have to say I’m most unimpressed with that.

CompSoc downtime

Thursday, February 8th, 2007

As a follow-up to the last post about /var filling up, here’s another one that’s equally as crazy.

In an effort to fix /var once and for all I scheduled some emergency downtime last night. The aim was to make the /var partition bigger. This got off to a pretty bad start when, in true Solaris fashion, I attempted to drop noisy down to single user by typing init 0. In Solaris this drops a SPARC system to the OBP (like the BIOS), and just reboots x86 machines. In FreeBSD runlevel 0 is equivalent to Solaris runlevel 5… shut the system down and then power it off.

So, at about 22:15 last night I switched the primary CompSoc server off. Hardly the fix I was looking for.

After a number of calls to Andy and Inti I had somebody switch it back on (bear in mind that the system is in Manchester and I’m near London)... in an attempt to minimise the downtime, I decided to do the fix as soon as it came back up. This time I got the right runlevel… init 1, but (as I realise now) all sorts of crazy stuff happens to FreeBSD’s serial redirection in single user mode. It appears to knock off serial support and output only to the video console. Again, not a lot of good for me.

Later on in the day somebody else rebooted it and after a number of attempts to get things working, I decided that I’d do the fix in full multi-user mode. This involved disabling logins, stopping almost all services, etc. More lsof was used to determine what was using /var; these were stopped and when there were no open filehandles I umount -f‘d /var.

I dumped the contents of /var to a different disk and set about updating the disklabel.

# /dev/ad0s1:
8 partitions:
	
  1. size offset fstype [fsize bsize bps/cpg] a: 1048576 0 4.2BSD 2048 16384 8 b: 8388608 1048576 swap c: 156296322 0 unused 0 0 # "raw" part, don't edit d: 1048576 9437184 4.2BSD 2048 16384 8 e: 52428800 10485760 4.2BSD 2048 16384 28552 f: 93381762 62914560 4.2BSD 2048 16384 28552

Above is the disklabel before the change… a is /, d is /var, e is /usr and f is /backup2. What I needed to do was grow d (currently just 512MB) by using some of the space from f, which was an unused backup directory. The obvious problem here is that /usr was in the way. My solution was to grow swap by 512MB, totally remove the d line, shrink f to around 8GB and rename it to d. This sounds a little complicated… it took me a while to get my head around it.

Prior to the change the on-disk layout was something like:

[ a (/) ] [ b (swap) ] [ d (/var) ] [ e (/usr) ] [ f (/backup2) ]

Now that I’ve made the changes the on-disk layout is more like:

[ a (/) ] [ b (bigger swap) ] [ e (/usr) ] [ d (/var) ]

The bsdlabel currently looks like:

# /dev/ad0s1:
8 partitions:
		
  • size offset fstype [fsize bsize bps/cpg] a: 1048576 0 4.2BSD 2048 16384 8 b: 9437184 1048576 swap c: 156296322 0 unused 0 0 # "raw" part, don't edit d: 16777216 62914560 4.2BSD 2048 16384 28552 e: 52428800 10485760 4.2BSD 2048 16384 28552
  • The beauty of this (as far as I was concerned) was that everything was still contiguous, no holes and no changing of slice letters. Next step was to newfs the new /var, mount it and restore the contents from the file on the other disk I previously mentioned. No major problems here, although I did manage to restore the contents of /var to both / and my personal home directory. Fortunately this mess was easy to clear up.

    So, with all of the files back, I rebooted the box. It didn’t come back up.

    After a lot of time talking Inti through the console (which I couldn’t get, because the machine was having none of single-user mode serial) we discovered that the only reason the system wouldn’t boot was because I hadn’t removed the /backup2 entry from /etc/fstab! D’oh! A rookie mistake (but one that I always make).

    Once we got this removed the system shot up. Allow a few more hours to get both bump and noisy up with LDAP working and we once again have a fully running CompSoc.

    It certainly didn’t go as planned, but I believe the end result is a good one:

    # df -h
    Filesystem     Size    Used   Avail Capacity  Mounted on
    /dev/ad0s1a    496M    383M     73M    84%    /
    devfs          1.0K    1.0K      0B   100%    /dev
    /dev/ad0s1e     24G     20G    2.0G    91%    /usr
    /dev/ad0s1d    7.7G    160M    7.0G     2%    /var
    /dev/da0       541G    190G    308G    38%    /data
    linprocfs      4.0K    4.0K      0B   100%    /usr/compat/linux/proc
    procfs         4.0K    4.0K      0B   100%    /proc
    devfs          1.0K    1.0K      0B   100%    /var/named/dev

    We really need to work on getting serial output from FreeBSD working properly, not to mention installing a new network card so that we can use the internal 10/100 interface for IPMI, which will allow us serial-over-LAN and full remote power capabilities.

    When I got home at 7PM I treated myself with a curry and an episode of Prison Break.

    Apologies to anybody that was affected by the downtime!

    Help! /var is shagged!

    Wednesday, February 7th, 2007

    The last couple of days have involved some seriously weird /var behaviour over at CompSoc.

    I’d narrowed the filling up of /var down to Apache’s error log, and had been removing it more or less twice a day. Today I disabled the error.log file and restarted Apache… a measure designed to last us until I had chance to properly fix the issue.

    This evening I come along and have a prod:

    # df -h /var
    Filesystem     Size    Used   Avail Capacity  Mounted on
    /dev/ad0s1d    496M    488M    -32M   107%    /var
    Yikes! That’s not good, but… what’s this?
    # du -sh /var
    143M    /var

    wtf? This got me pretty confused, but some searching around and I came upon a handy article at www.cyberciti.biz/tips/freebsd-why-command-df-and-du-reports-different-output.html that led me down the path of lsof. Here’s what I did:

    # lsof|grep var | sort -r -k 7
    lsof: WARNING: compiled for FreeBSD release 6.0-RELEASE-p6; this is 6.2-RELEASE.
    httpd     91413          root   11w    VREG       0,92  361496571    32972 /var (/dev/ad0s1d)
    httpd     10028           www   11w    VREG       0,92  361496571    32972 /var (/dev/ad0s1d)
    httpd     10022           www   11w    VREG       0,92  361496571    32972 /var (/dev/ad0s1d)
    httpd     10021           www   11w    VREG       0,92  361496571    32972 /var (/dev/ad0s1d)
    httpd      9973           www   11w    VREG       0,92  361496571    32972 /var (/dev/ad0s1d)
    httpd      9881           www   11w    VREG       0,92  361496571    32972 /var (/dev/ad0s1d)
    httpd      9865           www   11w    VREG       0,92  361496571    32972 /var (/dev/ad0s1d)
    httpd      9809           www   11w    VREG       0,92  361496571    32972 /var (/dev/ad0s1d)
    httpd      9807           www   11w    VREG       0,92  361496571    32972 /var (/dev/ad0s1d)
    httpd      9654           www   11w    VREG       0,92  361496571    32972 /var (/dev/ad0s1d)
    httpd      7245           www   11w    VREG       0,92  361496571    32972 /var (/dev/ad0s1d)
    httpd      6896           www   11w    VREG       0,92  361496571    32972 /var (/dev/ad0s1d)
    httpd      5909           www   11w    VREG       0,92  361496571    32972 /var (/dev/ad0s1d)
    syslogd    9972          root   19w    VREG       0,85  110849268 58880006 /data/var/log/all.log
    httpd     91413          root   14w    VREG       0,92    4767735    32988 /var/log/httpd/access.log
    httpd     91413          root   13w    VREG       0,92    4767735    32988 /var/log/httpd/access.log
    httpd     10028           www   14w    VREG       0,92    4767735    32988 /var/log/httpd/access.log
    httpd     10028           www   13w    VREG       0,92    4767735    32988 /var/log/httpd/access.log
    [snip]

    The first few are just fine, but what the hell is Apache doing with /var/log/httpd/access.log? It’s clearly not that big…

    Anyway, a quick restart of Apache and /var returned to a far more reasonable size:

    # df -h /var
    Filesystem     Size    Used   Avail Capacity  Mounted on
    /dev/ad0s1d    496M    144M    312M    31%    /var

    Panic over… for now.

    FreeBSD 6

    Wednesday, July 13th, 2005

    With a little spare time today I’ve had a look at the latest information on FreeBSD 6.

    There are some great new things being introduced—WPA authentication, sound fixups, new dhclient and generally much greater stability. From what I’ve read I’m really liking how 6 is shaping up and with any luck I’ll give it a whirl when I get back to England.

    I don’t yet know of the status of Beagle and other Mono-based apps, but I do know Mono is supported in the ports repository, so that’s a good start.

    Maybe it’s not quite up to Ubuntu in terms of user-friendliness, but I certainly believe it has a great many other redeeming features.