Blogs.sun.com blogs

One of my (many) pet peeves are shell scripts that fail to delete any temporary files they use. Included in this pet peeve are shell scripts that create more temporary files than they absolutely need, in most cases the number is 0 but there are a few cases where you really do need a temporary file but if it is temproary make sure you always delete the file.

The trick here is to use the EXIT trap handler to delete the file. That way if your script is killed (unless it is kill with SIGKILL) it will still clean up. Since you will be using mktemp(1) to create your temporary file and you want to minimize any race condition where the file could be left around you need to do (korn shell):

trap '${TMPFILE:+rm ${TMPFILE}}' EXIT

TMPFILE=$(mktemp /tmp/$0.temp.XXXXXX)

if further down the script you delete or rename the file all you have to do is unset TMPFILE eg:

mv $TMPFILE /etc/foo && unset TMPFILE
As noted here, I have been reading two books for a couple of months and not made much progress in recent times. My progress through books under consideration tend to be directly proportional to the amount of time spent traveling on trains or plains and I have not been doing that much of either for the last few month, hence the slow progress. A bit more focus and a train trip means I finished one of them last week.

Beautiful Security is a collection of 16 chapters written by 16 different people(s) with 16 different perspectives on 16 different aspects of security. This means there is no common thread other than it is about computer security. In my view this is no bad thing.

I think my favorite chapter was "The evolution of PGP's web of trust" by Phil Zimmermann and Jon Callas. The history and insight into the design decisions was really interesting. I also enjoyed the 1st chapter by Peiter Zatko on "Psychological Security Traps".

My interest in Computer Security got triggered about 6 months ago when I got cornered into helping 2 farmers run their PC and laptop. The virus and malware problems were just stunning. Work also had a few triggers (if you work for Sun ask me about the "find" incident) and this book has been very good at giving a informed view on 16 different areas of computer security.

After a couple of months off races, I am really looking forward to the Cardington Cracker in a couple of weeks.

Gregynog is a Country house near Newtown which was left to the University of Wales. For the last 13 years it has hosted a weekend of interview skills for 2nd Year Computer Science students from Aberystwyth University, of which I have managed to miss 2.

This year Paul Humphreys and myself ran objective setting sessions. A bit like life coaching, but without the 100 quid an hour overhead.

So good luck to those who's objectives included

  • Propose to his girl friends (maybe I should have pointed him to the Kepner Tregoe process for decision analysis)
  • Eat a baby dolphin
  • Stop smoking (not sure what, best not to ask)
  • Get an industrial year which involves Android
  • Finish their assignment by an appropriate date
  • Write code to do xxxxx in C,C++, Haskal, Perl, Python, etc and put to on their blog.

it takes all sorts to make a world. Many of the outcomes could have been tighter and better clarified, but it was an exercise in "How"

Both Paul and I also set out our list of 10 each, so I am off to finish my 2 books I have been reading since March and Paul will have dug manure into his allotment if it ever stops raining.

If you share a file system using the CIFS server (not SAMBA) and create a file in that file system using Windows XP the file ends up with these strange permissions and an ACL like this:

: pearson FSS 12 $; ls -vd Bad
d---------+  2 cjg      staff          2 Nov 13 17:11 Bad
     0:user:cjg:list_directory/read_data/add_file/write_data/add_subdirectory
         /append_data/read_xattr/write_xattr/execute/delete_child
         /read_attributes/write_attributes/delete/read_acl/write_acl
         /write_owner/synchronize:allow
     1:group:2147483648:list_directory/read_data/add_file/write_data

         /add_subdirectory/append_data/read_xattr/write_xattr/execute

         /delete_child/read_attributes/write_attributes/delete/read_acl

         /write_acl/write_owner/synchronize:allow

: pearson FSS 13 $; 


The first thing that riles UNIX some users is the lack of any file permissions, although things seem to work fine. The strange group ACL is for the local WINDOWS SYSTEM group. However the odd thing is for me it renders iTunes on the Windows system unable to see the files that it has created.

The solution is to add a default ACL to the root of the file system (well to every object in the file system if the file system is not new) that looks like this:

A+owner@:full_set:fd:allow,everyone@:read_set/execute:fd:allow

So this has the rather pleasant side effect of setting the UNIX permissions to something more recognisable:

: pearson FSS 20 $; ls -vd Good
drwxr-xr-x+  2 cjg      staff          2 Nov 13 18:16 Good
     0:owner@:list_directory/read_data/add_file/write_data/add_subdirectory
         /append_data/read_xattr/write_xattr/execute/delete_child
         /read_attributes/write_attributes/delete/read_acl/write_acl
         /write_owner/synchronize:file_inherit/dir_inherit/inherited:allow
     1:everyone@:list_directory/read_data/read_xattr/execute/read_attributes
         /read_acl:file_inherit/dir_inherit/inherited:allow
: pearson FSS 21 $; 

and the even more pleasant side effect of making iTunes works again!

I've said many times that dtrace is not just a wonderful tool for developers and performance gurus. The Kings of Computing, which are of course System Admins, also find it really useful.

There is an ancient version of make called Parallel make that occasionally suffers from a bug (1223984) where it gets into a loop like this:

waitid(P_ALL, 0, 0x08047270, WEXITED|WTRAPPED)	Err#10 ECHILD
alarm(0)					= 30
alarm(30)					= 0
waitid(P_ALL, 0, 0x08047270, WEXITED|WTRAPPED)	Err#10 ECHILD
alarm(0)					= 30
alarm(30)					= 0
waitid(P_ALL, 0, 0x08047270, WEXITED|WTRAPPED)	Err#10 ECHILD

This will then consume a CPU and the users CPU shares. The application is never going to be fixed so the normal advice is not to use it. However since it can be NFS mounted from anywhere I can't reliably delete all copies of it so occasionally we will see run away processes on our build server.

It turns out this is a snip to fix with dtrace. Simply look for cases where the wait system call returns an error and errno is set to ECHILD (10) and if that happens 10 times in a row for the same process and that process does not call fork then stop the process.

The script is simple enough for me to just do it on the command line:


# dtrace -wn 'syscall::waitsys:return / arg1 <= 0 && 
execname == "make.bin" && errno == 10  && waitcount[pid]++ > 20 / {

	stop();

	printf("uid %d pid %d", uid, pid) }

syscall::forksys:return / arg1 > 0 / { waitcount[pid] = 0 }'
dtrace: description 'syscall::waitsys:return ' matched 2 probes
dtrace: allowing destructive actions
CPU     ID                    FUNCTION:NAME
  2  20588                   waitsys:return uid 36580 pid 29252
  3  20588                   waitsys:return uid 36580 pid 2522
  5  20588                   waitsys:return uid 36580 pid 28663
  7  20588                   waitsys:return uid 36580 pid 29884
 10  20588                   waitsys:return uid 36580 pid 941
 15  20588                   waitsys:return uid 36580 pid 1098

This was way easier then messing around with prstat, truss and pstop!

At the request of the users the access hours for Sun Ray users in the house have been relaxed so that on Friday and Saturday nights the Sun Ray's in bedrooms can be used later.

This required that the access hour script be updated to understand the day of the week and hence the access_hour file also is updated in an incompatible way. There is now an extra column representing the days of the week when the rule is applied as the first column after the name of the user. The day of the week field will take a wild card '*' or ranges (1-5) for Monday to Friday, or lists (1,3,5). Sunday is day 0 as any self respecting geek would have it.

The new access_file I have looks something like this:

    user0:0-4:0001:2300:P8.00144f7dc383
    
    user2:0-4:0630:2300
    
    user3:0-4:0630:2230
    
    user4:0-4:0630:2100
    
    user4:5-6:0630:2200
    

The script is still here: http://blogs.sun.com/chrisg/resource/check_access_hours

I spent the last 2 days at a customer site in the south east of England. On my way home last night I decided to explore a route up a mountain called The Blorenge. I did not take any pictures, though I am sure the view would have been great if it was light. Being Novemeber the 5th, I felt the youff of Abergaveny let me down somewhat with few fireworks going off.

The Blorenge from the north is just over 500 meters of ascent, some of which is up a old mine works incline and some on open hill side. Nearly all of it is steep, so until the top plateau there was little I was able to run. Still a great hill training venue which is quite reasonable to do at night. Indeed, I was quite surprised to see an other set of lights out running who clearly knew an easier/better way down than straight back down the north face. I really missed my Mudclaws for the 1st 100m of descent.

Many thanks to Martin Beal and his blog for the idea. I have passed 100's of time to my shame, but never though of using it as a training ground and an excuse to break the drive home up. So if Martin at the top of end of the sport can do the ascent in 21.5 minutes, those of us at the other end might find 30 minutes a good target. Last night the ascent took about 45 minutes to the plateau, but some of that was spent reading the route description and looking for the track in the dark.

Spent part of my day off doing 2 reps of Cader Idris getting in just over 6000ft of ascent and 11 miles. Pictures are not great as I used a phone, choosing to leave the camera behind to save weight. The view from the top was hazzy, so did not bother to take any more. Still, very nice morning out for mid October.

12 years ago to the day I turned up 90 minutes late for my 1st day in the OS Group (clearly a lad from the country who underestimated M25 traffic by quite a lot) at Watchmoor Park. So lets play one of those NLP type positive affirmation games listing the 1st 12 things which come into my head about what has been great about the last 12 years at Sun and hence in no particular order
  • CS-CTE, the people, the legacy(not just the mid week beers email alias) and its unspoken philosophy for solving hard problems broadly similar to a synthesis of Von Neumann, Buddhism, a curious Jack Russell Terrier and The Rammones. If it was a religion, I would devote my life to it.
  • Rob Hulme
  • People called Phil, Andrew and Wayne who appeared at 1st meeting to have larger than life minerals, but in reality don't.
  • DTrace
  • Hyperactive children called White, Nash and Haslam
  • ATS/SGRT/SR/Rational Process/KT
  • The Melanson and Gardiner show
  • The post cuddly OS Group
  • Punjabi National Bank
  • Being an OS Ambassador before the program was castrated and had it limbs removed.
  • UK Academic customers, even the ones who expect you to commit Harakiri for the C compiler being removed from Solaris 2.0
  • Work from Home
Looking forward to more of the same, just different as the scale of change steps up a notch or two.
Nothing to do with David Lynch's rather strange early 90's TV series, but a short local mixed terrain race in Aberystwyth.

I need to stop doing the Aber Twin Peaks race. Despite the history I ran past without knowing about most of it and the marshaling, organization, en-route support, etc and goody bag all being top notch, at 7 miles and 1000ft, it is probably incompatible to train for and race in events like Highland Fling and the Nant Peris Horseshoe given my own unique combination of natural ability (not) and time to train. My time was about 10 seconds faster than last year at just under 62 minutes. So a year of training and not much progress? What did strike me was that after 5 minutes and a cup of tea I felt like I could have started running again. Last year I was a wreak for the rest of the day.

What I suspect has happened is my VO2 max has remained much the same and the threshold between where the body chooses the form of fuel (Glycogen or fat) has improved. The later not being a significant part of this race as the body typically has over an hours Glycogen store. I am much better at hill climbing and longer distances than I was this time last year.

So you get what you train for and I will continue following my running interest of longer mountain based races, but I think I will start to include a flat speed (all things are relative) session at least once every 2 weeks in my training.

I worked on a farm as my summer job while I was at School and University, so I feel just in claiming some agricultural heritage by sweat and toil if not by decent. About once every 3 month I buy a copy of Farmers Weekly to keep some background in what is going on in agriculture.

Buying a copy of Farmers Weekly in mid-Wales has no shame attached to it. I bought a copy titled alone the lines of "Railway Enthusiast" for my 3 year old who is mad about trains. While I did not comment, I felt the need to tell the chap in W H Smiths in Manchester Airport who I had never seen before or will ever see again that it was not for me. Still small man has got my monies worth out of the train mag.

This weeks copy of F. W. has a Opinion article by a Norfolk farmer called David Richardson who comes across as red faced, older version of Ben Goldacre in wellies which is probably a compliment. I suspect he would be quite happy if certain types of politicians were culled from politics.

Sound Science, however, is about evidence and research. It examines all possibilities. For that reason a scientist will never concede any product or process is 100% safe. They will admit while existing knowledge shows something is 99.999% OK, there is an outside chance something may be discovered that prevents 100% designation. It is this reluctance towards absolutism that make then vulnerable to criticism by some.
My experience of mid-Wales farmers is a tendency to pick and choose from what science has to offer as it suits them, but that is probably a mirror of the population in general and our training to believe what the media tell us on science without questioning the quality of the copy.

Also in this issue was the Farmers Weekly's Awards 2009. A good example of an exception to David Richardson's wise words where Elin Jones the local A.M. (Welsh Assembly Government Member), and Christianne Glossop, the Welsh Chief Vet, have used a combined science and common sense approach to the very real problem of Bovine TB. There is no single "black and white" right course of action around this subject, but a complex set of tradeoffs and risks. Some may say it is political suicide for Elin won't be able to count on the support of the voting Badger population in Ceredigion any longer.

Back in the equally whacky world of diagnosis of Computer System problems, claiming to be 100% sure of the root cause or fix of a problem is leading indicator for knowing too little about a computer systems ecosystem to make a useful contribution. I think so at least, but am open to evidence which may change my mind on this matter.

Since the "nevada" builds of Solaris next are due to end soon and for some time the upgrade of my home server has involved more than a little bit of TLC to get it to work I will be moving to an OpenSolaris build just as soon as I can.

However before I can do this I need to make sure I have all thesoftware to provide home service. This is really a note to myself to I don't forget anything.

  • Exim Mail Transfer Agent (MTA). Since I am using certain encryption routines, virus detection and spamassassin I was unable to use the standard MTA, sendmail, when the system was originaly built and have been using exim, from blastwave. I hope to build and use exim without getting all the cruft that comes from the Blastwave packaged. So far this looks like it will be simple as OpenSolaris now has OpenSSL.

  • An imapd. Currently I have a blastwave version but again I intend to build this from scratch again the addition of OpenSSL and libcrypto should make this easy.

  • Clamav. To protect any Windows systems and to generally not pass on viri to others clamav has been scanning all incoming email. Again I will build this from scratch as I already do.

  • Spamassassin. Again I already build this for nevada so building it for OpenSolaris will be easy.

  • Ddclient. Having dynamic DNS allows me to login remotely and read email.

  • Squeezecenter. This is a big issue and in the past has proved hard to get built thanks to all the perl dependacies. It is for that reason I will continue to run it in a zone so that I don't have to trash the main system. Clearly with all my digital music loaded into the squeezecentre software this has to work.

I'm going to see if I can jump through the legal hoops that will allow me to contribute the builds to the contrib repository via Source Juicer. However as this is my spare time I don't know whether the legal reviews will be funded.

Due to the way OpenSolaris is delivered I also need to be more careful about what I install. rather than being able to choose everything. First I need my list from my laptop. Then in addtion to that I'll need

  • Samba - pkg:/SUNWsmba

  • cups - pkg:/SUNWcups

  • OpenSSL - pkg:/SUNWopenssl

Oh and I'll need the Sun Ray server software.

Long, long time since I have been in the Arenig range and the Arenig race gave a perfect excuse. Pleased with a time of 1.19:54, it was as fast as I could have managed, with a bit of a sprinting tussle at the end. At 6 miles, it was shorter than I am used to, so found the faster pace of a shorter race a bit of a struggle(you get what you train for), but the running was great and the decent was fast. The last mile along a disused railway line seemed to go on for ages. Pictures in the usual place.

Cracking race, starting in the middle of nowhere. Best soup and cakes of any race west of Offa's Dyke.

After a customer visit yesterday, I took some time out on the drive home for a jog (still tired from Sunday's race) up Moel Famau. Intersected of Offa's Dyke, I think it must have the best view in Wales. To the west Snowdon, Cader Idris, Arenig's, etc and to the east Liverpool, Manchester, Wrexham, etc. OK, so it has the best view to the West in Wales. Caught the sun setting behind Snowdon as I was starting to descend, stunning. Note to self, must remember to carry camera on such adventures. Great training area and a place it would be reasonable to run at night with a head torch.

The Sun Corp User Group meeting yesterday went well. We had a fair turn out, some good questions and I have been asked to do a couple of company specific sessions as follow on.

Sometimes you don't always give the best answer when answering cold. Once you have had a little time to mull it over you think of a possible better answer or an equally valid alternative approach(or you were just wrong the 1st time, it happens sometimes).

To the gentleman with the umount problem which fails every few mounts prior to a backup, I would also make sure the assumption about why umount is failing with a busy file. If you are getting an error message in the log which points that way, then fine, if not, it would be a good idea to get the error code of the umount2 system call when it does fail. We could use DTrace, but we could also wrap the umount command inside truss in the script

truss -t umount2 -v umount2 umount /wibble
make sure it is logged. Still think the live dump is the right way to chase this if you can script it right and don't want to sit by the machine for months on end at 2am when the backup kicks off. This is still the wrong answer as ZFS snapshots would allow you to avoid the umount altogether.

To the people concerned about Oracle performance on ZFS, the offer of 30 minutes or so of SharedShell (http://www.sun.com/123) still stands. One question I should of also asked is if the ZFS intent log is split out and on separate fast storage.

On the way home I went for a run near New Rador. Very nice area indeed and great running. I did come across a flag pole with a red flag and some signage which discussed explosives and this being a restricted area. From later discussions with a local this was a test range and people with metal detectors turn up, hoist the flag and dispose of the various bits of scattered ordnance and you really know when they are there.

I am staying to the paths next time. The thought does not appeal of either arriving on a world war 2 morter shell at the pearly gates or being an minor character in 1 episode of Old Harry's Game who spent eternity debugging Windows and Linux performance issues with the promise of DTrace port always a few days away.


View New Radnor in a larger map

In spite of the possible objective danger, some great running but I am keeping to the paths until the clearance work is complete in 2011.

My typical Sunday afternoon training regime has been more along the lines of I have 2 hours, go out and run and lets see where we end up before we (man plus dog) need to turn around. Somewhat more structured this afternoon with 3 reps up Pumlumon from the west. Only about 6 miles in total. The up bits were mostly runnable, the down bits were soft and fast and the flat bits were absent. Each rep included 300m of ascent and descent. Each time I (dog stayed at home today) got to the top the weather was different. Sunny 1st time, spitting rain 2nd time and in cloud with a cold wind the 3rd ascent, all in the space of about an hour.

No pictures or timings as the battery on both the Garmin and the camera were flat, not much forward planning there then.


View Pumlumon from the west route in a larger map

In a quite different setting, on Tuesday I am presenting at the Sun Corporate User Group Breakfast Meeting in the City. More details can be found are here. I doubt porridge will be served, but there is some Coffee, pastry things and DTrace

I've finally worked out how to drive purple-url-handler. Strictly John worked it out, so I will stand on his shoulders, but for some reason it would not work for me and I now know why and have a workaround.

First you need an XMPP URI on a web page. Some thing like:

xmpp:chrisg_fans@muc.im.sun.com?join

will when clicked in a browser that has the right helper, something OpenSolaris has had for some time, will take your IM client to that room. However with pidgin that is only the case if that room is available in the first XMPP server listed in your list of accounts. So given that this room is on Sun's IM server with the list of accounts looking like this:


It will try and connect to the first XMPP server listed, which is google and hence fail. Changing the order to be:


and then logging in and out and now the link will work. You can drag and drip the entries in pidgin.

I was pondering a number of things on my evening run up Pumlomon last night via a new path which approaches from the West direct to the summit. No photo's as I decided to take a extra thermal in my bum bag rather than a camera which turned out to be a good choice as the sun started to set. The main pondering was if a footballer paid N millions a year had advanced their sport to the same extent as Sara Outen or Jez Bragg.

I listened to Sara Outen interviewed on Radio 2 and it was just very impressive what she had achieved in rowing across the Indian Ocean. It felt disappointing she was having to sell her boat to pay back a loan. So many daft things she could be getting on with which push the limit beyond what has been done before in a small boat with no friends.

I have not met Jez, but we have run in a few of the same races (Jez at the front, me at the other end of the field). 3rd in the Western States is a uncelebrated national success. Maybe thats the difference, real ambition and ability which push at the boundaries of what is possible does not get widely celebrated.

I don't find the calls for reform of bankers pay any more or less compelling than the case for the reform for Footballers pay. Rowing solo across the Indian Ocean has a high level of obvious risk with a very personal set of rewards. I don't see a lot of risk in being a Footballer or a Banker, so the rational for the level of monetary reward escapes me as neither appears to be pushing the boundaries of what is possible and inspiring other people. Maybe that is entertainment for you. No one is going to pay to watch footage of Sara rowing 10 hours+ a day in the middle of the ocean or Jez plodding down a trail which looks the same as the next trail.

This looks worth a play with.
Taken a week to write this up, but a lot has been going on work wise in my little world. 2 years in a row the Pumlumon Challenge got fantastic weather. For those who only visit mid-Wales for this event, it is always like that. Claims of average rainfall in the region of 1760mm are just figures picked from the Internet without any factual basis, hence my year round sun tan (not).

The race is part of the Vasque series of Ultra marathons, most of which are mountain based. I did the race last year for the 1st time and this year have done 2 other races in the series.

Wynne, chief organizer managed the most informal start for a race to date. Without any warning or build up, a very informal "off you go" was quite amusing.

I was still tired from the Nant Peris Horseshoe the week before which probably demonstrates to me I have the Peris my all and a week is not enough to recover at these distances which is stating the obvious. It was hot, I found after about 10 miles I was more tired than I should have been, so I was about 30 minutes slower than last year. Since the start is about a mile as the Red Kite flies from my house, I know the route quite well which really helped on the decent down Drosgol picking up about 10 minutes by following the secret quad bike tracks.

I enjoyed this year's race a lot more and even stopped to take a few pictures on top of Drogsol. Great event, fine organization(could have done with more water available at the bottom of Hengwm) and really good to see some of the people like Nick who I had meet on other races in the series.

A couple of photo's shows the large number of native flies which were not biting inclined. Visit them now before the Wind Turbines scheduled to be installed around Nant-y-Moch are put in place (I feel one only has the right to pass adverse comments on such things when their loft is fully insulated if you get my drift).

Cristina Cifuentes invited me to speak at Michael Chan's (Campus Ambassador - UTS) Software Freedom Day event at UTS yesterday. The broad guidance given to me by Michael was "Alan can you talk about Solaris, ZFS and Zones. If possible can you demonstrate this?".

OK, I've not done a lot with zones, especially under OpenSolaris, and I do love a challenge.

Last year I spoke on a number of things at the day at Sydney Uni and I was a bit disappointed in the slideware I created for the stuff on ZFS. This year I decided that the best slides that I had seen on ZFS were from Jeff and Bill in their ZFS - The last Word in Filesystems presentations, so I used a few slides from there. The fun part was as I only had the pdf file I had to recreate the wonderful drawings they used on the slides I wanted. I learned a bit about drawing in OpenOffice doing this :) (and was told after I finished that apparently there is an OpenOffice plugin that will let you pull images out of pdf documents. oh well).

I also found someay good zones resources in sun blogs, most notably Cloning Zones by Brian Leonard.

Wikipedia was also helpful verifying some timelines for Solaris and OpenSolaris.

At the end of the slides I flipped over to a couple of terminal sessions and demonstrated cloning some simple zones as I'd outlined in the talk. Very simple zones with no resources imported or networks. The impressive stuff was that we could demonstrated that with cloning we could provision and boot a new clone in under 10 seconds, which lead nicely into Andrew Latham's talk on Cloud Computing.

Anyway, the pdf of the presentation can be downloaded here.

Today I took the plunge and moved from working on our Nevada based Sun Ray Servers to one running OpenSolaris. So that I could get the full OpenSolaris look and feel I first purged my home directory of a number of configuration files and directories using a script like1 this:

#!/bin/ksh -p
TARGET=b4OpenSolaris
test -d $HOME/$TARGET || mkdir $HOME/$TARGET
mv $HOME/.ICEauthority $HOME/$TARGET
mv $HOME/.cache $HOME/$TARGET
mv $HOME/.chewing $HOME/$TARGET
mv $HOME/.config $HOME/$TARGET
mv $HOME/.dbus $HOME/$TARGET
mv $HOME/.dmrc $HOME/$TARGET
mv $HOME/.gconf $HOME/$TARGET
mv $HOME/.gconfd $HOME/$TARGET
mv $HOME/.gksu.lock $HOME/$TARGET
mv $HOME/.gnome2 $HOME/$TARGET
mv $HOME/.gnome2_private $HOME/$TARGET
mv $HOME/.gstreamer-0.10 $HOME/$TARGET
mv $HOME/.gtk-bookmarks $HOME/$TARGET
mv $HOME/.iiim $HOME/$TARGET
mv $HOME/.local $HOME/$TARGET
mv $HOME/.nautilus $HOME/$TARGET
mv $HOME/.printer-groups.xml $HOME/$TARGET
mv $HOME/.rnd $HOME/$TARGET
mv $HOME/.sunstudio $HOME/$TARGET
mv $HOME/.sunw $HOME/$TARGET
mv $HOME/.updatemanager $HOME/$TARGET
mv $HOME/.xesam $HOME/$TARGET
mv $HOME/.xsession-errors $HOME/$TARGET

I generated the list by installing OpenSolaris in a VirtualBox and then logging in and doing a bit of browsing and general usage and then seeing was was created. Additionally “.mozilla” was created but I chose to retain that so that I can keep all the history that is in my browser.

Once logged in I have removed the update-manager icon as I am not the administrator. I have also removed the power notification and network monitor as they provide no useful data on a Sun Ray server.

Using “System->Preferences->Startup Applications” I unchecked the codeina update notifier and added my script for updating my IM status.

So far so good but it is taking a while to get used to the menu being a the top and the window list at the bottom of the screen.

1Like as in similar to and not this exact script as mine had my home directory hard coded into it.

Some of the most common failures that result in customer calls are misuses of the memory allocation routines, malloc, calloc, realloc, valloc, memalign and free. There are many ways in which you can misuse these routines and the data that they return and the resulting failures often occur within the routines even though the problem is with the calling program.

I'm not going to discuss here all the ways you can abuse these routines but look at a particular type abuse. The double free. When you allocate memory using these routines it is your responsibility to free it again so that the memory does not “leak”. However you must only free the memory once. Freeing it more than once is a bug and the results of that are undefined.

This very simple code has a double free:

#include <stdlib.h>

void
doit(int n, char *x)
{
        if (n-- == 0)
                free(x);
        else
                doit(n,x);
}
int
main(int argc, char **argv)
{
        char *x;
        char *y;

        x = malloc(100000);
        
        doit(3, x);
        doit(10, x);
}

and if you compile and run that program all appears well;


However a more realistic program could go on to fail in interesting ways leaving you with the difficult task of finding the culprit. It is for that reason the libumem has good checking for double frees:


: exdev.eu FSS 26 $;  LD_PRELOAD=libumem.so.1 /home/cg13442/lang/c/double_free
Abort(coredump)
: exdev.eu FSS 27 $; mdb core
Loading modules: [ libumem.so.1 libc.so.1 ld.so.1 ]
> ::status
debugging core file of double_free (64-bit) from exdev
file: /home/cg13442/lang/c/double_free
initial argv: /home/cg13442/lang/c/double_free
threading model: native threads
status: process terminated by SIGABRT (Abort), pid=18108 uid=14442 code=-1
> ::umem_status
Status:         ready and active
Concurrency:    16
Logs:           (inactive)
Message buffer:
free(e53650): double-free or invalid buffer
stack trace:
libumem.so.1'umem_err_recoverable+0xa6
libumem.so.1'process_free+0x17e
libumem.so.1'free+0x16
double_free'doit+0x3a
double_free'doit+0x4d
double_free'doit+0x4d
double_free'doit+0x4d
double_free'doit+0x4d
double_free'doit+0x4d
double_free'doit+0x4d
double_free'doit+0x4d
double_free'doit+0x4d
double_free'doit+0x4d
double_free'doit+0x4d
double_free'main+0x100
double_free'_start+0x6c

> 

Good though this is there are situations when libumem is not used and others where it can't be used1. In those cases it is useful to be able to use dtrace to do this and any way it is always nice to have more than one arrow in your quiver:


: exdev.eu FSS 54 $; me/cg13442/lang/c/double_free 2> /dev/null              <
/usr/sbin/dtrace -qs doublefree.d -c /home/cg13442/lang/c/double_free 2> /dev/null
Hit Control-C to stop tracing
double free?
	Address: 0xe53650
	Previous free at: 2009 Jun 23 12:23:22, LWP -1
	This     free at: 2009 Jun 23 12:23:22, LWP -1
	Frees 42663 nsec apart
	Allocated 64474 nsec ago by LWP -1

              libumem.so.1`free
              double_free`doit+0x3a
              double_free`doit+0x4d
              double_free`doit+0x4d
              double_free`doit+0x4d
              double_free`doit+0x4d
              double_free`doit+0x4d
              double_free`doit+0x4d
              double_free`doit+0x4d
              double_free`doit+0x4d

: exdev.eu FSS 56 $; 

If run as root you can get the the real LWP values that did the allocation and the frees:

: exdev.eu FSS 63 $; pfexec /usr/sbin/dtrace -qs doublefree.d -c /home/cg1344>
Hit Control-C to stop tracing
double free?
	Address: 0xe53650
	Previous free at: 2009 Jun 23 14:21:29, LWP 1
	This     free at: 2009 Jun 23 14:21:29, LWP 1
	Frees 27543 nsec apart
	Allocated 39366 nsec ago by LWP 1

              libumem.so.1`free
              double_free`doit+0x3a
              double_free`doit+0x4d
              double_free`doit+0x4d
              double_free`doit+0x4d
              double_free`doit+0x4d
              double_free`doit+0x4d
              double_free`doit+0x4d
              double_free`doit+0x4d
              double_free`doit+0x4d

: exdev.eu FSS 64 $;

Here is the script in all it's glory.

#!/usr/sbin/dtrace -qs

BEGIN
{
	printf("Hit Control-C to stop tracing\n");
}
ERROR 
/ arg4 == DTRACEFLT_KPRIV || arg4 == DTRACEFLT_UPRIV /
{
	lwp = -1;
}

pid$target::realloc:entry,
pid$target::free:entry
{
	self->addr = arg0;
	self->recurse++;
}

pid$target::realloc:return,
pid$target::free:return
/ self->recurse /
{
	self->recurse--;
	self->addr = 0;
}

pid$target::malloc:entry,
pid$target::memalign:entry,
pid$target::valloc:entry,
pid$target::calloc:entry,
pid$target::realloc:entry,
pid$target::realloc:entry,
pid$target::free:entry
/ lwp != -1 && self->lwp == 0 /
{
	self->lwp = curlwpsinfo->pr_lwpid;
}

pid$target::malloc:entry,
pid$target::calloc:entry,
pid$target::realloc:entry,
pid$target::memalign:entry,
pid$target::valloc:entry,
pid$target::free:entry
/ self->lwp == 0 /
{
	self->lwp = lwp;
}

pid$target::malloc:return,
pid$target::calloc:return,
pid$target::realloc:return,
pid$target::memalign:return,
pid$target::valloc:return
{
	alloc_time[arg1] = timestamp;
	allocated[arg1] = 1;
	free_walltime[arg1] = 0LL;
	free_time[arg1] = 0LL;
	free_lwpid[arg1] = 0;
	alloc_lwpid[arg1] = self->lwp;
	self->lwp = 0;
}

pid$target::realloc:entry,
pid$target::free:entry
/ self->recurse == 1 && alloc_time[arg0] && allocated[arg0] == 0 /
{
	printf("double free?\n");
	printf("\tAddress: 0x%p\n", arg0);
	printf("\tPrevious free at: %Y, LWP %d\n", free_walltime[arg0],
		free_lwpid[arg0]);
	printf("\tThis     free at: %Y, LWP %d\n", walltimestamp,
		self->lwp);
	printf("\tFrees %d nsec apart\n", timestamp - free_time[arg0]);
	printf("\tAllocated %d nsec ago by LWP %d\n",
		timestamp - alloc_time[arg0], alloc_lwpid[arg0]);

	ustack(10);
}

pid$target::realloc:entry,
pid$target::free:entry
/ self->recurse == 1 && alloc_time[arg0] && allocated[arg0] == 1 /
{
	free_walltime[arg0] = walltimestamp;
	free_time[arg0] = timestamp;
	free_lwpid[arg0] = self->lwp;

	allocated[arg0] = 0;
}

pid$target::free:entry
/self->lwp && self->recurse == 0/
{
	self->lwp = 0;
}

1Most of the cases it “can't” be used is because it finds fatal problems early on in the start up of applications. Then the application writers make bizarre claims that this is a problem with libumem and will tell you it is not supported with their app. In fact the problem is with the application.

1Iostat has been around for years and until Dtrace came along and allowed us to look more deeply into the kernel was the tool for analysing how the io subsystem was working in Solaris. However interpreting the output has proved in the past to cause problems.

First if you are looking at latency issues it is vital that you use the smallest time quantum to iostat you can, which as of Solaris 10 is 1 second. Here is a sample of some output produced from “iostat -x 1”:

                  extended device statistics                 
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b 
sd3       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
                 extended device statistics                 
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b 
sd3       5.0 1026.5    1.6 1024.5  0.0 25.6   24.8   0  23 
                 extended device statistics                 
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b 
sd3       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 


The first thing to draw your attention to is the Column “%b” which the manual tells you is:

%b percent of time the disk is busy (transactions in progress)

So in this example the disk was “busy”, ie had at least one transaction (command) in progress for 23% of the time period. Ie 0.23 seconds as the time period was 1 second.

Now look at the “actv” column. Again the manual says:

actv average number of transactions actively being serviced (removed from the queue but not yet completed)
This is the number of I/O operations accepted, but not yet serviced, by the device.
In this example the average number of transactions outstanding for this time quantum was 25.6. Now here is the bit that is so often missed. Since we know that all the transactions actually took place within 0.23 seconds and were not evenly spread across the full second the average queue depth when busy was 100/23 * 25.6 or 111.3. Thanks to dtrace and this D script you can see the actual IO pattern2:

Even having done the maths iostat smooths out peaks in the IO pattern and thus under reports the peak number of transactions as 103.0 when the true value is 200.
The same is true for the bandwidth. The iostat above comes reports 1031.5 transactions a second (r/s + w/s) again though this does not take into account that all those IO requests happened in 0.23 seconds. So the true figure for the device would be 1031.5 * 100/23 which is 4485 transations/sec.
If we up the load on the disk a bit then you can conclude more from the iostat:
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b 
sd3       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
                 extended device statistics                 
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b 
sd3       5.0 2155.7    1.6 2153.7 30.2 93.3   57.1  29  45 
                 extended device statistics                 
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b 
sd3       0.0 3989.1    0.0 3989.1 44.6 157.2   50.6  41  83 
                 extended device statistics                 
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b 
sd3       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 

Since the %w column is non zero, and from the manual %w is:

%w percent of time there are transactions waiting for service (queue non-empty)

This is telling us that the device's active queue was full. So on the third line of the above output the devices queue was full for 0.41 seconds. Since the queue depth is quite easy to find out3 and in this case was 256, you can deduce that the queue depth for that 0.41 seconds was 256. Thus the average for the 0.59 seconds left was (157.2-(0.41*256))/0.59 which is 88.5. The graph of the dtrace results tells a different story:


These examples demonstrate what can happen if your application dumps a large number of transactions onto a storage device while the through put will be fine and if you look at iostat data things can appear ok if the granularity of the samples is not close to your requirement for latency any problem can be hidden by the statistical nature of iostat.

1Apologies to those who saw a draft of this on my blog briefly.

2The application creating the IO attempts to keep 200 transations in the disk at all the time. It is interesting to see that it fails as it does not get notification of the completion of the IO until all or nearly all the outstanding transactions have completed.

3This command will do it for all the devices on your system:

echo '*sd_state::walk softstate | ::print -d -at "struct sd_lun" un_throttle' | pfexec mdb -k

however be warned the throttle is dynamic so dtrace gives the real answer.

What do you do if you manage to delete or corrupt /etc/name_to_major? Assuming you don't have a backup a ZFS snapshot or an alternative boot environment, in which case you probably are in the wrong job, you would appear to be in trouble.

First thing is not to panic. Do not reboot the system. If you do that it won't boot and your day has just got a whole lot worse. The data needed to rebuild /etc/name_to_major is in the running kernel so it can be rebuilt from that. If your system an x86 system it is also in the boot archive.

However if you have no boot archive or have over written it with the bad name_to_system this script will extract it from the kernel, all be it slowly:

#!/bin/ksh
i=0
while ((i < 1000 ))
do
print "0t$i::major2name" | mdb -k | read x && echo $x $i
let i=i+1 
done

1Redirect that into a file then move the remains of your /etc/name_to_major out of the way and copy the file in place.

Next time make sure you have a back up or snapshot or alternative boot environment!

1You will see lots of errors of the form “mdb: failed to convert major number to name” these are to be expected. They can be limited to just one by adding “|| break” to the mdb line but that assumes that you have no holes in the major number listings which you may have if you have removed a device, so best to not risk that.

My winter bike is ready for another dose of rain and darkness. This year I have a new headlight the 2.4W Schmidt Edelux. A single LED that throws out more light that my old 12V set up. The old Lumitec Oval Plus sensor failed at the end of last winter such that the only part that still worked was the lamp. Neither the sensor or the swith would turn it off and the standlight also failed. While I don't hold much store in a forward standlights having so many it was only time before the light really failed. Something I can't really risk.

So I have joined the 21st century and have an LED. I've only test ridden it up and down the road, which has street lights, so does not really do it justice but it was very impressive. Like the Oval Plus sensor it comes on automatically and has a manual override.

Unlike the Oval Plus the switch is well protected being a reed switch operated by a magnet so there is no way for water to get inside and like all Schmidt lights it has a 5 year warranty and looks fantasitc.

It is powered by the SON hub generator and also swithes the rear light.

Photo by Robyn Gerhard

The Nant Peris Horseshoe starts in Llanberis, takes in the summits of Elidir, Y Garn, Glyder Fawr, Lliwedd, Snowdon and Moel Cynghorion. It is considered by many runners to be the hardest fell race in the Welsh calender at 17.5 miles and 8500ft of ascent. If you think different, I would be interested to know what fell race is harder apart from maybe the Welsh 1000m race and the Tryfan Downhill Dash, do add a comment.

I was very pleased indeed to finish in 5 hours and 13 minutes, which was in the region of 1 hour 20 mins faster than last years epic. I had a target time of 5.30, so in general the race went well. The weather was no where near as hot and apart from the rock being slippery, it was really good running weather with a stiff wind and cloud on the tops, but not so much as to make it cold or navigation difficult. It is the 1st race I have run where felt I achieved my potential on the day. The ascents up Snowdon and Moel Cynghorion were very tough(be something wrong if they were not) and it was cheering for the 3 marshals on top of Snowdon to comment that I was in a much better state than last year.

Dilwen, if you read this, it is now one all for the only race that matters series, next year is the decider!

Mike Blake and friends did a excellent job of organization and marshaling getting the combination of freedom, safety and challenge just right. The contribution of the 3 sponsors(Vic Hotel in Llanberis, 1st Hydro and the Snowdon Mountain Railway) makes a huge difference to the race in terms of access to land in the Llanberis slate quarries to run over, getting the 5 gallons of water to the top of Snowdon on the train and the accommodation for the race HQ. This is one aspect that I was not aware of. It did make me and others chuckle that the prize for the 1st man(I think this is right) was a pair of tickets on the Snowdon Railway to the summit. If you can run that race in 3 hours 12, you do not need the train to get to the submit. Be a nice treat for his grandparents maybe.

Results will end up here and some photos of the race can be found here. This pic

is taken at about a mile or so from the start, hence the rather fresh look, this pic

is take about a mile from the end, hence the general focus on putting one foot in front of the other.

Next race is much closer to home (about a mile as the Kite flies) is the Pumlumon Challenge which is good value at 27 miles and 5500ft which is part of the Ultra Running Championships. 3 of the races in this series in a year is enough for me.

The UK Corporate Solaris Users Group on Tuesday 29th September has a breakfast meeting where we are going to sprinkle a little DTrace on your cereal. DTrace was a requested topic at the last meeting, so in keeping with my preferred style of delivery, it will be demo only. Diagnosis and performance analysis is a full contact sport, so best to show it as it really is.

More details can be found are here

Editing sd.conf has always been somewhat difficult thanks to it not being a documented interface and that the interface was never inteded to be exposed and it was even architecture specific. Fortunately Micheal documented it, which meant that it was known even if syntax remained obscure.

However after ARC case 2008/465 was approved and the changes pushed as part of bug 6518995 you can now use more a human readable syntax1:

sd-config-list=
        "ATA     VBOX HARDDISK", "disksort:false";

As it turns out the “disksort”2 option along with the thottle-max and throttle-min are the ones I most often want to tune.

Here is the current list of tunables lifted straight from the ARC case.


Tunable_Name

Commitment

Data_Type

cache-nonvolatile

Private

BOOLEAN

controller-type

Private

UINT32

delay-busy

Committed

UINT32

disksort

Private

BOOLEAN

timeout-releasereservation

Private

UINT32

reset-lun

Private

BOOLEAN

retries-busy

Private

UINT32

retries-timeout

Committed

UINT32

retries-notready

Private

UINT32

retries-reset

Private

UINT32

throttle-max

Private

UINT32

throttle-min

Private

UINT32


1This reminds me of the change to /etc/printcap that allowed you to specify the terminal flags as strings rather than as a bitmap. All the mystery seemed to be removed!

2While I used disksort as an example for this case I can't think of any reason why you would have it enabled for a virtual disk in VirtualBox.

Compared to the last 2 years of mud, it was a relief that the worst the weather did was give a serious threat of drizzle, but backed off before going through with it. The 2 previous years where we spent 3 nights sleeping in a VW Transporter van with 2 children under 5 who spent the day playing in the mud did test the resolve.

The Beautiful Days festival at Escot Park in Devon in now in its 7th year and it has taken me a week to get round to writing it up. The Guardian describe the festival as a family based folk punk hoedown with give a high level flavor. It is a middle sized festival at around 10,000 people with a mix of bands drawn from folk, punk, reggae and rock spanning the last 40 years of music, for example Hawkwind played the same stage and evening as The King Blues. With a couple of stages, you can't see every band and we also spent some time in the kids area. My highlights were

  • Hawkwind : 40 years and still delivering the nearest thing to space travel through the medium of guitar based rock. Great laser show and they really deliver a wall of sound. Probably my top act of the weekend aided only by the products of the Otter Brewery which may have put me in a minority.
  • The highlight in terms of exposure to new bands was The King Blues which was a folk punk mash up with some punchy lyrics.
  • An other highlight of the past for me was The Blockheads. I am proud that the 1st record I bought at the age of 10 was "What a waste" giving me some punk credentials 31 years later. Really nice balance between remembering Ian Dury and being a current band.
  • I have seen the Levellers so many times, that I can't judge if they were better or worse than previous times or other bands. I enjoyed it and playing the whole of "A Weapon called the Word" panned out well.
  • Its hard to say it, but I found The Pogues a partial disappointment. There is something special about Shane MacGowan which still comes out, but he came across as being in a bit of a state(read hell of a state). Still, the Pogues were a band I had wanted to see for a long time and I am please I stayed around.
  • Pronghorn, the worlds premier Cow Punk band was good fun. Suspect the are in a Gene pool of one, but still a good act for a open air stage in the afternoon.
  • It should not work, but it did. Les Tribute are all dressed in red, mashing up covers of disco, rock, new romantic and every other type of chart hit from the 1980's with serious showmanship and humor. Went down really well. I am curious how they find bands like that.
  • I had not heard of Lamb, Gong or the Subhumans, but all were a good way to spent an hour or so.
  • I have seen Dreadzone a couple of times and they were on form. Small people really got into the spirit of it during their set.
  • Howard Marks was interesting. I disagree with almost everything he says, but he is still good value listen to
  • Mitch Benn who does the music bit on Radio 4 "The Now Show" was very good indeed.
  • There is a lot of other stuff going on including full contact Morris dancing where they hit each other with 2 inch wooden sticks, various modern sculptures, a golf carts dressed up as a pirate ship and the option of Confession with Nuns who have whips (it would be quite hard to explain).

So, much fun had by the King family once again, made much easier by the weather and we hope to do it again next year.

An assortment of images.

Next race is the Nant Peris Horseshoe on Saturday at 18 miles and 8500ft of ascent where we have something to prove after last year's rather poor showing.