Log in

No account? Create an account
entries friends calendar profile Previous Previous
Ed's journal — LiveJournal

A new feature in RHEL and Centos 7.4+ is the Network Bound Disk Encryption.

Specifically - it extends a LUKS encrypted volume, such that you can use some servers on the local network to perform the decryption automatically.

And in particular - this can be done on root volumes, meaning that _all_ of your 'at rest' data is encrypted. 

Why encrypt root? 

- Cloud hosts - you might very well want to have proprietary information encrypted at rest when being hosted in the cloud

- Desktops - if your machines are physically accessible they can be stolen or have drives removed. (Or just booted into 'recovery' mode and bypass audit controls)

- Laptops - losing them on the train. 

But all these things come with a pretty significant drawback - in order to reboot them, you need someone to physically enter a password a boot time. That ends up being a pretty big problem if you - for example - want to patch and restart a batch of servers. 

So enter NBDE - the client 'talks' to some servers on the local network, and uses a key exchange to generate a passphrase for drive decryption. It can do this at boot time, by enabling the network and appropriate modules in dracut

Centos/RHEL 7.4+ ship with the packages needed to do this: 

Clevis (and clevis-dracut) for the client side decryption.

Tang - the authentication/decryption server.

To set this up you will need 3 things:

A test host that you can reformat to encrypt as 'client'

A decryption server (to run tang). Ideally 2 - or more - for resilience. 

Read more...Collapse )


Leave a comment
This week I have mostly been fiddling with Docker and Elasticsearch/Logstash/Kibana. (Known as 'ELK').

The basics are something I've fiddled with before - elasticsearch is a NoSQL database that's built to shard and scale. Logstash is a log parsing tool, which extracts log metadata and inserts it into ... well, a variety of databases, but in this case I'm using elasticsearch.

And Kibana is a visualisation tool, that - amongst other things - has a configuration for doing logstash parsed logs out of an elasticsearch back end.

I've tried this before - and it worked fine - but what I wanted to try this time is making a scalable system. And thus docker containers. If you haven't encountered them, they're ... sort of like a mini virtual machine. You create a docker image - which is essentially an application, but bundled with all it's dependencies.

And from the image, you create containers - runnable instances of an application. But the key point is, each container is ... well, self contained. All the dependencies are bundled up together, which makes them particularly portable - relocate and start wherever you need/want. (Well, provided you have at least a basic docker build - the whole point is you don't actually need to install much else).

But the thing I was trying to do here is use a private docker network, and create a set of containers that would basically auto-configure - allowing you to 'spin up' extra nodes as you need to.

With the elasticsearch database this is working nicely - because you're instantiating containers off images, you need to think in terms of persistence. You can therefore create and attach a 'storage' container, that _is_ persistent - and just attach to that with your current elasticsearch image.

But the base 'discovery' mechanism is an IP unicast, which allows you to specify a set of 'discovery' nodes to find the initial cluster. It works well enough, but it does require you have a particular set of IP addresses active.

Logstash/Kibana is a bit less good at the dynamic discovery, so I'm still working on it. Logstash, given it's near-real-time nature it shouldn't be too hard to start/stop and do node discovery as part of the startup script, but Kibana it's a bit less easy.

So I'm thinking I might try looking at haproxy next, or some other discovery mechanism.

But otherwise, as it stands - I've got a container 'set' that it took me about 10 minutes to start up an extra 'node' in my cluster, to add storage/compute resources. (And most of that was installing the updates I needed for docker-engine to do the multi-host network).

So all good so far.
Leave a comment
use strict;
use warnings;

use Text::CSV;
use Data::Dumper;

my %count_of;
my @field_order;

foreach my $file (@ARGV) {
    my $csv = Text::CSV->new( { binary => 1 } );
    open( my $input, "<", $file ) or warn $!;
    my $header_row = $csv->getline($input);
    foreach my $header (@$header_row) {
        if ( not $count_of{$header} ) {
            push( @field_order, $header );

print "Common headers:\n";
my @common_headers = grep { $count_of{$_} >= @ARGV } keys %count_of;
print join( "\n", @common_headers );

my %lookup_row;
my $key_field;
if (@common_headers) { $key_field = pop @common_headers }

foreach my $file (@ARGV) {
    my $csv = Text::CSV->new( { binary => 1 } );
    open( my $input, "<", $file ) or warn $!;
    my @headers = @{ $csv->getline($input) };
    while ( my $row_hr = $csv->getline_hr($input) ) {
        my $key = $.;
        if ($key_field) {
            $key = $row_hr->{$key_field};
        $lookup_row{$key}{$file} = $row_hr;

my $csv_out = Text::CSV->new( { binary => 1 } );
my $header_row = \@field_order;
$csv_out->print( \*STDOUT, $header_row );
print "\n";

foreach my $key ( sort keys %lookup_row ) {
    my %combined_row;
    foreach my $file ( sort keys %{ $lookup_row{$key} } ) {
        foreach my $header (@field_order) {
            if ( $lookup_row{$key}{$file}{$header} ) {
                if (   not defined $combined_row{$header}
                    or not $combined_row{$header} eq
                    $lookup_row{$key}{$file}{$header} )
                        .= $lookup_row{$key}{$file}{$header};
    my @row = @combined_row{@field_order};
    $csv_out->print( \*STDOUT, \@row );
    print "\n";


Leave a comment

There have been instances recently of people saying or doing something inappropriate, and there being an associated furore over it.

A scientist making sexist comments about 'distractingly sexy' women in the lab.

A guy wearing a 'pin ups' T-shirt when talking about a space mission.

Or the whole 'sad puppies' thing around the Hugo awards.

Even "gamer gate".

The problem is in these scenarios, that there seems to be an urge to categorize people as either 'good people' or 'bad people'. And then there's a massive debate.

There is just no such thing as an unambiguously good - or bad - person. It's never so simple. There's no ethical calculus that lets you be a saint for 40 years, and then get a free pass on murdering a baby or two.

Nor do you -ever- get to 'cancel out' past mistakes. You can seek redemption, but the only way you ever get it is via forgiveness, not fixing the past.
So it's actually quite harmful to apply this sort of abstraction. We saw this in the Jimmy Savile affair (and many many other examples). People who couldn't believe he was doing what he was doing, because of all the good things he did.
And the truth is - he did both. He _did_ do a lot of good, and raise a lot of awareness and supported charities. We're fools if we dismiss that in light of the subsequent revelations.
But likewise - that doesn't excuse - or prevent from happening - the _other_ things he (allegedly?) did.
And the same is true of pretty much everyone. Everyone has a price. Everyone has pressure points. Everyone has weaknesses. Everyone has prejudices. Everyone makes mistakes.
And in some cases - these prejudices, mistakes and weaknesses lead to causing a harm that can never be repaired. And you just have to live with that. That doesn't mean you cannot do better, or indeed that you become irredeemable. It doesn't make you unforgivable.

Think how horrible _that_ would be. One mistake, you're now a 'bad person' and that is that.

A lot of people misunderstand what it is to forgive. It isn't about letting something pass, and say 'never mind, it doesn't matter'. If it didn't matter, it wouldn't need to be forgiven. It's about letting go of _your_ pain, anger, hate or fear. Acknowledging a harm - understanding that it hurt and always will - but letting go of it's ability to control your future.
Continuing to hate someone is ultimately very poisonous. It taints your world view massively.
That doesn't mean we should let "wrong" pass - not by any means. Challenge it whenever you can, especially when it's at a point that by doing so, you can change the course of a thing. But merely accept that every person in the world is a complex bundle of ambiguities, and picking fights and censure rarely changes anyone's opinion.
And you never get to 'fix' the past.

This started as a comment on a facebook post, but turned into a bit of a rant

5 comments or Leave a comment
One of my pet peeves with IT. (There's many, this is just one) is the notion of 'daily checks'.

Some places have a daily checklist, that's a list of tasks they have some IT person look at, each day, to make a note that everything is OK.
This is based on a fundamentally flawed assumption - that somehow a human is better at a routine task than a computer.

This is just plain wrong. Yes, there are some things that a person will be able to spot that a computer won't. But these are not things that go on a daily check list. They're the things you see when you roll up your sleeves and do an end-to-end diagnosis.

Otherwise... a computer can do a 'daily check' much more frequently than a person can. It can do it all day, every day. And it can notify you when there's a problem. By doing so, you don't get the 'road blindness' effect - people are bad at paying attention to persistent states, they're much better at picking out anomalies.

If the light is always yellow on your system, then you won't notice when a _new_ 'yellow' alert shows up.

So really - if your 'daily check' is any more involved than 'check your email' or perhaps 'open your monitoring portal' - then you're doing it wrong. Make your computers keep watch on each other, because that way you'll know what's wrong, when it went wrong and you won't have to wait up to 24 hours before you spot the problem.

Which lets face it - if anything is significantly wrong, your phone will already be ringing anyway.
Leave a comment
I'm sure many of you will have notice - several accounts have been disabled by Facebook, thanks to their policy that 'thou shalt use real names'.

They're under some misguided concept that anonymity leads to trolling, or maybe it's just marketing.
Whatever. It's extremely misguided.

I've used my real name of Facebook since the start. It's not particularly bothered me. However I'm also quite well aware that as a straight, white, middle class citizen of a first world country... the world is actually pretty well geared up to my convenience.

There are quite a few reasons why someone might not want to use their real name - but oddly, the vast majority of them _don't apply_ to the straight, white, middle class citizens of the world such as say, Mark Zuckerberg or the majority of employees of Facebook. http://www.usatoday.com/story/tech/2014/06/25/facebook-diversity/11369019/

Now, leaving aside _malicious_ reasons to 'fake up' a facebook persona (I can think of a few. Fraud, stalking, anonymous trolling).

There's also some really quite serious reasons why removing anonymity is an evil thing to do:
- Victims of domestic abuse. Children might want to avoid being tracked down by an abusive parent. Partners might want to avoid being tracked down by an abusive ex.
- Victims of crime in general - as a straight, white, middle class male you will likely never have to worry about being targeted by a rapist.
- People who work in particular professions - are seriously disadvantaged by their personal and professional lives collide. Police officers, teachers, social workers... are all much more at risk to online abuse as a result of what they do.
- LGBT individuals, who risk harassment or abuse - lets not forget, there are some countries that treat homosexuality as a crime.
- Political activists and dissidents - How much do you trust your government anyway? Do you think people who live in say, Syria, or China feel the same? Do you really think it's 'fair' to put these people at risk?
- People in responsible positions, such as banks who are at risk of being targeted by organised crime.

One of the things I particularly remember is a person I met who used to work in the prison service. One of the things he had hammered into him on his first day: DISCLOSE NOTHING. Because they'll be in close proximity to some serious criminals for extended periods. Some of which were clever enough and sociopathic enough to put together a profile on 'the screws'. And if they ever completed their picture, at some point a nice man would show up outside their house, and ... would probably offer to do them a favour. But would make clear that refusal wasn't an option, and things could get extremely nasty if they didn't want to co-operate.

Just by disclosing their favourite pub and which footie team they supported... they'd been tracked down, and their family put at risk. Coerced into corruption.

At which point they're basically screwed. Quitting your job might remove your risk of being corrupted, but there really is no guarantee that a criminal group won't retaliate. Not everyone can afford to provide their own personal witness protection scheme. Until you've experienced what a systematic campaign of petty harassment and vandalism can feel like - you really don't appreciate just how horrifically destructive it might be.

But the thing is - Google started requiring Real Names when they started G+.
They've since dropped it, presumably because they realised that 'Don't Be Evil' didn't really include putting people at risk simply because they've never had to worry about being victimised.


And to top it all - a firstname/lastname is really a very westernised view of the world. Not every country works like that.

So seriously Mr. Zuckerberg and Facebook. Think hard about the people who'll be burned by this policy.
Leave a comment
You may have found Object Oriented Perl useful - it's not a tool for every job, but if you've a complicated data model, or a data driven process, it's invaluable. (Not to mention code encapsulation - but that doesn't actually seem to come up as much).

You may have also found being able to thread and queue useful. (Perl threading and queues)

However what you'll also have probably found is that multithreading objects is a significant sort of nuisance. Objects are jumped up hashes, and there's some quite significant annoyances with sharing hashes between threads.

However, another module which I find very useful in this context is Storable (CPAN).
What Storable does - essentially - is allow you to easily store and retrieve data to the local filesystem. It's geared up to hashes particularly:
use Storable;  
store \%table, 'file';  
$hashref = retrieve('file');  

This is quite a handy way for handling 'saved state' in your Perl code. (Less useful for config files, because the 'stored' file is binary formatted).

However what Storable _also_ supports is objects - which as you'll recall from the previous blog post are basically hashes with some extra bells and whistles. Better yet, there are two other methods that allows Storable to store to memory.
my $packed_table = freeze ( \%table );  
my $hashref = thaw ( $packed_table );   

This also works very nicely with objects, which in turn means you can then 'pass' an object around a set of threads using the Thread::Queue module.
use Storable qw ( freeze thaw );  
use MyObject;  
use Thread::Queue;  
my $work_q = Thread::Queue -> new();  
sub worker_thread {  
  while ( my $packed_item = $work_q -> dequeue )  {  
    my $object = thaw ( $packed_item );  
      $object -> run_some_methods();  
      $object -> set_status ( "processed" );  
       #maybe return $object via 'freeze' and a queue?  
my $thr = threads -> create ( \&worker_thread );  
my $newobject = MyObject -> new ( "some_parameters" );  
$work_q -> enqueue ( freeze ( $newobject ) )  
$work_q -> end();  
$thr -> join();  

Because you're passing the object around within the queue, you're effectively cloning it between threads. So bear in mind that you may need to freeze it and 'return' it somehow once you've done something to it's internal state. But it does mean you can do this asynchronously without needing to arbitrate locking or shared memory. You may also find it useful to be able to 'store' and 'retrieve' and object - this works as you might expect. (Although I daresay you might need to be careful about availability of module versions vs. defined attributes if you're retrieving a stored object)

Tags: , , ,

Leave a comment
You may not have heard of Kerberos. But there's a pretty good chance that you've used it, if you've used Windows in a place of work in the last ... 10 years or so.

It's a method of single sign on, designed in MIT about 20 years ago. It's really quite clever - so much so, that no one's managed to beat it in that time. It was intended to be a way of authenticating users in an untrusted network, for Unix.
Ironically - it was Microsoft that turned it 'mainstream'. Active Directory is - basically - a combination of Kerberos and LDAP. (Which are the two key elements of a Kerberos authentication domain).

The reason it's quite clever? Well, prior to it's invention, Unix (and Windows) basically were an account per server. It had extended a little into 'shared' accounts with things like NIS and YP. (Which is basically a 'shared' account list, that each server can authenticate if it wishes).

But you still had to type a password in, each server you logged in to. You could set up some sort of 'override' (rsh 'authorized hosts' and later ssh public/private key pairs) but it didn't handle network level authentication.

What kerberos does, is allow you to 'declare' your identity to an authorisation server (Kerberos Domain Controller - which in Windows is an Active Directory domain controller). It uses encryption to handle the authentication mechanism - which is another clever innovation, because you then don't have to send your password in the clear.

You encrypt - locally - a message. You send it to the DC. Which then - because it 'knows' your password, can decrypt the message. And send you one back, encrypted the same way. To prevent shenanigans, you it requires you to encrypt the time, to make replay attacks harder. (Which is why AD/Kerberos breaks when your clocks are >5m out of sync).

It issues a 'ticket granting ticket' (TGT). This is a 'backstage pass', and - provided it's still valid - can be used to request access to other services in the network. You request access to another service by 'asking' for a ticket for it - the KDC then (because it knows the 'machine account' password for the server) sends _you_ a ticket, containing an (encrypted) authorisation. The server you're trying to access can decrypt it (using it's machine account credentials).

And because stuff is handed around encrypted (Kerberos doesn't explicitly specify encryption mechanisms) you get a way of proving you are who you say, and that your remote server is also the one you expected to be talking to - the message can only be decrypted by it's intended recipient.

It's actually pretty cool - Single Sign on is something that remains a challenge to implement (securely/safely). And Kerberos is about the only game in town.
Leave a comment
Is there anything quite as annoying as waiting for a delivery that doesn't arrive?
I think I have one - the delivery that arrives, but the driver doesn't even bother to knock on your door.

Interlink Express - marvelous chaps - were due to deliver my parcel today.
They even gave me a very specific 1 hour time window, and a button to reschedule, via a text message.

I was due to be working from home, so I didn't do any rescheduling. So at about 12:15, I heard my letter box rattle - that's about when the post arrives, so I wandered over to see what had arrived. (Probably more bills :().
But no, it was a 'sorry we missed you' card from Interlink express, explaining how - because I wasn't in to sign for this consignment - they had taken it away again.

Now, I was prepared for this eventuality - after all there's no guarantee I wouldn't be on the toilet when my parcel arrived, or something.

However - I'm absolutely, 100% certain that the guy didn't even knock. Because I heard the letterbox rattle, as the card came through it, and - just about - heard the sound of a van leaving as I got to the doormat.

I'm bemused. This delivery has cost me ... oh, £6.99 I think it was?
But they've clearly gone to the effort to drive to my house, find the right door, and put a card through it. Is there some mystery reason why the last 30s of effort might be just a bit too much?
Do delivery drivers have a time window 'per delivery' which they have to be careful not to overrun? So are in danger of getting to 14m30 on this delivery, and simply not have time to _actually_ unload it, because they only have 30s left?

I'm really not sure. All I know, is I'm immensely irked by the frankly shoddy customer service this represents. I was at home, waiting for the delivery - and they've treated as a game of 'knock down ginger'. (only without the knocking).

PS - I hope Interlink Express customer services have set up google alerts, and therefore will see this post. Hi there. I'm a disappointed customer.
6 comments or Leave a comment
The Vale Wildlife hospital again, put out a call for volunteers on Facebook - at this time of year in particular, they're really rather busy, and it's the holiday season. So I sort of got volunteered - their hedgehogs needed a bit more attention.

With a day starting 'before 8am' (on a saturday *shudder*) I made my way to the hospital. Made a brief hello, and then started to give the guy who actually knew what he was doing a hand. The work at hand was the hedgehogs. This was a ... room? shed? well, place with a number of hedgehog enclosures. They were due their weekly weighing, feeding and cleaning. Mostly there were recuperating patients, who'd been admitted for a variety of reasons - usually 'got stuck somewhere' or otherwise needed rescuing - one had fallen in a swimming pool, for example, and had needed rescuing.

A few, were pregnant or had recently given birth - a few enclosures had '5-ish' hoglets. Which look a lot like miniature hedgehogs, but their spines haven't actually hardened and gone prickly, and they're just a bit smaller and more wobbly. Working down the line of enclosures, involved scooping the hedgehog(s) out with a pair of leather gloves I was on the 'larger end' of their volunteers - most, particularly on this day, seem to be female and college/university age, and so the gloves in the room were just too small.

(This is the 'before' photo - these little ones are in the process of being moved, one at a time, to a new, clean enclosure, after being weighed).
Because picturesCollapse )

But all in all, an interesting sort of a day. I'm unfortunately not going to be in a position to do a regular shift at the hospital, but think I shall keep an eye out when for when they're short handed again. (I'm on the list as being able to fetch rescuees on my way home from work, which is a little less time intensive.)
Leave a comment
A couple of weekends ago, on a bit of a quiet afternoon, we took a bat in a box for a ride in the car. A had spotted a post on a facebook group from the Vale Wildlife hospital, that ... they'd had a report of bat that needed rescuing, and they were really busy. (They basically always are).
With my vast experience of bats (e.g. none at all) and being at a vague loose end, we volunteered to go be bat-taxi. Slightly concerned that - as a protected species - handling them was a 'no-no' we were re-assured that it was in a box.

Off we went to a hotel in south gloucestershire, to fetch said bat. It's a rather pretty place, and I trundled into reception to declare 'I'm here about the bat'. I was treated to a stroll "below stairs" - an odd split, as the luxurious plush carpets and rich furnishings gave ways to white washed walls and slightly battered lino. And there was the bat - in it's 'box'.

... which when I though of a box, I though 'with a lid' but it turns out their definition was more like a flimsy cardboard tray. (I can only assume they hadn't worked out that y'know, bats can fly).

When asked 'so what species was it' I had to ad-lib slightly, and point out that bats really weren't my field. (Implying perhaps I had any clue whatsoever about ... well, any form of wildlife at all). (At the hospital, they were happy to tell me that it was a pipistrel).

So upon getting back to the car, there had to be a hasty bit of tissue box vandalism, just to ensure the bat wouldn't, in fact, be 'exploring' the bat-taxi. Thinking about it - it'd probably be less of a problem than a bird, because at least bat can 'see' the windscreen. But even so.

A had it on her lap in the front seat, holding the 'tissue box' type lid, and off we went. A few miles down the road, we realised that the combination of 'dark' + 'airconditioning' might be ... well, a bit like 'night time' and the bat was starting to wake up and wriggle. But with the wriggling, the cunning plan was to press on, and hope it didn't get out.

A little further on, the bat had found the edge of the box, and was trying to squeeze out, and A could feel it tickling her hand as it wriggled.

We got to the hospital without further incident - only to find that the wriggling had been the bat finding somewhere cozy and warm to hide - Almost in the palm of A's hand.
We weren't entirely sure if the aforementioned 'no handling' law really applied to bats coming to sit on you, but thought they might be tolerant of the fact that it was for the purposes of getting it to the wildlife hospital.

Said pipistrel was admitted overnight, fed and watered, and was to be examined by a specialist in the morning - with an aim of recovery, then release - there wasn't any signs it was any more 'ill' than 'got lost and stuck in a hotel room'.
Leave a comment
This is the first in ... sort of two series of books. (as in, there's two sets in the same world, but following different lead characters, mostly).

It's an 'urban magic' books, set in London. Having typed that, you may be thinking of several other really good examples of 'urban fantasy' - such as Dresden Files, or Alex Verus, or perhaps Rivers of London. It's a little like those, but perhaps more the latter.

I have to say, I found it somewhat hard going at first, because - well, because I was expecting more of the same - a Wizard, who happens to live in a city type of story. And it's not like that at all. It's more sorcery and shamanism than wizardry. By which I mean - the 'magic' of the city is bound to the patterns of life _in_ the city, so some of the time, the story telling seems almost dreamlike.

It also starts in a bit of a rush and confusion - which is difficult at first, but gets easier. Bear with it - the protagonist has been out of circulation for a while (which helps with introducing you to the shape of the world).

It's also incredibly evocative - the underpinning principle is that magic is life, and the power of a modern sorcerer (or shaman, or warlock, or wizard) are innately tied to the patterns of life within the city. I like that it's set in London - which is a city with an awful lot of history to it. And that history is part of the magic. So you have the 'powers of the city' - the bag lady, the beggar king, the neon court, the graffiti artists. You have the magic of pigeons (which see everything) and foxes. You have the power of a warding, based on the terms and conditions of the London underground, and graffiti paint being (potentially) magic sigils.

It's a different sort of thing, because it is innately tied to the magic of a city, and I think it's really marvelous as a result (if slightly harder going).
Leave a comment
In the news today, is some headline grabbing nonsense about protecting children from the evils of porn.
I'd like to suggest that this is just nonsensical - almost all the sensational nonsense is generally about cheap titillation and scaremongering.

What goes on between two (legally and informed) consenting individuals is none of the business of state, or indeed anyone else.

There's various types of extreme porn that are illegal - and personally, I think that distracts from the important point. Because at the end of the day, no matter how extreme, the depiction, is only a picture. A depressing or disturbing one maybe, but still - just a picture.

The _problem_ is two separate things:
- Harm done to the subject. Especially when consent cannot be given (e.g. because of being too young). If abuse is committed, then that's a crime in and of itself.
- Harm done to the 'viewer'. It's hard to say for sure what effect repeated exposure to disturbing content actually has, but there's suggestions of links between extreme porn and future abuse. Correlation doesn't imply causation though - there's nothing to say that that link hasn't reduced the future abuse, rather than increased it.

But in neither case do we really do much good by trying to censor the internet. The WHOLE POINT of the internet is it's uncontrolled and uncontrollable. Trying to control search terms is on a part with trying to ban drugs by current street name - an exercise in futility, because as soon as one gets banned, there'll be a new one in use.

I would suggest instead that disturbing and damaging porn is a mental health problem - not a crime (in and of itself - obviously if people are harmed, then that's a crime in it's own right). You can't fully protect children from exposure to disturbing concepts, and by far, assuming that the magic of the internet is going to... well it is a deeply flawed assumption. (There's not many parents who would call themselves more tech savvy than their teenage children either).

It's far better to engage and understand - from all directions. Don't censor or censure, but encourage openness. And yes, that does mean that some people with some quite disturbing fantasies will come to light. But far better that, than the problem being suppressed until it's far too late for some innocent victim.

The internet is a real power in our society today - ideas and concepts can be moved around like never before. This means all sorts of good things happen as a result. It also means all sorts of bad things can too - there's a lot of nastiness buried in the human psyche, and that'll never go away. But you can shine a light on it, and reveal it for what it is.
Leave a comment
I'm currently musing on a difficult problem. Given a large storage estate, which contains some large filesystems, what is an efficient way to process 'the whole lot'.
As an illustrative case - take virus scanning. It's desirable to periodically scan 'everything'. There's other scenarios such as backups, accounting and probably a few others.
But it's lead me to consider it - given an order of magnitude of a petabyte, distributed over a billion or so files. What is an efficient way to do it?
Again - take the same illustrative case. A virus scanner, that can process 100k files per hour. At that rate, you're looking at 10,000 hours - or a little over a year. Even if you could keep a system doing that all the time, you're still faced with - potentially - having to keep track of how far you got, on something that's changing as you go.

So with that in mind, I'm thinking about ways to scale the problem. The good bit is - as you end up with substantial numbers, you also have a lot of infrastructure to make use of - you can't physically get to a petabyte, without a lot of spindles and controllers. And that usually means array level readahead caching too.
Which means optimally, you'll 'go wide' - try and make use of every spindle and every controller at once. And also, ideally doing it whilst maximising readahead efficiency, and minimising contention. (And of course, given the timescale you almost certainly have to 'fit in' with a real production workload, including backups).

The problem can be simplified to - given a really large directory structure, what's an efficient way to traverse it and break it down into 'bite size pieces'. Again, following on the virus checking example - maybe you want to break down into '100k file' pieces, because then each chunk is about an hour of processing, which can be queued and distributed. And then you will scale this, by taking each filesystem as a standalone object, to be traversed and subdivided.

You may also end up having to do something similar in future too - again, virus checking - you probably want to repeat the process, but you can then apply some sort of incremental checking (e.g. check file modification times perhaps, although that maybe unwise unless you can verify that the file actually is unchanged).

The other part of the problem is - well, you can't easily maintain a long list of 'every file' - for starters, you already essentially do that - it's called 'your filesystem'. And otherwise you're looking at a billion record database, which is also ... well, a different scale of problem.

So I've started reading about Belief Propagation https://en.wikipedia.org/wiki/Belief_propagation - but what I'm thinking of in terms of approach is to - essentially - use checkpoints to subdivide a filesystem. You use a recursive traversal (e.g. similar to Unix's 'find') but you work on a 'start' and 'end' checkpoint. Skip everything until start, process and batch everything up until 'end' checkpoint.
Ideally, you'll measure distance between your checkpoint as you go, and 'mark off' each time you complete a batch.

For the sake of parallelising and distributing though - I'm thinking that given you _can_ tell a number of inodes allocated to a filesystem (which approximates the number of files) you can then tell how many 'checkpoints' you would need within that filesystem. At which point you start traversing downwards, in depth order, until you get a number of directories that are in the right order of magnitude - and use each of those as your first set of checkpoints. As you run, redistribute the checkpoints by simply taking - for a batch size of n - take a new checkpoint every n/2 files, and if the distance between the first and last checkpoint is less than n/2, simply delete it. That should mean you get 'checkpoints' between n/2 and n in size. There'll be some drift between iterations, but as long as it's within the same order of magnitude, that doesn't matter overly.
Start 'finding', accumulate 'a few' batches, and then leave them to be processed, moving on to a different 'part' of the storage system, to do the same. (Separate server, location, whatever). You don't want your search to get too far ahead of your processing - you're probably looking at memory buffering your batches, and having too much buffered is a waste.

But it'll always bit a bit of a challenge - fundamentally, there's only so fast you can move substantial volumes of data.
Leave a comment
I started taking a look at object oriented perl the other day. Mostly because I was deconstructing something that didn't work quite right. Anyone a little bit familiar with Perl, will realise that the .. they've probably already seen it, because the in Perl, OO is driven by hashes, references and packages.

(Here's a hint - any time you've used '->' that's probably calling an object, and - because OO lets you encapsulate - there's a lot of that in imported modules).

The basics are - an object is a package, with an internal hash. And ... that's about it.
There's one 'new thing' that you may not have seen - 'bless'. Which is perl's way of giving a generic reference a class. Because they're applicable to objects, you'll see the subroutines within the package referred to as 'methods'.

You 'use' a method, with '->'. This is exactly the same as just running the subroutine, but perl passes the object reference (that you 'blessed') into the subroutine as the first argument.
my $object -> get_value ( "fish" );
Is equivalent to:
&Package::get_value ( $object, "fish" );

You then rely on the 'get_value' sub to 'know what to do' with $object. (Which is one of the underlying principles of OO - you ask it to do something, you don't deal with how it accomplishes it).

By convention too, packages should include a method 'new' - a constructor that sets up the blessed reference, and does any other initialisation that's necessary. (Doesn't have to be called this, but it's usually a good idea). Similarly - 'internal' subroutines are prefixed with _ to indicate they shouldn't be called directly. Unlike stricter languages, perl doesn't enforce privacy within objects. You _can_ diddle with attributes and internal methods, but it's asking for future pain, so don't do it.

If you create a sub called 'DESTROY' then this is called when an object would be deleted (usually due to going out of scope, or on program termination).

That's about it, really. Quite a bare bones implementation. If you want more 'OO' style features, there's a module called Moose, which implements a lot of more advanced features.

Here's some sample, illustrative code:

use strict;
use warnings;

package MyObject;

sub new
  my ( $class ) = @_;
  print "New called\n";
  print join ( "\n", @_);

  my $self = {};
    #need to give self something, because it needs to be 
    #a reference to something - in this case, an empty hash
    #you don't need to do this if you do something like:
  #my $self;
  #$self -> {_description} = "New Object"; 
  #because if you do that, self is no longer an undefined scalar, it's a reference.
  print "And Done\n";
  bless ( $self, $class ); 
  #note - the return code of 'bless' is the object reference.
  #perl implicitly returns the result of the last operation
  #so this 'return' below would occur implicitly if bless were the last
  #line in the sub. 
  return $self;

sub print_something
  my ( $self, @args ) = @_;
  print "Printing something (", $self, ") : ", @args, "\n";

sub set_description
  my ( $self, $desc ) = @_;
  $self -> {_description} = $desc; 

sub get_description
  my ( $self ) = @_;
  return $self -> {_description};

   print "Tidying up the object\n";
   print "Args of:".join ( "\n", @_ ),"\n";


Code to drive 'MyObject':

use strict;
use warnings;

use MyObject;

  my $object_for_me = MyObject -> new();
  $object_for_me -> print_something("Cool");
  $object_for_me -> set_description ( "New Description" );
  print $object_for_me -> get_description, "\n";
  print "Doing it 'subroutine style'\n";
  &MyObject::set_description ( $object_for_me, "Different Description" );
  print &MyObject::get_description ( $object_for_me ),"\n";

print "Ending program\n";
Leave a comment
Inspired by the crowd at Maelstrom, and mostly driven by the_wood_gnome.
(Repost with minor redrafts)

If you're struck by inspiration for a letter - the dirtier the pun, the better - then please let me know. (And in true piratin' form, these are more like guidelines, than actual rules)

The Pirate AlphabetCollapse )


Leave a comment
Following on from my previous post about why RRDtool is awesome.
A worked case study.

First off, we take the Raspberry Pi.
Install 'rrdtool' using:
sudo apt-get update rrdtool
And the perl library:
sudo apt-get librrds-perl

Sky routers have a router stats page, on
(You will need your router username and password)
You can check it works with
wget --http-user username --http-password password

There's a table in there that looks a bit like:
Read more...Collapse )
Leave a comment
I've been playing recently with a piece of software that I keep coming back to, and discovering new coolness.
It's called RRDtool.

RRD stands for Round Robin Database. And what this tool does is allow you to insert time based statistics into an RRD, and extract them later as graphs. It includes automatic statistic aggregating and archiving, which means it's ideal for ... well, all sorts of statistics really.

It's used by a spectacular number of utilities - including Cacti, MRTG, and - the way I first ran into it - Big Brother. Here's the full list.

But the tool itself it really very useful - there's all sorts of things that have performance counters, and ... RRDtool is almost perfectly suited to collating them - allowing you to sample information at almost any frequency you choose - and then collate them from high resolution diagnostics, to longer term trends.

It's easier to get going than you might think - first of all, you create an RRD. You do so by giving it a 'Data Series' (DS) - which defines the input data. And an Round Robin Archive (RRA) - which defines resolution, retention and auto-archiving.

And then you insert into your RRD samples collected, which it turns into consolidated data points - which can be extracted and turned into graphs (very easily, but you've got a very powerful graphing tool that also allows you to add in formula and transformation if you so desire). Or just pulled out as a set of data points.

But the reason I've been particularly impressed with it recently, is because it implements something called 'Holt Winters forecasting'. Now, to save too much brain ache, what this is is a technique for smoothing off a graph. But the important part is that it includes a seasonal variance (and by 'seasonal' it means in a statistical sense - in most cases in IT, as 'season' is a day) which means you've a mechanism to smooth and predict - along with an expected variance, based on your seasonal trend.

This means that - rather than setting a threshold of 'bad' and 'good' on your system state (which rarely works well, because a lot of system statistics are very hard to give such a binary answer) - you can instead detect aberrant behavior - simply count (on a rolling window) the number of times your measurement strays outside the expected variance, and flag an error if it does.
This is really very cool indeed. Self tuning statistics that can bring themselves to your attention when they're 'interesting'.

I would also note - this has lead me to discover R - a statistic modeling and manipulation tool. What this has helped with, is tuning Holt Winters parameters - it has parameters for adjustment rates of statistics - as with any smoothing algorithm, you set a 'weight' for adjusting the curve. The answer to 'what should I set these parameters to' is 'it depends' normally.
What R will do is (easily) let you feed in your samples, run 'HoltWinters' on it, and spit out the optimal parameters based on your data. (It does least-squared regression to find which parameters provide a 'best fit').
This too is awesome. (And R can do amazing amounts of other stuff too, which I like).

Anyway, if - like me - you like being able to see traffic graphs, CPU loads, average/peak concurrent user trends, response times and all sorts of stats in graphical and historical form - this is exactly the tool for the job.
Leave a comment
The event started on Thursday for me - getting to site, setting up and saying 'hi'. Which is just as well, as during Friday a really impressive amount of wind and rain was responsible for the destruction of an impressive number of tents.
Despite that though, shortly after 18:00 on Friday, the weather turned and we had glorious weather the rest of the event.

And it really did make it a blast. The potential visible at the first event, that was suppressed by the cold... sprang into life. There was no shortage of things going on - the only quiet time really was when 'everyone' was off on the battlefield. (And that was fine, as it gave time to stock up on bacon).

Between the different camps being beautifully dressed, the various people really trying (and succeeding) at giving their nations a real 'feel' which made just wandering around a joy. And no shortage of 'Stuff' going on, between the deliberations of the Synod, Senate and Conclave. (And presumably the Bourse/Military council too).

So actually, didn't really end up doing much, aside from wandering around and talking to people. Which wasn't really a bad thing - I really enjoyed it. But I think I do need to up my game a little, and get involved a little more.
Leave a comment
I have to say - I really had high hopes of Neil Gaiman's second episode - in what has been a lacklustre season so far, it ... well, frankly I was hoping to see the same magic of the Doctors Wife. Sadly not. I'm not sure what's going on with this season, but it really hasn't managed to deliver any of the really cracking episodes in the previous seasons.
In case of spoilersCollapse )
15 comments or Leave a comment
Because of spoilerCollapse )
3 comments or Leave a comment
The short summary: The weather was a bit grim. Offical 'ground temperature' did drop to -6 on some nights. But despite that, the event was fun.

The event started with us pulling up on site at about 17:00 Thursday. And being rather shocked at the incredible number of tents in place - as far as we could tell, around 1000 people were on site, which is already larger than Maelstrom.

The IC field was also looking quite busy - the Profound Decisions (PD) crew had been putting up copious quantities of tents, and a few 'permanent-ish' buildings, in the form of the senate and the tavern. We ended up putting our tent in the OC field, simply because of available space. Which given that wasn't a small field, is another sign of just how many people were there and keen to be involved.

There had been mud, which threatened to make things difficult - vehicles weren't being allowed on site. The massive pile of wood chips went a long way to making a servicable road.

Time in was earlier than 'usual' on Friday - there were many players who hadn't caught the start time of 13:00. (Maelstrom always started at 18:00). But given just how many things needed doing over the weekend - the daisy chain of electing - it was necessary. As a consequence of sorts, there wasn't really much room for politicking and campaigning - senators in particularly, were primarily elected right off the back of ... well, how many voters they turned up with in the first place. On one hand, it's slightly annoying but ... also largely inevitable. There will be time enough over the next year for campaigning to happen.
This, I think, was a pattern repeated over the festival - no one really knew anyone else, so it turned into a massive nepotism exercise. But then, I guess the 'cover story' of the collapse of the existing structures of the Empire is also .... well, going to trigger exactly that result too, so it's all to the good.

I like how the elections were structured, and think they'll be giving plenty of room for ongoing shenanigans - each of the areas of politics worked subtly differently. From the 'mercantile' elections being offered by auction, to the more standard 'popular vote' mechanisms.

I also heard plenty of enthusiastic reports on the battles - I didn't go myself, but by all accounts there were around 600 participants on each 'side' - and the combats were suitably tactical, and - in many ways - unforgiving of errors. (A few groups got isolated, surrounded and torn torn to shreds).

However I think what really made the event for me - there were clear guidelines on what each nation was 'all about' - including traditions, views and costuming. And at the event this really did create camps with very different 'feels' to them - colours, themes, styles - made it easy to believe you were visiting different nations within the Empire. There was a feel of a festival or fair, which meant wandering around the camps to socialize, trade and scheme felt... well, less like a hazardous operation, and altogether more like the 'done thing'. Or at least, if it weren't for drag of squelching though mud to get there.

I was also quite well impressed with the cohesion with the ref team - as far as I can tell, there are 'field refs' who primarily act as point of contact, and can radio back to get more detail - and can therefore do anything that's needed, rather than needing separate teams. Having an 'egregore' for each nation (basically a personified spirit of the nation) also helped have a focal point - (who also had a radio, so could pass on information about what was going on).

So yes - broadly speaking, the first event was far less chaotic than it really could have been. The weather wasn't anyone's friend, but it also was handled very well by the site crew, making it an inconvenience, not a disaster. And I never had to queue up in GOD, which was almost a shame given how toasty and warm it was in there....
Leave a comment
Every year or so, the government issues us all with a free piece of paper and a pencil to scribble on it, and offers to have a look through the whole lot. In the face of such largess, it seems almost rude not to participate. I would therefore like everyone to take the time to do a pretty doodle, scribble or scrawl. If you can manage it, submit a photo of it, and I'll collate them.

I'm referring to ballot papers, and electoral turnout - this PCC election in particular seems to have been something of a farce - reports of very quiet polling stations, widespread low turnouts.

But this isn't really news - voter turnout has been pretty poor for many years, and no one really pays much heed. Go on - what proportion of your electorate turned out for the last council election? Mine was 31%. Near 7 out of 10 people didn't spent 10m putting a mark on a piece of paper.

I've heard a variety of excuses - from 'I couldn't be bothered' to 'it makes no difference anyway'.

And I can _sort_ of sympathise. I mean, given the way we do the elections - if there are multiple candidates a vote for a minority group can end up a wasted vote. Whilst it _might_ get their deposit back, realistically the choices in the general election are either for or against the 'lead' candidate.
There's very few three way marginals out there, and realistically no minority party is going to win, ever. Because of 'first past the post' there's very few candidates outside the 'big three' who ever win a seat. Of the 650 seats, just 29 were in that category.
Likewise... well, take a look if you will - the demographics of the UK: http://en.wikipedia.org/wiki/Demography_of_the_United_Kingdom
Now compare that against the demographics of the House of Commons.

But the problem with low turnout - of not casting _your_ vote - is that then it's hard to tell the difference between:
- those that can't be bothered.
- those that are satisfied that any of candidates would be acceptable.
- those that don't want to endorse anyone.
- those that object to the election in some way.

A candidate who wins, with 70% of the electorate not casting a vote _can_ call that a mandate.

So I'd like you to consider very hard next election - if none of the candidates appeals to you, then take the time to go and spoil your ballot.
The outcome is much the same - your vote will be excluded from the proceedings, and the same candidate won. However, it's much harder for them to claim the tacit support of the electorate when people _did_ turn out, and took the time to _not_ vote for them.

I would really like to see an election where the winning candidate was 'outvoted' by spoiled ballots. A clear message _that_ would be.

But realistically - nothing changes if you stay at home. You're deemed to be tacitly supporting whoever one. Even if that wasn't your intent - you get lumped in with all the people who couldn't be bothered.

So please - take the time to spoil your ballot. Make your protest by _actually_ making a protest.

(Note: I mean this in lieu of not voting at all. If you have a candidate or party you wish to support, then knock yourself out. If you don't, I'm sure a minority/independent might be appreciative of getting a bit closer to getting their deposit back.
Whatever you do though - resist the temptation to 'opt out' by staying home. Opt out by drawing a pretty picture on that free piece of paper).
5 comments or Leave a comment
Have you ever wondered why you don't get 'round' numbers of gigabytes when you're buying a phone, or memory card?
No doubt you've seen it - you'll see 8, 16Gb, 32GB, and almost never a 10, 20Gb or 30Gb.

The reason is actually quite simple - because the simplest representation of information is the binary state - on or off. We call this the 'bit' - or Binary Digit and it doesn't matter if that's a light switch, or a punch card with holes and 'not holes'.

If you string together bits - you get a multiplying effect. If I have _two_ bits - each can be in two states, giving us 4 combinations. (ON/ON, ON/Off, Off/ON, and Off/Off).
As we add more bits, we end up with more potential combinations, and we simplify the two states into '1' for on, and '0' for 'off'.

This is where we get the start of binary - computer memory is incredibly simplistic, and you can think of it all as a long sequence of switches - that are either on or off.

A byte - with 8 'bits' - gives us 2 ^ 8 states - or 256.
So if we were wanting to represent numbers - we could convert any number between 0 and 255 into a 'byte'.
But it's cumbersome to write:
100 = 0110 0100

That's why hexidecimal is used - hexidecimal is base 16 - so each 'digit' goes from 0-15. (0-9A-F).
That's actually a much easier way of representing a binary number, as you can take each 'chunk' of 4 bits.
So that '100' in decimal, could be represented as 0110 (6) and 0100 (4). (the reason we have to use 'hex' is because the 4 bit block go up to 16, and so we need extra numbers).

But anyway - memory is - in effect - a bunch of switches. Each switch we add doubles the number of possible states - so where '8 bits' gives us 256 states, 9 gives us 512, and 10 gives us 1024.

The 'standard' size is 8 bits to make up a byte - that gives us 0000 0000 -> 1111 1111 - or to represent it in hexidecimal - 00 to FF. (You may see 0x in front of a hex number, just to make it clear that 0x64 is _not_ 64 in decimal.)

This is largely why you get memory sizes in multiples of 2.
1 Kb = 2 ^ 10 bytes.
1 Mb = 2 ^ 20 bytes.
1 Gb = 2 ^ 30 bytes
1 Tb = 2 ^ 40 bytes.

Because you double your number of 'states' by adding an extra bit, it ends up effectively wasteful to 'round down' to neat multiples of 10, as we're used to. If you want to hold 'up to 100 (decimal)' different states - 7 bits hold 2^7 states, which is 128. 6 bits holds 2^6 states - or 64.
So we'd need to use 7 bits, and - basically - throw away 28 of them, and never use them. Which is largely why we end up with 'nice neat' multiples of 2 for almost all our storage devices.
32 Gb being 2^35 bytes.

*We have Claude Shannon to thank for much of the grounding of information theory. http://en.wikipedia.org/wiki/Claude_Shannon,
Leave a comment
It seems another thing may be comment worthy on the subject of free speech.
A court case was won today.
A gay couple, who won a case against a B&B, who refused to let them sleep in a double bed.

This I think, is a good thing. The refusal was 'on religious grounds', and I'm quite pleased with the result.

However, that isn't what's prompted me to get out the keyboard. What has, is that someone high profile has decided to contest the ruling, by posting the address of the couple concerned on twitter.
Something which has give me pause to think - what do I think of _that_ in the context of previous posts on freedom of speech.

And my first thought is - I'd be considering posting my name, address and a veiled threat on Twitter to be at least as offensive as the previous cases.
Which for the sake of reference, in one case covered being sexually explicit about an abducted (and presumed murdered) 5 year old on facebook, and in the other wearing a t-shirt that supported the murder of two police officers.

I am very much hoping to see the gentleman in question in court, and would very much hope he got at least as much of a sentence as the other individuals.

As to the broader question? Well, I'm not sure how I'd compare it if I'm honest. I think willfully violating someone's privacy - especially in a context which encourages harm to them - to be more like criminal behavior than being the 'ordinary' kind of obnoxious and offensive.

And maybe as such, something that warrants the protection of law, where maybe 'just' being offensive might not.
1 comment or Leave a comment
Two stories in the news in the last few days have concerned me slightly.
This one:
Is about a guy who wore a T-shirt that was offensive, following the death of two police officers.
He's got 8 months in prison. (ALthough, I believe 4 months for the offence, and 4 for

And this one here:
He made an obnoxious offensive joke about the missing - presumed abducted and murdered - 5 year old, April Jones.

These worry me. I don't wish to defend what they said - it was obnoxious, tasteless, unpleasant and probably distressing.
No, what worries me is that I'm not sure it should be a crime to be an odious gobshite.

There's one simple reason - because when you have a law that makes it illegal to say certain things - things that are 'judged to cause offense' - then you have to have someone who makes the final decision - this joke is crass but ok. That one crosses the line. This point of view is impolite. That one is criminal.

And put very simply - I can think of no one I would trust to make that decision. If you do it via 'popular opinion' then you're at risk of minority oppression - how many people would find it 'ok' to make an unpleasant comment about Jimmy Savile at the moment? Now, same question, but aimed at transexuals perhaps? Do you see where I'm going with this? There will always be minorities that - by virtue of who they are - are more 'popularly acceptable' to be offensive about.

If you have some other group being the 'taste police' then you've got a bigger risk - that of appointing a group that has bias built in because of who they are - look if you will at the demographics of the current members of parliament. How many are white, middle class, middle aged, male, and from a public school background? Would you say that's more than average from the population?
How about the judiciary? Does that not have the same problem?

I'm worried that very simply you cannot truly appreciate something that is outside your realm of experience. You cannot understand what it is like to be bullied or abused for who you are, if it's never happened to you. If you've never been fat, female, gay, transsexual, black, disabled, ill, raped, abused as a child, mentally unwell... then how can you be someone who passes judgement on 'offensive'?
3 comments or Leave a comment
I have seen a few places now, expressing the sentiment that Mark Bridger - the man who is in custody, in the April Jones abduction case - should be tortured until he tells them the truth.

I would very much hope that no one I consider a friend believes this. But you know what? If you do, I'm _not_ going to torture you.

Here is the problem - we have a trial by jury system, which must have a jury return a verdict that the person is guilty, beyond reasonable doubt.
And until a jury has done this, the person is innocent.

Now, that's not very efficient. After all, proving something 'beyond reasonable doubt' is not exactly easy - it takes time to gather evidence, and sometimes it doesn't prove it to a high enough standard.

But fundamentally, there is no other choice - history has many lessons for us on this subject - that if we allow mobs to rule, then EVERY minority suffers. Be that because they're kicked to death for being a goth, or not allowed to vote because they're a woman.

We have a jury system, and a democratic political system - not because they're good, and because they're effective, but because as a society, we have PROVEN time and again, that we are not capable of being humanitarian.

So I'd say to you all - please bear it in mind - this man _may_ be guilty. But until a jury - after consideration of the evidence - says: "Guilty, beyond reasonable doubt", then he is as innocent as you are. As innocent as the girl who was beaten to death for being a goth, and as innocent as the child who has gone missing.

If - and only if - he is found guilty - then the situation changes. But even then, we imprison to protect society, and rehabilitate the offender. Torture doesn't do this. It isn't even particularly good at 'finding the truth'. Witches have been burned on the basis of evidence from torture.
I have to say though - I do sympathize with all the people who are angry, frustrated and want the girl to be saved. But a society based on vengeance leaves _everyone in it_ worse off.
Leave a comment
This week, I have mostly been playing with Thread::Queue.
Once of the downsides of perl threading is that it's not particularly lightweight. Spawning lots of new threads to do a single task isn't a very efficient way of doing a task - especially if you have libraries imported, and large data tables.

So the method I've been playing with is queue oriented - spawn a number of threads equal to some arbitrary parallelism target - 1 per 'resource' consumed is a good bet (so for processor intensive stuff, one per processor - if you're doing remote access to 15 servers, one each).

And then implement a 'queue' which is a thread safe implementation of a FIFO queue (FIFO = First in, First out).

It uses the library Thread::Queue, so you include that at the start of your program. You don't actually strictly speaking need to be threading to use it though - there's other reasons to use a FIFO.

So as a sample:

Read more...Collapse )

Fairly simple, but does allow for daisy chained processing (e.g. moving from one FIFO queue to the next).
The only slightly complicate part is in handling 'thread exiting'. I've taken to using an 'exit' signaler in the queue. (use an arbitrary pattern, and 'catch' when that occurs).
However the other possibility is in just using some kind of 'all done' shared variable, that you set once the queue is fully populated - because what you don't want to do is just assume that because the queue is empty, work is finished - because when you first start the thread, this might be the case, or perhaps if there's a dependency - or perhaps once the first items get 'dequeued' then the other threads might see an empty queue.

I've been using this mechanism to create a 'cascade' of tasks - run something on one (group of) server(s). Do a some processing. Run something based on the result on another server. This is well suited to queue style processing.
Similarly - because you're queue oriented, then it's also well suited to scaling up (or down) the parallelism. Such as when you're in a multi processor environment, for example - you may want to hog all the processors that are available, but you'll lose efficiency if you overdo it.

Tags: ,

Leave a comment
"[Vimes] learned something new: the very very rich could afford to be poor. Sybil Rankin lived in the kind of poverty that was only available to the very rich, a poverty approached form the other side. Women who were merely well-off saved up and bought dresses made of silk edged with lace and pearls, but Lady Ramkin was so rich she could afford to stomp around the place in rubber boots and a tweed skirt that had belonged to her mother. She was so rich she could afford to live on biscuits and cheese sandwiches. She was so rich she lived in three rooms in a thirty-four roomed mansion; the rest of them were full of very expensive and very old furniture, covered in dust sheets.

The reason that the rich were so rich, Vimes reasoned, was because they managed to spend less money.

Take boots, for example. He earned thirty-eight dollars a month plus allowances. A really good pair of leather boots cost fifty dollars. But an affordable pair of boots, which were sort of OK for a season or two and then leaked like hell when the cardboard gave out, cost about ten dollars. Those were the kind of boots Vimes always bought, and wore until the soles were so thin that he could tell where he was in Ankh-Morpork on a foggy night by the feel of the cobbles.

But the thing was that good boots lasted for years and years. A man who could afford fifty dollars had a pair of boots that'd still be keeping his feet dry in ten years' time, while a poor man who could only afford cheap boots would have spent a hundred dollars on boots in the same time and would still have wet feet.

This was the Captain Samuel Vimes 'Boots' theory of socio-economic unfairness."

On a related note - my new boots have arrived. First impressions are that they're comfy and robust.

I'm also dead impressed with the customer service I got from Polimil. (http://www.polimil.co.uk/)
They phoned me the day after I ordered, to say they were out of stock.
They suggested an alternative that they could dispatch immediately (One I had been considering, so it was a good alternative).
And then they gave me an indication of lead time to get new stock - 5-7 working days.
I chose to wait, and was quite impressed to see them dispatch Thursday (so more like 4 working days) to arrive Friday.

It's easy to get used to 'very average' customer service, so I was quite pleased with my shopping experience - I'll be remembering them for next time.

The boots come well reviewed - who couldn't like (from the site)
"I have been wearing this excellent boot for the last five years and nothing else comes close. Some trainer wearers will think it's too rigid but the extra support lends it to the more demanding user. I've hopped over high security spiked steel fencing, trudged through broken glass and needles, even got dragged a quarter mile by a stolen car. Let me tell you, when you are lying on your back at 40mph with only your boots between you and the Tarmac, these bad boys are what you need. I won't wear anything else."
5 comments or Leave a comment
This is perhaps in the wrong order, but to follow on from a couple of rambles lately - threading and perl.
How do you basically do this?
Here is an example:
Read more...Collapse )

OK, that's pretty simplistic I know - but the major way I've found threads useful is for what amount to embarrassingly parallel problems, like 'connect to 200 servers, and run the same commands on each of them'. You could quite easily replace that rather trivial 'sleep' subroutine, with one that does an ssh to a host, to run a command and capture the output. (And maybe process the results, before returning them)

Also note: Perl has had threads for a while, but the module doesn't necessarily contain all the functions you need - latest version as of 2012-07-23 is 1.86 which is available from CPAN. (http://search.cpan.org/~jdhedden/threads-1.86/lib/threads.pm)

More detail: http://perldoc.perl.org/threads.html

Tags: ,

1 comment or Leave a comment
So, I've been fiddling around with Perl, and threading.
One of the things that's been bugging me, is that when I've tried to do a 'return' from a perl subroutine, it's not worked - and I couldn't for the life of me figure out why.
What's _supposed_ to happen, is that you do 'thread -> join' to join the thread (once it's finished running) and that's supposed to capture the return result.

Why it wasn't working is thanks to this little snippet in the documentation (Yes, RTFM, I know. But to be fair, I wasn't looking for it in _that_ bit).

"The context (void, scalar or list) for the return value(s) for "->join()" is determined at the time of thread creation."

Perhaps I'd better backtrack a little though - I mean, anyone who's not really 'into' perl, might not have a clue what I'm talking about when I say 'context'.
So I shall summarise.

Perl is quite clever - it has two 'real' variable types - scalar - which contains anything that's a single value (So any string, integer, float, character, reference). And array (or list) - which is a group of zero or more scalars.

The clever bit is that it can figure out what you mean, by the context in which you do it - a brief illustration (lj cut to avoid breaking formatting).

code hereCollapse )

What's happening is the rather clever function 'wantarray' is being used to tell the call context of the subroutine 'wantarray' is undefined if it's a void context (the result is discarded). True if a list/scalar context and false if in a scalar context.

As for why this is useful - consider if you do something like 'grep' - a Unix command to find 'matches' against text patterns. If you do it in a 'list' context, having a list of the lines it found would be useful. If you do it in a scalar context, then having a number of matches is probably more useful (0 being 'false' you could do 'if grep("pattern", @text_block)' for example)

So anyway - the context of a threaded subroutine is set when the thread is _created_.
Which means you need to do something like:
Read more...Collapse )

On the face of it, you immediately discard '$thread' because it drifts out of scope (and it does). However, it also means your thread is created in a scalar context, so any results it returns will be scalar.
If you _don't_ do this, it'll be in a void context, and any return is discarded. Which was what was tripping me up.

And you will then be able to do:

Read more...Collapse )
Which won't work if you don't start the thread in a scalar context.

Tags: ,

3 comments or Leave a comment
Because I imported a bunch of files from various directories - I wasn't quite sure if I duplicated my photo collection.

The solution? Well, probably quite a few. But here it is in Perl.

First download: Activestate Perl.

Then, open your favourite text editor (I'm starting to quite like Textpad - http://www.textpad.com/download/ ) but notepad will do just fine.

Place into it the following code:
Read more...Collapse )
You'll - probably - need to edit '@paths_to_process'.
In perl, that's a list. Lists are values separated by commas, and with a curvy bracket on each end.
We use single quotes, because / has a special meaning, and we just want the literal text.
You could therefore do
@paths_to_process = ( 'C:\' ); 

and this would do all of your C drive. I wouldn't suggest that as a good idea, as it'll take a long time (because it has to open and read every file on your hard disk).
SO I'd suggest sticking with directories that you know you've stuff that might be duplicated. (E.g. pictures directories - but this doesn't really care what the file type is).

Anyway - then save that as 'duplicate_finder.pl' (or anything you like, basically, as long as it ends '.pl' which tells Perl to 'work with' this file when you double click it). I'd suggest running it from a command prompt, but that's personal taste. (It prints text - the window will probably vanish after, but don't worry as there'll be a text file there with the results)
Leave a comment
Dear everyone who uses post codes on your website.
Would you kindly refrain from insisting my post code is incorrect if it doesn't include a space.
It's not rocket science to parse post codes.
They are
a) not case sensitive.
b) The second part is ALWAYS a digit, and two letters.
c) the 'first part' is ALWAYS two letters, and 1 or 2 digits.

(The first part is also geographically grouped, but the second part may not be.

So there's no ambiguity involved - you can't get 'mixed up' between AB1 23CD and AB12 3CD, AB123CD, ab123CD. They're all the same post code. (And it's region 'AB12').
Not being able to parse this on your website is just sloppy - at least, bearing in mind you're doing any validation whatsoever on your inputs. (Which you are, because you've spotted it hasn't got a space).
5 comments or Leave a comment
It would be impossible to avoid a comparison with Jim Butcher's Dresden files. So I won't. It's a similar sort of a story - it's about a modern day, urban wizard. It's set in London - Camden, specifically - so it resonated a little more clearly with me than Harry Dresden's stomping ground, Chicago.
I think regardless, it stands the comparison quite well - it's less light hearted too - fewer 'comedy' moments, but none the worst for it.
Alex Verus is the lead character, and runs an 'Arcana' shop in Camden. He's not a powerhouse of a mage, but he does have a very useful schtick - that of being a diviner, and able to see the future. I have to say, that sounded like a recipe for disaster for me - after all, whilst the author has perfect foreknowledge, it's a bit dull if the characters do. But it's handled very well, and plausibly so. It's the thing that lets the otherwise not amazingly powerful Alex keep on going.

It's well paced, and a real page turner, and one I think you'll not regret - assuming you like the general premise (or otherwise enjoyed Dresden Files).
Amazon Link.

Certainly one I'll be picking up more of.
2 comments or Leave a comment
So, having come back from Maelstrom. The antepenultimate event. (Isn't that a good word?).
On a new character. End of the world is nigh. Things are starting to move. I had a variety of things I couldn't do, because of something else I did. But that was all fun, none the less (Nothing quite like someone trying to pick on you, whilst you're unable to do much about it).

Also: There was Pimms, good company, and socialising. The weather control was turned up a bit from last event. Given the choice between 'cold, huddled around the fire' and 'baking hot during the day, mayhem during the night' I'll definitely take the latter :).

Would quite like it if people weren't _quite_ so uptight come nightfall, but the onslaught of the Evil Dead makes that largely inevitable. I'm now in a place where - admittedly from a short run up - I have a win condition ahead of me, which I probably have no chance of making, but am going to try anyway. (It's probably not entirely congruent with some other win conditions out there, but that makes it all the more interesting).
Leave a comment
This weekend's excursion, was to sunny Chepstow, for Waltz on the Wye. Staying a little outside town, in a B&B, which I can honestly say was absolutely wonderful. The Friday evening saw a meal, followed by heading into town for the evening - mostly centered around the drill hall.
The highlight of the evening I think would have to be Morgan and West, the time travelling magicians. But a 'bloody mary' crumpet in the park by the river was rather fun.

Saturday, we had an early start for an epic breakfast, followed by a retreat to tactically snooze, before heading into the town for lunchtime. A wander around the mechanized market, a visit to the castle (and the contraptions exhibition). Not forgetting, of course, the 'chap hop work-shop' with the illustrious Professor Elemental. Who was a bit overwhelmed - expecting a turnout of 'about 10 people'. I'm sure it comes as no surprise that there was considerably more interest than that.

A takeout pizza, and a 'quick-ish' change, before the evening's entertainment. A cabaret show - including the can can, some juggling, hula hoops, and of course, a spot of waltzing. The highlight of the evening would have to be the aformentioned Professor Elemental though - he's got a highly entertaining stage presence.

Sunday was a little more sedate - the kind people at the B&B did us a later breakfast, allowing for a bit more of a lie in. Again, into town after packing up. Catching 'the curious case' theatre production - which was somewhat hard to describe, but was quite funny.

There was a little more of the market, another crumpet or two, and a talk about hidden messages in jewellry.

Am now home. Fairly tired, and now convinced I look good in a top hat.

I would, however, like to make some observations.
Firstly: Goggles-on-hat are now cliche. Some outfits they make sense. Mostly, they're just a bit silly really. I have nothing wrong with the idea of wearing the goggles, and putting them on top of your hat to get them out the way. Having them as permanent hat ornaments though, is just a bit ridiculous. It's the metaphorical equivalent of wandering around in the countryside, wearing riding gear, but having never been near a horse in your life.

Also: I now have several ideas that are amusing me. Hedgehog skirts - crinoline with spikes that 'pop up' (Maybe also doing a 'colour unfurling thing').
Maybe kickstool crinolines (e.g. one with a mobile chair to sit on, as you go).
Cog skirts, so multiple of them can mesh together - turn every dance into an excess of twirling (and battling for gearing ratios).
How to make a portable 'smoke machine' to have a chimney stack on a stovepipe hat.
'book' style phone and kindle cases.
And I'm also trying to figure out if you can subvert the 'goggles on hat' by making them lenses for a projector.

So yes. All good fun. Think that'll be one of the ones we aim to do next year.

Oh, and whilst I remember: Air Kraken. Do they live on air biscuits?
3 comments or Leave a comment
So I have in the past, been a pig about the national lottery. Calling it a stupid tax. (Or a greed tax, which is why I occasionally buy a ticket when there's a really huge jackpot).
But one thing that I can see, is that it's got potential as a 'piece of a dream' - I mean, leaving aside how exploitative that is, you've got a moment to think of what you'd do with it, and the small possibility that it pans out.

So I'd like to suggest an alternative - the 'job' lottery. It's a bit like the real lottery, only it's one where you put in an application for a job that'd be life changing. Be that because it's your dream job, or it's got the best hours and benefits, or just ... well, because it's near somewhere you'd really like to live.

Take a moment to enter each week - there's plenty of websites, but I'd suggest playing in http://www.indeed.co.uk - you may not be qualified, but I guarantee you your odds of 'winning' are a lot higher than getting the jackpot on the lottery. Second prize is a chat, a tour, and a cup of coffee and a chat about what you'd need to get into it.
And even if you don't win... well, you've got your CV and applying skills brushed up a bit. And anyway, it's not like you'd win the lottery, either.

But once you do that, then take a moment to dream about what that job could mean for your future. What it'd be like to live there, or do that, or get paid HOW MUCH?!

(Entry fee is probably cheaper than the lottery, too. At least, provided you're applying for the 'send us a CV' ones...)
1 comment or Leave a comment
So, being a fan - as I am - of Perl, I've had a reason to take a bit of a look at threaded code.
Threading is one possible implementation of parallel code and - for my purposes - is quite useful when you've got multiple things going on, which require 'something else' to respond.

Such as - for example - if you've got to log in to a lot of different servers, to perform a simple task - the 'login' takes more real time, than it does 'processing time' - so by threading, you end up with the task being accomplished faster.

Perl is actually quite easy to 'thread' with - the only really hard part is that your perl interpreter must be compiled to support threading - and if it doesn't by default, then it's a bit of hassle to recompile it. The good news is, current versions of perl seem to by default (my small sample of 'a couple of systems' I didn't need to rebuild).

The basics of how to do it, can be found in 'perldoc perlthrtut' - but it goes a bit like this:

add 'use threads;' to the start of your code.

In your perl code, create a sub routine, that will run as a thread.
sub thread_test_subroutine
  my ( $arg_1, $arg_2 ) = @_;
  print $arg_1;
  sleep ( rand(10) );
  print $arg_2;
  return $arg_1 + $arg_2;

When you call that subroutine, rather than doing so in the normal fashion, do so using
threads -> create.

my $thread = threads -> create ( \&thread_test_subroutine, ( $first_arg, $second_arg ) );

And for the sake of neatness, you need to 'join' the thread (joining is perlish for 'wait for it to finish, and get any return codes).

threads -> join ( $thread );

And it's just like that.
Because I'm a smartarse, I wanted to get a bit more clever - you can extend this idea for creating several threads, to all run in parallel.

So for example:
for ( my $count=0; $count < 10; $count++ )
  threads -> create ( \&thread_test_subroutine, ( $first_arg, $second_arg ) );

foreach my $thread ( threads -> list() )
  if ( $thread -> tid() ) 
    my $result = $thread -> join();
    print $thread -> tid(), " returned ", $result, "\n";

Which creates 10 threads, waits for all 10 to 'do their thing' and then joins them. Hopefully you can see where that gets a bit handy, if you've got a lot of networked devices to do stuff with.

But the hard part when messing with threads, is things like communicating between them. There's two libraries that I've been looking at today for that purpose.

You see, when I create a 'load' of threads, the last thing I want to do is to do so in an open ended fashion - 100 threads for 100 servers might be ok.
10,000 threads for 10,000 servers might cause a bit of a problem.

That's where Thread::Semaphore comes in, and - because Thread::Semaphore isn't part of the base distribution - I've also been looking at threads::shared.

Thread::Semaphore 'lets you create a 'shared' counter. (Which defaults to 1).
There's two bits to it - 'down()' and 'up()'.
down() is used to decrease the semaphore, and - if it's zero - will wait until it can do this.
up() increases the semaphore.

So if you insert above:

my $resource_limit = Thread::Semaphore -> new ( 5 );

And within each subroutine, used:

$resource_limit -> down();
$resource_limit -> up();
when done, you'd end up with a scenario that - with 10 threads 'existing' - you'd only actually have 5 'running' at any time.

You can also have your resource limit locally, within a loop:
sub thread_test_subroutine
  my ( $semaphore, $host, $arg_1, $arg_2 ) = @_;
  print "$host";
  print $arg_1;
  #check there's a resource available to do this bit
  $semaphore -> down()
  sleep ( rand(10) );
  $semaphore -> up()
  print $arg_2;
  return $arg_1 + $arg_2;

my @server_list = ( "one", "two", "three", "four" );

foreach my $host ( @server_list )
  my $resources_per_host = Thread::Semaphore -> new ( 2 );
  for ( my $count=0; $count < 10; $count++ )
    thread -> create ( \&thread_test_subroutine, ( \&resources_per_host, $host, $first_arg, $second_arg ) );

Which passes your 'semaphore' into the thread as a reference, such that you'll only have to 'active' threads per server in your list.

If you don't have Thread::Semaphore available, you can 'fake it' by using a (thread) shared variable, and a lock.


sub thread_test_subroutine
  my ( $semaphore, $host, $arg_1, $arg_2 ) = @_;
  print "$host";
  print $arg_1;
  #check there's a resource available to do this bit
    lock ( $semaphore );
    sleep ( rand(10) );
  #lock is released, because it's now out of scope. 
  print $arg_2;
  return $arg_1 + $arg_2;

my @server_list = ( "one", "two", "three", "four" );

foreach my $host ( @server_list )
  my $resources_per_host : shared;
  for ( my $count=0; $count < 10; $count++ )
    thread -> create ( \&thread_test_subroutine, ( \&resources_per_host, $host, $first_arg, $second_arg ) );

Not quite as good, as you can only have two states - 'in use' or 'not' per host. But does still allow for a bit of crude throttling.

But it was somewhat easier than I thought to do something that was practically useful, using threaded perl. These snippets are for my own reference, rather than practical usefulness, and bear in mind - if you're playing with parallel programs, you can end up with all kinds of exciting and interesting things going on, if you're not careful.

Tags: ,

2 comments or Leave a comment
If you've watched TV recently, you've probably seen an advert for one of the 'payday loan' companies. This is the most recent not-quite-a-scam, following on from the 'buy your gold at a stupidly low price' dealers.

However, were you aware just how much it costs to be 'consumer credit licensed'? As a sole trader, it will cost you the princely sum of £585. If it's a company, that goes up to a whole £1225.
( http://www.oft.gov.uk/OFTwork/credit-licensing/fees-refunds-payments/#named2 )

That ... well, that really doesn't buy an awful lot of regulation. Which is part of the reason the 'payday loan companies' are proliferating in this country - because it's at least relatively easy to set up, and really quite hard to fund effective regulation.

Alongside this, it probably hasn't escaped your notice, that interest rates are really low at the moment - savings accounts are just about tipping 3% interest, but ... frankly, that's a fairly lousy rate.

I wonder if the time hasn't come for peer-to-peer finance. Sort of like a 'lending market place' offering the opportunity to lend and borrow.
Offer out 'lending amounts' in small slices, to minimise individual risk, but still be in a position to - effectively - supply short-medium term loans (probably unsecured) and sit comfortably in that niche that means the lender gets a higher return than a 'savings account' in return for a little more risk.
And a borrower has access to credit that doesn't charge them 5000% APR.

What do you think? Would you make use of such a facility, if it were an option? To put down - say - £1000, to be lent out in £10 (or smaller) slices to individual borrowers, at interest rates that are 'competitive' with the existing loan/overdraft market?
Bank overdrafts clock in at 'around' 15-20%.
Credit card borrowing can be up to about 30%.
Bank loans might be in the 6-12% range.
Even mortages are generally (outside of fixed term) 4-5% (plus usually an arrangement fee).

And 'short term' credit, such as unarranged overdraft and those payday loans usually attract relatively fierce rates.

Now, you'd be trading off some risk in return for the increased rates (apparently pay day loans experience a 10-20% default rate) - there is a possibility of loss, and perhaps not full utilisation of the money (e.g. if it's not being 'borrowed' it won't necessarily be attracting interest.).

But the idea would be that you'd be able to specify acceptable borrowing quantity, interest rate and term, and probably a max proportion per individual, and a ticky box list of what you'd consider 'acceptable' in terms of who to loan to. (credit scoring, reference, security, er... whatever - but these'd be rolled into the price, in terms of applying a cost to the borrowing)

The individual would be able to 'request a quote' indicating approximate term of borrowing, and thus repayment rates (and in turn, interest). Which'd be calculated based on how many people were prepared to lend, and at what rate. (And probably be a smidgin in the middle to cover operating costs, credit scores, etc. ).
4 comments or Leave a comment
Two quotes that are related. Bonus points if you can guess the source (of both, the first one's easy!)
"Fear, hate, anger, the dark side are these."
If he'd thought about it, I'm sure he'd have included blame and resentment as well.
"There's magic in forgiveness."

Now, the reason they're related - they're all about emotions, and emotions that are ones aimed at someone or something.
That often leads to a trap though - because it's all too easy to focus on the target.
You fear Someone.
You hate Someone.
You blame Someone.
And you forgive Someone.

But the problem here, is the really important person in these emotions, is not the 'someone else'. But you.
The real reasons these emotions are 'of the dark side' is nothing to do with what you fear or hate. It's because they taint your perception of the world. They're an insidious poison, that skew your worldview, and make it a darker place.

That goes double if the thing you hate, fear or blame is yourself. I'm sure everyone's done it, at some point or other. That feeling of failure, of guilt that leads to self doubt, and self hate.

But still - it's not the subject of the emotion that matters, but the source.

The reason forgiveness is so inextricably linked though - forgiving someone is _also_ very little do to with the subject. It's all about letting go and putting aside all those dark side emotions. Forgiving isn't about forgetting. It's about letting go of that poison inside of you, and letting it go away, rather than clutching it close to your heart, where it can most hurt you.

So next time you feel tempted to 'stray towards the dark side' - just stop and think for a moment. Who's that actually hurting?
5 comments or Leave a comment