Friday, 27 March 2009

Lack of Planning

With this job at Jericho, it feels like I'm just constantly putting out fires. In one way that's a good thing, meaning that at least I'm busy and have work to do (which is not a bad thing in a recession). However, it does mean that I find less time to sit down and do some real planning.

For example, I've started looking into getting some industry qualifications in order to put into practice my idea that in the IT industry, to be successful, you must be constantly learning. Some industry qualifications would also look very nice on my resume. The ones I started looking at have been the Red Hat certifications and the Cisco certifications, although I wouldn't be opposed to getting Microsoft Certified, but that's a low priority. You can find so many '.NET' monkeys around that it would almost be an advantage (in my humble opinion) to not have any certs from the software giant.

Also, I've been thinking about starting my own Linux consulting firm. Nothing big, mostly just a way for me to make some extra cash on the weekends and further refine my skills. I think there's a real gap in the market here in NZ. I was also inspired by learning about OpenLogic and seeing one of their presentations. This was a very nice reminder that FOSS is in fact used in large enterprise, a fact that I sometimes forget working in a .NET shop like Jericho.

Here at work, the jobs keep piling up. On the system administration side of things I have to set up a box to act as a router/firewall/logger etc... For this I'ver gone with using SmoothWall, which is a dedicated 'router' distro which I've used before with success at Primesoft. The job after that will be to set up our own email server, whose main job will be to replace the contract we have with Net24/IPX. The machine's main purpose will be to act as a spam filter and a proxy. Although, I'll also get it to keep copies of the mail locally and set it up with a IMAP and a POP server, just in case our local Exchange server dies, so we have access to our mail. I'm probably going to go with setting up a Ubuntu box with Postfix and SpamAssassin. I also really want to set up Bayesian filtering on the server. The tricky part there's going to involve setting up a feedback mechanism, so that we can get the employees to train the filter.

On the Deliverability side of things, the job we're doing right now is creating a classifier to classify our asynchronous bounces. Previously we've been using a product called ListNanny, but it has failed us miserably. Obviously it's a product which is made for small lists but stands no chance when it comes to enterprise class ESP's. So far, we've got a set of regular expressions which we use to determine the classification. The set is not large enough and running a test over the last 10,000 messages only yields about 20-30% match rate. So, we're not only going to have to increase the size of the regex set, but also create a system to maintain such a set. There is also the idea I had to use that bounce classifier that I found in CPAN, but when I tested it out, it just didn't seem that accurate. There is a possibility to rewrite/redifine it seeing as how it is open source, but having our own solution has it's own advantages.

Personally, due to the ever changing nature of asynchronous bounce messages in my opinion, this would be a perfect problem to tackle using an AI technique such as a Bayesian classifier, the same as is used by SpamAssassin. Thinking about it, it wouldn't require that much tweaking. Simply, instead of getting the system to give a message a spam score, you give a score for each category (reply, hard bounce, soft bounce, complaint ....). This system would also need to have a feedback mechanism, to keep the false positives and false negatives to a minimum. This would actually be a really boring job, effectively resigning some poor soul to being a 'bounce monkey'.

The other project we've got going is trying to create a real time 'whiz-bang' graphing application for monitoring our sends. Something similar to glTail. I've already started looking at employing the same approach as glTail i.e. a ssh connection that is kept alive providing the 'stream' which is then used to create the numbers to be displayed on the graph. I've started looking at using either Perl (for its easy text manipulation) or Java (for its cross platform-ness). The problem with Java is that it doesn't have any official ssh libraries. There are libraries out there, but I have my doubts as to the quality. With Perl, it would only run on Linux so integrating it with ssh would be a piece of cake (well, actually a pipe, but hey...). I guess I could try maybe using cygwin and integrating that with a java front end somehow.... Anway, it'll probably be a while until I have enough time to start worrying about that.

Tuesday, 24 March 2009

Long Overdue Update

It's been a while since the last time I wrote. It's been about 4 months since I started at Jericho Ltd and the job now takes up a majority of my time. Everything from producing deliverability reports for the clients to making sure our network is secure is handled by myself. I find myself often looking at the huge list of things to do and getting demotivated. It would take another 4 months just to get everything off the list of things to do.

On a brighter note, I went to a 'Agile Professionals' conference (agile as in the methodology) today. The guy (Jeff Smith from Suncorp) didn't have a bad thing to say about open source which was good. He also talked about the necessary attitudes and philosophies that you should have when trying to create teams (people centric) and where the innovation in a company should come from (bottom up). These were all very good points, it was a shame that only myself and Clint showed up to the conference, it would have been nice to have a few more people from work there.

I've also been reading. Finished reading Fred Brooks legendary book 'The Mythical Man Month' and the first two books in John Scalzi's Old Man's War series. The mythical man month is a real gem and deserves its reputation as a great book. I don't think I've ever seen so many truisms about software expressed so eloquently. John Scalzi's work is good as well, although the claims on the cover of the book about him being Heinlein's equal are exaggerations.