Wednesday, December 19, 2007

Java NIO and The Grizzly

I have been working on a small side project to develop a highly scalable reporting and analysis service. Part of the design calls for all "processing" nodes to maintain persistent connections to every other "processing" node in the cloud. I knew the idea of using blocking IO and the 1:1 thread/connection model was going to be horrible for this design.

I first turned to the Java NIO packages. I always start with trying to understand the underlying technology before I start looking into libraries. While I wouldn't say that the Java NIO packages are exceptionally difficult to work with, they leave a lot to be desired in the documentation department. Even the examples and tutorials that exist on the internet appear incomplete. Very few touch on the best methods for dealing with write operations. Between writes, and fully understanding how to properly iterate over my selected key set (you have to call iterator.remove() after calling iterator.next()), I spent a week trying to get a firm grasp on what was really going on. By the end of the second week, I had created a prototype java application that listened on sockets and was playing hot potato with a serializable java object. I had acquired my basic understanding. Now ready for looking into libraries.

Grizzly is the library I am currently looking at. Getting a simple echo service up and running in Grizzly is a no brainer. The concept can be completed in under 20 lines of code. Grizzly even comes with protocol parsers and other useful interfaces that make developing your own protocol directly on-top of TCP or UDP a straightforward exercise.  I haven't gotten to far down the path of implementation yet, but I will definately be using Grizzly rather than rolling my own NIO solution.

The code for the Grizzly version of a simple echoing server is below:
public static void main(String[] args) 
throws IOException {
Controller controller = new Controller();
TCPSelectorHandler handler =
new TCPSelectorHandler();
handler.setPort(9090);
controller.setProtocolChainInstanceHandler(
new DefaultProtocolChainInstanceHandler() {
public ProtocolChain poll() {
ProtocolChain chain = protocolChains.poll();
if (chain == null) {
chain = new DefaultProtocolChain();
chain.addFilter(new ReadFilter());
chain.addFilter(new EchoFilter());
}
return chain;
}
}
);
controller.addSelectorHandler(handler);
controller.start();
}

I also researched the following libraries: EmberIO (part of Pyrasun), Apache Mina, and Coconut AIO. I will be posting some of my experiences with these libraries later.

Saturday, May 19, 2007

A small side project can be just the motivation you need.

Some of us entered programming because we have a passion for it. Work tends to smother that passion, but a small little utility app or a personal project is all it takes to remind us of the enjoyment this profession can bring.

A personal example is some recent burn-out I was suffering. Work was getting to me, and I was tired of my main project; I will go so far as dreading it. A friend asked me to write a simple app. Just something to take a downloaded OFX file and modify some fields to match how he would prefer to track his accounting. I managed to write the program and deliver the first build to him in under an hour. We spent the next hour trying to figure out why Quicken refused to import the modified OFX. When everything was fixed in the third hour, the end user declared it a stunning success.

It renewed my interest in programming. I felt success and it was good. The next day I went into work, ready to make the larger project a stunning success. Of course, that was the same day all of AOL's servers went up in flames.

Saturday, May 12, 2007

VMWare and FreeBSD

My biggest problem with running an OS under virtualization is clock drift. The default settings for nearly every OS I install has some form of clock drift. I have no idea if this will help anyone, but here are my settings for FreeBSD 6.2+ under VMware.

First, I always rebuild my kernel. There are two Lance drivers. The lnc and the le. The le driver is newer and has better considerations for locking. I comment out the slower, more trusted lnc device and replace it with the newer one:
#device lnc
device le

FreeBSD 6.1+, and maybe some older versions, support both the BusLogic (bt) and the LSI Logic (mpt) SCSI adapters. I personally recomend the BusLogic driver. I forget the exact details, but in the VMWare certification classes, they said the BusLogic driver was the better performing driver.

I also enable device polling in order to gain some possible speed boosts. I believe this also reduces error messages you may see from the lnc driver:
options DEVICE_POLLING
options HZ=100

And lastly in the kernel world, I comment out apic. The device apic line is technically deprecated according to the NOTES file, but it is still in there anyway.
#device apic

After rebuilding/installing the world, I also make the following changes to /boot/loader.conf. These lines really just reinforce what I did in the kernel configuration file and should work even if you don't rebuild your kernel:
hint.apic.0.disabled=1
kern.hz=100

After all this, you shouldn't see any error messages from the lnc driver, and you shouldn't see any issues with clock drift. I am still trying to figure out the best way to get the vmxnet driver working.

Monday, May 7, 2007

Open Source License Business

If you are developing open source software, the bevy of open source licenses you have to choose from is rather enormous. You have choices of everything from Public Domain to GPL to Dual licenses, etc.

And that is a good thing. While making choices is hard, having a choice is always a good thing. And in the case of open source licenses, the plethora of options provides you with opportunity to decide how your software impacts other developers and corporations. The two licenses I run into most often are the BSD license and the GPL license.

The BSD license is about user's freedom and author's credit. The user has the right to do anything and everything with the licensed material. The only restriction is that the original author gets credit for the original author's material.

Think about it this way. BSD promotes free-trade and constant exchange of ideas based on individual freedoms and values. The market place in this free-trade economy is populated by the developers of the world. The copyright holder or licenser has no power or authority to require tithe or change-sets. Under the BSD license the world becomes a free market where ideas are free to be used in any way possible.

In contrast, the GPL license is about the material's freedom. The user has the right to do anything with the licensed material as long as he makes an attempt to put all of his changes and usages into the open source community.

The GPL license was designed to make sure that open source code never finds it's way into proprietary software. It takes away the end users freedom to attempt to make improvements for personal gain; however, this loss of individual freedom comes with the benefit of an empowered community. All of the developers can rest assured that their code is not going to disappear and become closed source.

So in summary, the BSD license is not "promoting" proprietary code. It is promoting real individual freedom and opportunity. The BSD license can be considered anarcho-capitalism's equivalent in licenses.

The GPL license does not promote individual freedom, it promotes a community around the good of all. The GPL license can be considered socialism's equivalent in licenses.

Think about what kind of impact you want your code to have on the world, and make your decision based on that. And remember, no matter what license you choose, the copyright holder always has the power to change their mind.

Friday, April 6, 2007

Tricky Business

Software development is a tricky business. You are constantly trying to balance flexibility/configurability with easy of use and deployment, while at the same time trying to figure out how your audiences' preferred work-flow balances against writing code that can perform at an acceptable speed placed against writing code that you can manage six months from now. And those crazy business people and managers expect you to do it all inside some arbitrary time frame.

It is a system of compromises; success is highly dependent on getting as much information as possible and making the right compromises. Making the right compromises can become impossible if you completely refuse to ever take a risk. Some of the best, most exciting, and enjoyable projects I have ever worked on were started because the team I was on all agreed that the reward was worth the risk. Things like Scrum, Waterfall, Iterative, etc. were all defined to help mitigate the risk; however, process is useless if you are incapable of common sense reasoning, being able to explain your stance, and empathize with logic beyond your own.

Software development is also about pattern recognition. Making the right choices about risks is much easier if you have experience and can identify patterns in behavior sooner rather than later. This relates to everything from judging the character of the people you work with on your team, to identifying the similarities between two apparently unrelated bugs. Pattern analysis is one of the fundamental components to software development. Being able to recognize design patterns when you see them is helpful, but definitely not the purpose of this paragraph.

I cannot say this enough: understanding your audiences' work-flow is one of the most important things you can do. Unless you are writing applications for yourself only, you need to understand how the audience for your application will want to use what you develop. It doesn't matter if you develop the most technically advanced application on the planet that has much more functionality than the audience needs. If the audience cannot understand how to properly use it, or cannot wrap their head around what you wrote, your app will quickly be replaced by something less advanced, does 60% of the requirements, but has an "easy-button".

Microsoft has an entire business model based on work-flow development. I am sure someone in the open source community has a similar project; I have no idea what it is tho. But the fact that Microsoft (one of my least favorite companies) has put effort into making work-flow development easier shows how important work-flow development is becoming.

Combine all of this with a positive attitude and more of your projects will be success; although, it doesn't hurt if you are also Smarter Than A Fifth Grader.

Tuesday, April 3, 2007

And Then There Was Music...

My main project at AOL went live yesterday. Stunning success so far. The performance is better than expected, the data sources aren't crashing, and the pages look good. check out http://music.aol.com/.

The list of bugs that managed to make it to production is much smaller than I expected as well. It is hard to believe that it all just basically worked. I keep waiting for something major to blow up. Thing just never go "right" -- it is a law of software development or something. If things don't break atleast once in a release, than you missed something.

It has been publically released for over two days now, and the feedback has been positive. Congrats to my team.

Tuesday, March 27, 2007

"Can't you get the build out sooner?"

The best thing in the world is having people use your library and depend on its functionality.

The worst thing in the world is to have people that have more pull than you start dictating what should go into the library.

I get a warm fuzzy when I learn that someone else is going to use an application or library I wrote for their next project. It is nice to know that all my work wasn't an academic experiment, and I love knowing that someone found a real usage for my ideas.

I get very depressed when I am required to implement something that I feel doesn't belong in my library. Worst, the functionality typically isn't fully baked, and no one really knows what they want. Cruft and spagetti code soon ensue as modifications and changes are made in order to complete a short term requirement instead of a long term architecture.

... but architecture design is for another post.