« January »
Locations of visitors to this page

Powered by blojsom

Radovan Semančík's Weblog

Wednesday, 12 January 2011
« Giant Leap | Main | OpenIDM Design Summit 2011 »

The more I learn about distributed systems, the more I start to think that nobody has any idea how to make them well. It is unbelievable how little we know about distributed systems design and development. And how many failures do we still need to suffer to learn at least something?

As far back as I remember, there was Sun RPC, CORBA, DCOM, RMI, ... And now there are Web Services and REST. This is at least 20 years of development, yet all of that approaches seems to manifest the same fallacy: they try to hide the network from the developer. I will use Java JAX-WS as an example, but this is in no way specific to Java. The JAX-WS provides a runtime for a web service and generates a web service client. Both are designed to use local Java calls, efficiently hiding the network boundary. It goes a great deal for a programmer to feel comfortable. For example it hides the network exceptions from the programmer (by making them runtime exceptions). This may seem like a good approach, but it is a bad think in the very principle.

Most people would be surprised how applicable is the theory of relativity to a design of software systems. And most developers would be really surprised how slow the light is. The light will travel only approx. 300km in one millisecond. If the system needs a response under 1 millisecond, it just cannot communicate with anything that is more distant than 150km. Add a TCP three-way handshake and you are pretty much down to a size of a city. Assuming Einstein was right there is nothing, absolutely nothing, that could be done about this. No amount of technology or money can speed it up.

One millisecond is unbelievably long time for a computer system. Even a cheap computer can execute more than a million instructions in 1 millisecond. Local memory access is a bit slower, but even with that slow-down the local call is incomparably faster than a call over a fast network. Faster by several orders of magnitude. How could anyone hope to hide such a difference?

Reliability is another problem. While local call cannot (reasonably) fail, network call can and often does. Hiding the errors from the engineer does a disservice. It usually means that the network problems are ignored or handled at the top-level of an application. It means that any serious error in network communication means a sudden death of the application. Applications that are written a little bit better still manifest serious usability problems in presence of network failures. I can see that well enough on my iPhone.

Network is a significant boundary. A boundary that just cannot be swept under the carpet of a software development framework. Anyone trying to do that will most likely fail. Fail miserably.

Posted by rsemancik at 11:16 PM in Software

Add your comment:
(not displayed)
Generate another code

Please enter the code as seen in the image above to post your comment.
Your comments will be submitted for approval by blog owner to avoid comment spam. It will not appear immediately. Also please be sure to fill out all mandatory fields (marked by asterisk). This ... ehm ... imperfect software does not have any error indication for missing input fields.


[Trackback URL for this entry]