Toolchain

Python, NumPy and matplotlib - Visualize Matrix Rotations

2010-08-21T16:40:00.004-04:00

I am using Python, NumPy and matplotlib to experiment with, and visualize 3-D graphics techniques. The first step was figuring out how to generate polar plots with matplotlib and to determine the domain and range of spherical to Cartesian coordinate conversions.

The next step is to rotate a point in space using matrices, and then to explore the concept of gimbal lock. I will probably tackle gimbal lock in a future posting. Before delving into 3-D rotations, I needed to figure out how to extend my plotting knowledge.

I cleaned up the code from the previous posting, moving the spherical to Cartesian coordinates conversion code to a separate module - sphcoords. sphcoords also includes a function that takes spherical coordinates and returns a 1x3 NumPy matrix that represents a point in space.



import numpy as np

from numpy import matrix



def x_cart(incl_sph_coords, azim_sph_coords):

        return np.sin(incl_sph_coords) * np.cos(azim_sph_coords)



def y_cart(incl_sph_coords, azim_sph_coords):

       return np.sin(incl_sph_coords) * np.sin(azim_sph_coords)



def z_cart(incl_sph_coords):

        return np.cos(incl_sph_coords)



def sph2cart(incl_sph_coord, azim_sph_coord):

        return np.matrix([

        x_cart(incl_sph_coord, azim_sph_coord),

        y_cart(incl_sph_coord, azim_sph_coord),

        z_cart(incl_sph_coord)])

Below is an example of sphcoords function, sph2cart which generates a 1x3 matrix containing Cartesian coordinates. In this case, spherical coordinates of 30 degrees inclination and 180 degrees azimuth are converted.

poi = sphcoords.sph2cart(np.pi/6, -np.pi+(10*azi_incr))

By using matplotlib's subplot capability along with spherical to Cartesian coordinate conversions, I was able to depict a 3-D spherical model of space. A single red "point of interest" will be rotated using rotational matrices.

The initial position of the red point is arbitrary. The red point will be initially rotated about the z axis. The rotated position of the red point will remain along the point's original circle of inclination. This is what I was hoping to illustrate with the cyan-colored annotations.

Before going further, let me apologize for my inability to generate circles. The elliptical, squashed figures in my plots will be circular after I figure out how to set fixed aspect ratios.

The Initial Position of the Red Point of Interest

Here are the rotational matrices that were implemented in module rotmat using the NumPy Matrix.

import numpy as np
from numpy import matrix

def x_rot(rads):
  return np.matrix([
  [1,0,0], 
  [0, np.cos(rads), -np.sin(rads)],
  [0, np.sin(rads), np.cos(rads)]])

def y_rot(rads):
  return np.matrix([
  [np.cos(rads), 0, np.sin(rads)],
  [0, 1, 0], 
  [-np.sin(rads), 0, np.cos(rads)]])

def z_rot(rads):
  return np.matrix([
  [np.cos(rads), -np.sin(rads), 0],
  [np.sin(rads), np.cos(rads), 0],
  [0,0,1]])

The 1x3 matrix containing the point of interest's (POI) Cartesian coordinates is rotated about the z axis thusly:

poi = poi*rotmat.z_rot(-5*azi_incr)

The plot below shows the POI rotated 90 degrees about the z axis. The cyan annotations show the range of the next rotation which is about the x axis. The POI will be rotated 30 degrees about the x axis. The most interesting update will be on the upper right plot that will show the the POI jumping from the yellow 30 degrees circle to the black circle at 60 degrees.


After Rotating the POI about the Z axis.

After Rotating the POI about the X axis.

The final position of the POI after rotating about each Euler Axis.



import numpy as np

import matplotlib.pylab as plt

from numpy import matrix

import rotmat

import sphcoords





def generate_plot(inclin, azi, poi):

# plot x-y plane

plt.subplot(221)

plt.plot(

     sphcoords.x_cart(inclin[0],azi[0]), 

     sphcoords.y_cart(inclin[0],azi[0]), "g-o",

     sphcoords.x_cart(inclin[1],azi[1]),

     sphcoords.y_cart(inclin[1],azi[1]), "y-o",

     sphcoords.x_cart(inclin[2],azi[2]),

     sphcoords.y_cart(inclin[2],azi[2]), "y-o",

     sphcoords.x_cart(inclin[3],azi[3]),

     sphcoords.y_cart(inclin[3],azi[3]), "k-o",

     sphcoords.x_cart(inclin[4],azi[4]),

     sphcoords.y_cart(inclin[4],azi[4]), "k-o",

     sphcoords.x_cart(inclin[5],azi[5]),

     sphcoords.y_cart(inclin[5],azi[5]), "b-o",

     sphcoords.x_cart(inclin[6],azi[6]),

     sphcoords.y_cart(inclin[6],azi[6]), "b-o",

     poi[0,0], poi[0,1], "r-o")

plt.grid(True)

plt.ylabel('Y Axis')

plt.xlabel('X Axis')



# plot y-z plane

plt.subplot(222)

plt.plot(

     sphcoords.z_cart(inclin[0]), 

     sphcoords.y_cart(inclin[0],azi[0]), "g-o",

     sphcoords.z_cart(inclin[1]),

     sphcoords.y_cart(inclin[1],azi[1]), "y-o",

     sphcoords.z_cart(inclin[2]),

     sphcoords.y_cart(inclin[2],azi[2]), "y-o",

     sphcoords.z_cart(inclin[3]),

     sphcoords.y_cart(inclin[3],azi[3]), "k-o",

     sphcoords.z_cart(inclin[4]),

     sphcoords.y_cart(inclin[4],azi[4]), "k-o",

     sphcoords.z_cart(inclin[5]),

     sphcoords.y_cart(inclin[5],azi[6]), "b-o",

     sphcoords.z_cart(inclin[6]),

     sphcoords.y_cart(inclin[6],azi[6]), "b-o",

     poi[0,2], poi[0,1], "r-o")

plt.xlabel('Z Axis')

plt.grid(True)



#plot x-z plane

plt.subplot(223)

plt.plot(

     sphcoords.x_cart(inclin[0],azi[0]),

     sphcoords.z_cart(inclin[0]),"g-o", 

     sphcoords.x_cart(inclin[1],azi[1]),

     sphcoords.z_cart(inclin[1]),"y-o",

     sphcoords.x_cart(inclin[2],azi[2]),

     sphcoords.z_cart(inclin[2]),"y-o",

     sphcoords.x_cart(inclin[3],azi[3]),

     sphcoords.z_cart(inclin[3]),"k-o",

     sphcoords.x_cart(inclin[4],azi[4]),

     sphcoords.z_cart(inclin[4]),"k-o",

     sphcoords.x_cart(inclin[5],azi[5]),

     sphcoords.z_cart(inclin[5]),"b-o",

     sphcoords.x_cart(inclin[6],azi[6]),

     sphcoords.z_cart(inclin[6]),"b-o",

     poi[0,0], poi[0,2], "r-o")

plt.ylabel('Z Axis')

plt.grid(True)

plt.show()



if __name__ == '__main__':

    inclin = []

    azi = []



    inclin_points = range(0,20)

    azi_incr = (2 * np.pi)/float(20)

    azi_points = np.arange(-np.pi, np.pi, azi_incr) 



    #90 deg inclination

    inclin.append([np.pi/2 for i in inclin_points])

    #30,150 deg inclinations

    inclin.append([np.pi/6 for i in inclin_points])

    inclin.append([np.pi-np.pi/6 for i in inclin_points])

    #60,120 deg inclinations

    inclin.append([np.pi/3 for i in inclin_points])

    inclin.append([np.pi-np.pi/3 for i in inclin_points])

    # poles

    inclin.append([0])

    inclin.append([np.pi]) 



    azi.append(azi_points)

    azi.append(azi_points)

    azi.append(azi_points)

    azi.append(azi_points)

   azi.append(azi_points)

    azi.append([np.pi])

    azi.append([np.pi])



    poi = sphcoords.sph2cart(np.pi/6, -np.pi+(10*azi_incr))

    poi = poi*rotmat.z_rot(-5*azi_incr)

    poi = poi*rotmat.x_rot(np.pi/6)

    poi = poi*rotmat.y_rot(np.pi/2)

    generate_plot(inclin, azi, poi)

Spherical to Cartesian Coords - Thanks Python!

2010-08-15T17:53:00.004-04:00

Game and graphics programming requires the modeling of three-dimensional geometries in software. This stuff is relatively simple in theory but easy to screw up and difficult to debug.

The math used in 3-D geometrical programming has its limitations and special cases. One example is the gimbal lock problem that occurs when rotating a body using matrices and Euler angles.

Another problem that I recently stumbled across occurred when trying to find the intersection point of a line and plane. If the line is pointing away from the plane, the intersection point ends up being behind the line's start point. Essentially, if we shoot an arrow directly away from a target, it will still hit the target, but with the back of the arrow.

The problem I describe above with the intersection point, line and plane is not really a problem at all. The special case is easily detected while computing the result and the code can respond as required.

The real issue is that we must have a complete understanding of the mathematical techniques we use in our software.

There are two ways to understand math, one is to use analysis, the other is intuition/visualization. Analysis is the stronger of the two methods and includes things such as proofs. Intuition, visualization and experimentation is the approach that I take.

My latest interest is converting spherical coordinates into Cartesian coordinates. I can easily plug and chug using the standard formulas, but after learning a few hard lessons, I know I had better gain some understanding.

When it comes to three-dimensional programming, it is not as easy to visualize 3-D space as one would imagine. I needed some help to visualize this stuff.

Locations on earth are identified using spherical coordinates, using the intersection of two angles, an angle of inclination (latitude) and an angle of azimuth (longitude). In my investigation, I would like to consider inclination angles from 0 to 180 degrees, azimuths from 0 to 360 degrees and a spherical radius of one.

The plot simplifies the problem by reducing three-dimensions to a two-dimensional display with circles of common inclination. This is essentially a polar-plot. Each circle of common inclination (latitude) represents two inclinations, the inclination (northern hemisphere) and 180 - inclination (southern hemisphere).

As a two-dimensional polar plot, the problem is solved using the azimuth angle, and the distance from the origin to the proper circle of common inclination.

It appears from this work, that spherical to cartesian coordinate conversions is a simple and safe process (with the exception of when the inclination angle is 0 or 180 degrees, of course).



import numpy as np

import matplotlib.pylab as plt



def x(inclin, azi):

""" compute the x cartesian coord """

    return np.sin(inclin) * np.cos(azi)



def y(inclin, azi):

""" compute the y cartesian coord """

    return np.sin(inclin) * np.sin(azi)



def z(inclin):

""" compute the z cartesian coord """

    return np.cos(inclin)



inclin = []

inclin.append([np.pi/2 for i in range(0,20)])

inclin.append([np.pi/3 for i in range(0,20)])

inclin.append([np.pi/4 for i in range(0,20)])

inclin.append([np.pi/6 for i in range(0,20)])

inclin.append([np.pi/8 for i in range(0,20)])

inclin.append([np.pi/12 for i in range(0,20)])

inclin.append([np.pi/24 for i in range(0,20)])

inclin.append([0 for i in range(0,20)])

azi = np.arange(-np.pi, np.pi, (2 * np.pi)/float(20))



plt.plot(

   x(inclin[0],azi), y(inclin[0],azi), "g-o",

   x(inclin[1],azi), y(inclin[1],azi), "r-o",

   x(inclin[2],azi), y(inclin[2],azi), "c-o",

   x(inclin[3],azi), y(inclin[3],azi), "m-o",

   x(inclin[4],azi), y(inclin[4],azi), "y-o",

   x(inclin[5],azi), y(inclin[5],azi), "k-o",

   x(inclin[6],azi), y(inclin[6],azi), "b-o",

   x(inclin[7],azi), y(inclin[7],azi), "w-o")

plt.legend(

   ('90', '60, 120', '45, 135', 

   '30, 150', '23, 157', 

   '15, 165', '8, 172', '0, 180'))

plt.grid(True)

plt.ylabel('Y Cartesian Coord')

plt.title('Cartesian from Spherical')

plt.xlabel('X Cartesian Coord')

plt.show()

The Cult of Software Engineering Academia

2010-03-19T14:10:00.012-04:00

This morning I was online checking-out a software architecture conference that I had attended before and was considering to attend this year. I was looking over the keynote speakers and found the bio of one particular speaker to be interesting.

The speaker, a PhD and college professor is a leading authority on modern software development methodologies. Her focus is guiding organizations to institute the cultural changes required to adopt modern software development methods.

Her bio included a quotable-quote or catch-phrase that went like this: "You can take a man out of the Stone Age, but you can't take the Stone Age out of the man".

I was rocked-back by the arrogance of that statement. I am sure that she and I would be on the same page regarding software methods and the need for cultural change, but to publicly blast those who are not ready to accept your opinions as "less-evolved" is arrogant and absurd.

Perhaps we are seeing the deficiencies in her culture. The culture of academia. A culture accustomed to feeding bright, fresh, hungry minds. A culture accustomed to unquestionable omnipotence over its audience.

This culture of education has attempted to recast itself into a "culture of change", targeting the software industry. But in the process of recasting itself, it has done little in the way of introspection.

Perhaps the most important skills fostered by working software professionals include communication, collaboration and negotiation. Influence does not come easy and neither does experience. Both have to be earned.

Veteran software professionals have learned to listen as well as to speak, to accept criticism and to consider the ideas and opinions of others. Veteran software professionals would feel negatively about a one-way monologue that targets topics so close to the practice of their art.

I am not suggesting that "change agents" engage in conversation as a way to "handle" an audience, using dialog merely as a soft-skill. Instead I would hope that any discussion would be an honest, open, two-way exchange of ideas.

Of course, this may require the "change agent" to closely engage with her audience. In effect, to lose some of her elite status in order to gain the acceptance of her ideas.

To be a participant instead of a prescriber.

Write Software Requirements For The Love Of It

2010-03-13T23:10:00.009-05:00

Yeah sure. Right. Who am I kidding, writing software requirements kind of stinks. But right now, I cannot imagine a better use of my time.

Writing Good Software Requirements is Hard

Writing good requirements is a least as hard as writing good code. One key attribute of a good software requirement is that it must be unambiguous. The English language is notoriously ambiguous. The legal profession, political debate and religious holy wars are based on differing interpretations of the written word. Conversely, software programming languages are designed to be unambiguous because they have to be interpreted or parsed by a machine. Perhaps this is why programmers love to code - the machine detects and points out errors, or better yet, validates our efforts by correctly performing the operations that we have specified.

This brings takes us to another key attribute of a good software requirement, the requirement must be correct. This is obvious but the critical point is that the requirement must be perceived as correct by all of the software system's stakeholders. The point of the SRS is to capture a commonly agreed upon view of how the software system will function. The act of generating the SRS can end up being a process of socialization and team building. (Right now most programmers are saying "Have fun, I will be in the lab writing code".)

Looking at some job postings, they all seem to seek programming language expertise and experience. Bah! Programming is natural, joyous and simple compared to writing requirements. Writing requirements is like being a salmon, battling its way up stream to spawn and die. Well, maybe its not that bad, but its not as gratifying as writing code.

The Greatest Software Requirements Pitfall

Specifying software requirements that includes design and implementation details will lead you and your project to your doom.

It is actually difficult, unnatural, and odd to specify system requirements in way that is bereft of design and implementation. For one thing, as software developers, we are primarily problem solvers and solution creators. We are chomping on the bit to design, and implement. Secondly, we are comfortable talking in terms of the software system itself. Specifying software functions without including implementation details is not easy.

Writing software requirements that includes design and implementation is a trap. Essentially, if the implementation is used to define requirements, the implementation itself must be controlled. This leads to an explosion of requirements documentation. It becomes difficult to make even simple changes to the system without stakeholder consensus and documentation maintenance.

A Software Requirements Fallacy

There is an old software engineering joke, one developer says to the other, "You start coding while I go and get the requirements". At this point all learned software engineers slap their knees and laugh because its funny to think that some developers would start building a system before they knew what the system was required to do.

I believe this is line of thought is incorrect. Software requirements are not necessary to start software system development. Software requirements are paramount when trying to figure out if software development is complete.

Initial development can begin with some basic knowledge of the intended system. Initial development, proof-of-concepts, and prototypes will foster an understanding of the domain, encourage communications and team-building.

Essentially, I am suggesting that the act of software requirements specification does not have to imply the use of the waterfall methodology. I am suggesting that software requirements specification works well with iterative and incremental methods.

Unfortunately I cannot speak to how Agile methodologies factor in software requirements. I would bet that most include some form of functional description document. Perhaps the system is developed in increments that leads to some collective consensus of the intended systems functions.

A Software Requirements Truth

An SRS is simply a checklist of functions that a software system must perform.

Ultimately, the effectiveness and usability of the software system, as perceived by its end-users, is primarily driven by the talent, dedication and expertise of the software development team. This is because the one, unwritten requirement levied on all software systems should be:

The software system shall not stink.

Friqing Out with Python Closures

2009-01-28T13:42:00.058-05:00

You have to love snow days. After spending an hour shoveling snow this morning, I fired up my lap-top for a guilt-free afternoon of tinkering around with some code. I opened up some Python code that I had coded last year when I was discovering functional programming with Erlang and relating it to languages that I was familiar with.

Python is a great language for exploration. Back in the old days I did a lot of programming using C++ and Microsoft's Component Object Model (COM). C++ COM is a fairly complex technology so I found it preferable to prototype object-oriented designs using Java. The Java language's interface mechanism was a good analog for COM's interface-based programming paradigm. Java was a sexy thing compared to COM and Java was the lingua-franca of object oriented design during this period.

Then I discovered Python and found that it featured a familar approach to object orientation and it looked like it would support rapid prototyping and concept exploration.

Python proved to be a excellent language, but it did not serve well as a prototypical language for C++ applications. In the C++ language, data types are everything. Most of the advanced features of the language, such as templates and classes, provide ways to manage functionality across data types. The C++ compiler is the ultimate master and the programmer is subjected to its rule of type.

Python cannot be used to prototype solutions to C++ problems because the problems do not appear in Python.

Most circa 2000 C++ programmer were ill-equipped to absorb Python. To absorb Python meant to be influenced and changed by Python - to refute idioms previously held sacrosanct.

To close the loop, a next logical step would be to gain perspective by relating Python to functional programming languages and not just to Java and C++.

My Python Data Structures Project

I go through a cycle every now and again that involves Python and data structures. A good data structure to code is the Priority Queue. The binary heap-based Priority Queue features a tree data structure, implemented using an array and clever array index manipulations. A binary min heap is shown below. Essential a binary heap must always be correctly ordered, children must always be greater than its parents (in the case of the binary min heap).

The binary heap is stored in a simple array. Algorithms maintain the relations between array elements as shown below. One exception is that it is easier to implement the array-based binary heap if the first element is stored at index 1 in the array.

One reason to code up a Priority Queue is that it is a building-block data structure with multiple uses. The Priority Queue conceptually fits well into an abstract data type (ADT) such as a C++/Java/Python class. In fact there are Priority Queue ADTs in all three languages. My recent investigations into functional programming led me to create a Priority Queue using lexical closures. My original class-based Priority Queue was named priq. Thus the functional-closure based version became friq. And indeed it is a friq.

The friq

The screen shot above shows the definition of friq which is function with a single parameter lt. Parameter lt is a comparison "less-than" function which is easier demonstrated than explained. It will be demonstrated in the next section. Note that friq is a function definition that encloses other function definitions, and a rather strange nested list heef. The functionality exported by friq is done with its return statement (see below).

The functions enqueue and dequeue shown above are the two functions, enclosed by friq, that are exported to the outside world using friq's return statement.

Functions down_heap and up_heap implement the heavy lifting algorithms for the priority queue and are not exported to the outside world.

Finally we have the enclosing function friq returning a tuple containing the functions we wish to export.

Demonstrating the friq

The following code demonstrates usage of the friq. A list d is loaded with some random integers. The friq is invoked returning a tuple containing a function to enqueue values and another for dequeueing. List d is iterated and its values are enqueued. Then the friq is drained by dequeueing.

The usage of friq appears reasonably simple and straighforward. Note friq's lt parameter being passed a Python lambda function. The results are seen below.

Next we have a slightly more complex demonstration in which a tuple containing a number and a string is enqueued. Note that the code that handles dequeueing has to do extra work to expand the tuple. Also note that the lambda function is slightly more complex in order to handle the tuple. The index 0 signifies that the sort will be on the tuple's number.

The next demonstration is identifical to the previous with the exception that the lamda function uses an index of 1 which signifies that the sort will be on the tuple's string.

And finally, a display of the results of both demonstrations that involve the (number, str) tuple.

Here is the complete listing of friq.py


friq.py

#! /usr/local/bin/python

def friq(lt):
    """ 
    function friq that returns lexical closure function objects.    
    param lt - a 'less than' function or lambda
    """
    heef = [[None]]

    def enqueue(x):
        """
        enqueue - visible function 
        Add to the priority queue
        """
        heap = heef[0]
        heap.append(x)
        up_heap(heap)
        
    def dequeue():
        """ 
        dequeue - visible function 
        Get the highest priority elemement
        """
        value = None
        heap = heef[0]
        if lenf() > 0:
            # The dequeue value is taken from the 2nd element in the array.
            # The first element is not used to ease index arithmetic.
            value = heap[1]
            if lenf() > 1:
                # The last element value is copied to the first position.
                heap[1] = heap[-1]
            # The last value is deleted to reduce the size of the pq by one.
            del heap[-1]
            if lenf() > 1:
                # The val at heap[1] is moved down heap to maintain order.
    down_heap(heap)
        return value

    def heap():
        """ 
        heap - non-visible function 
        Returns the underlying array-based binary heap.
        Can be made visible for debug.    
        """
        return heef[0]

    def lenf():
        """ 
        lenf - non-visible function
        Returns the size of the underlying array-based binary heap.
        Can be made visible for debug.    
        """
        return len(heef[0]) - 1

    def down_heap(heap):
        """ 
        down-heap - non-visible function 
        The first element is moved down heap as required
        to satisfy heap order.
        """
        # px is an index to a bi heap parent
        # cx is an index to a bi heap child 
        px = 1
        v = heap[px]
        while px <= (len(heap)-1)//2:
            # calc the index of the child
            cx = px + px
            # find the index of the min of the two children 
            # (if there are two).
            if cx < len(heap) - 1 and \
                lt(heap[cx + 1], heap[cx]):
                cx = cx + 1
            # make the comparison - if v is higher pri - we are done.
            if lt(v, heap[cx]):
                break;
            else:
                # move the childs value to the parent
                heap[px] = heap[cx]
                # make the parent index that of the child.
                px  = cx  
        # finally the v is set to the current parent
        heap[px] = v

    def up_heap(heap):
        """ 
        up-heap - non-visible function
        The last element is moved up heap to satisfy the heap order.
        """
        # cx//2 idenifies cx's parent in the bi heap.
        cx = len(heap) - 1
        v = heap[cx]
        while cx//2 > 0 and lt(v, heap[cx//2]): 
            heap[cx] = heap[cx//2]
            cx = cx//2
        heap[cx] = v

    # the enclosing function returning functions that are to be visible
    return (enqueue, dequeue)

Hell is Other People's Code

2008-06-07T15:55:00.011-04:00

My current assignment has me analyzing code for an embedded system that been under development for 15 years.

There is a lot of money invested in the development of this software.

After 15 years of blood, sweat, and tears, the embedded system now needs modern hardware. But what to do about the very expensive software? Start over or devise some way to migrate the code base so that it may execute on the new hardware?

The idea of throwing away 15 years of software development is hard to swallow - so my task is to come up with a reuse/migration strategy.

But wait, there is more.
1. The software is very brittle and cannot accommodate modification.
2. Even basic maintenance activities are becoming a risky proposition.
3. The software is monolithic with no obvious means of splitting functionality into subsystems.

A secondary goal of the system modernization is to partition the software to increase robustness, maintainability and modifiability.

So I begin to look at the code. And to be honest, I was humbled by its cleanliness. The code was well commented, and consistent naming conventions and styles were applied throughout. It was the code of experienced software developers.

So let us take a step back. The tool I am employing in my analysis is to judge the competence of the engineers that created this troubled software. I think it is fair to carefully judge software developer competence as an analysis tool - but often there is more to the story than what meets the analyst's eye. For example most software systems are developed under the duress of schedule constraints - "trying to pack ten pounds of crap into a five pound bag". But I digress - back to my story.

The "coding in the small" looks good but I had already suspected that the problem existed in the design, or more specifically, how the software modules interact with one another. Let's put the software designer under the microscope.

Now the problem became very evident, many software modules where highly dependent on many other software modules. To be specific, it was easy to find software modules that were dependent on 50 other modules. It was impossible to create a informative diagram showing the dependencies in any type of meaningful way. I also searched for a more specific way to classify or present this problem to management but this also escaped me. This software simply was a rats nest of overly coupled modules that would continue to challenge our efforts to meet our goals.

So now I have indicted the software designer, hopefully he has made a clean get-a-way after 15 years. But now I need a motive - how could a software engineer let this happen? Lets travel back in time to 15 years ago and see what was going on. I remember those days and I remember that object-oriented techniques where still a little esoteric to the average developer. Many developers failed to make the leap to OO and many projects suffered through initial forays into OO. The main problem with OO in those days was in the development of frameworks and architectures. OO gave us the ability to define code modules that mapped to objects in our domain but the ability to create the executive sections of our software were often not understood and under-designed.

None the less, our indicted software designer still should have used the most basic and powerful software design technique, the idea of layering. But no, under analysis, the software has no structure.

The trial of the software designer continues. In defense of the designer, the software has been segmented into a multitude of modules that model the domain - but somehow this act of decomposition has created a mess. Let me introduce exhibit A, the programming language.

The system was programmed in the original version of Ada, the programming language developed for the U.S. Department of Defense in the late 1970's. Ada has been upgraded several times with the most notable upgrade coming in 1995 and termed Ada95, the original Ada is usually called Ada83.

Ada was specifically created to solve the so-called software crisis that plagued the development of large software systems. Yet here was a large software system in crisis, developed using Ada.

So now I will turn the glare of the spotlight away from the software designer and place it on this unpopular programming language. Now I must confess that I am not an "Ada man". I have worked on Ada projects for short terms but I have never become enamored with the language, although I know a few who have.

I looked-up a few of my (with all due respect) "grey haired" colleagues to borrow some manuals on Ada83, not Ada95 because I wanted to know what Ada developers were thinking back during the early days of the system's development. My Ada-savvy co-workers were eager to supply me with the manuals and to share a few thoughts with me.

Now Ada83 is not object-oriented but instead is object-based. Actually Ada83 is a procedural programming language similar to C. One main difference is that Ada includes the concept of a "package". The Ada package abstracts the "file module" and has some very powerful and useful capabilities. It is important to note that the Ada package is not a type like a Java or C++ class is a type. A package is used to define types, functions and variables - much like a class, but with the distinction that only one instance of a package exists in the program. A variable declared within a package is equivalent to a class-wide variable in Java or C++ - i.e a static variable.

My object-oriented mind now saw the embedded software system as ~500 singleton software classes - with public variables - mostly interconnected to one another. Could it be that the system's engineers did not really understand the object-based, procedural nature of the Ada language and used the Ada package as some type of brain-dead object-oriented class instead of a very powerful file module abstraction?

So let's be clear, in the object-based Ada language, the object is data defined, instantiated and processed by functions in the package, and not the package itself.

Interestingly, it all goes back to some of my previous musing of how State Stinks! Certainly the most stinkiest form of state would be static, globally accessible state.

I could go on but I have said all that I need to say about this matter. And I release the incarcerated designer from his cell, we all have skeletons in our closet and besides my job is not to find out what and why but how. How to fix this software and move it into the future. So far I am the one who is falling short of his goals.

But such is life in the hell of other people's code.

Bottom-Up Software Design is for Sheep

2008-02-22T22:52:00.056-05:00

Considering the devisive issues of our day: Vim vs Emacs, Ruby vs Python, .Net vs Java, Mac vs PC, and REST vs SOAP, there are plenty of opportunities to spark a holy war. Lets consider another polarizing issue:

Is it better to design and develop software from the top-down or from the bottom-up?

I am fortunate to have the opportunity to work across a wide variety projects and I am always curious to discover how my newly met colleagues are inclined regarding design. This evening I came across a compelling blog, by Gustavo Duarte, that took a stance opposed to my own. Here is my take on the issue.

Design by decomposition is the quintessential technique of the engineer

One of Gustavo's main points is that Feynman believed that software engineering has much in common with the other engineering disciplines. Considering this point, the most important concept in engineering is modularization. Most products and devices are made up of replaceable parts. Modularization is achieved by decomposing a system into sub-systems. System decomposition is top-down design.

Certainly we can agree that some of the first decisions we might make include the selection of a platform (Linux, Windows), technology stack (LAMP, Java, .NET), or language/framework (Ruby on Rails, Django).

Our choices will determine the composition of our system, how functionality is divided among system components, and usage of key patterns (e.g Model-View-Controller).

Do we even realize that we are doing top-down design?

Perhaps the availability of powerful platforms and frameworks is removing the importance of the top-down design perspective from our collective conscience.

None-the-less, software system design by decomposition into sub-systems implies a top-down approach to design.

The Space Shuttle Challenger was not a victim of top-down design.

Gustavo's blog walks us through some evidence that suggests that Dr. Feynman believed that top-down design doomed Challenger. The Challenger Disaster illustrates that a likely point of system failure is at sub-system interfaces. I don't see this as an indictment against top-down design.

For a analysis of the mulitude of events that led to the Challenger disaster, I recommend the book NO DOWNLINK. Essentially, the Challenger and her crew where the victims of pork-barrel politics, post-space race budget cuts, and the Challenger Syndrome.

Poor design choices such as the use of Solid Rocket Boosters (SRBs) and then the segmenting of the SRBs to aid in transportability where driven more by budgetary and political concerns then by faulty engineering practices.

Defending up-front software design

Another point made by Gustavo is that "Big up-front design is foolish". The word "Big" implies a non-incremental approach. Up-front design is not a foolish idea. However developing software in a non-incremental fashion (i.e. the waterfall method) is a bad idea.

Top-down software design leads to effectivly modularized code that fosters many good things such as reuse, testability, and maintainability.

Gustavo drags UML into this discussion and I agree with some of his points. The idea of designers churning out UML blue-prints to throw "over the wall" to "implementors" has no appeal to me and in fact, simple does not work.

UML is not implicitly evil and the use of UML does not have to imply some type of bureaucratic software factory where coder slaves toil away, mindlessly "implementing" designs.

UML is just a tool that lets you work on your system at a high level. UML is just a white board.

Using UML to create diagrams (which are essentially pictures) to foster communication is a beautiful thing.

To summarize, UML should be used to open a new communication channel, never to replace face-to-face discussion.

Risk

The main problem with up-front design from a developer's perspective is that it doesn't address many of the underlying risks to the success of the project. The term "risk" is a nice way to articulate that feeling you get in your stomach when you know you are doomed.

The best way to handle "risk" is to start working on the risky parts of the system as soon as possible, so that you have plenty of time to handle all the foo bar.

Risk is what leads developers to favor the bottom-up approach. This is because the risky sections of a system exist deep within the system and at sub-system interfaces such as APIs.

Up-front design may add risk because some design decisions are based on assumptions that can only be validated with working code.

Managing Risk

All forms of engineering require lab testing, research and development and prototyping to determine how elements of a system will actually perform and software is no different.

Risk-reduction prototyping is a technique used to mitigate risks, validate design assumptions, and to develop techniques that will be used during the production of a software system.

Conclusion

As software developers, we are also software users. We use frameworks and tool-sets that allow us to focus on the specific application that we are creating. If we don't maintain the ability to design software from the top down, we function more as users and less as developers. We become in effect, sheep walking the well-worn path instead of being being the pioneering trail-blazers we had hoped to be.

I believe that a pragmatic combination of both top-down (up-front design) and bottom-up (risk-reduction prototyping) techniques will greatly increase a software project's chances for success.

Dr. Feynman believed that software engineer has much in common with the other engineering disciplines. This is certaintly true, but 'how much' is less certain. Thinking of software engineering as just another engineering discipline comes with its own pitfalls.

What is your take on this topic?

Erlang and Tail Recursion Part II - Still Living, Still Dreaming

2008-01-10T11:06:00.000-05:00

In a previous post, I described how I experienced tail recursion and tail call optimization first hand. I have learned to accept that the Erlang compiler can detect when recursion can be replaced with iteration, preventing stack growth and eventual overflow (which means "hard crash"). The idea is that developers can stay within the language's functional "state-less" paradigm and let the compiler do the messy work.

I spent some time dissecting a very interesting Erlang example that I found on the web at builder.au. There are a lot of interesting nuggets in this example, requiring a little time with the Erlang reference manuals.

However there was one, seemingly innocuous passage of code, that made me realize that I had a few more lessons to learn regarding Tail (or more commonly, Last) Call Optimization (LCO).

Note that while the code is from builder.au, the comments are mine.


Function searchNode

searchNode(Name, LinkDict) ->
    receive
        {link, ToName, ToPid} ->
            % Tail recursion that will be Last Call Optimized
            searchNode(Name, dict:store(ToName, ToPid, LinkDict));
        {search, Path} -> 
            Free = notIn(Name, Path),
            if
                Free ->
                    lists:foreach(
                        fun(X)-> 
                            {_, Pid} = X,
                            Pid ! {search, Path++[Name]}
                        end, 
                    dict:to_list(LinkDict));
                true -> true
            end,
            % Tail recursion that will be Last Call Optimized
            searchNode(Name, LinkDict);
        {endPoint} ->
            % The {endPoint} message identifies that the node
            % 'embodied' by thisfunction is the 'target' of 
            % the search, requiring different message handling, 
            % and therefore a separate function 'last'.
            % But this is not tail recursion?
            last(Name, LinkDict)
    end.

Function searchNode is the function that both represents a node in a graph and an Erlang process. Once this function is initially invoked, it awaits a message from another process. Upon receiving and handling a message, it invokes itself 'tail recursively' to continue as part of the program and graph.

Except when it receives the endPoint message. In this case another function is invoked, the last function. Message endPoint is sent during the initial phase of a search, to the node that is the target of the search. In this case the node does not propagate the search but identifies when the search has completed. Essentially the "endPoint" node/process requires different behavior, therefore control is transferred to a different function.

So what happens when the search is complete? Does last return and if so, wouldn't searchNode return - ending the process/node it represents?

Note that while the code is from builder.au, the comments are mine.


Function last

last(Name, LinkDict) ->
        receive
            {link, ToName, ToPid} -> 
                % Tail recursion that will be LCO
                last(Name, dict:store(ToName, ToPid, LinkDict));
            {search, Path} ->
                % This is the message of interest for this 
                % function.  This indicates that this node was 
                % found by the search, displays the results and -
                % returns control back to searchNode!
               io:format("Found path: ~w~n", [Path++[Name]]),
                searchNode(Name, LinkDict);
            {endPoint} ->
                % Probably just handles the off-chance corner-case 
                % in which a node is identified as the target for a 
                % search while in the process of being currently 
                % searched.  This may not work correctly.
                searchNode(Name, LinkDict)
        end.

Interestingly, when last has achieved its purpose it invokes function searchNode, returning control back to the process/node's primary function.

To my imperative mind this is just wrong

Think about the crazy call stack we are creating that will never unwind, eventually crashing the program. Perhaps this is handled by Last Call Optimization?

Actually the answer was right at my finger-tips, in Concurrent Programming in Erlang Part I (search on LCO).

Essentially LCO handles functions that are mutually recursive as well as the classic 'tail recursion' case.

Some final thoughts.

This example is interesting to me because it shows a situation that requires a node (or some type of entity) be assigned a special status. For example, in imperative languages we may choose to set a flag. In Erlang we wish to remain state-less, it is improper and probably very difficult to set a flag. So instead, we request a different behaviour, which is exactly what we wanted in the first place.

Thinking in Erlang - Coding in Everything Else

2008-01-08T17:52:00.000-05:00

With apologies to Bruce Eckel.

Concurrent Programming in Erlang Part I uses a Binary Tree implementation to demonstrate Erlang's tuple data type.


Erlang

insert(Key, Value, nil) ->
    {Key, Value, nil, nil};
insert(Key, Value, {Key,_,Smaller, Bigger}) ->
    {Key, Value, Smaller, Bigger}; 
insert(Key, Value, {Key1,V, Smaller, Bigger}) when Key < Key1 ->
    {Key1, V, insert(Key, Value, Smaller), Bigger};
insert(Key, Value, {Key1, V, Smaller, Bigger}) when Key > Key1 ->
    {Key1, V, Smaller, insert(Key, Value, Bigger)}.

I was intrigued by this elegant approach to the Binary Tree.

Erlang's tuple data type collects the elements that make up a node including the node's key, value, and the references to its children nodes.

The first clause of the insert function is the "base case" that actually creates a new node. The second clause handles the case when the key is found and the value is simply replaced. Clauses three and four handle the cases that require traversal down through the tree.

Notice how the recursion is so wonderfully encoded. This is much cleaner and intuitive than the imperative solutions that I have worked with.


Python

def insert(key, value, tree):
    if tree == None:
        return (key, value, None, None)
    (key1, v1, smaller, bigger) = tree 
    if key == key1: 
        return (key1, value, smaller, bigger)
    if key1 > key: 
       return (key1, v1, insert(key, value, smaller), bigger)
    return (key1, v1, smaller, insert(key, value, bigger))

I would guess that this elegant Binary Tree implementation is common to functional programming languages in general. Python has some functional language capability and has a tuple data type. I was able to implement the "elegant" Binary Tree using Python.


JavaScript

function insert(key, value, tree) { 
    if(tree == null)
        return {key:key, value:value, smaller:null, bigger:null};
    if(tree.key == key){
        tree.value = value;
        return tree;
    }
    else if(key < tree.key)
        return {key:tree.key, value:tree.value, 
            smaller:insert(key, value, tree.smaller),
            bigger:tree.bigger};
    else
        return {key:tree.key, value:tree.value, 
            smaller:tree.smaller, 
            bigger:insert(key, value, tree.bigger)};
}

I have read that JavaScript is classified as a functional programming language and its object data type looks syntactically similar to Erlang's tuple. I was able to implement the "elegant" Binary Tree using JavaScript. For more fun, I created an interactive Binary Tree web page.

Erlang and the Towers of Hanoi

2007-12-26T18:12:00.000-05:00

My brilliant, 15 year-old son is teaching himself how to program Ruby using Chris Pine's wonderful book, Learn to Program. The book devotes an entire chapter to recursion and I spent an enjoyable evening with my son as we dissected an example program. Pine's example is reminiscent of the Towers of Hanoi, a puzzle that my son has been able to solve since he was 7. So I decided to code up this classic, first in Python and then in Erlang.

Towers of Hanoi - Erlang Style

My son could not believe that this function would solve the puzzle! I used a pencil and graph paper to illustrate an example with 4 plates and was once again amazed by the sheer beauty of this algorithm. One can only marvel as too how someone was able to discover this elegant solution.
It was my son who pointed out the "depth first" nature of the algorithm. The program recurses to a depth determined by the number of plates and then retreats back to the top of the stack of function calls, moving plates as it goes. Each retreat up the stack results in another incursion down to the depth determined by the number of plates. This continues until the original call returns and the program completes.

Essentially the process of recursion lays out a data structure of sorts that establishes a sequence of plate moves.

Notice the line in the code above: managerPid ! {self(), A,C}. This line of code is used to move a plate. The movement of the plate is handled by a separate process that represents a device or maybe even a person responsible for moving the plates.

Function manage_towers, in the code illustration above, executes within the second process and writes the tower movements to the console.

I had some philosophical concerns with the multi-process approach to the Towers of Hanoi. Recursion and distributed computing are two distinct and orthogonal solutions that are based on the same idea, the divide and conquer algorithm. However the solution does have some pragmatic appeal to me and it all begins with the immutability of Erlang's variables.
The thought is that the Towers of Hanoi program cannot actually move the plates, the plates would have to be moved by some entity apart from the program. This allows philosophical head room for both immutable variables and asynchronous calls to a process that records the plate movements.

As I watched the console display the plate movements, my mind reeled as I thought of the program descending and ascending through the call stack, briefly pausing to issue asynchronous messages of instruction to its sister process.

The Towers of Hanoi program is initiated by the code illustration shown above.

Notice that when the call to process_towers returns, the tower manager process (managerPid) is sent a message that indicates that the program has finished.

managerPid ! {self(), finished}

The main process (thread) of the program then blocks at the receive statement, awaiting acknowledgment from the tower manager process. Essentially, the main process is deferring termination until the tower manager process completes.

As you can imagine, the process_towers function computes the solution far quicker than the console can update its display. At some point, the main process waits until the tower manager handles its backlog of messages, in the order in which they arrived.

This is accomplished using Erlang's Mail Box, which probably is implemented as a thread-safe queue.

Many of the popular languages provide thread-safe queue classes but Erlang's Mail Box is built-in, readily available and seamless.

But does it work robustly? I can say yes because of my carelessness.

I tested the program with 10 plates and then decided to really test it with 20 plates. I sat there for a second before I realized what I had done. I believe Towers has an exponential growth rate, therefore 10 plates required 1000 plate movements and 20 plates requires a million plate movements.

I let the program run, all day and into the evening. My son saw the program running and asked if I had mistakenly created an infinite loop.

I ran the concept by him, "If we have 4 plates, it takes 16 moves, 5 plates takes 32 moves and 20 plates takes ...". "One million moves", he immediately replied. "But a million is not a big number for a computer", he said.

We could roughly count the number of plate moves being written to the console and determined that approximately 7 moves where displayed per second.
Therefore it would take about two days for the program to run to completion.

I went to sleep that night, thinking about the many thousands of messages queuing-up in the Tower Manager's Mail Box.

I awoke the next morning to find the program dutifully displaying plate movements. I shut the program down - satisfied that Erlang's messaging system met its challenge.

Erlang, Scalability - The Next Wave

2007-12-15T00:13:00.000-05:00

I had a great time watching the TV series Surface on the SciFi Channel, with my son who is 10. In the final episode, Marine Biologist Dr. Laura Daugherty warns us that Tsunamis waves come in sets.

The first wave of the Erlang Tsunami was realizing Erlang's ability to harness the potential of multi-core CPU's to perform computations, by using concurrently executing threads and processes.

I had not recovered from the first wave when I was hit by the next wave in the set. The second wave is Erlang's ability to scale. Usually, scalability is used as characteristic of an architecture, system or technology. I don't recall if I have seen scalability used to characterize a programming language. Certainly, most programming languages provide features that can be used to develop scalable systems - but what would it take to make the language inherently scalable?


loop(State) ->
    receive
        {call, From, Request} ->
            {Result, State2} = handle_call(Request, State),
            From ! Result,
            loop(State2);
    end.

Function loop(State) waits at the receive statement for a message that matches tuple {call, From, Request}. When a matching message is received. it is handled by the statement that calls the aptly named function, handle_call. From is messaged with the Result returned by function handle_call.

So, the received message includes information that identifies the operation to perform, the return address, and perhaps some parameters. This operation is "stateless" because the state required to process the request is provided with the request and is not maintained by loop. This is how REST works.

A key-word used in the last couple of paragraphs is "message". Note that the only coupling between sender and receiver in this message passing approach is the "shape" of the data passed (e.g. the tuple {call, From, Request}). This results in a very dynamic, and decoupled approach.

Erlang is highly scalable because its stateless, message-passing approach is suitable for all levels of distributed computing. It matters not if the distribution is among threads in one process or among hosts arrayed around the world. Erlang programmers use the same syntax, patterns and idioms, regardless of the level of distribution.

This is true scalability.

The Tsunami metaphor might have seemed like an over-statement, hopefully I have proved otherwise, but let us consider one final point. The Erlang programmer's skill scales. If a programmer can develop programs using concurrency and Erlang, it matters little if the program runs stand-alone on one box, or is widely distributed across the internet.
Do you feel the wave?

Erlang, Tail Recursion, and Living the Dream

2007-12-12T22:23:00.000-05:00

Erlang Eureka! describes a moment of discovery as the narrator walks through a Erlang code example. Function loop is an example of a function that runs within its own thread, awaits messages from other threads, and then acts on received messages. Note that function loop is a loop only in the sense that loop repeats by calling itself.

loop(Module, State) ->
receive
{call, From, Request} ->
 {Result, State2} = Module:handle_call(Request, State),
 From ! {Module, Result},
 loop(Module, State2);
{cast, Request} ->
 State2 = Module:handle_cast(Request, State),
 loop(Module, State2)
end.

Loops, such as for and while loops, are a considered to be iterative. Function loop is a different beast in that it uses recursion to mimic the iterative loop mechanism. Function loop uses a type of recursion that is called "tail recursion". Essentially if a function returns the return value of a recursive call, it is tail recursive.

In languages such as Java, C, C++, C# and Ada (imperative languages), tail recursion is frowned upon because it risks overflowing the stack Tail recursion can be easily replaced with an iterative approach, a process called tail call optimization.

On the other hand, functional languages such as Erlang, Lisp, Scheme, and F# make use of tail recursion as we can see from the code example.

Why?

1. Iteration by recursion is required because variables may be assigned only once in most functional languages, including Erlang. This "assign once" feature is one of the key reasons that Erlang can manage state well in the face of concurrency. Notice how State is updated within loop and stored in State2 which is passed by parameter to the tail recursive call to loop.

2. Functional programming language compilers, such as Erlang, perform tail call optimization. Functional language programmers will actually structure code to achieve tail recursion (pretty different). See A Deeper Look at Tail Recursion.

I also would like to credit Jomo Fisher's Adventures in F# -- Tail Recursion in Three Languages blog .

Its funny how I set off to research this topic. After experimenting with Erlang over a weekend and blogging about it, I went to work on Monday and actually thought about implementing a loop using recursion, for about a microsecond. My fingers stopped so fast that they left skid marks on the keyboard. Could you imagine implementing an application's main loop using recursion and several seconds into operation the stack overflows, the application crashes, and there you sit feeling very dumb?

With a bolt of lightening, a clear perspective of the differences between imperative and functional programming languages lay before me.

It is great when your work is interesting and your work-place is your laboratory - that is why I am Living the Dream.

Metrics-Based Software Management - A Hands On Approach

2007-12-11T20:12:00.000-05:00

Erlang Eureka!

2007-12-06T18:42:00.000-05:00

State Stinks! Part II discusses that managing the operational control of an application is problematic when the application is multi-threaded. Multi-threaded applications consist of concurrently operating execution paths that share common state and data.

Jeff Moser suggested that I look into the Erlang programming language. One of his blog postings led me to an excellent interview with Joe Armstrong, the principal inventor of Erlang. Joe is an engaging individual and has many interesting things to say. Joe prefers "concurrent orientation" over "object orientation" and he makes a compelling point. Some of my past posts reflect my opinion that strongly favors object orientation. I also ruminated on threads and the odd pairing of concurrency and object orientation. Basically this is the kind of stuff that peaks my interest so I proceeded to download Erlang.

First of all, everything went well, the Erlang web site is well organized and full of resources. Erlang has been around since 1991 and is a very well documented, mature programming language.

I started working my way through the tutorials, looking forward to that "eureka" moment when I would see Erlang's secret sauce. To begin with, Erlang's functional approach with (immutable) variables that can only be bound once, is well chronicled. Other features of the language such as dynamic-typing combined with assignment by pattern-matching makes for a feature-rich, expressive language that you just have to try for yourself.

My eureka moment?


init(Module) ->
   register(Module, self(),
   State = Module:init(),
    loop(Module, State).

loop(Module, State) ->
receive
 {call, From, Request} ->
     {Result, State2} = Module:handle_call(Request, State),
     From ! {Module, Result},
     loop(Module, State2);
 {cast, Request} ->
     State2 = Module:handle_cast(Request, State),
     loop(Module, State2)
end.

Note the function loop(Module, State). Erlang variables are capitalized. loop has two parameters Module and State. I won't discuss Module, think of it as a file containing a group of helper functions.

State is an immutable value containing our application's operational state.
First, the receive statement blocks the thread that loop is executing and awaits a message (from another thread) that contains data that matches one of the two tuples (variables enclosed in curly braces) either {call, From, Request}, or {cast, Request}.

If the message matches {call, From, Request}, the next three lines in the code will execute. First, the Request is delegated to one of the handle_call helper function in Module. Note that the immutable State is passed along. The updated state is returned to variable State2.

Next, the calling thread identified by From is messaged (!) with data {Module, Result}.

Finally, loop is called for its next iteration with the updated state present in the State2 variable.

It is important to note that the above code fragment is not pseudo-code. The code demonstrates sending and receiving messages between threads and how messages are used to share state.

At the top of this post I mentioned that an application's threads share common state. Erlang threads do not directly share common state but must share state as immutable variables passed using messages. Erlang calls threads "processes" because they do not share state.

Eureka! - Erlang's secret sauce isn't very secret, just saucy. State is maintained via very simple and explicit operations that are designed to scale up into large, robust systems. The code fragment did not show any detail about State/State2 and how State was updated (e.g. inside the Module:handle_call function).

Note that Erlang doesn't remove the need to manage state, in fact, state within a Erlang application may be very complex if it uses multiple "processes" to handle computations.

Erlang allows us to use its full toolkit to simply, safely and explicitly work with the data used to maintain the operational state of a software application.

State Stinks! Part II

2007-12-04T10:27:00.001-05:00

State Stinks! alright. My previous rant should have sufficed but I haven't succeeded in getting problems with state off of my chest! What could have I forgotten that would require its own posting! I had forgotten to mention the most worrisome of all forms of state - a form of state that I will refer to as uber-state.

To recap my previous post, I tried to illustrate how much of a software developer's life is spent managing and maintaining state. I also pointed out that many of our problems are self-inflicted, we trade a few gray hairs to optimize performance - some times just out of habit. All of these problems are intensified in multi-threaded applications. In short, state maintenance limits what we can accomplish with multi-threaded applications.

Uber-State - What we need is a supreme form of state that lives above normal application-maintained state. Some ancient guy said "give me a long enough pole and a fulcrum out in space and I can move the world" (or something to that effect). What we need is a fulcrum out in space. Easy enough, our application's supreme entity is the operating system. What we need is state that is managed by the OS that we can use as leverage points to assert control of our application. Right? Well this is what we current use to wrangle multi-threaded applications. Operating systems provide objects that we can use to synchronize thread operations and they work as designed.

But even Uber-State Stinks! - I am not trying to be difficult here! Let me explain. Operating System thread synchronization objects are very simple and are typically referred to as "primitives". To really harness the power of multi-threaded applications in a multi-core processor world, I need complete control of the operational state of my application. I don't just need to synchronize threads, I need to orchestrate my application. Uber-State you say (actually I said it) - horsefeathers!

I had mentioned in my previous post that naive OO designs typically ignore an application's operational state, treating state like a second-class citizen.

Using OS thread synchronization primitives has the same effect. Our applications rely on big-brother to adjucate and control access to state - leaving our application's state machine in disarray.

State doesn't have to stink - State needs to be first class citizen in our applications and work the way we would expect!

So where am I going with this? To begin with, let it be known that I do not only mean to prescribe to others with my postings, but I also wish to elicit suggestions. A former colleague, Jeff Moser, suggested that I look into Erlang. I have listened to his suggestion and my findings are fuel for my next posting to be entitled "Erlang Eureka!".

State Stinks!

2007-12-02T12:23:00.000-05:00

Sorry, I am not referring to your favorite institution of higher learning (although they may indeed stink). I am referring to the data utilized by the software application that you are responsible for creating and/or maintaining.

Data is not necessarily state - Typically we think of application data as a raw material of sorts. This material must be processed, shifted, sorted, stored, transformed, accessed, and so on.

States of existence - Another form of data is the data used to manage the operation of the software application itself. This is typically the kind of data we associate with "state". In this case, we are talking about states of existence, the dynamic behavior of the application, the software application as a machine.

State as a second-hand citizen - I am referring to Object-Oriented design. You see it all the time, the essence of OO design is the class hierarchy. A team models the application domain into classes and begins to code. The problem - the application's engine is not considered. The code that defines the behavior of an application ends up scattered across event-handling routines, forever a breeding ground for bugs and maintenance head-aches.

State as a Sin - This is a dramatic way of saying that we use state as an optimization. Artificial state is a good term for caches and the like. This type of state is typically misused and comes with a price, we have to maintain this state to ensure that it is fresh. Once again we have stumbled across a breeding ground for problems.

Smart Pointers - Yeah right! I know what Scott Meyers says, "use smart pointers". I have used smart pointers for years. My question is why do I need to use smart pointers in a properly designed C++ application? Why not ensure that a C++ class wraps all dynamic memory concerns to begin with and let the copy constructors, assignment operators, and destructors do their jobs? The only reason I can think of is because of performance concerns. Due to a need to optimize, we now are responsible for managing a computers memory resources, and we do so in way that subverts the intent of C++ auto-scoping classes and destructor semantics. And we call it "smart".

State Maintenance is Like Pounding Sand - Did I mention concurrency?

Multi-Core Processors and Concurrency - The carrot and the stick. We humans can be stubborn. We may not always change for a carrot or a stick, but the carrot and the stick combination can be hard to refuse.
Carrot - Multi-Core processors. Think super computer.
Stick - Each core in the multi-core processor must be driven by its own thread or process.
Carrot + Stick = We must learn how to program computers for multiple concurrent processes/threads.

Stop Pounding Sand - So we need to rethink application development. We will write applications that are not optimized from the perspective of one thread of execution, but will blaze across multi-core processors. We will employ programming technologies that makes state a first class citizen and sinful state a bad memory.

This is all Good - Why? Because State Stinks! State maintenance is boring, tedious, wasteful and problematic.

Grr++ ..

2007-11-21T15:00:00.000-05:00

Its been about two years since I've done any serious C++ however my current interests (networking) have led me back to my software development roots.

I have been working through W. Richard Steven's UNIX Network Programming book, coding up the examples on my laptop running Centos Linux. Not only am I becoming more knowledgeable in network programming, I am also stretching my skills into the Unix world - getting some work with gcc, g++, gdb, vim, make, as well as some shell scripting.

Usually when learning from a computer programming book, you have the option to download the example source code, but I think it is beneficial to type in the code. This said, I only want to type in the code once! Essentially I want to create libraries that I can reuse as I work through the book and perhaps I may wind up with stuff that I can reuse in the future.

I enjoy W. Richard Steven's classic UNIX C code, but I am compelled to wrap the C functions in C++ classes, and here I am, back in my old stomping grounds.

Now instead of diligently working through Steven's book, I am writing C++ wrapper classes.

As I remember, this was a common theme during my past self studies. I always wanted to devise some system but I usually spent my time creating libraries and tools. The work on libraries probably bore fruit in my professional life but it seemed like I never wrote that next killer app in my spare time.

Of course I discovered Java, C#, Perl, and Python, all of which featured powerful libraries. C++ began to drift off of the radar.

But now I am back. So now what?
To begin with, I like C++. Experience has shown me that GC and reference-counted memory management schemes are not a free lunch. I have seen horrendous memory leaks in GC systems due to things like circular references.

The ability to create full objects on the stack and then let nature's memory manager do its thing actually seems powerful.

So where does C++ stink?
Smell 1 - Libraries - I want XPath and XML Dom, Regular Expressions, Data Access and Concurrency libraries at my finger tips because I have been spoiled.

Smell 2 - Tight coupling with C. This is both a blessing and a curse. The blessing is that (dare I say) all system API's are written for the C programming language (my examples being Win32 and POSIX). Therefore C++ libraries are easily created to wrap these low-level API's.

I think much of the bad rap C++ got was when it was used as an abstraction layer above C. Take MFC (please take it!) for example, I wonder how many developers swore off C++ after a few years with that crappy framework. I also wonder how many developers learned how to write crappy C++ code after a few years with that framework. Java seemed like a plush, polished gem after MFC.

Also as an aside, the scripting languages did a much (I mean MUCH) better job of abstracting away the C system-level APIs.

So here I am a few years down the road and supposedly wiser, stupidly writing crappy C++ libraries around C code and feeling like I am wasting my time (which I surely am).

So C++ is not necessarily a bad language, but I need library and framework support.

Probably the only downside is that I need to assemble my own toolkit. First of all there is Boost which appears to take STL to the next level. There is ACE which provides an abstraction layer around the system APIs, particularly in the realm of networking and concurrency. And there are other sources of libraries, notably Apache which has C++ versions of its Xerces XML libary.

No worries and maybe I can find some type of library to write along the way (old habits die hard)!

I am just a metaphor (so give me a break)

2006-06-06T14:36:00.000-04:00

Sometimes people would ask me what I do for a living. My typical response would be "Software Engineer" or "Software Developer". Invariably the questioning party would give me a blank look and then say "Ooh, you are a computer programmer".
Yes I am a computer programmer. But lately I have been working as an architect. Now when I give "Software Architect" as my job title, I just get the blank look.
Architects are people that design sky-scrapers, what does this have to do with software? "Well, I am primarily concerned with the high-level design of software applications", I might say.
"Ooh, you are a computer programmer" comes the predictable response.
Yes I am a computer programmer, but now I am much more, I have become a metaphor, a figure of speech.
Actually I harbor no ill feelings towards the trusty metaphor. It seems like the more colorful and metaphorical the metaphor, the better. In celebration of my friend the metaphor, I think I will post the occasional blog that catalogues some of the metaphors that have served me well.

Metaphor 1: If I am an architect, other computer programmers may be cabinet makers.
Cabinet makers are among the most crafty of the tradesman and thier skill is evident. One may make the claim that the cabinet maker is at the pinnacle of the building trade. But an extremely talented cabinet maker may not be equipped to design a building. Some software developers delight us with thier wizardry as they extract data streams from serial ports, handily twiddle bits through the ether(net) and such. But when they scale up there endeavours it becomes a mess - spaghetti wizardry. Unstructured, insane but yet genius.
I shall now metaphorically refer to these developers as cabinet makers. As an architect, it is my job to scope out where the cabinets need to go. I like to dabble with the occasional cabinet, but as an architect I am increasing focusing my efforts to the big picture.

Metaphor 2: Software is a machine.
Many times you hear about how the software industry is immature because of quality issues. Mechanical systems off-loaded complexity to electronics and electronics off-loaded complexity to software. When we find another place to dump this complexity, then the software industry will become mature. The point is that software is a very efficient way to create a very complex machine. This machine could be created in hardware, imagine Babbage with a CNC Mill/Lathe! If software statements where represented using moving, mechanical parts, a software app would approach the complexity of an aircraft carrier.
So a software app is a very complex machine that is comprised of multiple moving parts.
Lets plan accordingly.

Bruce on Threading Terminology

2005-10-03T10:51:00.000-05:00

Bruce Eckel posted on Threading Terminology over at artima.com. The beauty of this post is following along with Bruce as he intuitively wrestles with Java 1.5 threads as a service to his Thinking in Java readers. As with most popular blogs, the discussion that follows is interesting. A common thought is that despite the power and clarity of the Java 1.5 threading model, the same pitfalls remain.

Multithreaded Applications: asynchrony - a blessing or curse?

We have the power to create a new thread in our code. We create a thread and send it on a mission. Our new thread is J.E.B. Stuart and his task is to find the Union's flank and then to report immediately. Meanwhile the main thread can continue its march towards Washington. At this point, we have two elements operating asynchronously, but eventually the main thread may have to wait for Stuart's report. At this point the strategy begings to unravel, the main thread has to halt its advance and await notification from Stuart.
The wait proceeds, now the main thread may have to choose between retiring from the field or proceeding on to battle. One problem is that the main thread still has a responsibilty to Stuart. Stuart cannot be left orphaned precariously near the Union's lines. In any case Stuart must be ordered to terminate his patrol.
The essence of multithreaded apps is indepence and asynchrony, yet the reality is that lines of communication must be maintained between threads and protocols must be established to handle the situations that will arise.

Synchrony - the beating heart of time

Many of the elements of a threading service are synchronization objects such as Mutexes, Semaphores, Critical Sections, Signals, Events, and Monitors. With a few simple lines of code, we loose a fury upon the world, and then write reams of code trying to control this bastard child. Ultimately the synchronity of single threadedness is revealed as more powerful than the asynchronous multithread. It is the only way to subdue the creature.

Multiprocess or Multithread

Multithreaded programs have been identfied as more performant and more scalable than their multiprocess counterparts. A good example is process spawning web-servers, such as CGI servers, vs. threaded servers such as Mod Perl. One very brilliant blogger whose URL I cannot recall (I will update this blog when I remember), suggests that multiprocess is far more robust than multithreaded. I especially enjoyed his term for hard-to-find multithreaded bugs - "heisen bugs".
However, if one process is dependent on the state of another process, synchronicity issues will exist. What about multimachine distributed systems and such? Message Queue applications are used to synch up the constituents.

In conclusion - good, bad, indifferent or just reality.

It is goodness that programming languages and platforms continue to put a fine point on thier support for multithreaded programs. Yet multithreaded support needs to exist in the viscera of applications that need threads. It is not a problem for language designers, it is a problem for application designers. It is a problem for very experienced application designers.

Optimization Patterns

2005-07-01T16:25:00.000-05:00

This following is a reply to the post The Limits of the MVC Design Pattern.
It seems like the "Limit of the MVC Design Pattern" is that it defies optimization.
It is no surprise that it would be difficult to optimize a general solution such as an architectural framework or design pattern. Maybe we are stumbling across the need for "Optimization Patterns".
One example of an Optimization Pattern is refactoring, which is optimizing the structure of code.
One argument against the need to optimize, is that performance is a shrinking concern. Sometimes it seems brain-dead to continually reload data-structures instead of caching, but reloading and requerying may have no effect on the bottom line - which is the "user experience".
New evidence suggests that software performance may be very important. Stored data is growing at incredible rates, rates that even surpass Moore's Law. This is good news for us geeks since complex solutions involving caches, multiple threads and efficient algorithms will be in demand.
So MVC promotes reuse and decoupling which are good things, but what patterns promote optimization? One answer may be the Inversion of Control Pattern (IoC). Briefly, IoC is used in frameworks because it facilitates the design of configurable systems. Therefore "Optimization Patterns" really are "Flexibility Patterns".
Systems will have to be designed with extension and modification points to support changes not yet imagined.

Last night I dreamt of trees...

2005-03-27T08:34:00.000-05:00

of data, trees of reality.

To begin, yesterday I spent a good part of the day working on my latest project which combines learning Python and brushing up on some basic data structures, such as Binary Trees. Maybe "brushing up" isn't a good way to describe my intentions. I am not satisfied with knowing that I can code up a Binary Tree whenever I care to. What I am looking for is a visceral, deep, intuitive understanding.
I coded up an AVL Tree which is a balanced Binary Tree. In order to test the tree, I had to come up with some specific values to insert and then determine the resulting tree. This work was done, not on a computer, but with pen and paper. After several hours of sketching trees, I was suddenly struck with the limitation of the Binary Tree. As the tree is traversed from root to leaf, branches of the tree are eliminated from consideration. All that matters are the options that lay ahead. This is what makes Binary Trees so powerful for searching, but not so great for modeling activities. In real life, branches that connect nodes of existence come and go, seemingly at random. Nodes are connected and removed due to past-events and probablities. We seem to have the ability to skip around the tree of reality, to teleport amongst its various nodes.
But reality is a Binary Tree. Consider the effect of time on reality.
Time delimits and defines reality. Choices are made, paths are taken. We don't have to concern ourselves with options on paths we are not on. Reality is a Binary Tree, could you imagine otherwise? The choices we make resolve to a path, which in turn resolves to a line. The time-line.
I looked at the smallest of Binary Trees, three nodes, one of which was the parent with its two children, and saw infinite possibilities. I looked at a main branch of a large tree and, even with all of its sub-trees to consider, all that came to mind was the removal of possibility, the refinement of the real.
So I went to sleep last night and I dreamt of trees. Elements of life, connected by branches now seen.

Temporal Anamolies

2005-03-02T08:32:00.000-05:00

My favorite StarTrek NG plots involve "temporal anamolies" in which the time-line is disrupted and the crew of the Enterprise is plunged into the grips of an alternate reality. In real life, temporal anamolies do exist, in multi-threaded software applications.
The most fundamental building-block of computer algorithms is the sequential execution of code. Code is always executed sequentially for each "unit of execution" in a computer progam. A unit of execution in a computer program is known as a thread.
A thread is a time-line in code, a unit of reality if you will. Maybe this is why programming with multiple threads seems so odd to me.
Operations on threads involve things like putting threads to sleep, which is equivalent to suspending time. Threads are "blocked" which is like having time run into a log-jam. A use for "blocking" is to make a thread wait so that it may "join" with another thread (time-line).
Note that other forms of reality such as state and data (which is like matter), are accessible by all threads in a program. Therefore a protocol has to exist between threads to ensure that time-lines don't corrupt matter. For example, you would not want to go back in time and marry your parent (sorry for the visual).
Multi-threaded programs have thier own set of common problems. One is "dead-lock" when time-lines attempt to resolve conflicts over matter access and effectively block each other out. The other problem is known as a "race-condition" which is where one time-line is dependent on the actions of another time-line.
I blogged about object-orientation in Hammer Of The Gods where I made the case that every "thing" can be classified as an object. None of the OO languages I use, effectively model threads. Instances of objects have scoped life-times. Objects are just "matter" that exist on the time-line. Threads are the time-line. Threads transcend objects.
I have two dark visions regarding threads. One vision is of a thread, endlessly spinning through time, its time-line eternal. All context and state of interest long-since extinguished. This happens, which is why the Windows Task Manager is a handy tool. The seconds it takes to launch the tool and shut-down the errant task are many eternities to a lost thread.
The other vision is nightmare involving the use of garbage-collected or reference-count-based memory management. I see a thread, drifting through an empty, irrelevant universe, holding reference to an irrelevant object. Partners in the void. Neither able to release one another from futility. Neither able to redeem the others existence.

Hammer Of The Gods

2005-02-23T20:41:00.000-05:00

Objects abound us. There are objects that move, objects that are fixed. Objects that reside within other objects and objects that contain other objects. Objects crawl, eat, defecate, breed and die on the surface of other objects. Objects revolve in systems around other massive objects. What about thoughts, concepts, feelings? What about light, gravity, water, the wind? We can consider these objects as well.
So everything is-an Object (reminds me of a very popular programming language). Of course the term object is just an extremely general way to classify things. Software design involves the discovery and classification of objects that reside within a domain, and the creation of objects that serve to frame and facilitate the domain objects.
Object-Oriented Design is a tool and as a tool it has a very specific function. There is an old cliche, "if you have a hammer, everything looks like a nail". But everything is an object, therefore everything is-a nail, in this sense.
Certainly if we name an element in our domain or system, we are speaking of objects. A very powerful naming technique is the use of metaphor. Successful metaphors surround the geekosystem. We don't invent new words, we overload the ones we already have. Words like 'file', 'folder', 'menu', 'icon'. Objects, every one of them. We even overload verbs, for example, 'browse', 'click', 'surf'. One new verb that comes to mind is to 'google'. One 'googles' when at Google. Google is a service (metaphorically speaking), a service is an object.
'Naming' and the use of metaphor is a very important capability for software developers. I think we tend to underestimate the power of well-crafted names and metaphors. The word 'object', with regards to software development, is a great name. The term was coined in the 1960s and is still effective, relevant, and a source of mystery.
The power in the name 'object' comes from the abstract, vague and nebulous nature of the name. By under-specifying what an object is, by allowing the term 'object' to breath, we give 'objects' great capability. So lets go the whole nine-yards and say that everything is an object. Its true anyway.
Its important to be able to live in the abstract, but every now and again we need to touch concrete. The typical way to teach OO is to provide concrete examples and to provide simple tenets such as 'an object is data and the methods that work with the data'. Objects live along side electron-clouds and Heisenberg. Too much concrete - too much uncertainty. I think most of the break-throughs in OO came from the early pioneers (SmallTalkers) who didn't have rules and concrete. They just created. And speaking of clouds, kind of makes you wonder if Grady Booch was on to something with his form of OO design notation. At least he gave Heisenberg his due.

Graphs and the Space-Time Continuum

2005-02-16T09:01:00.000-05:00

I was given a small hammer when I was just a wee lad. With hammer in hand, I wandered my home searching for something to fix. Later, my father had to retrace my steps with wood-filler and paint, repairing various door-frames, sills and mouldings. I don't do much with hammers anymore, the tools of my trade are software technologies. When I learn something new, its like getting a new tool and I go in search of an application. This approach is little non-optimal but it is how I learn. My latest interest are the data-structures known as graphs. Like most software engineers, I learned about graphs in school. Graphs are connected networks of nodes. One of the key points is that the connectivity of nodes is established by having nodes refer to other nodes.
I have never worked with anyone who has ever had the need to create a graph, including the Binary Search Tree. Today, most languages have built-in data-structures such as lists, hash-tables, and vectors. Data is modeled and stored in a database. It is easy to consign the graph to the realm of academia and the exotic. As others discount the graph, I may find a way to use it to my advantage, to make it a permanent part of my toolkit.
To begin, I close my textbook and strap myself into my rocket and launch to low earth orbit. Now adrift in vacuum, away from code examples and technical detail, I once again view the graph. Ignoring the obvious, such as computer networks, air travel routes, and state-machines, what do I see? Anything that interacts with anything probably is a graph. One of my favorite tools is Object-Oriented design. Interacting objects are more exciting to me than functional flow. Interacting objects are nothing more than nodes in a graph.
Can I enhance my ability to model the world with OO by thinking in terms of graphs? Will graphs add a new perspective to my ability to analyze?
Next I dock with my star-ship and warp out of the solar-system. I subject the graph to various experiments. One thing to note is that a graph can exist in three dimensions. Perfect examples are Tinker Toys and Connects. But 3-D graphs can be squashed flat and still exist in 2-D. Therefore it is easy to surmise that graphs of many dimensions exist and can be represented in just two dimensions. Infinity in two dimensions.
Now I launch the graph at the event-horizon of a black-hole. As the graph compresses, the intent of each node become distorted. Yet the relationships remain. The differences between each node becomes less as they merge to a singularity and become one. One node associated with itself. With trepidation, I leave the void, with the memories of that brave graph emblazened in my mind. I return home, reopen my textbook and begin my journey anew.

Techie or Manager

2005-01-27T17:40:00.000-05:00

I drove into work the other day listening to meditative incantations. As I drove, I tried to see the world as a child. Water-towers became docked space ships and streetlights become uni-podded aliens attempting to disintegrate me. Beside the fact that I am weird, why would I do this? No I didn't smoke anything. Its an exercise I like to do when I am doing software design. I like to keep imaginitive, and open. This is important because software is soft. Software developers create thier own realities. Maybe a little insanity is good for those whose create alternate realities.
Anyway, I arrived at work, mentally prepared, but not for what was waiting for me. A note summons me to my bosses office - kind of formal and all. I immediately itemized all my misdeeds of the past week or so. Nothing that I couldn't defend, or so I thought. Well to cut to the chase, I was promoted, to Lead Software Engineer.
My heart sank. To begin with, my mentor, leader, and collaborator of 6 years was moving on. He was the interpersonal hub and arbitrator between the disparate elements of the project. He also had to do all the crappy admin stuff. He never really got to do much software design or code. I sort of led the software design and coding efforts as one of the project's Alpha Geeks.
As I thought about it I realized that I really had it made. I get to play with computers for a living. I get to devise, conceptualize and realize. I don't wanna be no manager. I am already living my dream. So I said 'No'. I told my manager that developing software is my passion and I meant it. You can't argue with that. I also mentioned that she could count on me while transitioning in a new Lead.
My heart rose like the mid-day sun.
I had made the right choice.