Tuesday, 18 February 2048

The hunt for the dangling Iterator (part 1)

Why start a blog on "fuzzy methods" with specialised technical considerations on java programming?
More on this later! (but if you are not familiar with java programming better skip those initial blog entries).
If you are already familiar with discussions on Iterators you can also skip this introduction:
Iterator is an interesting feature: it enables a "client code" to pilot a service requested from an implementing code. Usually when you request a service from another class you have something that looks like:
 Result res = implementor.doIt() ;
Control is transfered to implementor: once the implementor's code is finished control is transfered again to the caller's code.
Using an Iterator is having a different cooperation between implementor and client code. once you write :
 Iterator iter = implementor.iterator() ;
The client code collaborates with the implementor code through subsequent calls to the hasNext(), next() calls.
Now suppose you have a code that wants to retrieve a list of Objects from a Catalog. You can have different ways to specify the service:
  
  Thing[] get(request) ;
or
 
  List<Thing> get(request) ;
or
 
  Iterator<Thing> get(request) ;
The choice depends on realistic expectations about the number of Things you will get and how long it will take to retrieve those things.
Using an Iterator will enable the client code to "pump" results as needed by the client code. So , for instance, if you inadvertently issue a request that could generate thousands of results you may rapidly give up and, if your request chain of code is well built, it is pumping only a limited numbers of objects in a cache (from a database for instance).
Once you think you need an Iterator another question is raised: when does the interaction between "partner" objects end? For instance: you use a database and once you have finished pumping results you need to close the interaction with the database.
This is annoying because the implementing object wants to be sure that the client code explicitly ends the interaction ... but how could it be sure? (programmers are so absent-minded you know...)

END INTRODUCTION

Now the story of my different tries to solve this problem.
Fist idea: declare a Closeable Iterator
public interface CloseableIterator<X> extends Iterator<X> {
    public void close() ; // 
}
Note that close does not throw any exception (this relates to other points when considering Iterators).
So the Catalog<X> could declare :
  public CloseableIterator<X> get(Request<X> req) ;
Now the problem is not solved because the implementing code is not sure that the calling code will actually call close().
Step 2 : while investigating the closure proposals for java 1.7 I started designing those codes (see for a lesson on closures) :
public interface Iterant<X > {
    public <throws E > void forEach({X => boolean throws E} code) throws E;
    
}
and wrote an implementation that did:
    private CloseableIterator<X> iter ; //initialised by constructor
    private  boolean closed = false ; // should be volatile

    public <throws E> void forEach({X => boolean throws E} f ) throws E {
        try {
            while(iter.hasNext() ) {
                X val = iter.next() ;
  // this will normally be treated differently with break/continue
  // of the proposed "for" modifier
                if (! f.invoke(val)) {
                    break ;
                }
            }
        }finally {
            this.theEnd() ;
        }
    }
    
    protected synchronized void theEnd() {// see comment later on why "synchronized"
        if(!closed) {
     closed = true ;
            iter.close() ;// normaly close should be idempotent
        }
    }
Well it appeared that the code was a bit specific to the notion of CloseableIterator and could be abstracted a triffle more :
    private Iterator<X> iter ;
    private {=> void } exit ;

    public IterantWithExit(Iterator<X> it, {=> void} exitCode) {
        // test if it is not null ! 
        this.iter = it ;
        this.exit = exitCode ;
    }

    private AtomicBoolean exited = new AtomicBoolean(false) ;
    private AtomicBoolean called = new AtomicBoolean(false) ;

    public <throws E > void forEach({X => boolean throws E} f ) throws E {
         if(called.getAndSet(true)) {
             throw new IllegalStateException("forEach already called");
         }
        try {
            while(iter.hasNext() ) {
                X val = iter.next() ;
                if (! f.invoke(val)) {
                    break ;
                }
            }
        }finally {
            this.theEnd() ;
        }
    }
    
    private  void theEnd() {
        if( ! exited.getAndSet(true)) {
           if(exit != null) exit.invoke() ;
        }
    }
So now we do not need anymore to create a CloseableIterator since any code to call after exit could be designed by the implementing code.
Are we sure that the exit code will be always called? no! the code provided by the client could stop the loop (waiting -for a GUI event for instance-) and a "dangling iterator" could be left.
ok we could add this to the code:
    protected void finalize() {
        this.theEnd() ;
    } // BTW explains why code dealing with concurrent access
That is: if there is a dangling object the close operation will be called when the object is reclaimed by the garbage collector.
Is this enough? it depends of the importance of the close code. It depends if you tolerate that the close may not be called when the JVM terminates. If you want to be DAMN SURE it is called (except in case of cataclysmic termination) you may add this code :
protected static final WeakHashMap<IterantWithExit, {=> void}> SURVIVORS = 
            new WeakHashMap<IterantWithExit, {=> void}>();
static {
        Runtime.getRuntime().addShutdownHook(new Thread() {
            public void run() {
                for(IterantWithExit iterEx : SURVIVORS.keySet()) {
                    iterEx.theEnd() ;
                }
            }
        });
}
This is heavy artillery but if you need it ....
I was not yet satisfied with this code so during the night this came to my mind:
public interface Capsule<X> {
    <R, throws E> R exec({X => R throws E} code)throws E ;
}
and implementation looking like :
public class ExitCapsule<X> implements Capsule<X> {
           // SURVIVORS initialisation 
  
    private {=> void } exit ;
    private X val ;
  

    public <R, throws E > R exec({X => R throws E} f ) throws E {
         
        try {
             return f.invoke(val) ;
        }finally {
            this.theEnd() ;
        }
    }
    
    public ExitCapsule(X vl,  {=> void} exitCode) {
        this.val = vl ;
        this.exit = exitCode ;
        SURVIVORS.put(this,exitCode);
    }

    protected void finalize() {
        this.theEnd() ;
    }
 // the End code + volatile test for execution
}
THAT looked like some very abstract thing! (may be I was influenced by my desperate tries at understanding the fpl library of my friend Luc Duponcheel ;-))
In that case the client code is going to handle a Capsule<Iterator<Thing>> and iterates as needed ... bleech ...
Frankly the little demon that supervises my sleep did not like this option so he came up with another solution ....
more to come ....

The hunt for the dangling Iterator (part 2)

Our quest was: how to be sure that a client code using an Iterator is calling a close code when the interaction with the implementing code is finished?
This code popped up:
public class ExitIterator<X> implements Iterator<X> {
    // if requested: create weak map SURVIVORS

    private {=> void } exit ;
    private Iterator<X> iterator ;
    private AtomicBoolean exited = new AtomicBoolean(false) ;

    public boolean hasNext(){
        boolean ok = false ;
        try {
            ok = iterator.hasNext();
        }finally {
            if( ! ok) {
                this.theEnd() ;
            }
        }
        return ok ;
    }
    
    public X next() {
        boolean ok = false ;
        X res = null ;
        try {
            res = iterator.next() ;
            ok = true ;
        }finally {
            if( ! ok) {
                this.theEnd() ;
            }
        }
        return res ;
    }
    
    public void remove() {
        iterator.remove() ;
    }
    
    private  void theEnd() {
        if( ! exited.getAndSet(true)) {
           if(exit != null) exit.invoke() ;
        }
    }
    
    public ExitIterator(Iterator<X> it,  {=> void} exitCode) {
 // should check if iterator is null !
        this.iterator = it ;
        this.exit = exitCode ;
        // SURVIVORS.put(this,exitCode);
    }

    protected void finalize() {
        this.theEnd() ;
    }
}
This code may exist in different flavours
  • "as is" if you reasonnably want the close operation to be invoked when the application is running
  • with a SURVIVORS WeakHashMap (see previous blog entry) if you want to be damn sure the close operation is invoked
  • a "belt and suspenders" option with a timeout that calls the exit code if the Iterator has not been used for a certain amount of time
Now :
  • what qualifies this code for being considered a "solution" ?
  • what can we say about the path that lead us here?

A solution ?

Well there is a moment where you'll have to stop searching for an even better solution. So what qualifies this code for being landmarked so we can deal with other problems?
I feel that it nicely fits into existing code:
  • A client code that already uses an Iterator is not to be modified: it's up to the implementing code to act
  • An implementing code that already produces an Iterator is to be only slightly modified since we need to slip an ExitIterator with proper exit code in between. So code modifications are minimal.
The code is not perfect and may need further refinements ... but this is for further versions:
  • Why is there a WeakHashMap<Iterator<?>, {=> void}> in the first place? (overkill?)
  • Could we quantify the performance penalty for using this new Iterator? (probably not that big an issue)
  • ... (please add your own critics here)
The goal of this discussion is to deal with methods: when do you think you should stop/suspend your design errands?

The path

One may wonder why I didn't come with the ExitIterator solution directly since constraints such as not meddling with client code could have been guessed in the first place!!!
Well ladies and gentlemen let me state this: almost everything is obvious .... AFTER! (the Magnus Ridolph principle)
I would even say that trying to focus on a limited set of hypothesis and reinjecting prerequisites later is stuff for an efficient search for a solution! This looks like a paradox but in practice it IS a practical behaviour
I'll have to confess: I am not a genius! I need strategies to deal with complex problems and come with palatable solutions. My experience with different design activities (remember: I have been an architect!) is that they share a similar look and feel: we start an errand on different paths and the complex solutions are the first to come. We need to be stubborn and not to stop too hastily and we need an attitude to try to distill (slowly!) new structures and solutions.... This goes through sharing (brainstorming) and structural esthetical sense (?)
Looking for a design and freezing the state of our quest is the stuff about "fuzzy" inductive design methods (as opposed to "hard" deductive methods)
More about this later ....

Saturday, 15 February 2048

Those elusive design patterns (part 1)

When I was teaching maths (a long time ago!) I tried to infuse consideration towards well built coherent demonstrations.

Then one day a student asked a question about a theorem: "Clever! but how am I supposed to discover the solution to such a thing?"

I realised I had to teach that a demonstration was the final stage of a complex quest. During that search one had to mix deductive stages ("X implies Y") and inductive stages where the mind wandered in search of clues, paths, dead ends, magical inspirations ...

Can this design process be taught to be more efficient?
- Yes!
Then does the use of such methods guarantees a result?
- No!
Transposing this to information technology can lead to very pessimistic conclusions. When designing a software:
  • you are not sure your hypothesis are 100% valid, you are not sure that you are aware of all hypothesis, hypothesis may change during the life cycle!
  • we are not sure that the final stage (the software product) is coherent, reliable and even meets expectations.
  • the overall process by which we create this product is blurry: many methods compete for our attention. Some are as "hard" as possible (could be likened to deductive processes), some are more oriented towards behavioural patterns for the programming team.
In this blog I will use a gross approximation which is to qualify "soft" methods (methods not based on proofs) of being "fuzzy". Being rational beings we need rules even if these are rules of thumb or if the theoretical foundations for those rules are unsteady.

Those methods are useful but have a disturbing property: their relevance should be continuously re-evaluated. This is paradoxical: one works on methods to establish a process that will enable further economies of thought ... then we need to constantly review this process. So where is the economy?

In software engineering the paucity of meta-methods practices is dangerous.

I have been working in a corporation using "extreme waterfall" methods : for every step there is an extensive list of prerequisite documents and a list of mandatory documents to be supplied for the next step. I have no clue of wether this is justified without a context for my judgement, but my question is: is there an evaluation for the overhead generated by the procedure itself?

At the other end of the development spectrum "agile" methods have an empirical approach recognising that problems cannot be fully understood. Some rules state that distracting influences of things not being directly focused on the current goal should be eliminated. This rule has also limits: in some case it is extremely important and profitable to let someone wander off. So without an evaluation of the violation of rules we miss something!

So, roughly, every method should carry its rationale, critics and evaluation practices. This is obvious but frequently overlooked.

Let's have a look on programming design patterns and how those patterns should constanly be re-evaluated.

At this stage it is important to remind that the recognition of the "design pattern" notion has been inspired by architecture (Ch. Alexander) -though I.M.H.O the first precise description I know of dates back to Clausewitz and dealt with military strategy-.

-those of you who are not familiar with Java programming may skip the following entries of the blog-

Those elusive design patterns (part 2)

It may sound strange to talk about programming design patterns in a "fuzzy" methods context.
There are two reasons for that:
  • - Though the inner mechanisms of programming patterns are well described their scope must be subject of very thorough discussions. In that sense I often say that design patterns are more "inspirational paths" than precise recipes.
  • - This notion of a structure that guides a search path has wider applications.
The initial descriptions of design patterns are accompanied by rationales and precise guidelines about their use and limits. Funnily if you browse the web those discussions tend to disappear: while focusing on the essence of the pattern most descriptions tend to forget those guidelines! The result is that we tend to be lazy or, even worse, use code generators once we have identified a pattern. As we profit from the work of our elders we forget about the precautions for use, we forget about re-enacting the logical build of the pattern. I am not talking about reinventing the wheel every time but about quickly following their path and profit from their experience ... which is different from applying indiscriminately a recipe.
Let's revisit a well known pattern which is Observer.
I have not found in line the precise description as in GOF (so for the moment just a concise description here )
Now compare the GOF description with the Java implementation of Observer/Observable:
public class Observable { // the "subject"
   // details skipped here
   // including interesting case of setChanged()

   public void addObserver(Observer o)  { .....

   public void notifyObservers(Object arg) { ....

public interface Observer {
   public void update(Observable o, Object arg) ;
}
Though many people argued against Observable being a class (and not an interface) I would rather focus on other additional remarks here :
  • - Java has opted for a "push" option : the Observer does not have to go after the subject to request state information. The notification object could be anything (including something that may enable the Observer to request back some state information!).
  • - Implicitly the Observer here can register itself to different subjects. Other implementation of this pattern may skip this ability.
  • - The "contract" for an actual Observer should specify more contraints. A common example is that each update method should terminate as quickly as possible to avoid blocking further notifications to other Observers. One could implement a different set of Observer/Observable where there is a different Thread to handle each notification (or anything that could make the notification asynchronous -we are here slipping towards a publish/subscribe pattern-). Note that abnormal termination of update should also be handled gracefully (that is: if a Throwable is creeping up the stack the Subject should not crash).
  • - Another twist to the story is illustrated by the event mechanism in AWT: when notifications are fired in rapid bursts there could be a merging of events in an event queue.
  • - The fact that Observer is an interface is an illustration of the decoupling principle. This can be illustrated with the example of the call-back pattern: Object of type O is registering itself with object of type S that will call back later. Since O "knows" S and S keeps a reference to O there would be a cyclic dependency of types if O were not an interface. This pattern can be considered as a simplified Observer case: there is only one Observer.
Let's have a look at a use of this simplified pattern ... this will lead us to further versions of the Observer:
Suppose we are desiging a GUI where there is a Component that helps the user to choose an Item and another Component which is able to show the details of the Item.
public class ShowItem extends Panel {
   .....
   public void show(Item choosen) { ....
}
Then the ItemChooser Component:
public class ItemChooser extends Panel {
   ....
   public ItemChooser(ItemCatalog catalog, ItemSelector selector) { ...
}
When the user picks up an Item with the help of the ItemChooser then another code will be notified. The reference to this code (ItemSelector) should be an interface, then
public class Pilot extends Panel implements ItemSelector {
   // this component has a Controler role
   // create subComponents of type ItemChooser and ShowItem
   private ShowItem presenter;
   private ItemChooser chooser ;

   public Pilot(ItemCatalog catalog) {
       chooser = new ItemChooser(catalog, this) ;
       presenter = new ShowItem() ;
       ...
   }

   // the contract of ItemSelector
   public void select(Item choice) {
       presenter.show(choice) ;
   }
   ...
}
Here we could not design an ItemChooser(ItemCatalog, Pilot) otherwise we will get a strong coupling between the Pilot and ItemChooser classes.
This said the ItemSelector definition is slightly contrived: can we make it more general, more "pattern like"? This will lead us to reconsider the more general Observer pattern.
More to come ....

Those elusive design patterns (part 3)

(continuation of part 2)
In our previous example I thought that the definition of ItemSelector was slightly contrived. Could we make it more general?

public interface Selector<X> {
 public void select(X select) ;
}

In that case the ItemChooser could look like :
public class ItemChooser extends Panel {
 ....
 public ItemChooser(ItemCatalog catalog, Selector<Item> selector) { ...

We could even try more parameterised code ... but this is beyond the point :
 public <X extends Item> ItemChooser(Catalog<X> catalog, Selector<? super X> selector) { ...

So this may lead to consider parameterised code for the Observer pattern :
public interface Observer<X> {
 public void update(X arg) ;
}

Then one could create either parameterised Subject :
 public void notifyObservers(X changed) ;

or "ad hoc" subjects pushing different data types :
 public void addThingObserver(Observer<Thing> obs)
 public void addItemObserver (Observer<Item> obs)

It's a choice: by controling more closely the type of the notification object we also loose generality.

But let's go back to our callback example. In the Pilot code :
public class Pilot extends  Panel implements Selector<Item> {

We still expose publicly some internal behaviour of Pilot which is it responds to a choice made by an inner component. We could hide this by creating an inner class implementing the Selector contract (or many inner classes implementing different Selector contracts) but the closure features enable somthing which is closer to the callBack notion:

public class Pilot {
   private ShowItem presenter= new ShowItem() ;
   private ItemChooser chooser ;
   ....
   public Pilot (Catalog <? extends Item> cat) {
       chooser = new ItemChooser(cat, {Item item => presenter.show(item); });
   }

and ItemChooser having :
 public ItemChooser(Catalog <? extends Item> catalog,
    {Item => void} action ) { ...

or some parameterised variant of it. Note that the decoupling is still better since the code which has the initiative of the call does not even need to know the name/purpose of the called method.

Now how about a Subject defined thus:
public class Dispatcher<X> {
   private ArrayList<{ X=> void}> actions = new ArrayList<{X=>void}>() ;
  
   public void addAction({X => void} action) {
       actions.add(action);
   }
  
   public void fire(X event) {
       for( {X=> void } func : actions) {
           func.invoke(event);
       }
   }
}

A funny twist could be added by making sure that every Observer code gets a different object as the result of the event! This could be through a Dispatcher<{=>Thing}> and it that case it's up to each observer to call the method that generates each distinct object. (another example: Dispatcher<{=>Iterator<Double>}>

It could be also through a method that guarantees that the generating method is called by the subject before any transfer to the observer:
   public void mapFire( {=>X } generator) {
       for( {X=> void } func : actions) {
           func.invoke(generator.invoke());
       }
   }

conclusions

  • - Revisiting the pros and cons of a pattern, browsing through the possible options and weighing the implementation choices will help both to match the implementation to your needs AND to reconsider your analysis through the light shed by the pattern's options . This a two ways process!

  • - Compare the initial UML schema and the latest codes for Dispatcher. This is to illustrate a principle: a design is not independent of the implementation language. Though we may try to design at a higher level of abstraction (and not deal with those pesky details of implementation) the paradigms of the implementation are still reflected at a higher level. You just can't have strategic thinking if you do not have some feedback from the field!
    (note : I even suspect some design patterns to be inherently implementable with interpreted languages -primary suspect: Visitor-)


(feb. 26 2008).

Wednesday, 12 February 2048

Recreational lab: recreate your command pattern

Going on vacation tomorrow so here is a lab:
Let's suppose you have an application where two J.V.M. cooperate across the wire (using some serialization).
Now consider things such as :
interface XY_R<X,Y,R> extends {X, Y => R} , Serializable {}
Some code using this (oos is an ObjectOutputStream and the code is in a static main) :
        XY_R<Double,Double, Double> func = { Double x, Double y => x+y } ;
        oos.writeObject(func);

        Double db = new Double(3.14);
        XY_R<Double,Double, Double> func2 = { Double x, Double y => x+y*db } ;
        oos.writeObject(func2);

  • Experiment with more complex closures (embarking instance data)

  • Rewrite a command pattern with that (so one J.V.M can send a command to be executed by the other J.V.M) .

Have fun !
....
.....
Sad note: though this looks fun, deployment problems are tricky (unless you download dynamically the anonymous inner classes from the JVM which initiates ) ... so it's not such a good idea after all!
....
.....
Plus: since generics are a compile-time thing it does not make much sense here in a dynamic context so better stick to interface MyCommand extends {LocalContext => Result } , Serializable {}

.... I need vacations ....

Tuesday, 4 February 2048

Code speed-reading

When dealing with lot of codes from other programmers you are obliged to learn to "speed read".

As with any speed reading technique you need to learn to quickly spot structures and anomalies in a code which is supposed to have been compiled and so is syntactically correct. The first step is to refuse to read code which is not properly indented and/or does not follow the "etiquete" of the language (such as the standard naming conventions in Java).

This said not all programming languages are created equals when it comes to ease of quick-reading. Designers of new languages have so many things to do that I wonder if the ergonomy of the syntax is top priority and is based on behavioural studies. But when a new syntactic feature is proposed there are numerous comments and wars of religion on this very problem: is this syntax felt as "programmer friendly"?.

This looks as a very subjective topic; everyone is influenced by his handling of other programming languages and "gut feelings". This is typically a DESIGN problem: we know there is something to do but the constraints appear fuzzy.

Since I am not presently trying to write a strict PhD thesis on the subject I will try a more "impressionist" approach by putting forward some considerations , subjective judgments and half-baked ideas (and hope more rationality could emerge from ensuing comments).

First remember that some "barbaric" syntactic features become "idioms" everybody can read once we are accustomed to them : remember that one in C ?

  while(*ptdest++ = *ptorigin++) ;

Then remember that the devil is in the details: for instance in Java it is a really bad thing that the same notation (+ * / - % ) is used for arithmetic operations wether the operation is on integer or floating point types. You may think it is a minor glitch but I tested it thoroughly with hundreds of students: after warning them I have them perform a lab wih a simple calculation ... 98% of them get it wrong and write code that is not going to work properly! So here the conclusions are not that intuitive.

Let's try to describe some problems that slow my code reading:
  • Having a symbol both in a parenthesis-like fashion and as a "single occurence" : in java we use "<" as an operator and in expression like ArrayList<Thing>. Hey there is not a single case of syntactic ambiguity! Yes but I am talking about your quick eyes movements that scan the code.

Not convinced? let's try something else:
  • Having keywords or symbols with a different meaning according to the context? Primary culprits: < ? extends Iterable> - here I feel bad about extends but not about "?" how is this possible?- Same for < ? extends Thing & Closeable> : I feel bad for the "&" not for the "?".

Do you feel similar problems? They may be just "similar" not "the same" so I would be curious for a real scientific screening of quick code reading.

How about the wonderful things the code editor of your IDE will do for you?
Well the problem is that coloring keywords or types is not the way to go. An horror museum of code coloring HERE . The basic rules of visual design should stop programmers to use flickering effects when showing code.

At first glance we need to get the structure of the code: where is this variable, method, constructor defined? Where are we in the code? Those structures that help us navigate are somewhat different from the syntactic features: too bad!

Why don't we get languages where keywords, symbols, different flavors of variables (such as local or members), different methods and named blocks of code could be spotted at once because their definition and their use have distinct syntactic features? In a language where a keyword could never be mistaken with a variable name you are free to augment the language with any new keyword!

Another point: we still use text-based language. I am not advocating a new APL but a distinction between the source code that is compiled and the source code that is handled by the programmer. Why not imagine "three dimensional" languages with different clear views handled by an I.D.E That could rely on a more obscure two dimensional text formalism (not handled by the programmer)?

Easier said than done: syntax-guided editors look like a good idea but have failed to deliver (our mind likes to wander and hates being too much guided by a hierarchy). .... We need something which should be both loose (where you can freely type as your thoughts flow) and structurable on demand : programming objects can be handled globally, annotated with anything we need (including formalisms alien to the language)... A simple example: when writing Java code with Netbeans I would gladly like to have different colours in the background to mark code executed by different threads!

Conclusion: languages rely on complex studies but ergonomy of the code should be a subject by itself. Thanks for any hint or link to studies on this.
(March 14 2008)


suggestion about netbeans

(added March 26 2008)
Since there is a contest about netbeans HERE I will suggest some enhancements to the code editor

A programmer should be able to add his own formatting to a code to help better code-reading.
At low level this could be implemented with special comments that could be kept in the code.

Suppose you create a syntax more or less like this:
// special comments at the beginning of the source define tags
/*#def DownLoadThread BgColor(10,34,120) Visible*/
// means that the tag will be used to delimit a zone with a background Color
// and that the tag will be shown as a special comment
.... //code


// use
/*#DownloadThread*/
   // code of this Thread


/*#/DownloadThread*/

// other codes

Again the idea is to let the programmer deal with more formatting than the automatic formatting of the editor.

Saturday, 1 February 2048

May happen! ..... So what? (over-engineering qualms)

Common wisdom amongst software developpers: if something bad may happen then it will happen! So if your design is left with gapping holes be sure someone will fall into those! But if you happen to know about those "holes" there is another nagging question: is it possible or economically sound to fill in those gaps?

That happened to me a long time ago: I implemented a SQL interpreter and realised that some properties of my code may lead to hog the runtime , or even crash it. As I was worried by that I talked with other developpers and one came with this startling assertion: those conditions are simply not going to be met in "real life"!

We were stuck in a dilemna: we had no proof that it could not happen but trying to fill in the gaps would cost a lot in terms of work and performance. So we decided to hide the gaps and wait: the intuition that nothing was going to happen there has been proved to be practically right during all the product's life.

This example could barely be used for other developpers: it is an anguishing situation to know that something MAY be wrong! This said overanguished developpers may also be fighting windmills: this may happen ... but will it happen? The best thing to do is to prove that it could happen or not... but are we able to provide this proof?

Happily it is sometimes possible to question defensive programming with arguments (provided we do not stick to the rules just to relieve our anguishes).

Though this is a far from perfect illustration I would like jump at the opportunity to emphasize the importance of questionning engineering decisions that could stem from sheer habits.

So again java code:
More or less a typical example about the Singleton pattern :
public class ASingleton {
 // a private constructor
 private ASingleton() { .....}
 // a static instance (could be initialised lazily or in a static block)
 private static ASingleton INSTANCE = ..... ;
 // factory method
 public static ASingleton getInstance() {
  // returns INSTANCE one way or another
 }
}

A rule of thumb is that if ASingleton is Serializable then one should provide :

 private Object readResolve() {
  // returns INSTANCE one way or another
 }

This last code should be discussed: the idea of readResolve is that if your read an instance from an ObjectStream then you may end up with different instances of the Singleton! So a good defensive practice is to make sure there is only one instance of the singleton ....

This may happen .... So what?
This should be discussed and we need to go a bit further than the constraint "one instance":

Point 1 : we may need only one state ... but we do not necessarily need one instance!


How about a singleton written that way:
public class Service {
 // could be from a private inner class
 private static ServiceImplementor INSTANCE = .... ;
 // a constructor!
 public Service() {}
 // every method is delegated to a static service
 public Thing method() { return INSTANCE.method() ; }
 // may be override some methods from class Object
}

In that case:
  • the client code is not necessarily aware that it is using a singleton (this property is hidden to client code)

  • the unicity of the service is nonetheless fulfilled! This, by the way, is coherent with the definition of an object as a device that do something: client code is not supposed to know how it is done.

  • if the client code wants to use the object for its monitor (synchronization, wait/notify) or wants to use the == operator, it is not going to be possible ... hey! wait a minute: the client code is not supposed to know that it is using a Singleton.

Point 2: most singletons are about a context!

  • Most singletons represent something which is external and bound to a context (a System, a spool, a configuration) and the question is: to what context are we bound? a J.V.M? a ClassLoader? a Thread (ThreadLocal objects)? So it is not that uncommon to get different instances representing the context service (even unwillingly: in the case of ASingleton above you may get different instances in a J.V.M if different ClassLoaders are involved).

  • When representing services bound to a context then those singletons are either not serializable or their instance is a representation of a remote context. That is: if JVM1 is sending an instance to JVM2then this instance is representing the context of JVM1 on JVM2 (a remote reference to a service). In that case there could be two different references on JVM2: one of the local context and one of a remote context.

    One could argue that something like a printing service should be accessed locally! But in that case the request for the service should be local (method call) and there is no reason to send an instance! (Only possible exceptions are those of point 3).

    Note: the class Service above is not suited for this remote reference passing since it will get its behaviour only from the static context of the class. (But you can design hybrid codes with local and remote data).

Point 3: singletons as special instances of a service

An example of this is Collections.EMPTY_LIST we have a singleton that implements the List services. In fact this looks like an enum with a single value.

Since those could be elements in a structure (an object has a reference to a List which happens to be an EMPTY_LIST) they should be serialised.

Then if you want to use checks such as node == EMPTY_LIST then the unicity of the instance is required (and readResolve used).

May happen! ..... so what?

In my humble opinion the systematic use of readResolve is over-engineering: we are afraid of instance duplication .... should we be afraid of that?

For sure I am interested in examples of singletons that do not match the abovementioned points that will show that it is hard to prove a general point and when confronted with a suspicion of over-engineering there are decisions that rely on risk analysis (we do not have an expert at hand everytime).

(Again the example is a bit far-fetched: there is no big risk involved here).

A side note: this blog entry was written in response to this blog but the editorial process there failed to register a link to this response ... so ?

About Me

My photo
to graduate at my architecture's school I wrote a thesis on "fuzzy" methods (that was in 1974). afterwards I turned software engineer (and later to java evangelist) but this just strenghtened my views on methods ... I'll try to share (though it is hard to write precisely on fuzzy topics). ...