Tuesday, 18 February 2048

The hunt for the dangling Iterator (part 1)

Why start a blog on "fuzzy methods" with specialised technical considerations on java programming?
More on this later! (but if you are not familiar with java programming better skip those initial blog entries).
If you are already familiar with discussions on Iterators you can also skip this introduction:
Iterator is an interesting feature: it enables a "client code" to pilot a service requested from an implementing code. Usually when you request a service from another class you have something that looks like:
 Result res = implementor.doIt() ;
Control is transfered to implementor: once the implementor's code is finished control is transfered again to the caller's code.
Using an Iterator is having a different cooperation between implementor and client code. once you write :
 Iterator iter = implementor.iterator() ;
The client code collaborates with the implementor code through subsequent calls to the hasNext(), next() calls.
Now suppose you have a code that wants to retrieve a list of Objects from a Catalog. You can have different ways to specify the service:
  
  Thing[] get(request) ;
or
 
  List<Thing> get(request) ;
or
 
  Iterator<Thing> get(request) ;
The choice depends on realistic expectations about the number of Things you will get and how long it will take to retrieve those things.
Using an Iterator will enable the client code to "pump" results as needed by the client code. So , for instance, if you inadvertently issue a request that could generate thousands of results you may rapidly give up and, if your request chain of code is well built, it is pumping only a limited numbers of objects in a cache (from a database for instance).
Once you think you need an Iterator another question is raised: when does the interaction between "partner" objects end? For instance: you use a database and once you have finished pumping results you need to close the interaction with the database.
This is annoying because the implementing object wants to be sure that the client code explicitly ends the interaction ... but how could it be sure? (programmers are so absent-minded you know...)

END INTRODUCTION

Now the story of my different tries to solve this problem.
Fist idea: declare a Closeable Iterator
public interface CloseableIterator<X> extends Iterator<X> {
    public void close() ; // 
}
Note that close does not throw any exception (this relates to other points when considering Iterators).
So the Catalog<X> could declare :
  public CloseableIterator<X> get(Request<X> req) ;
Now the problem is not solved because the implementing code is not sure that the calling code will actually call close().
Step 2 : while investigating the closure proposals for java 1.7 I started designing those codes (see for a lesson on closures) :
public interface Iterant<X > {
    public <throws E > void forEach({X => boolean throws E} code) throws E;
    
}
and wrote an implementation that did:
    private CloseableIterator<X> iter ; //initialised by constructor
    private  boolean closed = false ; // should be volatile

    public <throws E> void forEach({X => boolean throws E} f ) throws E {
        try {
            while(iter.hasNext() ) {
                X val = iter.next() ;
  // this will normally be treated differently with break/continue
  // of the proposed "for" modifier
                if (! f.invoke(val)) {
                    break ;
                }
            }
        }finally {
            this.theEnd() ;
        }
    }
    
    protected synchronized void theEnd() {// see comment later on why "synchronized"
        if(!closed) {
     closed = true ;
            iter.close() ;// normaly close should be idempotent
        }
    }
Well it appeared that the code was a bit specific to the notion of CloseableIterator and could be abstracted a triffle more :
    private Iterator<X> iter ;
    private {=> void } exit ;

    public IterantWithExit(Iterator<X> it, {=> void} exitCode) {
        // test if it is not null ! 
        this.iter = it ;
        this.exit = exitCode ;
    }

    private AtomicBoolean exited = new AtomicBoolean(false) ;
    private AtomicBoolean called = new AtomicBoolean(false) ;

    public <throws E > void forEach({X => boolean throws E} f ) throws E {
         if(called.getAndSet(true)) {
             throw new IllegalStateException("forEach already called");
         }
        try {
            while(iter.hasNext() ) {
                X val = iter.next() ;
                if (! f.invoke(val)) {
                    break ;
                }
            }
        }finally {
            this.theEnd() ;
        }
    }
    
    private  void theEnd() {
        if( ! exited.getAndSet(true)) {
           if(exit != null) exit.invoke() ;
        }
    }
So now we do not need anymore to create a CloseableIterator since any code to call after exit could be designed by the implementing code.
Are we sure that the exit code will be always called? no! the code provided by the client could stop the loop (waiting -for a GUI event for instance-) and a "dangling iterator" could be left.
ok we could add this to the code:
    protected void finalize() {
        this.theEnd() ;
    } // BTW explains why code dealing with concurrent access
That is: if there is a dangling object the close operation will be called when the object is reclaimed by the garbage collector.
Is this enough? it depends of the importance of the close code. It depends if you tolerate that the close may not be called when the JVM terminates. If you want to be DAMN SURE it is called (except in case of cataclysmic termination) you may add this code :
protected static final WeakHashMap<IterantWithExit, {=> void}> SURVIVORS = 
            new WeakHashMap<IterantWithExit, {=> void}>();
static {
        Runtime.getRuntime().addShutdownHook(new Thread() {
            public void run() {
                for(IterantWithExit iterEx : SURVIVORS.keySet()) {
                    iterEx.theEnd() ;
                }
            }
        });
}
This is heavy artillery but if you need it ....
I was not yet satisfied with this code so during the night this came to my mind:
public interface Capsule<X> {
    <R, throws E> R exec({X => R throws E} code)throws E ;
}
and implementation looking like :
public class ExitCapsule<X> implements Capsule<X> {
           // SURVIVORS initialisation 
  
    private {=> void } exit ;
    private X val ;
  

    public <R, throws E > R exec({X => R throws E} f ) throws E {
         
        try {
             return f.invoke(val) ;
        }finally {
            this.theEnd() ;
        }
    }
    
    public ExitCapsule(X vl,  {=> void} exitCode) {
        this.val = vl ;
        this.exit = exitCode ;
        SURVIVORS.put(this,exitCode);
    }

    protected void finalize() {
        this.theEnd() ;
    }
 // the End code + volatile test for execution
}
THAT looked like some very abstract thing! (may be I was influenced by my desperate tries at understanding the fpl library of my friend Luc Duponcheel ;-))
In that case the client code is going to handle a Capsule<Iterator<Thing>> and iterates as needed ... bleech ...
Frankly the little demon that supervises my sleep did not like this option so he came up with another solution ....
more to come ....

1 comment:

Inventory Management Software said...

Thanks for sharing your post and it was superb .I would like to hear more from you in future too.

About Me

My photo
to graduate at my architecture's school I wrote a thesis on "fuzzy" methods (that was in 1974). afterwards I turned software engineer (and later to java evangelist) but this just strenghtened my views on methods ... I'll try to share (though it is hard to write precisely on fuzzy topics). ...