Tuesday, 4 February 2048

Code speed-reading

When dealing with lot of codes from other programmers you are obliged to learn to "speed read".

As with any speed reading technique you need to learn to quickly spot structures and anomalies in a code which is supposed to have been compiled and so is syntactically correct. The first step is to refuse to read code which is not properly indented and/or does not follow the "etiquete" of the language (such as the standard naming conventions in Java).

This said not all programming languages are created equals when it comes to ease of quick-reading. Designers of new languages have so many things to do that I wonder if the ergonomy of the syntax is top priority and is based on behavioural studies. But when a new syntactic feature is proposed there are numerous comments and wars of religion on this very problem: is this syntax felt as "programmer friendly"?.

This looks as a very subjective topic; everyone is influenced by his handling of other programming languages and "gut feelings". This is typically a DESIGN problem: we know there is something to do but the constraints appear fuzzy.

Since I am not presently trying to write a strict PhD thesis on the subject I will try a more "impressionist" approach by putting forward some considerations , subjective judgments and half-baked ideas (and hope more rationality could emerge from ensuing comments).

First remember that some "barbaric" syntactic features become "idioms" everybody can read once we are accustomed to them : remember that one in C ?

  while(*ptdest++ = *ptorigin++) ;

Then remember that the devil is in the details: for instance in Java it is a really bad thing that the same notation (+ * / - % ) is used for arithmetic operations wether the operation is on integer or floating point types. You may think it is a minor glitch but I tested it thoroughly with hundreds of students: after warning them I have them perform a lab wih a simple calculation ... 98% of them get it wrong and write code that is not going to work properly! So here the conclusions are not that intuitive.

Let's try to describe some problems that slow my code reading:
  • Having a symbol both in a parenthesis-like fashion and as a "single occurence" : in java we use "<" as an operator and in expression like ArrayList<Thing>. Hey there is not a single case of syntactic ambiguity! Yes but I am talking about your quick eyes movements that scan the code.

Not convinced? let's try something else:
  • Having keywords or symbols with a different meaning according to the context? Primary culprits: < ? extends Iterable> - here I feel bad about extends but not about "?" how is this possible?- Same for < ? extends Thing & Closeable> : I feel bad for the "&" not for the "?".

Do you feel similar problems? They may be just "similar" not "the same" so I would be curious for a real scientific screening of quick code reading.

How about the wonderful things the code editor of your IDE will do for you?
Well the problem is that coloring keywords or types is not the way to go. An horror museum of code coloring HERE . The basic rules of visual design should stop programmers to use flickering effects when showing code.

At first glance we need to get the structure of the code: where is this variable, method, constructor defined? Where are we in the code? Those structures that help us navigate are somewhat different from the syntactic features: too bad!

Why don't we get languages where keywords, symbols, different flavors of variables (such as local or members), different methods and named blocks of code could be spotted at once because their definition and their use have distinct syntactic features? In a language where a keyword could never be mistaken with a variable name you are free to augment the language with any new keyword!

Another point: we still use text-based language. I am not advocating a new APL but a distinction between the source code that is compiled and the source code that is handled by the programmer. Why not imagine "three dimensional" languages with different clear views handled by an I.D.E That could rely on a more obscure two dimensional text formalism (not handled by the programmer)?

Easier said than done: syntax-guided editors look like a good idea but have failed to deliver (our mind likes to wander and hates being too much guided by a hierarchy). .... We need something which should be both loose (where you can freely type as your thoughts flow) and structurable on demand : programming objects can be handled globally, annotated with anything we need (including formalisms alien to the language)... A simple example: when writing Java code with Netbeans I would gladly like to have different colours in the background to mark code executed by different threads!

Conclusion: languages rely on complex studies but ergonomy of the code should be a subject by itself. Thanks for any hint or link to studies on this.
(March 14 2008)

suggestion about netbeans

(added March 26 2008)
Since there is a contest about netbeans HERE I will suggest some enhancements to the code editor

A programmer should be able to add his own formatting to a code to help better code-reading.
At low level this could be implemented with special comments that could be kept in the code.

Suppose you create a syntax more or less like this:
// special comments at the beginning of the source define tags
/*#def DownLoadThread BgColor(10,34,120) Visible*/
// means that the tag will be used to delimit a zone with a background Color
// and that the tag will be shown as a special comment
.... //code

// use
   // code of this Thread


// other codes

Again the idea is to let the programmer deal with more formatting than the automatic formatting of the editor.

No comments:

About Me

My photo
to graduate at my architecture's school I wrote a thesis on "fuzzy" methods (that was in 1974). afterwards I turned software engineer (and later to java evangelist) but this just strenghtened my views on methods ... I'll try to share (though it is hard to write precisely on fuzzy topics). ...