Monday, February 9, 2009

Your design depends on the glue you have...

The Post-it notes is an interesting case where a glue type created a whole set of products and even programming tutorial!!!

In paper Why Functional Programming Matters, John Huges, has interesting observations:

It is now generally accepted that modular design is the key to successful programming, .... However, there is a very important point that is often missed. When writing a modular program to solve a problem, one first divides the problem into subproblems, then solves the sub-problems and combines the solutions. The ways in which one can divide up the original problem depend directly on the ways in which one can glue solutions together. Therefore, to increase ones ability to modularise a problem conceptually, one must provide new kinds of glue in the programming language. Complicated scope rules and provision for separate compilation only help with clerical details; they offer no new conceptual tools decomposing problems.

One can appreciate the importance of glue by an analogy with carpentry. A chair can be made quite easily by making the parts - seat, legs, back etc. - and sticking them together in the right way. But this depends on an ability to make joints and wood glue. Lacking that ability, the only way to make a chair is to carve it in one piece out of a solid block of wood, a much harder task. This example demonstrates both the enormous power of modularisation and the importance of having the right glue.


He says:

To assist modular programming, a language must provide good glue. Functional programming languages provide two new kinds of glue - higher-order functions and lazy evaluation.



On higher-order functions he says:

.....functional languages allow functions which are indivisible in conventional programming languages to be expressed as a combination of parts - a general higher order function and some particular specialising functions. Once defined, such higher order functions allow many operations to be programmed very easily. Whenever a new datatype is defined higher order functions should be written for processing it. This makes manipulating the datatype easy, and also localises knowledge about the details of its representation. The best analogy with conventional programming is with extensible languages - it is as though the programming language can be extended with new control structures whenever desired.



On Lazy evaluation he says:

The other new kind of glue that functional languages provide enables whole programs to be glued together. Recall that a complete functional program is just a function from its input to its output. If f and g are such programs, then (g . f ) is a program which, when applied to its input, computes

g (f input)

The program f computes its output which is used as the input to program g. This might be implemented conventionally by storing the output from f in a temporary file. The problem with this is that the temporary file might occupy so much memory that it is impractical to glue the programs together in this way. Functional languages provide a solution to this problem. The two programs f and g are run together in strict synchronisation. F is only started once g tries to read some input, and only runs for long enough to deliver the output g is trying to read. Then f is suspended and g is run until it tries to read another input. As an added bonus, if g terminates without reading all of f ’s output then f is aborted. F can even be a non-terminating program, producing an infinite amount of output, since it will be terminated forcibly as soon as g is finished. This allows termination conditions to be separated from loop bodies - a powerful modularisation.

Since this method of evaluation runs f as little as possible, it is called “lazy evaluation”. It makes it practical to modularise a program as a generator which constructs a large number of possible answers, and a selector which chooses the appropriate one. While some other systems allow programs to be run together in this manner, only functional languages use lazy evaluation uniformly for every function call, allowing any part of a program to be modularised in this way. Lazy evaluation is perhaps the most powerful tool for modularisation in the functional programmer’s repertoire.



You can read the whole paper: Why Functional Programming Matters

Friday, February 6, 2009

pure, side effect, types, and Computation....

A friend (a purist one!) was arguing that a purely functional programming paradigm is not useful in real life. His point was that a pure functional language means no side effect. Without side effect there is not much a software can do. Simon Peyton Jones, one of the gurus of the Haskell, had an interesting comment in that a computer program that has no side effect is essentially a heater! As it generates heat and no real useful work.

So what is the disconnect? It has to do with a type systems and the scope of your types (essentially what are you typing in your system?)

Lets start from Assembly programming. In an assembly language program any instruction can read/write from/to any address in its space. To understand the code (as in debugging it) you need to go line by line and make sure you understand the side effects in each instruction.

High-level languages improved on this. For instance in a C program (or Java), you know the types of the input and output to/from your function (or method). The key point to realize is that the type of input/output doesn't say anything as to what the method will (or will not) cause when it is invoked. For instance, if a method takes two integer and returns another integer (as in add), you don't know if the invocation of the method would create any side effects on the file system. The best you can do is to look at the program's (or Class) include (or imports). For instance a class that doesn't import the file system classes (directly or indirectly) then its methods would not have any side effects or dependency in the file system.

Then comes Dependency Injection to solve the problem in Object oriented model. Dependency Injection allowed you to abstract out your object's external dependencies into an interface that would get injected at run time into the class.. It is an improvement on the base Object Oriented model. But on a given method invocation, you still didn't know what would happen. At least there is nothing in the compiler that you can count on to ensure the dependencies. So for example, a call to a method1 on a class can influence a subsequent call to method2 of the class. You would not have any idea about such backdoor connections. With good programing practices you can get by. But when you are thinking about testing, reuse or sharing of code this kind of issues always creep in.

Clearly some languages do away with types all together. But here we are talking about cases where (for whatever reason) you want to have a type system. What is interesting is that in languages that do have type system, the types of the input and output is not enough to be able to make sense of the code (as in maintain it, share it, or write unit tests for it, etc).


If instead of typing the input/out you actually type the computation then you can get lot out of your typing system. Take for instance the saying: life is not a destination, it is journey. In C/Java/etc life is a destination. All you care about is the type of the returned data. In typing the computation it is all about the journey. But how is that different?

Let say your kids are going on a summer trip with their school. What is important to you? Do you just care that they reach their destination or are you also concern about the type of activities that they would experience along the way? For instance, you want to know if they serve alcohol in the trip or not. If kids are going on the trip, then may be you want to find alternatives. If adults are going for the spring break, well that may be the thing you are looking for.

The key is that you don't want to forbid or encourage alcohol consumption (side effects) just that you want to make sure the type of the journey (the computation) matches your requirements as in destination AND journey. You want to know all the effects and not just the return value. After all, if you want a type system, why should you just stop at the return values?

In Haskell, the type of the function is not just the return value, it is the type of computation (journey). That is the key concept. Of course the type of the return value would also be included in the type of computation. Furthermore, there are cleaver ways to take a computation and define as abstract type (similar to template programming C++ or Generics in Java) where the actual type of the return value depends on the invocation of the computation (based on other parameters). If there are any side effects it would also show up on your type system. You have full disclosure, or specification. Modeling IO is an interesting example of using type system to fully define computations with side effect. The IO Inside article might be a good start to get more details.

To sum it up, in a pure functional language you don't want to avoid side effects. You just want to know the computational models that do generate side effect and use them accordingly. The beauty of it is that the compiler ensures that a computation would not violate its computational contract.

Knowing the computation's type (as oppose to just its return value) is quite beneficial. It like the dependency injected but at function level. Hence, the type signature can be used to perform automatic testing of the functions. Furthermore, the caller can control the various aspect of the computation. For instance, lets say the computation signature indicates that the computation generates log messages. The caller can call the function with its own logging implementation in the computation, which sounds awfully like Aspects programming.