Code generating code

At work, we are currently experimenting with model driven software development. In the past, we were using the UML2 editor on Eclipse to create models of out persistent objects and later generated code directly from the model. The step to create not only Java code, but also other artifacts (like SQL scripts or code to import the persistent data from XML files) was small, the step to create a more abstract level was a little bit larger. Eclipse makes this somewhat simple because it delivers a infrastructure for modeling, transformation and generating code. The most interesting part seems to me the decision, how the higher-level model should look like.

The me, this decision illustrates the problem with MDSD: if forces you to decide for a model. You then need to do everything in this model, and hope it fits to ones needs. If not, you need to create another, higher-level, model, and transform this into the other one. There is no concept of ad-hoc generation of code. But it’s exactly this what makes generating code so useful - the ability to replace every repetition of code by generating it from a higher level description of it.

MDSD only seems to be useful if the higher-level model is used multiple times, because its definition and all the work around model transformation is too much to use it only a single time (or in a single project). But code generation principles are designed to be simple, so it is much more economic to use them. In a recent example, I used grep and a small shell script to create stubs from about 100 XML files (we have a dynamic lookup mechanism in our software, and these files would otherwise been visible to the end user). This is code generation - albeit I executed it only once. With the proper infrastructure in place, I would have written a small program to determine all the files to be used, and would create all the stubs every time during the build process.

So, that’s why I’m currently trying to build a small code-generation infrastructure for Java. It uses bean shell scripts for filling an (internal) model, which is then used by Freemarker templates to create artifacts (one or more). These artifacts could be files, or just another bean shell scripts, which are then feed back into the generation engine, and the process starts over. I would prefer to eliminate the internal model (so I could have bean shell scripts directly creating other scripts), but since Java code cannot be handled as data, this is not really feasible. But even more important is to add the ability for incremental generation (of the final files). Often I need to add some methods to the generated Java classes, and I don’t want to loose them every time.

So, there is still some work to be done, but I hope that it will prove useful (and I think I learned something new from LISP).

Posted by Hendrik Lipka at 2007-11-14 (Google)
Categories: development work