How will the new multithreading design in version 0.9 innovate the scripting language panorama?
There are two good reasons why multithreading in scripting languages are delicate matters (that many didn't even want to face). The first is that multithreading can break things. In “good multithreading” (multithreading which is allowed to actually exploit parallel computational power of modern architectures without excessive overhead), there is no way to recover from an error in a thread. A failing thread is a failing application, and that is a bit problematic to be framed in the controlled execution concept behind scripting language virtual machines.
The second reason is, as LUA developers point out, that a language where a = 0 is not deterministic cannot be proficiently used in multithreading. Some scripting language make a = 0 be deterministic and visible across threads by locking every assignment instruction, and that is a performance killer under many aspects. It doesn't only deplete performance on the script itself, but in case of concurrent programming in an application, it may severely deplete the host application performance by forcing it to unneeded context switches.
We opted for a pure agent based threading model. Each thread runs a separate virtual machine, and communication across threads can happen only through specialised data structures. In this way, each virtual machine can run totally unhindered by global synchronisation. It is possible to share raw memory areas via the MemBuf item type, or to send complete objects created upon separate elaboration via a interthread item queue.
The point is that, in our opinion, multithreading in scripting languages cannot be seen as multithreading in low level languages, where each operation can be mapped to activities in the underlying parallel CPUs. The idea of “mutex/event”-based parallel programming is to be rejected in super-high level languages as scripting languages, as there are too many basic operations involved in the simplest instruction. Since, in complex applications written even with low level languages, those primitives are used by low to create higher level communication mechanisms, our view is that multithreading in scripting languages should provide exactly those mechanisms, without trying to force the scripts to do what they cannot proficiently do, that is, low level synchronization primitives.
When I write a server, I find myself struggling to create complex synchronisation rules and structures through those primitives, avoiding to use them directly, and I don't see why we should bestow the same struggle on script users. The realm where primitive synchronisation is useful is not a realm where scripting languages should play a direct role – it's where you would want to write a C module to be used from the scripting language anyhow.
In 0.9 we have introduced an inter-thread garbage collector that accounts for objects present in more virtual machines. This is already exploited via the sharing of MemBuf instances, but we plan to extend this support to other kind of objects. For example, it is currently possible to send a copy of a local object to another thread via an item queue (the underlying data, possibly coming from a module or from an embedding application, can actually be shared; it's just the representation each vm has of the object that must be copied). This makes it a bit difficult to cooperate on creating complete objects across threads, and even if this works in term of agent-based threading, we're planning to use the new interthread GC system to be able to send deep items across threads. Since 0.9, it is already possible to create deep data externally (i.e. in the embedding application or in a module) and send it to a vm in a different thread.
The only problem left in doing it natively across two different vms is ensuring that the source vm won't be allowed to work on the object and on any of the data inside it while the target vm is working on it. Even if this may seem a limitation, it's exactly what the "object monitor" approach to multithreading dictates, and it is perfectly coherent with our view of higher level parallel abstraction. 0.9 version also introduces the mechanism of interthread broadcasts, with message oriented programming extended to interthread operations. We still have to work that out, completely, but that's the reason why we're numbering this release range “0.9”.
Finally, as the vm have native multithread constructs now, we may also drop the necessity to have different vms for different threads, as each thread may just operate on its own local context, while common operations on the vm (as loading new modules) can be easily protected. Still, we need to consider the possibility of multiple vms living in different threads, as this is a useful model for embedding applications.
How can a software developer get into Falcon development?
Easily. We can divide the support you may give to Falcon in mainly five areas. I rank them in order of weighted urgency/complexity ratio.
1. Modules. we need to extend the available features of the language, and modules are a good place from where to start, both because they are relatively simple to write and build and because they put the writer in contact with the vm and ITEM API quite directly. At the moment we don't have a comprehensive module writer's guide, but examples are numerous and well commented, and the API of both the vm and items are extensively documented. A skeleton module is available for download from our "extensions" area on the site, and provides an easy kick-off for new projects. Some of the most wanted modules and bindings are listed here.
2. Applications. We'd welcome some killer application as a comprehensive CMS written in Falcon, but even simpler applications are welcome.
3. Extensions and embeddings. As a scripting engine, we welcome people willing to drive their applications with Falcon. For example, the binding with Kross into KDE applications. We have a cute scripting engine binding for XChat, and we'd like to have for other scriptable applications (other IM systems, editors, music players etc). We need also to extend the existing HTTP server module binding engine and to apply it to more servers. At the moment we only support Apache.
4. Falcon core. Maintaining and extending the core system, the core module and the Feathers is still quite challenging: the 0.9 development branch has just started and we need to exploit the most advanced techniques in terms of memory management and compiler optimisations existing around, or finding new ones. We'll introduce at least two more paradigms in this round; logic programming and type contract programming, and there's plenty to work to do on tabular programming. The area is still open, so if you really want to get the hands dirty on the top-level technology in the field, this is the right place and the right time to give a try at that.
5. IDE. We need an IDE for development and debugging of Falcon applications. A terribly interesting tool would be an embeddable IDE that applications may fire up internally to manage their own scripts (consider game mod applications, but also specialised data-mining tools). Falcon has a quite open engine, and integrating it directly into the environment shall be easy. I put it for fifth as an IDE is useless if the language doesn't develop the other four points in the meanwhile, but having an IDE ready when the other four points will be satisfactorily advanced would be really a godsend.
Jumping in is easy – just get the code you want to work on from our SVN (or make a complete installation of Falcon + dev files and install the skeleton module if you want to write your own extension) and do something. Then give us a voice through our newsgroup, mail or IRC, and you're in. Developers may join as contributors and enter the Committee if their contribution is constant and useful.
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.