Blog: Why we need biological models

Blog: Why we need biological models

A notable enthusiasm in contemporary software engineering is the idea that the field has a lot to learn from biology. You see it everywhere, from using neuroscience to get to AI ("reverse engineering the brain") to employing biological motifs like chemotaxis in network design. There is even a computer science journal -- "Bioinspiration and Biomimetics" -- devoted to the theme.

An example of the material in that journal might be a recent article by Craig Tovey of Georgia Tech and Sunil Nakrani of Oxford on using honeybee behavior to model the management of server farms. The point of connection is that both systems have resource allocation issues. A typical up-to-date hosting center uses virtualization software to dice a single large computing infrastructure (mainframe or cluster) into a constantly changing number of virtual servers (servers that exist only in software). Each of these virtual servers runs a slightly different configuration of services (like calls to different data bases, or various queues) simultaneously. The problem facing the Center as a whole is deciding how big a bite each of these virtual servers should be allowed to take out of the pool of common resources at any one time. (This is called the "orchestration problem".) In practice that decision is made by calculating the costs and benefits of all the servers, collecting them into a single list, and allocating resources accordingly.

Bees have a more decentralized system, in which each unit of resource -- each foraging bee -- gets to decide for itself how its time might be best spent. A bee returning from a flower patch reports to the rest of the hive via a medium called a waggle dance. The directions to the patch are encoded in the patterns of movements, while the intensity of the dance reflects the quality of the source. Unemployed foragers looking for a source to exploit survey the range of dances, picking one that seems to promise a good return on investment. When that bee returns, she delivers her own report, her own dance, standardized against the mean intensity of the dancing observed before she left the hive. (If there hadn't been much dancing when she left, she gets more excited about a given find than if everyone had been boasting about their own harvests.)

In both cases the loads in question fluctuate wildly (in the neighborhood of 1:100) and in both cases these fluctations have just enough continuity to them to make adapting to them profitable. The challenge therefore is not in figuring out the right average response (easy in either case) but in locating spikes in demand, mobilizing and dispatching resources fast enough to satisfy that spike, and then withdrawing those resources when demand goes away. The basic problem might best be understood as a problem of rapid learning, a theme that arises almost everywhere in contemporary computer science.

Hives execute on this issue impressively: according to the website of the nation's premier bee biologist, Tom Seeley of Cornell, the distribution of bees among nectar sources is close to the theoretical ideal: i.e., the distribution you would see if bees were omniscient. Server farms have lots of room for improvement. So a few months ago Tovey and Nakrani were inspired to write a bee- like internet hosting orchestration algorithm, in which each server published information about its own workload. Underemployed servers surveyed the information published by all the employed servers, and migrated to the load with the greatest revenue potential. The bee algorithm improved hosting performance significantly -- by about 25%.

It is interesting to ask why these computer scientists needed to consult the biological model at all. Why didn't one or both of them just solve the problem the way most engineers solve most problems, by thinking carefully about the constraints in the context of the tools at hand?

It is possible that there is a category of problem solving that we humans just can't think about very well, solutions that our brains are just not well adapted to. For these domains we need concrete demonstrations, the way a kindergarten student needs to understand addition in terms of "one egg and one egg" instead of the more abstract "1+1". One such domain might be the task of writing and debugging programs in which solutions emerge from the interaction of simple primitives over time and/or from dividing problems into large numbers of pieces, sending each piece to its own processor, and then combining the outputs of these processors. Both techniques are called parallel computing.

Parallel computing has been an important thread in the theory of computer science for decades. In theory its advantages appear to be considerable -- oldtimers might remember Thinking Machines' famous experiment in what it called Connectionism -- but so far single agent, uniprocessor, computing has prevailed. Humans don't seem to be able to do parallel programming very well. Our minds seem to have evolved to follow individual agents over time; we are just not that good at constructing narratives about multiple agents interacting simultaneously.

Fortunately for us, biology does almost everything through one or another species of parallel processing, from genetic networks to neural networks to networks of protein signalling. It is therefore in a position to explain parallel solutions to us one step at time, in the most elementary fashion. (Once we get the idea we slap our foreheads and move on, doing what we want with it.)

This raises the question of why nature selected every other computational organ out there to work in parallel while it forced our minds -- our conscious thinking -- to be confined to linear uniprocessing. Did nature have a reason for making us stupider than it is?

Perhaps so.

Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.

Join the newsletter!

Error: Please check your email address.

More about Parallel SolutionsPromiseVIA

Show Comments

Market Place