State Machines and metadata

by Vladimir Zlatanov - vlado at dikini+net

In order to be able to explain possible impact of bindapi on node creation and behaviour let's look at it as a state machine. A rather general state machine.

take 1 - basic model

state machine algorithm
 react_on_signal(X):
    foreach(action) {
     if guards(X) then {
       action(X)
    }

 guards(X):
    foreach(guard) {
      $result &&= guard(X,action)
    }
the code above assumes 'magic' scopes - this means that during execution the guard magically knows that it guards the particular action.

A state is a bag full of things, a set. It can be identified only by the things contained in it. The state machine on the other hand adds to the mix the possible actions defined for that state. The actions can happen only if allowed by the appropriate guards - some condition checkers. A pseudo-code for this general state machine algorithm is outlined in the inlay. Let's follow it through. The world issues a signal, let's say 'view'. The SM checks for each action if it is allowed in this state, for this signal and possibly exectues it.

In theory, such state machines should be able to compute any computeable tasks. Turing complete for the CS types and the theoreticians. If somebody is interested in the theory - have a search for Abstract State Machines and or Gurevich evolving algebras. In practice using this model is a bit unsavoury. It can quickly become a mess. We need some extra syntactic sugar, to sweeten this up and simplify the usage. We need some practical means to make all of this UI friendly as well.

take 2 - blending things and actions

If we generalise the state - the things should know how to react to external events and labels for those things, then we come to a new, but equivalent, model.

state machine algorithm - take 2
 react_on_signal(X):
    foreach(thing) {
      foreach(action) {
        if guards(X) then {
          action(X)
        }
      }
    }

 guards(X):
    foreach(guard) {
      $result &&= guard(X)
    }
 

The state stores things, which consist of some data and guarded actions. When an event reaches the state machine the apropriate actions are fired. Execution result wise nothing really changes. It's the internal structure that differs. We use micro management - to reduce the number of things we take into account, when defining a state.

So what is a thing and what can we do with it? A thing is the state from take 1, at least the way it was described until now. It knows, what is in it. It knows how to react to events. It is akin to a class in OO terminology. We can take a thing from 'the great outside' and add it to the state. We can remove a thing from a state and throw it to the 'the great outside' or the big bucket in the sky. A thing on its own is immutable - as a structure it can't be changed from anything else.

drupal nodes are state machines

If you make a comparison between drupal nodes and the take two state machine, you will notice similarities. Both of them contain things - drupal calls them fields. Both react to events. Both guard their actions. Ok, to be precise the drupal node types are quite tightly coded, so such clear separation may not be possible, but the bird-evey view should be correct. Node api is equivalent to 'add from the outside' type of action, as described earlier. Drupal's node collection as a whole can be viewed as a mother state machine, containing things - nodes.

This kind of abstraction is helpful, at least for me, when trying to uncover how to use and what are the consequences of implementing metadata Drupal.

metadata Drupal

What is metadata? What do we want to do with it? How can we do that? What is the impact on speed? What are the benefits for the users?

Metadata. Data about data. In different situations within drupal it can mean different things. Firstly it is knowledge about the underlying structure of things. In this case metadata programming will involve smart manipulation of the things structures. Or move to and from the outside actions as described above.

In the case of associating program semantics to taxonomy terms, i guess it is dependent on the taxonomy domain. Generally this would mean having some kind of template structures that implement that meaning. Adding association of this type means adding a set of things from the great outside according to the template. This kinds of operations are good, since they implement an ontology - a classification with rules to react to the outside world, which definitely will make Drupal even more useful as a Content Management Platform. It will make Drupal smarter.

Benefits? Flexible content creation. Smarter content organisation, cross referencing. Easier adaptation to their needs. All this can come from having combination of a visual language to adapt drupal to one's needs, a set of simple consistent apis to define data and behaviour.

virtual machine

The biggest contributor to slow speed Drupal is the database storage. So the main impact on speed should come from the implementation of how are things stored. It can be helpful if we look at this problem as engineering the optimal Virtual Machine. Components:

instruction set
the drupal actions, assembled from the defininitions of thing types - drupal modules
fast storage
the php variables
slow storage
the database storage

I will try and produce a more detailed discussion on the different ways, and different traps for the SQL side, I can notice at a later stage. I hope this will benefit someone, and not only my ego.