Documenting the data architecture for OGPF (OCaml Genetic Programming Framework).
I've attempted to modularize the various aspects of OGPF so that each piece is independent for the others. This is unfortunately not completely possible, due to the interdependent nature of the very problem. To the left is a dependency chart. Its not as bad as it looks, however. All of the black dependency arrows indicate dependence on simple types and functions which won't change much from one setup to another. It is only the big blue line which will take some work.
There is a certain balance to be had here. An ideal situation would have your genotype and fitness test be completely separate. Yet you must also use the fitness test to probe individual genomes and discover how good they are at solving the problem at hand. If nothing else the fitness test must know the capabilities of the genotype, such as its ability to deal with various types of data (is this genotype able to work with strings? Can it return a string? etc).
How much you have to coordinate the workings of the genotype and fitness test depend a lot on what sort of problem you're solving. If the fitness test itself is important, but you have a genotype you always use, then not much change is necessary. For myself, the genotype basically '''is''' the experiment, so I tend to alter both constantly.
Genotypes define and manipulate individual genomes. Here is the basic signature which a genotype must implement:
module type Sig = sig
type t
val combine: t -> t -> t
val randInstance: int -> t
val print: t -> unit
val to_string: t -> string
val of_string: string -> t
end
Here is an additional signature that I have some things require (FGenotype):
module type Sig = sig
type t
val print: t -> unit
val to_string: t -> string
val eval: float -> t -> float
end
One debate I'm currently having with myself is whether Genotypes should be responsible for evaluating themselves. One the one had self-evaluation makes the genotype more modular, keeping the fitness test from having this job. On the flip side the fitness test must then know how to communicate the problem setup to the genotype.