Parallel Training of Tangled Program Graphs

The objective of this tutorial is to activate parallel training of Tangled Program Graphs (TPGs) with Gegelati by:

instantiating a ParallelLearningAgent, and
making the PendulumWrapper safely copyable so worker threads receive independent environments.

The starting point of this tutorial is the C++ project obtained at the end of the GEGELATI introductory tutorial. While completing the introductory tutorial is strongly advised, a copy of the project resulting from this tutorial can be downloaded at the following link: pendulum_wrapper_solution.zip.

To fully benefit from the parallelization, a the multi-episode evaluation of the agents, covered in the linked tutorial, can be implemented before starting this tutorial. The result from the multi-episode evaluation tutorial can be downloaded at the following link: gegelati-tutorial-strengthening-solution.zip.

0. Why make the environment copyable?

The learning process of TPGs involves two main time-consuming steps per generation:

Evaluation of the fitness of each individual TPG root within the PendulumWrapper learning environment. This step takes time T_eval seconds at each generation in the printed log.
Mutation of the TPG population. This step takes time T_mutat seconds at each generation in the printed log.

When using a LearningAgent, both steps are performed sequentially on a single thread. To accelerate training, it is possible to parallelize these steps across multiple threads/cores by using ParallelLearningAgent.

To better take not of the benefits of parallel training, keep a copy of the logs produced by the sequential training for comparison.

An important feature of Gegelati is that the parallelization of training is fully deterministic, which means that running the same training with the same random seed will always produce the same results, regardless of the number of threads used. This is achieved by ensuring that each worker thread operates on its own independent copy of the learning environment.

1. Parallelize mutations

Use the ParallelLearningAgent

To enable parallel mutations, the sequential LearningAgent must be replaced with ParallelLearningAgent. By default, the number of threads is set to the number of available hardware threads on the machine.

TODO #1:

Edit the /gegelati-tutorial/src/training/main-training.cpp by replacing the line that instantiates the LearningAgent with a line that instantiates a ParallelLearningAgent:

Solution to #1 (Click to expand)

/* main-training.cpp */
// Instantiate and initialize the Learning Agent (LA)
Learn::ParallelLearningAgent la(pendulumLE, instructionSet, params);

First parallel training run

Build and run the main-training target of the project. You should observe that T_mutat times have slightly decreased compared to the sequential training log. Other columns relative to the trained TPG characteristics (NbVert, NbActR, NbTeamR) and the fitness of agents (Min, Avg, Max) should remain identical to the sequential training.

2. Parallelize evaluations

Make the PendulumWrapper safely copyable

To enable parallel evaluations, the PendulumWrapper must be made safely copyable. This is done first by implementing the copy constructor of the PendulumWrapper class, and then by overriding the clone() method inherited from the LearningEnvironment base class.

TODO #2:

Edit the /gegelati-tutorial/src/environments/pendulum_wrapper.h and /gegelati-tutorial/src/environments/pendulum_wrapper.cpp to add a copy constructor PendulumWrapper(const PendulumWrapper& other) to the class.

It is important to note that the default copy constructor generated by the compiler would perform a shallow copy of the member variables, which is not suitable in this case. Therefore, a custom copy constructor must be implemented to ensure that all member variables are properly duplicated.

Special care should be taken to handle the std::vector<Data::PointerWrapper<double>> data attribute, this attribute must be initialized as a copy-constructed copy of the other.data attribute. Then the pointers contained in the vector must be updated to point to the attributes of the this->pendulum, and not to other.pendulum as is the case after copy-constructing the data attribute.

Solution to #2 (Click to expand)

/* pendulum_wrapper.h */
// Copy constructor
PendulumWrapper(const PendulumWrapper& other);

/* pendulum_wrapper.cpp */
// Copy constructor implementation
PendulumWrapper::PendulumWrapper(const PendulumWrapper& other)
    : LearningEnvironment(other), // Call base class copy constructor
      pendulum(other.pendulum),   // Copy-construct the pendulum
      data(other.data)            // Copy-construct the data vector
{
    // Update pointers in data to point to this->pendulum's attributes
	data.at(0).setPointer(&this->pendulum.getAngle());
	data.at(1).setPointer(&this->pendulum.getVelocity());
}

TODO #3:

Next, override the clone() method in the PendulumWrapper class to return a new instance of PendulumWrapper created using the copy constructor.

Solution to #3 (Click to expand)

/* pendulum_wrapper.h */
// Override clone method
Data::LearningEnvironment* clone() const override;

/* pendulum_wrapper.cpp */
// Override clone method implementation
Data::LearningEnvironment* PendulumWrapper::clone() const {
    return new PendulumWrapper(*this); // Use copy constructor
}

TODO #4:

To signal to Gegelati that the PendulumWrapper can be safely copied for parallel evaluation, the LearningEnvironment::isCopyable() method must be overridden to return true.

Solution to #4 (Click to expand)

/* pendulum_wrapper.h */
// Override isCopyable method
bool isCopyable() const override;

/* pendulum_wrapper.cpp */
// Override isCopyable method implementation
bool PendulumWrapper::isCopyable() const {
    return true; // Indicate that this environment is copyable
}

Test parallel evaluations

Build and run the main-training target of the project. You should observe that T_eval times have significantly decreased compared to the sequential training log. Other columns relative to the trained TPG characteristics (NbVert, NbActR, NbTeamR) and the fitness of agents (Min, Avg, Max) should remain identical to the sequential training.

It it possible to control the number of threads used by the ParallelLearningAgent by setting the nbThreads parameter in the /gegelati-tutorial/params.json file as follows:

"nbThreads": 4,

Conclusion

In this tutorial, you have successfully enabled parallel training of Tangled Program Graphs (TPGs) in Gegelati by replacing the sequential LearningAgent with ParallelLearningAgent and making the PendulumWrapper safely copyable.

More information about parallel training with Gegelati can be found in the following publication:

K. Desnos, N. Sourbier, P.-Y. Raumer, O. Gesny and M. Pelcat. GEGELATI: Lightweight Artificial Intelligence through Generic and Evolvable Tangled Program Graphs. In Workshop on Design and Architectures for Signal and Image Processing (DASIP), ACM, 2021