Parallel Training of Tangled Program Graphs
The objective of this tutorial is to activate parallel training of Tangled Program Graphs (TPGs) with Gegelati by:
- instantiating a
ParallelLearningAgent, and - making the
PendulumWrappersafely copyable so worker threads receive independent environments.
The starting point of this tutorial is the C++ project obtained at the end of the GEGELATI introductory tutorial. While completing the introductory tutorial is strongly advised, a copy of the project resulting from this tutorial can be downloaded at the following link: pendulum_wrapper_solution.zip.
To fully benefit from the parallelization, a the multi-episode evaluation of the agents, covered in the linked tutorial, can be implemented before starting this tutorial. The result from the multi-episode evaluation tutorial can be downloaded at the following link: gegelati-tutorial-strengthening-solution.zip.
0. Why make the environment copyable?
The learning process of TPGs involves two main time-consuming steps per generation:
- Evaluation of the fitness of each individual TPG root within the
PendulumWrapperlearning environment. This step takes timeT_evalseconds at each generation in the printed log. - Mutation of the TPG population. This step takes time
T_mutatseconds at each generation in the printed log.
When using a LearningAgent, both steps are performed sequentially on a single thread. To accelerate training, it is possible to parallelize these steps across multiple threads/cores by using ParallelLearningAgent.
To better take not of the benefits of parallel training, keep a copy of the logs produced by the sequential training for comparison.
An important feature of Gegelati is that the parallelization of training is fully deterministic, which means that running the same training with the same random seed will always produce the same results, regardless of the number of threads used. This is achieved by ensuring that each worker thread operates on its own independent copy of the learning environment.
1. Parallelize mutations
Use the ParallelLearningAgent
To enable parallel mutations, the sequential LearningAgent must be replaced with ParallelLearningAgent. By default, the number of threads is set to the number of available hardware threads on the machine.
TODO #1:
Edit the /gegelati-tutorial/src/training/main-training.cpp by replacing the line that instantiates the LearningAgent with a line that instantiates a ParallelLearningAgent:
Solution to #1 (Click to expand)
/* main-training.cpp */
// Instantiate and initialize the Learning Agent (LA)
Learn::ParallelLearningAgent la(pendulumLE, instructionSet, params);
First parallel training run
Build and run the main-training target of the project. You should observe that T_mutat times have slightly decreased compared to the sequential training log. Other columns relative to the trained TPG characteristics (NbVert, NbActR, NbTeamR) and the fitness of agents (Min, Avg, Max) should remain identical to the sequential training.
2. Parallelize evaluations
Make the PendulumWrapper safely copyable
To enable parallel evaluations, the PendulumWrapper must be made safely copyable. This is done first by implementing the copy constructor of the PendulumWrapper class, and then by overriding the clone() method inherited from the LearningEnvironment base class.
TODO #2:
Edit the /gegelati-tutorial/src/environments/pendulum_wrapper.h and /gegelati-tutorial/src/environments/pendulum_wrapper.cpp to add a copy constructor PendulumWrapper(const PendulumWrapper& other) to the class.
It is important to note that the default copy constructor generated by the compiler would perform a shallow copy of the member variables, which is not suitable in this case. Therefore, a custom copy constructor must be implemented to ensure that all member variables are properly duplicated.
Special care should be taken to handle the std::vector<Data::PointerWrapper<double>> data attribute, this attribute must be initialized as a copy-constructed copy of the other.data attribute. Then the pointers contained in the vector must be updated to point to the attributes of the this->pendulum, and not to other.pendulum as is the case after copy-constructing the data attribute.
Solution to #2 (Click to expand)
/* pendulum_wrapper.h */
// Copy constructor
PendulumWrapper(const PendulumWrapper& other);
/* pendulum_wrapper.cpp */
// Copy constructor implementation
PendulumWrapper::PendulumWrapper(const PendulumWrapper& other)
: LearningEnvironment(other), // Call base class copy constructor
pendulum(other.pendulum), // Copy-construct the pendulum
data(other.data) // Copy-construct the data vector
{
// Update pointers in data to point to this->pendulum's attributes
data.at(0).setPointer(&this->pendulum.getAngle());
data.at(1).setPointer(&this->pendulum.getVelocity());
}
TODO #3:
Next, override the clone() method in the PendulumWrapper class to return a new instance of PendulumWrapper created using the copy constructor.
Solution to #3 (Click to expand)
/* pendulum_wrapper.h */
// Override clone method
Data::LearningEnvironment* clone() const override;
/* pendulum_wrapper.cpp */
// Override clone method implementation
Data::LearningEnvironment* PendulumWrapper::clone() const {
return new PendulumWrapper(*this); // Use copy constructor
}
TODO #4:
To signal to Gegelati that the PendulumWrapper can be safely copied for parallel evaluation, the LearningEnvironment::isCopyable() method must be overridden to return true.
Solution to #4 (Click to expand)
/* pendulum_wrapper.h */
// Override isCopyable method
bool isCopyable() const override;
/* pendulum_wrapper.cpp */
// Override isCopyable method implementation
bool PendulumWrapper::isCopyable() const {
return true; // Indicate that this environment is copyable
}
Test parallel evaluations
Build and run the main-training target of the project. You should observe that T_eval times have significantly decreased compared to the sequential training log. Other columns relative to the trained TPG characteristics (NbVert, NbActR, NbTeamR) and the fitness of agents (Min, Avg, Max) should remain identical to the sequential training.
It it possible to control the number of threads used by the ParallelLearningAgent by setting the nbThreads parameter in the /gegelati-tutorial/params.json file as follows:
"nbThreads": 4,
Conclusion
In this tutorial, you have successfully enabled parallel training of Tangled Program Graphs (TPGs) in Gegelati by replacing the sequential LearningAgent with ParallelLearningAgent and making the PendulumWrapper safely copyable.
More information about parallel training with Gegelati can be found in the following publication: