GEGELATI
Public Member Functions | Protected Member Functions | Protected Attributes | List of all members
Learn::LearningEnvironment Class Referenceabstract

Interface for creating a Learning Environment. More...

#include <learningEnvironment.h>

Inheritance diagram for Learn::LearningEnvironment:
Learn::AdversarialLearningEnvironment Learn::ClassificationLearningEnvironment

Public Member Functions

 LearningEnvironment ()=delete
 Delete the default constructor of a LearningEnvironment.
 
virtual ~LearningEnvironment ()=default
 Default virtual destructor.
 
 LearningEnvironment (uint64_t nbAct)
 Constructor for LearningEnviroment. More...
 
virtual LearningEnvironmentclone () const
 Get a copy of the LearningEnvironment. More...
 
virtual bool isCopyable () const
 Can the LearningEnvironment be copy constructed to evaluate several LearningAgent in parallel. More...
 
uint64_t getNbActions () const
 Get the number of actions available for this LearningEnvironment. More...
 
virtual void doAction (uint64_t actionID)
 Execute an action on the LearningEnvironment. More...
 
virtual void reset (size_t seed=0, LearningMode mode=LearningMode::TRAINING)=0
 Reset the LearningEnvironment. More...
 
virtual std::vector< std::reference_wrapper< const Data::DataHandler > > getDataSources ()=0
 Get the data sources for this LearningEnvironment. More...
 
virtual double getScore () const =0
 Returns the current score of the Environment. More...
 
virtual bool isTerminal () const =0
 Method for checking if the LearningEnvironment has reached a terminal state. More...
 

Protected Member Functions

 LearningEnvironment (const LearningEnvironment &other)=default
 Make the default copy constructor protected.
 

Protected Attributes

const uint64_t nbActions
 

Detailed Description

Interface for creating a Learning Environment.

This class defines all the method that should be implemented for a Learner to interact with an learning environment and learn to interact with it.

Interaction with a learning environment are made through a discrete set of actions. As a result of these actions, the learning environment may update its state, accessible through the data sources it provides. The learning environment also provides a score resulting from the past actions, and a termination boolean indicating that the learningEnvironment has reached a final state, that no action will affect.

Constructor & Destructor Documentation

◆ LearningEnvironment()

Learn::LearningEnvironment::LearningEnvironment ( uint64_t  nbAct)
inline

Constructor for LearningEnviroment.

Parameters
[in]nbActnumber of actions that will be usable for interacting with this LearningEnviromnent.

Member Function Documentation

◆ clone()

Learn::LearningEnvironment * Learn::LearningEnvironment::clone ( ) const
virtual

Get a copy of the LearningEnvironment.

Default implementation returns a null pointer.

Returns
a copy of the LearningEnvironment if it is copyable, otherwise this method returns a NULL pointer.

Copyright or © or Copr. IETR/INSA - Rennes (2019) :

Karol Desnos kdesn.nosp@m.os@i.nosp@m.nsa-r.nosp@m.enne.nosp@m.s.fr (2019)

GEGELATI is an open-source reinforcement learning framework for training artificial intelligence based on Tangled Program Graphs (TPGs).

This software is governed by the CeCILL-C license under French law and abiding by the rules of distribution of free software. You can use, modify and/ or redistribute the software under the terms of the CeCILL-C license as circulated by CEA, CNRS and INRIA at the following URL "http://www.cecill.info".

As a counterpart to the access to the source code and rights to copy, modify and redistribute granted by the license, users are provided only with a limited warranty and the software's author, the holder of the economic rights, and the successive licensors have only limited liability.

In this respect, the user's attention is drawn to the risks associated with loading, using, modifying and/or developing or reproducing the software by the user in light of its specific status of free software, that may mean that it is complicated to manipulate, and that also therefore means that it is reserved for developers and experienced professionals having in-depth computer knowledge. Users are therefore encouraged to load and test the software's suitability as regards their requirements in conditions enabling the security of their systems and/or data to be ensured and, more generally, to use and operate it in the same conditions as regards security.

The fact that you are presently reading this means that you have had knowledge of the CeCILL-C license and that you accept its terms.

◆ doAction()

void Learn::LearningEnvironment::doAction ( uint64_t  actionID)
virtual

Execute an action on the LearningEnvironment.

The purpose of this method is to execute an action, represented by an actionId comprised between 0 and nbActions - 1. The LearningEnvironment implementation only checks that the given actionID is comprised between 0 and nbActions - 1. It is the responsibility of this method to call the updateHash method on dataSources whose content have been affected by the action.

Parameters
[in]actionIDthe integer number representing the action to execute.
Exceptions
std::runtime_errorif the actionID exceeds nbActions - 1.

Reimplemented in Learn::ClassificationLearningEnvironment.

◆ getDataSources()

virtual std::vector< std::reference_wrapper< const Data::DataHandler > > Learn::LearningEnvironment::getDataSources ( )
pure virtual

Get the data sources for this LearningEnvironment.

This method returns a vector of reference to the DataHandler that will be given to the learningAgent, and to its Program to learn how to interact with the LearningEnvironment. Throughout the existence of the LearningEnvironment, data contained in the data will be modified, but never the number, nature or size of the dataHandlers. Since this methods return references to the DataHandler, the learningAgent will assume that the referenced dataHandler are automatically updated each time the doAction, or reset methods are called on the LearningEnvironment.

Returns
a vector of references to the DataHandler.

◆ getNbActions()

uint64_t Learn::LearningEnvironment::getNbActions ( ) const
inline

Get the number of actions available for this LearningEnvironment.

Returns
the integer value of the nbAction attribute.

◆ getScore()

virtual double Learn::LearningEnvironment::getScore ( ) const
pure virtual

Returns the current score of the Environment.

The returned score will be used as a reward during the learning phase of a LearningAgent.

Returns
the current score for the LearningEnvironment.

Implemented in Learn::AdversarialLearningEnvironment, and Learn::ClassificationLearningEnvironment.

◆ isCopyable()

bool Learn::LearningEnvironment::isCopyable ( ) const
virtual

Can the LearningEnvironment be copy constructed to evaluate several LearningAgent in parallel.

Returns
true if the LearningEnvironment can be copied and run in parallel. Default implementation returns false.

◆ isTerminal()

virtual bool Learn::LearningEnvironment::isTerminal ( ) const
pure virtual

Method for checking if the LearningEnvironment has reached a terminal state.

The boolean value returned by this method, when equal to true, indicates that the LearningEnvironment has reached a terminal state. A terminal state is a state in which further calls to the doAction method will have no effects on the dataSources of the LearningEnvironment, or on its score. For example, this terminal state may be reached for a Game Over state within a game, or in case the objective of the learning agent has been successfuly reached.

Returns
a boolean indicating termination.

◆ reset()

virtual void Learn::LearningEnvironment::reset ( size_t  seed = 0,
LearningMode  mode = LearningMode::TRAINING 
)
pure virtual

Reset the LearningEnvironment.

Resetting a learning environment is needed to train an agent. Optionally seed can be given to this function to control the randomness of a LearningEnvironment (if any). When available, this feature will be used:

  • for comparing the performance of several agents with the same random starting conditions.
  • for training each agent with diverse starting conditions.
Parameters
[in]seedthe integer value for controlling the randomness of the LearningEnvironment.
[in]modeLearningMode in which the Environment should be reset for the next set of actions.

Implemented in Learn::ClassificationLearningEnvironment.

Member Data Documentation

◆ nbActions

const uint64_t Learn::LearningEnvironment::nbActions
protected

Number of actions available for interacting with this LearningEnvironment


The documentation for this class was generated from the following files: