A multifactorial reinforcement learning assessment group

Estimated read time: 2 min

Wireless

Technology deployed in the real world inevitably faces unexpected challenges. These challenges arise because the environment in which the technology is developed differs from the environment in which it will be deployed. When technology is transferred successfully, we say it is popularized. in multifactorial system, like autonomous vehicle technology, there are two potential sources of generalization difficulty: (1) variance of the physical environment such as changes in weather or lighting, and (2) variance of the social environment: changes in the behavior of other interacting individuals. Dealing with social environment variance is at least as important as dealing with physical environment variance, except that it has not been studied as much.

As an example of a social environment, consider how self-driving cars on the road interact with other cars. Each car has an incentive to transport its passengers as quickly as possible. However, this competition can lead to poor coordination (road congestion) which negatively affects everyone. If the cars operate cooperatively, more passengers may arrive at their destination more quickly. This conflict is called a social dilemma.

However, not all interactions are social dilemmas. For example, there synergistic Interactions in open source software The net result is zero Interactions in sport, f Coordination problems It is at the heart of supply chains. Navigating each of these situations requires an entirely different approach.

Multi-agent reinforcement learning provides tools that allow us to explore how artificial agents interact with each other and with unfamiliar individuals (such as human users). This class of algorithms is expected to perform better when tested for their social generalization capabilities than others. However, to date, there has been no systematic evaluation standard to assess this.

Blue: focal assemblages of trained agents, Red: population background of pre-trained bots

Here we present the Melting Pot, a scalable evaluation suite for multifactor reinforcement learning. The Melting Pot assesses generalization to novel social situations involving both familiar and unfamiliar individuals and is designed to test a wide range of social interactions such as: cooperation, competition, deception, reciprocity, trust, stubbornness etc. The Melting Pot offers researchers a set of 21 MARL (multifactor game) “substrates” to train agents on, and more than 85 unique test scenarios to evaluate these trained agents. The performance of the agents in these test scenarios determines whether the agents:

  • perform well across a range of social situations in which individuals depend on one another,
  • Interact effectively with unfamiliar individuals who were not seen during training,
  • Passing the generalization test: answering positively the question “What if everyone acted this way?”

The resulting score can then be used to rank different multifactorial RL algorithms by their ability Circular To narrate social situations.

We hope the Melting Pot will become a standard for multifactor reinforcement learning. We plan to maintain it, and we will expand it in the coming years to cover more social interactions and mainstreaming scenarios.

Learn more from our GitHub page.

Source link

Post a Comment

Cookie Consent
We serve cookies on this site to analyze traffic, remember your preferences, and optimize your experience.
Oops!
It seems there is something wrong with your internet connection. Please connect to the internet and start browsing again.
AdBlock Detected!
We have detected that you are using adblocking plugin in your browser.
The revenue we earn by the advertisements is used to manage this website, we request you to whitelist our website in your adblocking plugin.
Site is Blocked
Sorry! This site is not available in your country.