Metadata-Version: 2.1
Name: dm-robotics-agentflow
Version: 0.6.0
Summary: Tools for single-embodiment, multiple-task, Reinforcement Learning
Home-page: https://github.com/deepmind/dm_robotics/tree/main/py/agentflow
Author: DeepMind
License: Apache 2.0
Classifier: Development Status :: 5 - Production/Stable
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.7, <3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: dm-robotics-transformations
Requires-Dist: dm-robotics-geometry
Requires-Dist: numpy >=1.16.0
Requires-Dist: dm-control ==1.0.15
Requires-Dist: mujoco ==3.0.0
Requires-Dist: opencv-python <=4.6.0.66,>=3.4.0
Requires-Dist: attrs >=20.3.0
Requires-Dist: pydot >=1.2.4
Requires-Dist: typing-extensions >=3.7.4

# AgentFlow: A Modular Toolkit for Scalable RL Research

<!--* B 2021-07-21 internal placeholder *-->

## Overview

`AgentFlow` is a library for composing Reinforcement-Learning agents. The core
features that AgentFlow provides are:

1.  tools for slicing, transforming, and composing *specs*
2.  tools for encapsulating and composing RL-tasks.

Unlike the standard RL setup, which assumes a single environment and an agent,
`AgentFlow` is designed for the single-embodiment, multiple-task regime. This
was motivated by the robotics use-case, which frequently requires training RL
modules for various skills, and then composing them (possibly with non-learned
controllers too).

Instead of having to implement a separate RL environment for each skill and
combine them ad hoc, with `AgentFlow` you can define one or more `SubTasks`
which *modify* a timestep from a single top-level environment, e.g. adding
observations and defining rewards, or isolating a particular sub-system of the
environment, such as a robot arm.

You then *compose* SubTasks with regular RL-agents to form modules, and use a
set of graph-building operators to define the flow of these modules over time
(hence the name `AgentFlow`).

The graph-building step is entirely optional, and is intended only for use-cases
that require something like a (possibly learnable, possibly stochastic)
state-machine.

<!-- Internal placeholder C -->
### [Components](docs/components.md)
### [Control Flow](docs/control_flow.md)
### [Examples](docs/examples.md)
<!-- Internal placeholder D -->
