Metadata-Version: 2.1
Name: dm-robotics-agentflow
Version: 0.3.0
Summary: Tools for single-embodiment, multiple-task, Reinforcement Learning
Home-page: https://github.com/deepmind/dm_robotics/tree/main/py/agentflow
Author: DeepMind
License: Apache 2.0
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.7, <3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: dm-robotics-transformations
Requires-Dist: dm-robotics-geometry
Requires-Dist: numpy (>=1.16.0)
Requires-Dist: dm-control (==0.0.425341097)
Requires-Dist: opencv-python (>=3.4.0)
Requires-Dist: attrs (>=20.3.0)
Requires-Dist: pydot (>=1.2.4)
Requires-Dist: typing-extensions (>=3.7.4)

# AgentFlow: A Modular Toolkit for Scalable RL Research

## Overview

`AgentFlow` is a library for composing Reinforcement-Learning agents. The core
features that AgentFlow provides are:

1.  tools for slicing, transforming, and composing *specs*
2.  tools for encapsulating and composing RL-tasks.

Unlike the standard RL setup, which assumes a single environment and an agent,
`AgentFlow` is designed for the single-embodiment, multiple-task regime. This
was motivated by the robotics use-case, which frequently requires training RL
modules for various skills, and then composing them (possibly with non-learned
controllers too).

Instead of having to implement a separate RL environment for each skill and
combine them ad hoc, with `AgentFlow` you can define one or more `SubTasks`
which *modify* a timestep from a single top-level environment, e.g. adding
observations and defining rewards, or isolating a particular sub-system of the
environment, such as a robot arm.

You then *compose* SubTasks with regular RL-agents to form modules, and use a
set of graph-building operators to define the flow of these modules over time
(hence the name `AgentFlow`).

The graph-building step is entirely optional, and is intended only for use-cases
that require something like a (possibly learnable, possibly stochastic)
state-machine.

### [Components](docs/components.md)
### [Control Flow](docs/control_flow.md)
### [Examples](docs/examples.md)


