A ScalaPy Facade for OpenAI Gym!
Main | Develop |
---|---|
The main aim of this facade consist in using environments described in OpenAI Gym.
Currently, there is no interesting in creating environment Scala side. The developing workflow consist in:
- develop your reinforcement learning in Scala,
- create a functional facade to interact with ScalaPy Gym
- test your algorithms in Open AI baselines and share your results!
First, you should set up your ScalaPy project correctly, please refer to this documentation:
Then, you should add this library as dependency in your sbt file:
libraryDependencies += "io.github.cric96" %% "scalapy-gym" % "<x.y.z>"
Then you should install OpenAI dependencies. I suggest you to use pyenv
. The main dependencies are:
- gym
- scipy
Look to requirements.txt.
To use other environments (box2d
or MuJuCo
and Atari
), please refer to OpenAI Documentation.
This library tries to make environments type safe. So you have to define:
- action type
- observation type
- action space type
- observation space type
For example, for FrozenLake you should write:
val env = Gym.make[Int, Int, Discrete, Discrete]("FrozenLake-v0")
If you do not care about the action and observation type, you can type:
val env = Gym.unsafe("FrozenLake-v0")
A simple loop that advances in the simulation could be:
import io.github.cric96.gym.Gym
val env = Gym.unsafe("FrozenLake-v0") // or EnvFactory.ToyText.frozenLakeV0
env.reset()
val observations = (0 to 1000)
.tapEach(_ => env.render)
.map(env.step(env.actionSpace.sample()))
env.close()
The python counterpart is:
val env = Gym.unsafe("FrozenLake-v0")
env.reset()
for _ in range(1000):
env.render()
env.step(env.action_space.sample()) # take a random action
env.close()
As you can see, the experience is very similar :)
Some environments have already the correct typing (look to EnvFactory)
- ToyTest
- FrozenLake
- FrozenLake
- GuessingGame
- HotterColder
- nChain
- Roulette
- ClassicControl
- Acrobot
- CartPole
- MountainCar
- MountainCarContinuous
- Pendulum
- Atari
- Box2D
- BipedalWalker
- BipedalWalkerHardcore
- CarRacing
- LunarLander
- LunarLanderContinuous
- MuJoCo
- Robotics
- Algorithms