Interface Consistencies 

I know that Reinforce.jl is not trying to emulate OpenAI gym exactly, but I'm curious behind the reasoning to a couple interface decisions that seem inconsistent with gym's.

First, why doesn't `reset!(env)` return a state or observation for convenience? From personal experience, when I was using OpenAIGym.jl, `reset!(env)` was always returning `false`. This was happening because julia returns the variable on the last line of the function by default, which happened to come from `env.done=false`. I had to look through the source code to figure out what was happening. Returning a state/observation would be consistent with gym, and would avoid any confusion for new users. 

Second, why does `step!(env, s, a)` return `r, s'` instead of `s',r`? This is a minor difference in ordering, but once again, I had an expectation for what `step!` should return from gym. 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Interface Consistencies #10

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Interface Consistencies #10

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions