Why we need the features of all rounds to predict the final reward? #19

wongsingfo · 2022-08-24T10:12:20Z

wongsingfo
Aug 24, 2022

Both mortal and suphx [1] use a global reward predictor to predict the final game reward when the i-th game begins. The predictor uses the features (i.e. scores of 4 players, grand_kyoku, honba, and kyotaku) of not only the i-th round but also all previous rounds.

I am wondering why we need the features before the i-th round? I think they are independent factors for the final reward. In other words, no matter how well or how poor the player performs from the first round to the (i-1)-th round, the expected final ranking should be the same given that the features of the i-th round are the same.

[1] Suphx: Mastering Mahjong with Deep Reinforcement Learning. arXiv preprint arXiv:2003.13590, 2020a. Section 3.2

Equim-chan · 2022-08-24T11:10:14Z

Equim-chan
Aug 24, 2022
Maintainer

I'm not sure. I was following Suphx's method because it was tested to have worked. Maybe you could do some experiment by replacing the GRU part with 2 layers of MLP of the same number of parameters, and see if the performances are the same.

0 replies

hyskylord · 2023-11-01T13:00:03Z

hyskylord
Nov 1, 2023

I think the assumption here is that a player will tend to use the same strategy in all rounds (for both human players and AIs), so that you can predict the behaviour of a player by its actions in previous rounds.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Why we need the features of all rounds to predict the final reward? #19

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Why we need the features of *all* rounds to predict the final reward? #19

Uh oh!

wongsingfo Aug 24, 2022

Replies: 2 comments

Uh oh!

Equim-chan Aug 24, 2022 Maintainer

Uh oh!

hyskylord Nov 1, 2023

Why we need the features of all rounds to predict the final reward? #19

wongsingfo
Aug 24, 2022

Equim-chan
Aug 24, 2022
Maintainer

hyskylord
Nov 1, 2023