MOS testing problem: action looks random while using POUCT planner

Hi! So I just wanted to test out my understanding by re-creating the MOS problem in a different way. 
- I computed robot states and object state as 2 different numpy array, because of this instead of dim, I have an array giving me min x y and max x y. 
- I used POUCT with histogram belief for object search. 
- My actions were forward, turn left 45, turn right 45, and declare (equivalent to find). 
- Instead of having look as an action, I made agent look all the time so that agent will always receive an observation from sensor.
- Policy just gives all actions since I don't have the look into find action limitations.
- Reward is just -1 for turn, -3 forward, -1000 for wrong Declare, +1000 for right declare if agent is within a certain distance of the object.
- num_sims = 500
- The running is a little slow but that's not the main issue.
- For simplicity, I plot and save the pose of robot and the objects every timestep. 

So I set epsilon to 1 just to make sure it works first. so essentially, each object belief's pose is already pretty high after a few iterations, hence the sample_belief is gives good info right away, since the probability was basically 1. The problem is it seems the action it take is still pretty random. I tried debugging the tree but it seems the largest value of action given observation of the tree is very random as well. 

Here's the link full code: https://drive.google.com/file/d/1Kul8ay_ruGrTcr2hjo2_NfcwzlntAB3N/view?usp=sharing
Just do ``` python problem.py``` to run the code.
Let me know if you have any questions regarding the code. Since it is very similar to the original in terms of formatting, there's a lot  of code so I wasn't sure if I should put everything on here. 

Here's its pose after action, it looks very random and it keeps trying to go straight even though it can't since I only have the grid at 3 by 27.

![1](https://github.com/user-attachments/assets/3cc5139b-e65c-41aa-a10d-153067a455d0)
![2](https://github.com/user-attachments/assets/d83fd98d-3350-49a8-b30e-1d610b210481)
![3](https://github.com/user-attachments/assets/23a06883-c136-4159-9e52-99a91656be4f)
![4](https://github.com/user-attachments/assets/75019248-0b28-4e4d-a1f2-f98119f6f537)
![5](https://github.com/user-attachments/assets/9a1cbf05-b7cb-421d-b39d-f0ea8eaf5ae2)
![6](https://github.com/user-attachments/assets/c2026f66-e042-4b4d-b298-8f42000df508)
![7](https://github.com/user-attachments/assets/eeb37766-0766-4de3-9bec-32aac11dfedd)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MOS testing problem: action looks random while using POUCT planner #74

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MOS testing problem: action looks random while using POUCT planner #74

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions