Clarification on Dataset Source, Trace Selection, and Data Split Details

Dear Author,

First of all, thank you for your excellent work! I have found your research to be both insightful and valuable.

I am writing to seek clarification regarding the dataset used in your project. Specifically, I have a few questions:

1. **What is the exact source of the dataset?**  
2. **Which specific traces were used in your study?**  
3. **How were the training/validation, and testing splits performed?**

I noticed that there may be two related datasets of Android Raw GNSS Measurements data (Fu et al., 2020), although they have some overlap:
- [Google Smartphone Decimeter Challenge](https://www.kaggle.com/competitions/google-smartphone-decimeter-challenge/data)
- [Android smartphones high accuracy GNSS datasets](https://www.kaggle.com/datasets/google/android-smartphones-high-accuracy-datasets)

In Section 5.3 of your journal paper, you mention:
> “(a) a training split (≈ 75% of the data set), (b) a validation split (≈ 10% of the data set), and (c) a testing split (≈ 15% of the data set).”

You also state:
> “As a result of the data set split, the training data set had 93,195 samples, the validation data set had 10,355 samples, and the testing data set had 16,568 samples.”

However, the ratio of training to validation samples is 93,195:10,355 = 9:1 (exactly), which does not match the stated 75%:10% = 7.5:1. The 9:1 ratio exactly corresponds to the parameter `frac: 0.9` in `[config\train_android_conf.yaml]`.

Could you please provide some suggestion? Your insights would be greatly appreciated.

Thank you in advance for your time and assistance.

Best regards,  
Sheng Liu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clarification on Dataset Source, Trace Selection, and Data Split Details #10

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Clarification on Dataset Source, Trace Selection, and Data Split Details #10

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions