Skip to content

Clarification on Dataset Source, Trace Selection, and Data Split Details #10

@liusheng2020

Description

@liusheng2020

Dear Author,

First of all, thank you for your excellent work! I have found your research to be both insightful and valuable.

I am writing to seek clarification regarding the dataset used in your project. Specifically, I have a few questions:

  1. What is the exact source of the dataset?
  2. Which specific traces were used in your study?
  3. How were the training/validation, and testing splits performed?

I noticed that there may be two related datasets of Android Raw GNSS Measurements data (Fu et al., 2020), although they have some overlap:

In Section 5.3 of your journal paper, you mention:

“(a) a training split (≈ 75% of the data set), (b) a validation split (≈ 10% of the data set), and (c) a testing split (≈ 15% of the data set).”

You also state:

“As a result of the data set split, the training data set had 93,195 samples, the validation data set had 10,355 samples, and the testing data set had 16,568 samples.”

However, the ratio of training to validation samples is 93,195:10,355 = 9:1 (exactly), which does not match the stated 75%:10% = 7.5:1. The 9:1 ratio exactly corresponds to the parameter frac: 0.9 in [config\train_android_conf.yaml].

Could you please provide some suggestion? Your insights would be greatly appreciated.

Thank you in advance for your time and assistance.

Best regards,
Sheng Liu

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions