Skip to content

Quickstart example runs for a long time #998

@cspink

Description

@cspink

I am new to OpenNMT-tf, as the features look very useful for what I am trying to do. I begin with the quickstart guide found here. Installing the software on a HPC grid with its idiosyncrasies can be cumbersome, and I am not entirely sure if I have done things correctly. The system runs, however, but even when I start the example, running the exact same commands as in the quickstart guide, on a machine with 4 GPUs, it runs for hours, possibly days. The throughput is ~50k tokens/s. The guide also says that it is too little data to get good results. So what will the training times be on real data, then?

Since this is described as a toy example I found it odd that it might run for days on such a big system. So I would just like to know something about what I can expect from the autoconfig system described in the guide. I also have no idea of when it will stop, if it has some convergence criteria or a fixed amount of epochs.

The vocabulary I get out is:
24998 toy-ende/src-vocab.txt
35819 toy-ende/tgt-vocab.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions