-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Description
The translation sentence (<s />
) elements have the same id
value as the original sentence. See id="t0b0d0p0s0"
in the following example. This violates the XML specification requiring id attributes to be unique across a single document.
<p id="t0b0d0p0">
<s id="t0b0d0p0s0"><w id="t0b0d0p0s0w0" ARPABET="T HH IY S" time="0.72" dur="0.25">This</w> <w id="t0b0d0p0s0w1" ARPABET="IY S" time="0.97" dur="0.14">is</w> <w id="t0b0d0p0s0w2" ARPABET="AA" time="1.11" dur="0.05">a</w> <w id="t0b0d0p0s0w3" ARPABET="T EY S T" time="1.16" dur="0.58">test</w>.</s>
<s do-not-align="true" id="t0b0d0p0s0" sentence-id="t0b0d0p0s0" class="sentence__translation editable__translation" xml:lang="eng">Ceci est un test.</s>
</p>
There was an attempt to fix this issue, but there is now functionality that depends on this broken implementation. Additionally, any corrective action will need to support the "broken" implementation since older readalong XML files will not get fixed.
Recommendations
- append the suffix
trN
to the original sentence's id to generatet0b0d0p0s0tr0
. Current read alongs have a single translation, thetrN
prefix would support additional translations. - use the
sentence-id
attribute to identify a sentence's translation - maintain current implementations to support older read along files.
Metadata
Metadata
Assignees
Labels
No labels