Skip to content
This repository was archived by the owner on Dec 21, 2017. It is now read-only.
This repository was archived by the owner on Dec 21, 2017. It is now read-only.

VeRI XML file has the wrong encoding #72

@agude

Description

@agude

The first line of the VeRI train_label.xml file is:

<?xml version="1.0" encoding="gb2312" ?>

However, xml.etree.ElementTree throws an error when reading this file:

ValueError                                Traceback (most recent call last)
<ipython-input-2-c9f2779c1c84> in <module>()
      2 hog_features_test = HOGFeatureProducer(veri_test)
      3 
----> 4 veri_train = VeriDataset("pelops/datasets/VeRi", "train")
      5 hog_features_train = HOGFeatureProducer(veri_train)

pelops/pelops/datasets/veri.py in __init__(self, dataset_path, set_type)
     48         self.__color_type = {}
     49         if self.set_type is utils.SetType.ALL or self.set_type is utils.SetType.TRAIN:
---> 50             self.__build_metadata_dict()
     51         self.__set_chips()
     52 

pelops/pelops/datasets/veri.py in __build_metadata_dict(self)
     53     def __build_metadata_dict(self):
     54         """Extract car type and color from the label file."""
---> 55         root = xml.etree.ElementTree.parse(self.__filepaths.label_train).getroot()
     56         try:
     57             root = xml.etree.ElementTree.parse(self.__filepaths.label_train).getroot()

/opt/conda/lib/python3.5/xml/etree/ElementTree.py in parse(source, parser)
   1182     """
   1183     tree = ElementTree()
-> 1184     tree.parse(source, parser)
   1185     return tree
   1186 

/opt/conda/lib/python3.5/xml/etree/ElementTree.py in parse(self, source, parser)
    594                     # It can be used to parse the whole source without feeding
    595                     # it with chunks.
--> 596                     self._root = parser._parse_whole(source)
    597                     return self._root
    598             while True:

ValueError: multi-byte encodings are not supported

Instead, the first line must be changed to:

<?xml version="1.0" encoding="UFT-8" ?>

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions