The Multi_Accuracy metric is not compatible with mxnet 1.6.0

Hi,

I tried to train the network by just changing the batchsize and gpus in the default setting. And I get the following error, which occurs after the finishing of the first batch.

`[09:20:05] src/nnvm/legacy_json_util.cc:209: Loading symbol saved by previous version v0.8.0. Attempting to upgrade...
[09:20:05] src/nnvm/legacy_json_util.cc:217: Symbol successfully upgraded!
[09:20:05] src/nnvm/legacy_json_util.cc:209: Loading symbol saved by previous version v0.8.0. Attempting to upgrade...
[09:20:05] src/nnvm/legacy_json_util.cc:217: Symbol successfully upgraded!
INFO:root:start with arguments Namespace(batch_size=2, benchmark=0, data_nthreads=128, disp_batches=20, dtype='float32', gpus='0', image_shape='3,512,512', kv_store='device', load_epoch=None, lr=0.1, lr_factor=0.1, lr_step_epochs='100,200', max_random_aspect_ratio=0, max_random_h=0, max_random_l=0, max_random_rotate_angle=0, max_random_s=0, max_random_scale=1, max_random_shear_ratio=0, min_random_scale=1, model_prefix='./model/tasn', mom=0, monitor=0, network=None, num_classes=200, num_epochs=300, num_examples=5994, num_layers=None, optimizer='sgd', pad_size=0, random_crop=1, random_mirror=1, rgb_mean='123.68,116.779,103.939', test_io=0, top_k=5, wd=0)
[09:20:05] src/io/iter_image_recordio_2.cc:178: ImageRecordIOParser2: ./data/cub/train.rec, use 4 threads for decoding..
[09:20:08] src/io/iter_image_recordio_2.cc:178: ImageRecordIOParser2: ./data/cub/val.rec, use 4 threads for decoding..
learning rate from ``lr_scheduler`` has been overwritten by ``learning_rate`` in optimizer.
INFO:root:Epoch[0] Batch [0-20]	Speed: 33.71 samples/sec	att_net_accuracy=0.000000	part_net_accuracy=0.023810	master_net_accuracy=0.023810	part_net_aux_accuracy=0.023810	master_net_aux_accuracy=0.023810	distillation_loss=5.296982
Traceback (most recent call last):
  File "train.py", line 57, in <module>
    eval_metric = evaluate.Multi_Accuracy(num=6))
  File "/home/ysu/project/attention_net/tasn/tasn-mxnet/example/tasn/common/fit.py", line 195, in fit
    monitor            = monitor
  File "/home/ysu/mxnet_attention/lib/python3.5/site-packages/mxnet/module/base_module.py", line 533, in fit
    self.update_metric(eval_metric, data_batch.label)
  File "/home/ysu/mxnet_attention/lib/python3.5/site-packages/mxnet/module/module.py", line 775, in update_metric
    self._exec_group.update_metric(eval_metric, labels, pre_sliced)
  File "/home/ysu/mxnet_attention/lib/python3.5/site-packages/mxnet/module/executor_group.py", line 640, in update_metric
    eval_metric.update_dict(labels_, preds)
  File "/home/ysu/mxnet_attention/lib/python3.5/site-packages/mxnet/metric.py", line 133, in update_dict
    self.update(label, pred)
  File "/home/ysu/project/attention_net/tasn/tasn-mxnet/example/tasn/common/evaluate.py", line 32, in update
    self.sum_metric[i] += (pred_label.flat == label.flat).sum()
TypeError: 'float' object is not subscriptable
`

The reason is that, in mxnet1.6.0, the `EvalMetric` class has not only `num_inst `, `sum_metric`, but also `global_num_inst`, `global_sum_metric`. 

And in the batch_end_callback function (here is `Speedometer`), it will execute `reset_local()` function to reset `num_inst `, `sum_metric`, rather than `reset()` function as in the old version of mxnet.

However, you don't have the implementation of `reset_local()` in your  `Multi_Accuracy ` class. So the  `sum_metric` will be reset as 0.0 using the `reset_local()` function in the `EvalMetric` class.

A quick solution could be, set the `auto_reset` argument in `Speedometer` as False.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The Multi_Accuracy metric is not compatible with mxnet 1.6.0 #14

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The Multi_Accuracy metric is not compatible with mxnet 1.6.0 #14

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions