Skip to content

Pipeline should handle manifest files with OSX/Windows line endings #119

@mhidas

Description

@mhidas

I've just tried uploading a .map_manifest file into the new AODN_moorings_nocheck pipeline (see https://github.com/aodn/chef-private/pull/2984) on 4-nec-hob. The manifest file looked like this:

/mnt/ebs/tmp/test_data/QLD/IMOS_ANMN-QLD_CTPSOKUE_20091120T045000Z_GBRMYR_FV01_GBRMYR-0911-WQM-189_END-20100601T040000Z_C-20120201T063245Z.nc,IMOS/ANMN/QLD/GBRMYR/Biogeochem_timeseries/IMOS_ANMN-QLD_CTPSOKUE_20091120T045000Z_GBRMYR_FV01_GBRMYR-0911-WQM-189_END-20100601T040000Z_C-20120201T063245Z.nc
/mnt/ebs/tmp/test_data/QLD/IMOS_ANMN-QLD_CTPSOKUE_20091120T045000Z_GBRMYR_FV01_GBRMYR-0911-WQM-19_END-20100601T040000Z_C-20120201T063238Z.nc,IMOS/ANMN/QLD/GBRMYR/Biogeochem_timeseries/IMOS_ANMN-QLD_CTPSOKUE_20091120T045000Z_GBRMYR_FV01_GBRMYR-0911-WQM-19_END-20100601T040000Z_C-20120201T063238Z.nc
/mnt/ebs/tmp/test_data/QLD/IMOS_ANMN-QLD_CTPSOKUE_20101103T090500Z_GBRMYR_FV01_GBRMYR-1010-WQM-188_END-20110414T212900Z_C-20120129T141746Z.nc,IMOS/ANMN/QLD/GBRMYR/Biogeochem_timeseries/IMOS_ANMN-QLD_CTPSOKUE_20101103T090500Z_GBRMYR_FV01_GBRMYR-1010-WQM-188_END-20110414T212900Z_C-20120129T141746Z.nc

The pipeline sort of pretended to process the file, run a harvester, and eventually reported SUCCESS in the log (see 4-nec-hob:/mnt/ebs/log/pipeline/process/tasks.AODN_moorings_nocheck.log, task id aa617f77-4c7a-46a9-828f-964672c52a8e), but in fact it failed completely:

  • Only one harvester (moorings_metadata) was selected to run, though several others have regexes matching the files in the collection;
  • The harvester ran without errors, but actually harvested nothing (it did report a warning for every file like "FILE_INDEX_UPDATER - WARNING: IMOS/ANMN/QLD/GBRMYR/Biogeochem_timeseries/IMOS_ANMN-QLD_CTPSOKUE_20091120T045000Z_GBRMYR_FV01_GBRMYR-0911-WQM-189_END-20100601T040000Z_C-20120201T063245Z.nc not found on index");
  • It did something on S3, but the files are not correctly uploaded. I can see them e.g. in here, but clicking on any of the files results in the error message "The specified key does not exist."

Trying to list the uploaded files on S3 gives weird results:

4-nec-hob:/mnt/imos-test-data$ ll IMOS/ANMN/QLD/GBRMYR/Biogeochem_timeseries/
ls: cannot access 'IMOS/ANMN/QLD/GBRMYR/Biogeochem_timeseries/IMOS_ANMN-QLD_CTPSOKUE_20091120T045000Z_GBRMYR_FV01_GBRMYR-0911-WQM-189_END-20100601T040000Z_C-20120201T063245Z.nc'$'\n': No such file or directory
ls: cannot access 'IMOS/ANMN/QLD/GBRMYR/Biogeochem_timeseries/IMOS_ANMN-QLD_CTPSOKUE_20091120T045000Z_GBRMYR_FV01_GBRMYR-0911-WQM-19_END-20100601T040000Z_C-20120201T063238Z.nc'$'\n': No such file or directory
ls: cannot access 'IMOS/ANMN/QLD/GBRMYR/Biogeochem_timeseries/IMOS_ANMN-QLD_CTPSOKUE_20101103T090500Z_GBRMYR_FV01_GBRMYR-1010-WQM-188_END-20110414T212900Z_C-20120129T141746Z.nc'$'\n': No such file or directory
ls: cannot access 'IMOS/ANMN/QLD/GBRMYR/Biogeochem_timeseries/IMOS_ANMN-QLD_CTPSOKUE_20111017T062000Z_GBRMYR_FV01_GBRMYR-1110-WQM-187_END-20120412T221800Z_C-20121112T033903Z.nc'$'\n': No such file or directory
ls: cannot access 'IMOS/ANMN/QLD/GBRMYR/Biogeochem_timeseries/IMOS_ANMN-QLD_CTPSOKUE_20111017T062000Z_GBRMYR_FV01_GBRMYR-1110-WQM-19_END-20120412T221800Z_C-20121112T033855Z.nc'$'\n': No such file or directory
total 2
drwxrwxrwx 1 root root 0 Jan  1  1970 ./
drwxrwxrwx 1 root root 0 Jan  1  1970 ../
?????????? ? ?    ?    ?            ? IMOS_ANMN-QLD_CTPSOKUE_20091120T045000Z_GBRMYR_FV01_GBRMYR-0911-WQM-189_END-20100601T040000Z_C-20120201T063245Z.nc?
?????????? ? ?    ?    ?            ? IMOS_ANMN-QLD_CTPSOKUE_20091120T045000Z_GBRMYR_FV01_GBRMYR-0911-WQM-19_END-20100601T040000Z_C-20120201T063238Z.nc?
?????????? ? ?    ?    ?            ? IMOS_ANMN-QLD_CTPSOKUE_20101103T090500Z_GBRMYR_FV01_GBRMYR-1010-WQM-188_END-20110414T212900Z_C-20120129T141746Z.nc?
?????????? ? ?    ?    ?            ? IMOS_ANMN-QLD_CTPSOKUE_20111017T062000Z_GBRMYR_FV01_GBRMYR-1110-WQM-187_END-20120412T221800Z_C-20121112T033903Z.nc?
?????????? ? ?    ?    ?            ? IMOS_ANMN-QLD_CTPSOKUE_20111017T062000Z_GBRMYR_FV01_GBRMYR-1110-WQM-19_END-20120412T221800Z_C-20121112T033855Z.nc?
drwxrwxrwx 1 root root 0 Jan  1  1970 non-QC/

Looks like maybe some characters got added on to the end of each file name somewhere along the way?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions