Skip to content
This repository was archived by the owner on Feb 2, 2022. It is now read-only.
This repository was archived by the owner on Feb 2, 2022. It is now read-only.

Multi fatality parsing issue #232

@rgreinho

Description

@rgreinho

Issue Type

  • Bug report

Current Behavior

$ scrapd --from "February 11, 2020" --to "February 11, 2020" -vvv
2020-02-14T13:23:52-0600 scrapd.core.apd:233  Retrieving fatalities from 2020-02-11 to 2020-02-11.
2020-02-14T13:23:52-0600 scrapd.core.apd:238  Fetching page 1...
2020-02-14T13:23:54-0600 scrapd.core.apd:39   http://austintexas.gov/department/news/296
2020-02-14T13:23:54-0600 scrapd.core.apd:249  11 fatality page(s) to process.
2020-02-14T13:23:54-0600 scrapd.core.apd:39   http://austintexas.gov/news/fatality-crash-16-2
2020-02-14T13:23:56-0600 scrapd.core.apd:39   http://austintexas.gov/news/fatality-crash-15-2
2020-02-14T13:23:56-0600 scrapd.core.apd:174  Errors while parsing http://austintexas.gov/news/fatality-crash-15-2:
Article fields:
	 * could not retrieve the notes information
2020-02-14T13:23:56-0600 scrapd.core.apd:39   http://austintexas.gov/news/fatality-crash-10-1
2020-02-14T13:23:56-0600 scrapd.core.apd:39   http://austintexas.gov/news/fatality-crash-8-2
2020-02-14T13:23:57-0600 scrapd.core.apd:39   http://austintexas.gov/news/fatality-crash-6-2
2020-02-14T13:23:58-0600 scrapd.core.apd:39   http://austintexas.gov/news/fatality-crash-9-2
2020-02-14T13:23:58-0600 scrapd.core.apd:39   http://austintexas.gov/news/fatality-crash-14-2
2020-02-14T13:23:58-0600 scrapd.core.apd:39   http://austintexas.gov/news/fatality-crash-7-2
2020-02-14T13:23:58-0600 scrapd.core.apd:39   http://austintexas.gov/news/fatality-crash-13-2
2020-02-14T13:23:58-0600 scrapd.core.apd:39   http://austintexas.gov/news/fatality-crash-12-1
2020-02-14T13:23:58-0600 scrapd.core.apd:39   http://austintexas.gov/news/fatality-crash-11-2
2020-02-14T13:23:58-0600 scrapd.core.apd:282  1 fatality page(s) is/are within the specified time range.
2020-02-14T13:23:58-0600 scrapd.core.apd:238  Fetching page 2...
2020-02-14T13:24:00-0600 scrapd.core.apd:39   http://austintexas.gov/department/news/296?page=1
2020-02-14T13:24:00-0600 scrapd.core.apd:249  7 fatality page(s) to process.
2020-02-14T13:24:00-0600 scrapd.core.apd:39   http://austintexas.gov/news/traffic-fatality-85
2020-02-14T13:24:00-0600 scrapd.core.apd:39   http://austintexas.gov/news/fatality-crash-2-2
2020-02-14T13:24:02-0600 scrapd.core.apd:39   http://austintexas.gov/news/fatality-crash-4-1
2020-02-14T13:24:02-0600 scrapd.core.apd:39   http://austintexas.gov/news/traffic-fatality-86
2020-02-14T13:24:02-0600 scrapd.core.apd:39   http://austintexas.gov/news/fatality-crash-3-2
2020-02-14T13:24:02-0600 scrapd.core.apd:39   http://austintexas.gov/news/fatality-crash-5-2
2020-02-14T13:24:02-0600 scrapd.core.apd:39   http://austintexas.gov/news/fatality-crash-1-2
2020-02-14T13:24:02-0600 scrapd.core.apd:282  0 fatality page(s) is/are within the specified time range.
2020-02-14T13:24:02-0600 scrapd.core.apd:286  There are no data within the specified time range on page 2.
2020-02-14T13:24:02-0600 scrapd.cli.cli:89   Total: 1

Outputs:

[
  {
    "case": "20-0420110",
    "crash": 15,
    "date": "2020-02-11",
    "fatalities": [
      {
        "age": 21,
        "dob": "1998-04-28",
        "ethnicity": "White",
        "first": "Owen",
        "gender": "Male",
        "generation": "",
        "last": "Macki",
        "middle": "William"
      }
    ],
    "latitude": 0.0,
    "link": "http://austintexas.gov/news/fatality-crash-15-2",
    "location": "North Capital of Texas Hwy/North Mopac NB Svrd",
    "longitude": 0.0,
    "notes": "",
    "time": "02:02:00"
  }
]

Expected Behavior

A second fatality should have been extracted from this report:

 Raquel Gitane Aveytia | Asian female | 07/26/1995

Possible Solution

Steps to Reproduce

  1. scrapd --from "February 11, 2020" --to "February 11, 2020" -vvv

The reposrt is located at https://austintexas.gov/news/fatality-crash-15-2

Metadata

Metadata

Assignees

Labels

kind/bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions