Skip to content

A simple Node.js project that scrapes real-time dam data from embalses.net using Cheerio, and exposes it via a REST API. The project collects data on water levels, capacity, and variations of Spain's dams.

Notifications You must be signed in to change notification settings

regadior/spanish-dams

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dam Data Scraping

This is a simple scraping project to collect dam data from the website embalses.net. The primary goal of this project is to gather real-time information on Spain's dams, including water levels, capacity, and variations compared to previous weeks and the same period of previous years.

Features

  • Dam Data Scraping: The project scrapes dam data from the website embalses.net using cheerio and fetch.
  • Dam Information: The data scraped includes the dam's name, water level, water percentage, total capacity, variation compared to last week, and comparisons to the same period from last year.
  • Simple API: A simple REST API exposes the dam data in JSON format.

Installation

  1. Clone the repository to your local machine:

    git clone https://github.com/regadior/spanish-dams.git
  2. Install the dependencies:

    npm install
  3. Run the application in development mode:

    npm run dev

    The application will start running on the default port (3000) or the port specified in the .env file.

How It Works

Cheerio:

Cheerio is used to scrape the data from the target website. It loads the HTML structure of the web page and allows you to query and extract the relevant data. It behaves similarly to jQuery, providing an easy way to interact with HTML elements and traverse the document.

In this project, cheerio is used to:

  • Load the HTML of the main dam data page and find tables with information about dams.
  • Navigate through the HTML structure to get the desired information about each dam.
  • Scrape additional data from individual dam pages by following links extracted from the main page.

API:

A simple API is created using Express that serves the scraped data in JSON format when you make a GET request to /api/data.

Endpoints

  • GET /api/data: Returns the scraped dam data in JSON format.

About

A simple Node.js project that scrapes real-time dam data from embalses.net using Cheerio, and exposes it via a REST API. The project collects data on water levels, capacity, and variations of Spain's dams.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published