Skip to content

feat: update the course intro and all lesson intros of the JS2 course to be about JavaScript #1653

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
---
title: Downloading HTML with Python
title: Downloading HTML with Node.js
sidebar_label: Downloading HTML
description: Lesson about building a Python application for watching prices. Using the HTTPX library to download HTML code of a product listing page.
description: Lesson about building a Node.js application for watching prices. Using the Fetch API to download HTML code of a product listing page.
slug: /scraping-basics-javascript2/downloading-html
unlisted: true
---

import Exercises from './_exercises.mdx';

**In this lesson we'll start building a Python application for watching prices. As a first step, we'll use the HTTPX library to download HTML code of a product listing page.**
**In this lesson we'll start building a Node.js application for watching prices. As a first step, we'll use the Fetch API to download HTML code of a product listing page.**

---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
---
title: Parsing HTML with Python
title: Parsing HTML with Node.js
sidebar_label: Parsing HTML
description: Lesson about building a Python application for watching prices. Using the Beautiful Soup library to parse HTML code of a product listing page.
description: Lesson about building a Node.js application for watching prices. Using the Cheerio library to parse HTML code of a product listing page.
slug: /scraping-basics-javascript2/parsing-html
unlisted: true
---

import Exercises from './_exercises.mdx';

**In this lesson we'll look for products in the downloaded HTML. We'll use BeautifulSoup to turn the HTML into objects which we can work with in our Python program.**
**In this lesson we'll look for products in the downloaded HTML. We'll use Cheerio to turn the HTML into objects which we can work with in our Node.js program.**

---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
---
title: Locating HTML elements with Python
title: Locating HTML elements with Node.js
sidebar_label: Locating HTML elements
description: Lesson about building a Python application for watching prices. Using the Beautiful Soup library to locate products on the product listing page.
description: Lesson about building a Node.js application for watching prices. Using the Cheerio library to locate products on the product listing page.
slug: /scraping-basics-javascript2/locating-elements
unlisted: true
---

import Exercises from './_exercises.mdx';

**In this lesson we'll locate product data in the downloaded HTML. We'll use BeautifulSoup to find those HTML elements which contain details about each product, such as title or price.**
**In this lesson we'll locate product data in the downloaded HTML. We'll use Cheerio to find those HTML elements which contain details about each product, such as title or price.**

---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Extracting data from HTML with Python
title: Extracting data from HTML with Node.js
sidebar_label: Extracting data from HTML
description: Lesson about building a Python application for watching prices. Using string manipulation to extract and clean data scraped from the product listing page.
description: Lesson about building a Node.js application for watching prices. Using string manipulation to extract and clean data scraped from the product listing page.
slug: /scraping-basics-javascript2/extracting-data
unlisted: true
---
Expand Down
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
---
title: Saving data with Python
title: Saving data with Node.js
sidebar_label: Saving data
description: Lesson about building a Python application for watching prices. Using standard library to save data scraped from product listing pages in popular formats such as CSV or JSON.
description: Lesson about building a Node.js application for watching prices. Using the json2csv library to save data scraped from product listing pages in both JSON and CSV.
slug: /scraping-basics-javascript2/saving-data
unlisted: true
---

**In this lesson, we'll save the data we scraped in the popular formats, such as CSV or JSON. We'll use Python's standard library to export the files.**
**In this lesson, we'll save the data we scraped in the popular formats, such as CSV or JSON. We'll use the json2csv library to export the files.**

---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
---
title: Getting links from HTML with Python
title: Getting links from HTML with Node.js
sidebar_label: Getting links from HTML
description: Lesson about building a Python application for watching prices. Using the Beautiful Soup library to locate links to individual product pages.
description: Lesson about building a Node.js application for watching prices. Using the Cheerio library to locate links to individual product pages.
slug: /scraping-basics-javascript2/getting-links
unlisted: true
---

import Exercises from './_exercises.mdx';

**In this lesson, we'll locate and extract links to individual product pages. We'll use BeautifulSoup to find the relevant bits of HTML.**
**In this lesson, we'll locate and extract links to individual product pages. We'll use Cheerio to find the relevant bits of HTML.**

---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
---
title: Crawling websites with Python
title: Crawling websites with Node.js
sidebar_label: Crawling websites
description: Lesson about building a Python application for watching prices. Using the HTTPX library to follow links to individual product pages.
description: Lesson about building a Node.js application for watching prices. Using the Fetch API to follow links to individual product pages.
slug: /scraping-basics-javascript2/crawling
unlisted: true
---

import Exercises from './_exercises.mdx';

**In this lesson, we'll follow links to individual product pages. We'll use HTTPX to download them and BeautifulSoup to process them.**
**In this lesson, we'll follow links to individual product pages. We'll use the Fetch API to download them and Cheerio to process them.**

---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Scraping product variants with Python
title: Scraping product variants with Node.js
sidebar_label: Scraping product variants
description: Lesson about building a Python application for watching prices. Using browser DevTools to figure out how to extract product variants and exporting them as separate items.
description: Lesson about building a Node.js application for watching prices. Using browser DevTools to figure out how to extract product variants and exporting them as separate items.
slug: /scraping-basics-javascript2/scraping-variants
unlisted: true
---
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Using a scraping framework with Python
title: Using a scraping framework with Node.js
sidebar_label: Using a framework
description: Lesson about building a Python application for watching prices. Using the Crawlee framework to simplify creating a scraper.
description: Lesson about building a Node.js application for watching prices. Using the Crawlee framework to simplify creating a scraper.
slug: /scraping-basics-javascript2/framework
unlisted: true
---
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Using a scraping platform with Python
title: Using a scraping platform with Node.js
sidebar_label: Using a platform
description: Lesson about building a Python application for watching prices. Using the Apify platform to deploy a scraper.
description: Lesson about building a Node.js application for watching prices. Using the Apify platform to deploy a scraper.
slug: /scraping-basics-javascript2/platform
unlisted: true
---
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,32 +9,32 @@ unlisted: true

import DocCardList from '@theme/DocCardList';

**Learn how to use Python to extract information from websites in this practical course, starting from the absolute basics.**
**Learn how to use JavaScript to extract information from websites in this practical course, starting from the absolute basics.**

---

In this course we'll use Python to create an application for watching prices. It'll be able to scrape all product pages of an e-commerce website and record prices. Data from several runs of such program would be useful for seeing trends in price changes, detecting discounts, etc.
In this course we'll use JavaScript to create an application for watching prices. It'll be able to scrape all product pages of an e-commerce website and record prices. Data from several runs of such program would be useful for seeing trends in price changes, detecting discounts, etc.

![E-commerce listing on the left, JSON with data on the right](./images/scraping.webp)

## What we'll do

- Inspect pages using browser DevTools.
- Download web pages using the HTTPX library.
- Extract data from web pages using the Beautiful Soup library.
- Save extracted data in various formats, e.g. CSV which MS Excel or Google Sheets can open.
- Download web pages using the Fetch API.
- Extract data from web pages using the Cheerio library.
- Save extracted data in various formats (e.g. CSV which MS Excel or Google Sheets can open) using the json2csv library.
- Follow links programmatically (crawling).
- Save time and effort with frameworks, such as Crawlee, and scraping platforms, such as Apify.

## Who this course is for

Anyone with basic knowledge of developing programs in Python who wants to start with web scraping can take this course. The course does not expect you to have any prior knowledge of web technologies or scraping.
Anyone with basic knowledge of developing programs in JavaScript who wants to start with web scraping can take this course. The course does not expect you to have any prior knowledge of other web technologies or scraping.

## Requirements

- A macOS, Linux, or Windows machine with a web browser and Python installed.
- Familiarity with Python basics: variables, conditions, loops, functions, strings, lists, dictionaries, files, classes, and exceptions.
- Comfort with importing from the Python standard library, using virtual environments, and installing dependencies with `pip`.
- A macOS, Linux, or Windows machine with a web browser and Node.js installed.
- Familiarity with JavaScript basics: variables, conditions, loops, functions, strings, lists, dictionaries, files, classes, and exceptions.
- Comfort with building a Node.js package and installing dependencies with `npm`.
- Familiarity with running commands in Terminal (macOS/Linux) or Command Prompt (Windows).

## You may want to know
Expand Down
Loading