Allowing to ignore errors and proceed. #25

sgzmd · 2025-10-13T19:54:56Z

Sometimes weird errors happen, terminating download session. Specifying --ignore-errors flag allows to, well, ignore errors and keep going, which may be desirable behavior in some cases.

Summary by CodeRabbit

New Features
- Added CLI options to control headless mode and session limits: --ignore-errors and --max-num-photos-in-session.
- Progress is now persisted/restored so processing can resume from the last saved position.
- Sessions automatically restart when the per-session photo limit is reached to continue downloads.
Bug Fixes
- Improved error handling: errors are logged and progress is saved; with --ignore-errors processing continues instead of aborting.

Sometimes weird errors happen, terminating download session. Specifying `--ignore-errors` flag allows to, well, ignore errors and keep going, which may be desirable behavior in some cases.

coderabbitai · 2025-10-13T19:55:06Z

Walkthrough

Replaces direct Playwright usage with a new browser abstraction (createBrowser), adds a commander-based CLI (headless, ignore-errors, max photos per session), robust boolean parsing, progress save/load, per-session download limits with browser restarts, and try/catch error handling that logs and persists progress and optionally continues.

Changes

Cohort / File(s)	Summary
Main CLI & flow `index.js`	Reworked CLI to use `commander`; added `parseBooleanOption`, `options` object, `--headless`, `--ignore-errors`, `--max-num-photos-in-session`; introduced progress helpers (`getProgress`, `saveProgress`); wrapped navigation/download loop in try/catch; tracks `photosDownloaded` and `photosDownloadedInSession`; restarts browser when per-session max reached; on error logs, saves progress, and either continues or exits based on `--ignore-errors`.
Browser abstraction `browser.js`	New module exporting `createBrowser(userDataDir, headless, userAgent)` that resolves userAgent, applies stealth plugin, and launches a persistent Playwright Chromium context with configured args, viewport, and download settings.
Dependencies `package.json`	Added `commander` dependency (^14.0.1).
Ignored file removed `download/.gitignore`	Deleted `.gitignore` that previously ignored all files except itself.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant CLI as index.js (CLI)
  participant Browser as browser.js (createBrowser)
  participant Page as Page / Navigator
  participant Store as ProgressStore

  User->>CLI: run command (flags)
  CLI->>CLI: parse options (commander, parseBooleanOption)

  CLI->>Browser: createBrowser(userDataDir, headless, userAgent)
  Browser-->>CLI: persistent context

  rect rgb(240, 248, 255)
  CLI->>Page: open / navigate loop
  Page->>Store: getProgress()
  loop per-photo
    Page->>Page: downloadPhoto(), write metadata
    Page->>CLI: increment photosDownloadedInSession
    alt photosDownloadedInSession >= maxPerSession
      CLI->>Browser: close / restart
      CLI->>Browser: createBrowser(...)
    end
  end
  end

  alt error thrown
    Page->>CLI: throw
    CLI->>Store: saveProgress(currentUrl)
    alt ignoreErrors = true
      CLI->>Page: continue loop
    else
      CLI-->>User: log error + suggest --ignore-errors, exit(1)
    end
  else success
    CLI-->>User: complete normally
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

I hop through code with whiskers keen,
New tunnels built and pathways clean.
I save my steps when storms appear,
Restart my paws when limits near.
Flag me brave or make me rest—I'll cheer. 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title succinctly conveys the main feature added by the pull request—enabling the tool to ignore errors and continue processing—without extraneous details, and it aligns directly with the described objectives of the changeset.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

index.js (1)
28-33: Consider using a dedicated argument parsing library.

The current argument parsing using process.argv.includes() is functional but basic. For better robustness and future extensibility (help text, validation, type checking), consider using a library like yargs or commander.

Example with yargs:
import yargs from 'yargs';
import { hideBin } from 'yargs/helpers';

const argv = yargs(hideBin(process.argv))
  .option('headless', {
    type: 'boolean',
    default: true,
    description: 'Run browser in headless mode'
  })
  .option('ignore-errors', {
    type: 'boolean',
    default: false,
    description: 'Continue downloading despite errors'
  })
  .help()
  .argv;

const headless = argv.headless;
const ignoreErrors = argv.ignoreErrors;

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a5593b and f510525.

📒 Files selected for processing (1)

index.js (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

index.js (1)

setup.js (1)

page (33-33)

coderabbitai · 2025-10-13T19:57:37Z

index.js

+    try {
+      await page.evaluate(() => document.getElementsByClassName('SxgK2b OQEhnd')[0].click())
+
+      // we wait until new photo is loaded
+      await page.waitForURL((url) => {
+        return url.host === 'photos.google.com' && url.href !== currentUrl
+      },
+        {
+          timeout: timeoutValue,
+        })
+
+      await downloadPhoto(page)
+      await saveProgress(page)
+    } catch (error) {
+      console.error('An error occurred:', error);
+      await saveProgress(page);
+      if (!ignoreErrors) {
+        console.log('Aborting due to error. Use --ignore-errors to continue on errors.');
+        process.exit(1);
+      }
+    }


⚠️ Potential issue | 🟠 Major

Add consecutive error limit to prevent infinite loops.

When ignoreErrors is enabled and errors persist (e.g., DOM structure changed, network issues), the script will loop indefinitely, repeatedly encountering the same error. This wastes resources and provides no clear indication when manual intervention is needed.

Add a consecutive error counter to abort after a threshold is reached:

chromium.use(stealth()) const timeoutValue = 300000 const userDataDir = './session' const downloadPath = './download' let headless = true // accept --headless=false argument to run in headful mode if (process.argv[2] === '--headless=false') { headless = false } let ignoreErrors = false; // accept --ignore-errors argument to control error handling behavior if (process.argv.includes('--ignore-errors')) { ignoreErrors = true; } + +const MAX_CONSECUTIVE_ERRORS = 3; +let consecutiveErrors = 0;

Then update the catch block:

} catch (error) { console.error('An error occurred:', error); await saveProgress(page); + consecutiveErrors++; + + if (consecutiveErrors >= MAX_CONSECUTIVE_ERRORS) { + console.error(`Aborting after ${MAX_CONSECUTIVE_ERRORS} consecutive errors.`); + process.exit(1); + } + if (!ignoreErrors) { console.log('Aborting due to error. Use --ignore-errors to continue on errors.'); process.exit(1); } + console.log(`Continuing despite error (${consecutiveErrors}/${MAX_CONSECUTIVE_ERRORS} consecutive errors)...`); } }

And reset the counter on success (after line 153):

await downloadPhoto(page) await saveProgress(page) + consecutiveErrors = 0; // Reset on success } catch (error) {

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try {

await page.evaluate(() => document.getElementsByClassName('SxgK2b OQEhnd')[0].click())

// we wait until new photo is loaded

await page.waitForURL((url) => {

return url.host === 'photos.google.com' && url.href !== currentUrl

},

{

timeout: timeoutValue,

})

await downloadPhoto(page)

await saveProgress(page)

} catch (error) {

console.error('An error occurred:', error);

await saveProgress(page);

if (!ignoreErrors) {

console.log('Aborting due to error. Use --ignore-errors to continue on errors.');

process.exit(1);

}

}

chromium.use(stealth())

const timeoutValue = 300000

const userDataDir = './session'

const downloadPath = './download'

let headless = true

// accept --headless=false argument to run in headful mode

if (process.argv[2] === '--headless=false') {

headless = false

}

let ignoreErrors = false

// accept --ignore-errors argument to control error handling behavior

if (process.argv.includes('--ignore-errors')) {

ignoreErrors = true

}

const MAX_CONSECUTIVE_ERRORS = 3

let consecutiveErrors = 0

// … inside your main loop …

try {

await page.evaluate(() => document.getElementsByClassName('SxgK2b OQEhnd')[0].click())

// we wait until new photo is loaded

await page.waitForURL(

(url) => url.host === 'photos.google.com' && url.href !== currentUrl,

{ timeout: timeoutValue }

)

await downloadPhoto(page)

await saveProgress(page)

consecutiveErrors = 0 // Reset on success

} catch (error) {

console.error('An error occurred:', error)

await saveProgress(page)

consecutiveErrors++

if (consecutiveErrors >= MAX_CONSECUTIVE_ERRORS) {

console.error(`Aborting after ${MAX_CONSECUTIVE_ERRORS} consecutive errors.`)

process.exit(1)

}

if (!ignoreErrors) {

console.log('Aborting due to error. Use --ignore-errors to continue on errors.')

process.exit(1)

}

console.log(

`Continuing despite error (${consecutiveErrors}/${MAX_CONSECUTIVE_ERRORS} consecutive errors)...`

)

}

🤖 Prompt for AI Agents

In index.js around lines 141 to 161, add a consecutive error counter and threshold to avoid infinite loops when ignoreErrors is true: declare a let consecutiveErrorCount = 0 and const MAX_CONSECUTIVE_ERRORS = <reasonable number> in the surrounding scope, increment consecutiveErrorCount inside the catch block before deciding whether to continue, and if consecutiveErrorCount >= MAX_CONSECUTIVE_ERRORS log a clear fatal message and call process.exit(1) regardless of ignoreErrors; after successful operations (immediately after line 153 where the success path continues) reset consecutiveErrorCount = 0 so transient failures don't trigger the abort.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

index.js (2)

40-47: Trim .lastdone and preserve original error.

Trailing newlines commonly break new URL(...). Also throw new Error(error) loses stack/message.

-const getProgress = async () => {
+const getProgress = async () => {
   try {
-    const lastDone = await fsP.readFile('.lastdone', 'utf-8')
-    if (lastDone === '') throw new Error('Please add the starting link in .lastdone file')
-    return lastDone
+    const lastDone = (await fsP.readFile('.lastdone', 'utf-8')).trim();
+    if (lastDone === '') throw new Error('Please add the starting link in .lastdone file');
+    return lastDone;
   } catch (error) {
-    throw new Error(error)
+    throw error; // preserves original message/stack
   }
 }

202-214: Fix ReferenceError: path is undefined in catch; recompute target path and validate.

In the catch block, path is out of scope (declared inside try). This will throw and skip moving the file.

   } catch (error) {
-    const randomNumber = Math.floor(Math.random() * 1000000)
-    const fileName = await download.suggestedFilename().replace(/(\.[\w\d_-]+)$/i, `_${randomNumber}$1`)
-
-    var downloadFilePath = path
-
-    // check for long paths that could result in ENAMETOOLONG and truncate if necessary
-    if (downloadFilePath.length > 225) {
-      downloadFilePath = truncatePath(downloadFilePath)
-    }
-
-    await moveFile(temp, `${downloadFilePath}`)
-    console.log('Download Complete:', `${downloadFilePath}`)
+    const randomNumber = Math.floor(Math.random() * 1000000);
+    const randomizedFileName = (await download.suggestedFilename()).replace(/(\.[\w\d_-]+)$/i, `_${randomNumber}$1`);
+    let downloadFilePath = validatePath(`${downloadPath}/${year}/${month}/${randomizedFileName}`);
+    // check for long paths that could result in ENAMETOOLONG and truncate if necessary
+    if (downloadFilePath.length > 225) {
+      downloadFilePath = truncatePath(downloadFilePath);
+    }
+    await moveFile(temp, downloadFilePath);
+    console.log('Download Complete:', downloadFilePath);
   }

♻️ Duplicate comments (1)

index.js (1)

157-164: Consecutive error limit to avoid infinite loops (already suggested).

This loop can spin forever with --ignore-errors. The prior review’s recommendation to add a max consecutive error counter still applies.

🧹 Nitpick comments (2)

browser.js (2)
12-15: Avoid hard‑coding channel: 'chromium' (portability).

For most environments the Playwright-managed Chromium is preferred. Forcing a system channel can fail if not installed. Make it configurable or drop it.
-  return chromium.launchPersistentContext(path.resolve(userDataDir), {
-    headless,
-    channel: 'chromium',
+  return chromium.launchPersistentContext(path.resolve(userDataDir), {
+    headless,
+    // channel: 'chromium', // consider making this configurable
16-24: Guard --no-sandbox usage.

Disabling the sandbox weakens security; only needed in specific containers/root setups. Consider toggling via CLI/env and omit by default.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f510525 and fe6888b.

⛔ Files ignored due to path filters (1)

package-lock.json is excluded by !**/package-lock.json

📒 Files selected for processing (4)

browser.js (1 hunks)
download/.gitignore (0 hunks)
index.js (4 hunks)
package.json (1 hunks)

💤 Files with no reviewable changes (1)

download/.gitignore

✅ Files skipped from review due to trivial changes (1)

package.json

🧰 Additional context used

🧬 Code graph analysis (2)

browser.js (1)

index.js (3)

userDataDir (15-15)

headless (34-34)

userAgent (9-12)

index.js (1)

browser.js (2)

createBrowser (7-29)

createBrowser (7-29)

🔇 Additional comments (1)

index.js (1)

18-31: CLI boolean parsing looks solid.

Good use of a coercer to support true/false/1/0/yes/no and defaulting behavior.

coderabbitai · 2025-10-15T08:24:21Z

index.js

+      // Check if the maximum number of photos in session has been reached
+      if (maxNumPhotosInSession !== -1 && photosDownloadedInSession >= maxNumPhotosInSession) {
+        console.log(`Downloaded ${photosDownloadedInSession} photos in this session. Restarting browser...`);
+        await browser.close();
+        photosDownloadedInSession = 0;
+        browser = await createBrowser(userDataDir, headless, userAgent)
+        page = await browser.newPage();
+        await page.goto(currentUrl);
+      }


⚠️ Potential issue | 🔴 Critical

Session restart uses stale URL; resume from the current page instead.

currentUrl was captured before navigation. On restart you’ll revisit the old photo and risk duplicates.

- if (maxNumPhotosInSession !== -1 && photosDownloadedInSession >= maxNumPhotosInSession) { + if (maxNumPhotosInSession !== -1 && photosDownloadedInSession >= maxNumPhotosInSession) { console.log(`Downloaded ${photosDownloadedInSession} photos in this session. Restarting browser...`); - await browser.close(); + const resumeUrl = await page.url(); // capture current page before closing + await browser.close(); photosDownloadedInSession = 0; browser = await createBrowser(userDataDir, headless, userAgent) page = await browser.newPage(); - await page.goto(currentUrl); + await page.goto(resumeUrl); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Check if the maximum number of photos in session has been reached

if (maxNumPhotosInSession !== -1 && photosDownloadedInSession >= maxNumPhotosInSession) {

console.log(`Downloaded ${photosDownloadedInSession} photos in this session. Restarting browser...`);

await browser.close();

photosDownloadedInSession = 0;

browser = await createBrowser(userDataDir, headless, userAgent)

page = await browser.newPage();

await page.goto(currentUrl);

}

// Check if the maximum number of photos in session has been reached

if (maxNumPhotosInSession !== -1 && photosDownloadedInSession >= maxNumPhotosInSession) {

console.log(`Downloaded ${photosDownloadedInSession} photos in this session. Restarting browser...`);

const resumeUrl = await page.url(); // capture current page before closing

await browser.close();

photosDownloadedInSession = 0;

browser = await createBrowser(userDataDir, headless, userAgent)

page = await browser.newPage();

await page.goto(resumeUrl);

}

🤖 Prompt for AI Agents

In index.js around lines 148-156, the restart logic uses the stale variable currentUrl captured earlier, causing a revisit of the old photo; before closing the browser capture the actual current page URL (e.g., const resumeUrl = await page.url()), then after creating the new browser and page navigate to that captured resumeUrl instead of currentUrl so scraping resumes where it left off.

Allowing to ignore errors and proceed.

f510525

Sometimes weird errors happen, terminating download session. Specifying `--ignore-errors` flag allows to, well, ignore errors and keep going, which may be desirable behavior in some cases.

coderabbitai bot reviewed Oct 13, 2025

View reviewed changes

Changing logic to limit number of photos in session

fe6888b

coderabbitai bot reviewed Oct 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allowing to ignore errors and proceed. #25

Allowing to ignore errors and proceed. #25

Uh oh!

sgzmd commented Oct 13, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 13, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 13, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Allowing to ignore errors and proceed. #25

Are you sure you want to change the base?

Allowing to ignore errors and proceed. #25

Uh oh!

Conversation

sgzmd commented Oct 13, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sgzmd commented Oct 13, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 13, 2025 •

edited

Loading