Replies: 2 comments
-
Yeah plus one! Is there no way to do this? The crawler seems to default to store the output into some storage provided by Crawlee, writing something to disk - but ideally the crawler would just return the data from the handler function and allow the developer choice on how/where to persist the data. |
Beta Was this translation helpful? Give feedback.
0 replies
-
You just need to collect the result into a variable. The crawler cannot just return data because it will not hold arbitrary results in its memory. // disable writing to disk
Configuration.getGlobalConfig().set('persistStorage', false);
let result;
const playwrightCrawler = new PlaywrightCrawler({
//some options,
async requestHandler({ request, page, log, parseWithCheerio }) {
result = getData();
},
});
await playwrightCrawler.run([{url: 'https://example.com'}]);
return result; // or resolve(result) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have an application, which does next things:
I've read the docs, and you have an example of how someone can parse a single URL, but there is no way to parse a single URL with the help of Puppeteer/Playwright.
It would be very nice if there would be an ability to parse a single URL in a way like this for example:
Beta Was this translation helpful? Give feedback.
All reactions