I have tried numerous methods to scrape data from LinkedIn, but every time, my accounts ended up getting blocked. After spending a significant amount of time experimenting with different approaches, I came up with the idea of developing this Chrome extension.
With this extension, I have successfully scraped data from over 500,000 public accounts, gathering millions of posts, comments, likes, and other engagement data. I ran this script continuously for five days straight, and it did not affect my account in any way.
However, please note that this tool is strictly designed for scraping publicly available LinkedIn data. It is not intended for any illegal activities, privacy violations, or unauthorized access. Always comply with LinkedIn’s policies and legal regulations when using this tool.
This Chrome Extension automates collecting posts from multiple LinkedIn company pages. It:
- Reads a list of LinkedIn URLs from a local file (
chunk.txt
). - Navigates a browser tab to each page, finds the Posts section, then scrolls to load all posts.
- Extracts post content (text, date, likes, comments) and adds the company URL to each row.
- Saves the results in a CSV file for each page.
- If a page has no posts or is unclaimed, it creates a single-row CSV documenting "no post".
- Never stops if any page fails—automatically skips errors and continues.
- Utilizes a heartbeat system to differentiate between “still loading” and “stuck”. If no heartbeat is received within 30 seconds, it skips that page.
- Reads
chunk.txt
: A JSON array of LinkedIn company URLs, e.g.[ "https://www.linkedin.com/company/red-ventures", "https://www.linkedin.com/company/example-company" ]
- Infinite Scrolling: Automatically scrolls until no more posts are loaded.
- Heartbeat Mechanism: The content script sends "heartbeat" messages every scroll iteration. The extension will only force a skip if 30 seconds pass with no heartbeat.
- Skipping: If a page errors out or is unclaimed or has no posts, the extension logs that scenario (by generating a short CSV) and moves on.
- Updates a
chunk-updated.txt
after each page, listing remaining URLs to be scraped. - Never blocks or stops entirely on errors.
.
├─ manifest.json
├─ popup.html
├─ popup.js
├─ contentScript.js
└─ chunk.txt
manifest.json
– Chrome Extension manifest (Manifest V3).popup.html
– The simple UI shown when you click the extension icon.popup.js
– The main logic orchestrating URL navigation, scraping calls, timeouts, CSV creation.contentScript.js
– Injected on LinkedIn pages, scrolls, sends heartbeats, and scrapes post data.chunk.txt
– A JSON array of LinkedIn URLs. This is read at runtime.
- Download or clone this repository.
- Open Chrome and go to
chrome://extensions
. - Enable Developer mode (toggle in top-right).
- Click Load unpacked and choose the folder containing this extension.
Chrome will load your extension, showing its name and version.
- Ensure you’re logged into LinkedIn in the same Chrome profile.
- Click the extension icon in the browser’s toolbar.
- In the popup, you’ll see a “Scrape All Pages” button.
chunk.txt
is fetched on first run to get the list of LinkedIn pages.- The extension navigates your active tab to each page, waiting for a full load.
- Infinite Scroll occurs on each page, sending you CSV files as they’re ready.
- After each page, an updated
chunk-updated.txt
is automatically downloaded to show the remaining URLs. - If a page is stuck for 30 seconds without sending a "heartbeat," the extension skips that page.
- If a page has no posts, you’ll still get a 1-row CSV documenting "no post."
heartbeatTimeoutMs
inpopup.js
default is 30 seconds. Increase if you have extremely slow loading or massive pages.- CSS selectors in
contentScript.js
:".update-components-text"
for post content."div.feed-shared-update-v2"
for each post container.- If LinkedIn changes its internal DOM classes, you must update these selectors.
- CSV Columns are:
[PageURL, PageName, Content, PostDate, Likes, Comments]
. chunk.txt
is read once per session; if you want a fresh load, remove existingchunk_urls
fromchrome.storage.local
or reload the extension.
- Reading
chunk.txt
:- The extension fetches your local
chunk.txt
(an array of LinkedIn URLs). - It stores them in
chrome.storage.local
under the key"chunk_urls"
.
- The extension fetches your local
- Infinite Scroll:
- On each page,
contentScript.js
attempts to click the"/posts/"
link if not already on it. - It scrolls until no new posts load for 3 consecutive checks (spaced 3 seconds apart).
- On each page,
- Heartbeat:
- After each scroll iteration,
contentScript.js
callschrome.runtime.sendMessage({ action: "heartbeat" })
. popup.js
resets a 30-second timer on each heartbeat. If the timer expires, that means no new heartbeat => skip.
- After each scroll iteration,
- CSV Generation:
- For each page, we build a CSV with
[pageUrl, pageName, content, postDate, likes, comments]
. - If the page is “unclaimed” or “deleted,” we add a single record with
"not found"
and"no post"
. - If an error or timeout occurs, we skip that page but still generate a "skipped" CSV.
- For each page, we build a CSV with
chunk-updated.txt
is automatically downloaded each time a page is processed, showing leftover URLs.
- LinkedIn TOS: This scraping may violate LinkedIn’s User Agreement. Use at your own risk.
- Selectors: If LinkedIn changes DOM classes, you must update
contentScript.js
. - Scale: Very large lists might trigger rate limits or suspicious activity detection from LinkedIn. Increase sleeps if you see errors or partial loads.
- No Headless: This runs in a real, visible Chrome tab—not headless. You can keep working in another window, though.
- No Data / Empty CSV: Check if the user is logged in or the LinkedIn page truly has no posts. Inspect DevTools for errors.
- Manifest or CSP Errors: Ensure you have no inline scripts in Manifest V3.
- Skipping Too Soon: Raise the 30s heartbeat timeout if you have slow pages.
Pull requests are welcome. For major changes, please open an issue first to discuss. Keep in mind the code is subject to break if LinkedIn updates its layout.
This extension’s code is provided as-is, under no specific license or an MIT License. Use responsibly!
If you face any issues while using this tool or have any questions, feel free to reach out to me via email:
📧 ahsanullahqureshi2021@gmail.com
I will try my best to assist you as soon as possible. 🚀
Enjoy automated LinkedIn post scraping across multiple company pages with robust skipping, CSV output, and a heartbeat-based approach to avoid infinite hangs. Let us know if you have any questions!