Crawl any page, parse only whitelisted and not blacklisted pages

In many websites, you are only interested in some kind of a page, like a product details page. But what if there are no links to other product detail pages from there? The crawler would get stuck. 
I think it could work that the crawler crawls pages freely (respecting robots.txt file of course) but only parses the qualified pages to extract item.