This repository was archived by the owner on Dec 1, 2024. It is now read-only.
crawl / scrape GitHub topics #53
shinenelson
started this conversation in
General
Replies: 1 comment
-
this could be converted to a technical issue if it is relevant. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
there are 33 public repositories with the terms-of-service topic and 84 public repositories with the privacy-policy topic. This includes companies like GitHub, Basecamp, Unity-Technologies among many others.
Since these are version controlled repositories with plain text files ( mostly markdown ), it may make it lot easier to track changes and source the policy documents verbatim than scraping them off of a website and doing processing on top of all that.
I understand this might introduce some extra technical effort in getting done. The reason I am proposing this was because I was surprised that the Basecamp's Terms of Service was not annotated yet even though they had their policies in a public repository in markdown format.
What would it take for this to get done?
Beta Was this translation helpful? Give feedback.
All reactions