Skip to content

Commit 6c8b68f

Browse files
authored
Merge pull request #7088 from ethereum/markdownCheckr
Add a markdown checker
2 parents 139f833 + c9c8d43 commit 6c8b68f

File tree

11 files changed

+300
-84
lines changed

11 files changed

+300
-84
lines changed

package.json

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,7 @@
8585
"@types/styled-system": "^5.1.15",
8686
"babel-preset-gatsby": "^2.14.0",
8787
"github-slugger": "^1.3.0",
88+
"gray-matter": "^4.0.3",
8889
"husky": "^4.2.5",
8990
"identity-obj-proxy": "^3.0.0",
9091
"minimist": "^1.2.6",
@@ -102,6 +103,7 @@
102103
"crowdin-clean": "rm -rf .crowdin && mkdir .crowdin",
103104
"crowdin-import": "ts-node src/scripts/crowdin-import.ts",
104105
"format": "prettier --write \"**/*.{js,jsx,json,md}\"",
106+
"markdown-checker": "node src/scripts/markdown-checker.js",
105107
"generate-heading-ids": "ts-node --esm src/scripts/generateHeadingIds.mts",
106108
"start": "gatsby develop",
107109
"start:lambda": "netlify-lambda serve src/lambda",

src/content/community/get-involved/index.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,6 @@ The Ethereum ecosystem is on a mission to fund public goods and impactful projec
100100
- [Web3 Army](https://web3army.xyz/)
101101
- [Crypto Valley Jobs](https://cryptovalley.jobs/)
102102

103-
104103
## Join a DAO {#decentralized-autonomous-organizations-daos}
105104

106105
"DAOs" are decentralized autonomous organizations. These groups leverage Ethereum technology to facilitate organization and collaboration. For instance, for controlling membership, voting on proposals, or managing pooled assets. While DAOs are still experimental, they offer opportunities for you to find groups that you identify with, find collaborators, and grow your impact on the Ethereum community. [More on DAOs](/dao/)

src/content/community/research/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -392,7 +392,7 @@ Decentralizing the entire Ethereum tech stack is an important research area. Cur
392392

393393
#### Background reading {#background-reading-20}
394394

395-
- [Ethereum stack](/developers/docs/ethereum-stack/)
395+
- [Ethereum stack](/developers/docs/ethereum-stack/)
396396
- [Coinbase: Intro to Web3 Stack](https://blog.coinbase.com/a-simple-guide-to-the-web3-stack-785240e557f0)
397397
- [Introduction to smart contracts](/developers/docs/smart-contracts/)
398398
- [Introduction to decentralized storage](/developers/docs/storage/)

src/content/desci/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,8 @@ DeSci aims to create an ecosystem where scientists are incentivized to openly sh
2222
Decentralized science allows for more diverse funding sources (from [DAOs](/dao/), [quadratic donations](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2003531) to crowdfunding and more), more accessible access data and methods, and by providing incentives for reproducibility.
2323

2424
### Juan Benet - DeSci, Independent Labs, & Large Scale Data Science
25-
<iframe width="560" height="315" src="https://www.youtube.com/embed/zkXM9H90g_E" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
2625

26+
<iframe width="560" height="315" src="https://www.youtube.com/embed/zkXM9H90g_E" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
2727

2828
## How DeSci improves science {#desci-improves-science}
2929

src/content/developers/docs/nodes-and-clients/nodes-as-a-service/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -167,11 +167,11 @@ Here is a list of some of the most popular Ethereum node providers, feel free to
167167
- [Docs](https://documenter.getpostman.com/view/13630829/TVmFkLwy)
168168
- Features
169169
- Access to 50+ blockchain nodes
170-
- Free API Key
170+
- Free API Key
171171
- Block Explorers
172172
- API Response Time ⩽ 1 sec
173173
- 24/7 Support Team
174-
- Personal Account Manager
174+
- Personal Account Manager
175175
- Shared, archive, backup and dedicated nodes
176176
- [**Pocket Network**](https://www.pokt.network/)
177177
- [Docs](https://docs.pokt.network/home/)

src/content/developers/docs/programming-languages/python/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ The following Ethereum-based projects use tools mentioned on this page. The rela
8282
## Python Community discussion {#python-community-contributors}
8383

8484
- [Ethereum Python Community Discord](https://discord.gg/9zk7snTfWe) for Web3.py and other Python framework discussion
85-
- [Vyper Discord]([https://discord.gg/9zk7snTfWe](https://discord.gg/SdvKC79cJk)) for Vyper smart contract programming disucssion
85+
- [Vyper Discord](<[https://discord.gg/9zk7snTfWe](https://discord.gg/SdvKC79cJk)>) for Vyper smart contract programming disucssion
8686

8787
## Other aggregated lists {#other-aggregated-lists}
8888

src/content/developers/docs/smart-contracts/formal-verification/index.md

Lines changed: 71 additions & 74 deletions
Large diffs are not rendered by default.

src/content/developers/docs/smart-contracts/languages/index.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -113,11 +113,10 @@ For more information, [read the Vyper rationale](https://vyper.readthedocs.io/en
113113
- [Smart contract development frameworks and tools for Vyper](/developers/docs/programming-languages/python/)
114114
- [VyperPunk - learn to secure and hack Vyper smart contracts](https://github.com/SupremacyTeam/VyperPunk)
115115
- [VyperExamples - Vyper vulnerability examples](https://www.vyperexamples.com/reentrancy)
116-
- [Vyper Hub for development](https://github.com/zcor/vyper-dev)
116+
- [Vyper Hub for development](https://github.com/zcor/vyper-dev)
117117
- [Vyper greatest hits smart contract examples](https://github.com/pynchmeister/vyper-greatest-hits/tree/main/contracts)
118118
- [Awesome Vyper curated resources](https://github.com/spadebuilders/awesome-vyper)
119119

120-
121120
### Example {#example}
122121

123122
```python

src/content/translations/fa/developers/docs/intro-to-ether/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ sidebar: true
4242
سوختن اتر در تمام تراکنش‌ها روی اتریوم رخ می‌دهد. وقتی هزینه تراکنش کاربران پرداخت می شود، یک هزینه پایه با توجه به تقاضای شبکه ثبت شده و از چرخه خارج می شود که به همراه حداکثر کارمزد گاز و اندازه متغیر [بلوک] (https://etherscan.io/block/12965263)، کارمزد نهایی تراکنش را مشخص می کند. وقتی تقاضای شبکه زیاد باشد، میزان اتر سوزانده شده از آنچه که استخراج می شود بیشتر شده و از تولید مقدار زیاد آن جلوگیری می گند.
4343

4444
سوزاندن کارمزد پایه از راه‌های مختلفی که ماینرها می‌توانند از آن برای دستکاری شبکه استفاده کنند، جلوگیری می‌کند. برای مثال اگر ماینرها کارمزد پایه را دریافت می کردند، می توانستند تراکنش های خود را به صورت رایگان درج کنند و کارمزد پایه را برای بقیه افزایش دهند. از طرف دیگر، آنها می توانند کارمزد پایه را به برخی از کاربران خارج از زنجیره بازپرداخت کنند، که منجر به بازار کارمزد تراکنش مبهم و پیچیده تر می شود.
45-
45+
4646
## واحدهای خرد اتر {#denominations}
4747

4848
از آنجایی که بسیاری از تراکنش‌ها در اتریوم کوچک هستند، اتر دارای چندین واحد شمارش است که ممکن است برای مقادیر کمتر به آن‌ها اشاره شود. از میان این واحدهای شمارش، Wei و gwei از اهمیت ویژه‌ای برخوردارند.

src/scripts/markdown-checker.js

Lines changed: 219 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,219 @@
1+
const fs = require("fs")
2+
const path = require("path")
3+
const matter = require("gray-matter")
4+
const argv = require("minimist")(process.argv.slice(2))
5+
6+
const LANG_ARG = argv.lang || null
7+
const PATH_TO_INTL_MARKDOWN = "./src/content/translations/"
8+
const PATH_TO_ALL_CONTENT = "./src/content/"
9+
const TUTORIAL_DATE_REGEX = new RegExp("\\d{4}-\\d{2}-\\d{2}")
10+
const WHITE_SPACE_IN_LINK_TEXT = new RegExp(
11+
"\\[\\s.+\\]\\( | \\[.+\\s\\]\\(",
12+
"g"
13+
)
14+
const BROKEN_LINK_REGEX = new RegExp(
15+
"\\[[^\\]]+\\]\\([^\\)\\s]+\\s[^\\)]+\\)",
16+
"g"
17+
)
18+
const HTML_TAGS = ["</code", "</p>"]
19+
const SPELLING_MISTAKES = [
20+
"Ethreum",
21+
"Etherum",
22+
"Etherium",
23+
"Etheruem",
24+
"Etereum",
25+
"Eterium",
26+
"Etherem",
27+
"Etheerum",
28+
"Ehtereum",
29+
"Eferum",
30+
]
31+
const CASE_SENSITVE_SPELLING_MISTAKES = ["Thereum", "Metamask", "Github"]
32+
// Ideas:
33+
// Regex for explicit lang path (e.g. /en/) && for glossary links (trailing slash breaks links e.g. /glossary/#pos/ doesn't work)
34+
// We should have case sensitive spelling mistakes && check they are not in links.
35+
36+
const langsArray = fs.readdirSync(PATH_TO_INTL_MARKDOWN)
37+
langsArray.push("en")
38+
39+
function getAllMarkdownPaths(dirPath, arrayOfMarkdownPaths = []) {
40+
let files = fs.readdirSync(dirPath)
41+
42+
arrayOfMarkdownPaths = arrayOfMarkdownPaths || []
43+
44+
for (const file of files) {
45+
if (fs.statSync(dirPath + "/" + file).isDirectory()) {
46+
arrayOfMarkdownPaths = getAllMarkdownPaths(
47+
dirPath + "/" + file,
48+
arrayOfMarkdownPaths
49+
)
50+
} else {
51+
const filePath = path.join(dirPath, "/", file)
52+
53+
if (filePath.includes(".md")) {
54+
arrayOfMarkdownPaths.push(filePath)
55+
}
56+
}
57+
}
58+
59+
return arrayOfMarkdownPaths
60+
}
61+
62+
function sortMarkdownPathsIntoLanguages(files) {
63+
const languages = langsArray.reduce((accumulator, value) => {
64+
return { ...accumulator, [value]: [] }
65+
}, {})
66+
67+
for (const file of files) {
68+
const isTranslation = file.includes("/translations/")
69+
const langIndex = file.indexOf("/translations/") + 14
70+
const isFourCharLang = file.includes("pt-br") || file.includes("zh-tw")
71+
const charactersToSlice = isFourCharLang ? 5 : 2
72+
73+
const lang = isTranslation
74+
? file.slice(langIndex, langIndex + charactersToSlice)
75+
: "en"
76+
77+
if (LANG_ARG) {
78+
if (LANG_ARG === lang) {
79+
languages[lang].push(file)
80+
}
81+
} else {
82+
languages[lang].push(file)
83+
}
84+
}
85+
86+
return languages
87+
}
88+
89+
function processFrontmatter(path, lang) {
90+
const file = fs.readFileSync(path, "utf-8")
91+
const frontmatter = matter(file).data
92+
93+
if (!frontmatter.title) {
94+
console.warn(`Missing 'title' frontmatter at ${path}:`)
95+
}
96+
// Description commented out as there are a lot of them missing :-)!
97+
// if (!frontmatter.description) {
98+
// console.warn(`Missing 'description' frontmatter at ${path}:`)
99+
// }
100+
if (!frontmatter.lang) {
101+
console.error(`Missing 'lang' frontmatter at ${path}: Expected: ${lang}:'`)
102+
} else if (!(frontmatter.lang === lang)) {
103+
console.error(
104+
`Invalid 'lang' frontmatter at ${path}: Expected: ${lang}'. Received: ${frontmatter.lang}.`
105+
)
106+
}
107+
108+
if (path.includes("/tutorials/")) {
109+
if (!frontmatter.published) {
110+
console.warn(`Missing 'published' frontmatter at ${path}:`)
111+
} else {
112+
try {
113+
let stringDate = frontmatter.published.toISOString().slice(0, 10)
114+
const dateIsFormattedCorrectly = TUTORIAL_DATE_REGEX.test(stringDate)
115+
116+
if (!dateIsFormattedCorrectly) {
117+
console.warn(
118+
`Invalid 'published' frontmatter at ${path}: Expected: 'YYYY-MM-DD' Received: ${frontmatter.published}`
119+
)
120+
}
121+
} catch (e) {
122+
console.warn(
123+
`Invalid 'published' frontmatter at ${path}: Expected: 'YYYY-MM-DD' Received: ${frontmatter.published}`
124+
)
125+
}
126+
}
127+
}
128+
}
129+
130+
function processMarkdown(path) {
131+
const markdownFile = fs.readFileSync(path, "utf-8")
132+
let brokenLinkMatch
133+
134+
while ((brokenLinkMatch = BROKEN_LINK_REGEX.exec(markdownFile))) {
135+
const lineNumber = getLineNumber(markdownFile, brokenLinkMatch.index)
136+
console.warn(`Broken link found: ${path}:${lineNumber}`)
137+
138+
// if (!BROKEN_LINK_REGEX.global) break
139+
}
140+
141+
// TODO: refactor history pages to use a component for network upgrade summaries
142+
// TODO: create .env commit warning component for tutorials
143+
// Ignore tutorials with Javascript and ExpandableCards
144+
/* Commented this out due to console noise (but they are things we should fix!)
145+
if (!(path.includes("/history/")) && !(markdownFile.includes("```javascript")) && !(markdownFile.includes("ExpandableCard"))) {
146+
for (const tag of HTML_TAGS) {
147+
148+
const htmlTagRegex = new RegExp(tag, "g")
149+
let htmlTagMatch
150+
151+
while ((htmlTagMatch = htmlTagRegex.exec(markdownFile))) {
152+
const lineNumber = getLineNumber(markdownFile, htmlTagMatch.index)
153+
console.warn(`Warning: ${tag} tag in markdown at ${path}:${lineNumber}`)
154+
155+
if (!htmlTagRegex.global) break
156+
}
157+
}
158+
}
159+
*/
160+
161+
// Commented out as 296 instances of whitespace in link texts
162+
// let whiteSpaceInLinkTextMatch
163+
164+
// while ((whiteSpaceInLinkTextMatch = WHITE_SPACE_IN_LINK_TEXT.exec(markdownFile))) {
165+
// const lineNumber = getLineNumber(markdownFile, whiteSpaceInLinkTextMatch.index)
166+
// console.warn(`White space in link found: ${path}:${lineNumber}`)
167+
// }
168+
169+
checkMarkdownSpellingMistakes(path, markdownFile, SPELLING_MISTAKES)
170+
// Turned this off for testing as there are lots of Github (instead of GitHub) and Metamask (instead of MetaMask).
171+
// checkMarkdownSpellingMistakes(path, markdownFile, CASE_SENSITVE_SPELLING_MISTAKES, true)
172+
}
173+
174+
function checkMarkdownSpellingMistakes(
175+
path,
176+
file,
177+
spellingMistakes,
178+
caseSensitive = false
179+
) {
180+
for (const mistake of spellingMistakes) {
181+
const mistakeRegex = caseSensitive
182+
? new RegExp(mistake, "g")
183+
: new RegExp(mistake, "gi")
184+
let spellingMistakeMatch
185+
186+
while ((spellingMistakeMatch = mistakeRegex.exec(file))) {
187+
const lineNumber = getLineNumber(file, spellingMistakeMatch.index)
188+
console.warn(
189+
`Spelling mistake "${mistake}" found at ${path}:${lineNumber}`
190+
)
191+
}
192+
193+
if (!mistakeRegex.global) break
194+
}
195+
}
196+
197+
function getLineNumber(file, index) {
198+
const fileSubstring = file.substring(0, index)
199+
const lines = fileSubstring.split("\n")
200+
const linePosition = lines.length
201+
const charPosition = lines[lines.length - 1].length + 1
202+
const lineNumber = `${linePosition}:${charPosition}`
203+
204+
return lineNumber
205+
}
206+
207+
function checkMarkdown() {
208+
const markdownPaths = getAllMarkdownPaths(PATH_TO_ALL_CONTENT)
209+
const markdownPathsByLang = sortMarkdownPathsIntoLanguages(markdownPaths)
210+
211+
for (const lang in markdownPathsByLang) {
212+
for (const path of markdownPathsByLang[lang]) {
213+
processFrontmatter(path, lang)
214+
processMarkdown(path)
215+
}
216+
}
217+
}
218+
219+
checkMarkdown()

0 commit comments

Comments
 (0)