Skip to content

Commit a6e4a2a

Browse files
committed
Fixing of markdown processing
- processing our special markdown notation [[ ]] still had numerous issues, like parsed inside code blocks - some markdown links still not skipped - processing order of [[ ]] [[ | ]] was bogus - added handling of a special form of our notation [[title|-]] that can protect the given title from further autolink/tooltip processing - prepared possible processing of our markdown notation in H2-H6 headings Signed-off-by: Hofi <hofione@gmail.com> Signed-off-by: Hofi <hofione@gmail.com>
1 parent b45d9a6 commit a6e4a2a

File tree

1 file changed

+76
-25
lines changed

1 file changed

+76
-25
lines changed

_plugins/generate_tooltips.rb

Lines changed: 76 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -47,39 +47,61 @@ def prefixed_url(url, base_url)
4747
return url
4848
end
4949

50+
def is_modifiable_markdown_part?(part)
51+
# NOTE: This allows usage of our custom markdown notation in the other (h2-h6) headings as well.
52+
# Unlike in the case of the page titles (h1), the (lunr) search will work nicely for the text parts.
53+
return part.start_with?('[[') || part.start_with?('#')
54+
end
55+
5056
def make_tooltip(page, page_links, id, url, match)
51-
match_parts = match.split(/\|/)
57+
match_parts = match.split(/(?<!\\)\|/)
58+
5259
# If the text has an '|' it means it comes from our special autolink/tooltip [[text|id]] markdown block
5360
# We have to reparse it a bit and get the id we must use
5461
if match_parts.length > 1
5562
#puts "match_parts: #{match_parts}"
56-
match = match_parts[0]
63+
title = match_parts[0]
64+
if title.length <= 0
65+
puts "Error: Empty title in matching part: '#{match}' -> #{match_parts}"
66+
# nil means, show the original markdown part, instead of a half rendered one
67+
return nil
68+
end
5769
id = match_parts[1]
70+
# This is a special use case [[title|-]] that protects the given title from further processing
71+
if id == '-'
72+
# Just use the original title text
73+
return title
74+
end
5875
link_data = page_links[id]
5976
if link_data != nil
6077
url = link_data["url"]
6178
url = prefixed_url(url, page.site.config["baseurl"])
6279
else
63-
puts "Error: Unknown ID in matching part: #{match_parts}"
64-
return match
80+
puts "Error: Unknown ID in matching part: '#{match}' -> #{match_parts}"
81+
# nil means, show the original markdown part, instead of a half rendered one
82+
return nil
6583
end
84+
else
85+
title = match
6686
end
6787

6888
if id == nil or id.length <= 0
69-
puts "Error: Empty ID in matching part: #{match}"
70-
return match
89+
puts "Error: Empty ID in matching part: '#{match}' -> #{match_parts}"
90+
# nil means, show the original markdown part, instead of a half rendered one
91+
return nil
7192
end
7293
if url == nil or url.length <= 0
73-
puts "Error: Empty URL for ID: #{id} in matching part: #{match}"
74-
return match
94+
puts "Error: Empty URL for ID: '#{id}' in matching part: '#{match}' -> #{match_parts}"
95+
# nil means, show the original markdown part, instead of a half rendered one
96+
return nil
7597
end
7698

7799
# NOTE: Now we treat every link that has protocol prefix part as an external one
78100
# that allows usage of direct links anywhere if needed (not recommended, plz use external_links.yml instead)
79101
# but, at the same time requires e.g. all the really external links to be fully qualified (even in external_links.yml as well)
80102
external_url = is_prefixed_url?(url)
81-
match = save_from_markdownify(match)
82-
replacement_text = '<a href="' + url + '" class="nav-link content-tooltip"' + (external_url ? ' target="_blank"' : '') + '>' + match + '</a>'
103+
title = save_from_markdownify(title)
104+
replacement_text = '<a href="' + url + '" class="nav-link content-tooltip"' + (external_url ? ' target="_blank"' : '') + '>' + title + '</a>'
83105
# puts "replacement_text: " + replacement_text
84106

85107
return replacement_text
@@ -101,11 +123,15 @@ def process_markdown_part(page, markdown_part, page_links, full_pattern, id, url
101123
left_separator = $1
102124
matched_text = $2
103125
right_separator = $3
104-
#puts "\nmatch: #{match}\nleft_separator: #{left_separator}\nmatched_text: #{matched_text}\nright_separator: #{right_separator}"
126+
# puts "\nmatch: #{match}\nleft_separator: #{left_separator}\nmatched_text: #{matched_text}\nright_separator: #{right_separator}"
105127

106128
replacement_text = make_tooltip(page, page_links, id, url, matched_text)
107-
if add_separator
108-
replacement_text = left_separator + replacement_text + right_separator
129+
if replacement_text != nil
130+
if add_separator
131+
replacement_text = left_separator + replacement_text + right_separator
132+
end
133+
else
134+
replacement_text = markdown_part.gsub(/(?<!\\)\|/, "\\\|")
109135
end
110136
replacement_text
111137
end
@@ -120,10 +146,26 @@ def process_markdown_parts(page, markdown)
120146
# Regular expression pattern to match special Markdown blocks
121147
# Unlike the others this needs grouping as we use do |match| for enumeration
122148
# NOTE: Use multi line matching partially as e.g. code blocks can span to multiple lines
123-
special_markdown_blocks_pattern = /((?m:````.*?````|```.*?```|``.*?``|`.*?`)|\[\[.*?\]\]|\[.*?\]\(.*?\)\{\:.*?\}|\[.*?\]\(.*?\)|\[.*?\]\{.*?\}|^#+\s.*?$)/
149+
markdown_blocks_pattern = /((?m:````.*?````|```.*?```|``.*?``|`.*?`)|\[\[(?:[^\]^\[]|\\\[|\\\])*?\]\]|\[[^\]^\[]*?\]\(.*?\)\{\:.*?\}|\[[^\]^\[]*?\]\(.*?\)|\[[^\]^\[]*?\]:.*?$|\[[^\]^\[]*?\]\s*\[.*?\]|^#+\s.*?$)/
150+
# TODO: Always sync the bellow with the one-liner version for readability
151+
# FIXME: Check why the /x version bellow is not working the same way
152+
# markdown_blocks_pattern = /( # Either Code blocks
153+
# (?m: # Even Multiline ones
154+
# ````.*?```` | # Code block with 4 backticks
155+
# ```.*?``` | # Code block with 3 backticks
156+
# ``.*?`` | # Code block with 2 backticks
157+
# `.*?` # Inline code with 1 backtick
158+
# ) | #
159+
# \[\[(?:[^\]^\[]|\\\[|\\\])*?\]\] | # or Our special, custom markdown notation
160+
# \[[^\]^\[]*?\]\(.*?\)\{\:.*?\} | # or Link with attribute
161+
# \[[^\]^\[]*?\]\(.*?\) | # Link without attribute
162+
# \[[^\]^\[]*?\]:.*?$ | # Link reference label declaration
163+
# \[[^\]^\[]*?\]\s*\[.*?\] | # Link using reference label
164+
# ^#+\s.*?$ # or Headers
165+
# )/x
124166

125167
# Split the content by special Markdown blocks
126-
markdown_parts = markdown.split(special_markdown_blocks_pattern)
168+
markdown_parts = markdown.split(markdown_blocks_pattern)
127169
#puts markdown_parts
128170
markdown_parts.each_with_index do |markdown_part, markdown_index|
129171
# puts "---------------\nmarkdown_index: " + markdown_index.to_s + "\n" + (markdown_index.even? ? "NOT " : "") + "markdown_part: " + markdown_part
@@ -144,27 +186,36 @@ def process_markdown_parts(page, markdown)
144186
#puts "searching for #{title} with pattern #{pattern}"
145187

146188
if markdown_index.even?
147-
# Content outside of special Markdown blocks, aka. pure text (NOTE: Also excludes the reqursively self added <a ...>title</a> tooltips/links)
189+
# Content outside of Markdown blocks, aka. pure text
148190

149191
# Search for known link titles
150192
# NOTE: Using multi line matching here will not help either if the pattern itself is in the middle broken/spaned to multiple lines, so
151193
# using whitespace replacements now inside the patter to handle this, see above!
194+
# NOTE: Also excludes the reqursively self added <a ...>title</a> tooltips/links
152195
full_pattern = /(^|[\s.,;:&'"(])(#{pattern})([\s.,;:&'")]|\z)(?![^<]*?<\/a>)/
153196
markdown_part = process_markdown_part(page, markdown_part, page_links, full_pattern, id, url, true)
154197
else
155-
# Content inside of special Markdown blocks
198+
# Content inside of Markdown blocks
156199

157-
# Handle own auto\tooltip links [[title]], but NOT [[title|id]], see bellow why
158-
full_pattern = /(\[\[)(#{pattern})(\]\])/
159-
markdown_part = process_markdown_part(page, markdown_part, page_links, full_pattern, id, url, false)
200+
# Handle our special markdown notation autolink/tooltip links [[title]], but NOT [[title|id]], see bellow why
201+
if is_modifiable_markdown_part?(markdown_part)
202+
full_pattern = /(\[\[)(#{pattern})(\]\])/
203+
markdown_part = process_markdown_part(page, markdown_part, page_links, full_pattern, id, url, false)
204+
end
160205
end
161206
end
162207

208+
# Handle our special markdown notation autolink/tooltip links [[title|id]]
209+
# This must be a separate run, as independent from the given title, if ID is presented it will always override the title, and the title exclusion as well
163210
if markdown_index.odd?
164-
# Handle own auto\tooltip links [[title|id]]
165-
# This must be a separate run, as independent from the given title, if ID is presented it will always override title, and title exclusion as well
166-
full_pattern = /(\[\[)(.+\|.+)(\]\])/
167-
markdown_part = process_markdown_part(page, markdown_part, page_links, full_pattern, nil, nil, false)
211+
# Content inside of Markdown blocks
212+
213+
if is_modifiable_markdown_part?(markdown_part)
214+
# puts "\nmarkdown_index: " + markdown_index.to_s + "\n" + (markdown_index.even? ? "NOT " : "") + "markdown_part: " + markdown_part
215+
# NOTE: The differences in the patter is intentional, allowing empty part on both sides of | allows the same flow inside process_markdown_part
216+
full_pattern = /(\[\[)(.*?(?<!\\)\|.*?)(\]\])/
217+
markdown_part = process_markdown_part(page, markdown_part, page_links, full_pattern, nil, nil, false)
218+
end
168219
end
169220

170221
#puts "new markdown_part: " + markdown_part
@@ -293,7 +344,7 @@ def gen_page_link_data(links_dir, link_files_pattern)
293344
page_id = yaml_content['id']
294345
page_url = yaml_content['url']
295346
page_title = yaml_content['title']
296-
chars_to_remove = %{"'} #!?.:;}
347+
chars_to_remove = %{"'}
297348
page_title = page_title.gsub(/\A[#{Regexp.escape(chars_to_remove)}]+|[#{Regexp.escape(chars_to_remove)}]+\z/, '')
298349
#puts "page_title: " + page_title
299350
if page_title.length == 0

0 commit comments

Comments
 (0)