Skip to content

Commit f62aa9b

Browse files
committed
Docs
Also slipping in one intDecimal regex change
1 parent b5a017e commit f62aa9b

File tree

3 files changed

+27
-24
lines changed

3 files changed

+27
-24
lines changed

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -89,15 +89,15 @@ There are other `String` parsers in the module `Parsing.String.Basic`, for examp
8989

9090
Parser combinators are in this package in the module `Parsing.Combinators`.
9191

92-
A parser combinator is a function which takes a parser as an argument and returns a new parser. The `Data.Array.many` combinator, for example, will repeat a parser as many times as it can. So the parser `many letter` will have type `Parser String (Array Char)`.
92+
A parser combinator is a function which takes a parser as an argument and returns a new parser. The `many` combinator, for example, will repeat a parser as many times as it can. So the parser `many letter` will have type `Parser String (List Char)`.
9393

9494
Running the parser
9595

9696
```purescript
9797
runParser "aBabaB" (many ayebee)
9898
```
9999

100-
will return `Right [true, false, true]`.
100+
will return `Right (true : false : true : Nil)`.
101101

102102
### Stack-safety
103103

@@ -114,15 +114,15 @@ stack-safe.
114114

115115
The original short classic [FUNCTIONAL PEARLS *Monadic Parsing in Haskell*](https://www.cs.nott.ac.uk/~pszgmh/pearl.pdf) by Graham Hutton and Erik Meijer 1998.
116116

117-
[*Revisiting Monadic Parsing in Haskell*](https://vaibhavsagar.com/blog/2018/02/04/revisiting-monadic-parsing-haskell/) by Vaibhav Sagar is a reflection on the Hutton, Meijer FUNCTIONAL PEARL.
117+
[*Parsec: Direct Style Monadic Parser Combinators For The Real World*](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/parsec-paper-letter.pdf) by Daan Leijen and Erik Meijer 2001.
118118

119119
[*Parse, don't validate*](https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/) by Alexis King is about what it means to “parse” something, without any mention of monads.
120120

121+
[*Revisiting Monadic Parsing in Haskell*](https://vaibhavsagar.com/blog/2018/02/04/revisiting-monadic-parsing-haskell/) by Vaibhav Sagar is a reflection on the Hutton, Meijer FUNCTIONAL PEARL.
122+
121123
[*Parsec: “try a <|> b” considered harmful*](http://blog.ezyang.com/2014/05/parsec-try-a-or-b-considered-harmful/) by Edward Z. Yang is about how to decide when to backtrack
122124
from a failed alternative.
123125

124-
[*Parsec: Direct Style Monadic Parser Combinators For The Real World*](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/parsec-paper-letter.pdf) by Daan Leijen and Erik Meijer 2001.
125-
126126
[*Parser Combinators in Haskell*](https://serokell.io/blog/parser-combinators-in-haskell) by Heitor Toledo Lassarote de Paula.
127127

128128
There are lots of other great monadic parsing tutorials on the internet.

src/Parsing/String/Basic.purs

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -71,12 +71,12 @@ alphaNum = satisfyCP isAlphaNum <?> "letter or digit"
7171

7272
-- | Parser based on the __Data.Number.fromString__ function.
7373
-- |
74-
-- | This should be the inverse of `show :: String -> Number`.
74+
-- | This should be the inverse of `show :: Number -> String`.
7575
-- |
7676
-- | Examples of strings which can be parsed by this parser:
7777
-- | * `"3"`
7878
-- | * `"3.0"`
79-
-- | * `"0.3"`
79+
-- | * `".3"`
8080
-- | * `"-0.3"`
8181
-- | * `"+0.3"`
8282
-- | * `"-3e-1"`
@@ -91,24 +91,26 @@ number =
9191
, string "-Infinity" *> pure (negate infinity)
9292
, string "NaN" *> pure nan
9393
, tryRethrow $ do
94+
-- This primitiv-ish parser should always backtrack on fail.
95+
-- Currently regex allows some illegal inputs, like "."
96+
-- The important thing is that the regex will find the correct
97+
-- boundary of a candidate string to pass to fromString.
9498
section <- numberRegex
9599
-- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/parseFloat
96100
case Data.Number.fromString section of
97101
Nothing -> fail $ "Number.fromString failed"
98-
-- Maybe this parser should set consumed flag if regex matches but fromString fails?
99-
-- But currently regex allows some illegal inputs, like "."
100-
-- Anyway this primitiv-ish parser should always backtrack on fail.
101102
Just x -> pure x
102103
] <|> fail "Expected Number"
103104

105+
-- Non-exported regex is compiled at startup time.
104106
numberRegex :: forall m. ParserT String m String
105107
numberRegex = either unsafeCrashWith identity $ regex pattern mempty
106108
where
107109
pattern = "[+-]?[0-9]*(\\.[0-9]*)?([eE][+-]?[0-9]*(\\.[0-9]*))?"
108110

109111
-- | Parser based on the __Data.Int.fromString__ function.
110112
-- |
111-
-- | This should be the inverse of `show :: String -> Int`.
113+
-- | This should be the inverse of `show :: Int -> String`.
112114
-- |
113115
-- | Examples of strings which can be parsed by this parser:
114116
-- | * `"3"`
@@ -121,10 +123,11 @@ intDecimal = tryRethrow do
121123
Nothing -> fail $ "Int.fromString failed"
122124
Just x -> pure x
123125

126+
-- Non-exported regex is compiled at startup time.
124127
intDecimalRegex :: forall m. ParserT String m String
125128
intDecimalRegex = either unsafeCrashWith identity $ regex pattern mempty
126129
where
127-
pattern = "[+-]?[0-9]*"
130+
pattern = "[+-]?[0-9]+"
128131

129132
-- | Helper function
130133
satisfyCP :: forall m. (CodePoint -> Boolean) -> ParserT String m Char

src/Parsing/String/Replace.purs

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -113,8 +113,8 @@ breakCapT input sep = hush <$> runParserT input go
113113

114114
-- | #### Break on and capture one pattern
115115
-- |
116-
-- | Find the first occurence of a pattern in a text stream, capture the found
117-
-- | pattern, and break the input text stream on the found pattern.
116+
-- | Find the first occurence of a pattern in the input `String`, capture the found
117+
-- | pattern, and break the input `String` on the found pattern.
118118
-- |
119119
-- | This function can be used instead of
120120
-- | [Data.String.indexOf](https://pursuit.purescript.org/packages/purescript-strings/docs/Data.String#v:indexOf)
@@ -139,8 +139,8 @@ breakCapT input sep = hush <$> runParserT input go
139139
-- |
140140
-- | #### Access the matched section of text
141141
-- |
142-
-- | If you want to capture the matched string, then combine the pattern
143-
-- | parser `sep` with `match`.
142+
-- | To capture the matched string combine the pattern
143+
-- | parser `sep` with the `match` combinator.
144144
-- |
145145
-- | With the matched string, we can reconstruct the input string.
146146
-- | For all `input`, `sep`, if
@@ -317,8 +317,8 @@ splitCapCombinator sep = tailRecM accum { carry: Nothing, rlist: Nil, arraySize:
317317

318318
-- | #### Split on and capture all patterns
319319
-- |
320-
-- | Find all occurences of the pattern parser `sep`, split the input string,
321-
-- | capture all the patterns and the splits.
320+
-- | Find all occurences of the pattern parser `sep`, split the
321+
-- | input `String`, capture all the matched patterns and the splits.
322322
-- |
323323
-- | This function can be used instead of
324324
-- | [Data.String.Common.split](https://pursuit.purescript.org/packages/purescript-strings/docs/Data.String.Common#v:split)
@@ -332,11 +332,11 @@ splitCapCombinator sep = tailRecM accum { carry: Nothing, rlist: Nil, arraySize:
332332
-- | The input string will be split on every leftmost non-overlapping occurence
333333
-- | of the pattern `sep`. The output list will contain
334334
-- | the parsed result of input string sections which match the `sep` pattern
335-
-- | in `Right`, and non-matching sections in `Left`.
335+
-- | in `Right a`, and non-matching sections in `Left String`.
336336
-- |
337337
-- | #### Access the matched section of text
338338
-- |
339-
-- | If you want to capture the matched strings, then combine the pattern
339+
-- | To capture the matched strings combine the pattern
340340
-- | parser `sep` with the `match` combinator.
341341
-- |
342342
-- | With the matched strings, we can reconstruct the input string.
@@ -491,7 +491,7 @@ replaceT input sep = do
491491
-- | of the leftmost non-overlapping sections of the input string which match
492492
-- | the pattern parser `sep`, and
493493
-- | replace them with the result of the parser.
494-
-- | The `sep` parser must return a result of type `String`.
494+
-- | The `sep` parser must return a result of type `String` for the replacement.
495495
-- |
496496
-- | This function can be used instead of
497497
-- | [Data.String.replaceAll](https://pursuit.purescript.org/packages/purescript-strings/docs/Data.String#v:replaceAll)
@@ -500,11 +500,11 @@ replaceT input sep = do
500500
-- |
501501
-- | #### Access the matched section of text in the `editor`
502502
-- |
503-
-- | To get access to the matched string for the replacement
503+
-- | To get access to the matched string for calculating the replacement,
504504
-- | combine the pattern parser `sep`
505-
-- | with `match`.
505+
-- | with the `match` combinator.
506506
-- | This allows us to write a `sep` parser which can choose to not
507-
-- | edit the match and just leave it as it is.
507+
-- | replace the match and just leave it as it is.
508508
-- |
509509
-- | So, for all `sep`:
510510
-- |

0 commit comments

Comments
 (0)