Skip to content

More parser examples #465

@phdoerfler

Description

@phdoerfler

I'd appreciate a larger list of example parsers. There is the CSV parser and JSON parser, as well as the two calculator parsers and the ABC parser, of course. However, given sufficiently low caffeeine levels these and the documentation are not enough. I expect that my parser writing endeavors would be helped by more examples to draw inspiration from.
Is there some place where people contribute their Parboiled2 parsers? Github's discussion tab or the wiki could be a great place for this. ANTLR has its own repository with lots of examples.

Compare:

To provide some context as to what kind of problem I am trying to solve, imagine sbt's "inspect tree" output:

[info] Loading global plugins from /home/foobar/.sbt/1.0/plugins
[info] Loading project definition from /work/space/scratch/project
[info] Set current project to scratch (in build file:/work/space/scratch/)
[info] some-project/*:packModuleEntries = Task[scala.collection.Seq[xerial.sbt.pack...
[info]   +-some-project/*:packDuplicateJarStrategy = latest
[info]   +-some-project/*:packExcludeArtifactTypes = List(source, javadoc, test, it,..
[info]   +-some-project/*:packExcludeJars = List(.*log4j.*, .*logback.*, .*specs2.*)
[info]   +-some-project/*:packModuleEntries::streams = Task[sbt.std.TaskStreams[sbt...
[info]   | +-*/*:streamsManager = Task[sbt.std.Streams[sbt.Init$ScopedKey[_ <: Any]]]
[info]   | 
[info]   +-some-project/*:update = Task[sbt.UpdateReport]
[info]     +-*/*:appConfiguration = xsbt.boot.AppConfiguration@ab2e887
[info]         

Say you are interested in the task and settings, e.g., some-project/*:packModuleEntries::streams, perhaps their indentation depth, and don't care about the rest. There are multiple lines, not all of which contain mention of a task or setting. Is it better to treat it as one big string and write a parser such as

ContentWeAreInterestedIn.separatedBy(ContentWeIgnore)

?
Or is it better to treat it as individual lines, remove the [info] bit in the front separately, maybe even apply another rule on top of that now cleaned up line using a subparser or the ~> operator? When dealing with separate lines, how does one best filter out lines that do not contain a task or setting such as the first few lines and the fourth to last line that only contains a |? I think there is multiple valid approaches to this and I write parsers infrequently enough that it feels more like an uphill battle to figure this out vs. just using RegEx, .filter, etc., again.

In summary, a cookbook where users could contribute small examples sounds (to me) like a great idea. Especially for sligthly messy inputs. What are your thoughts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions