Skip to content

[WIP] Feature: Wireshark Dissector Generator #8576

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

thejtshow
Copy link

@thejtshow thejtshow commented Apr 10, 2025

Hello! This is an awesome project and I'm eager to share some work I think would greatly benefit the flatbuffers community.

Description

This PR is my attempt to create a full-featured Wireshark dissector generator (see #153 and #8333).

This implementation is based on bfbs_gen_lua.h|cpp and relies only on reflection to retrieve all the necessary data. One thing I attempted to do is put all of the actual lua logic into supporting files in the wireshark/ directory - it was much easier to iterate on the code in this way. Almost all generated code boils down to "call this generic function with these specific parameters".

The reason why I built this wholly separate from the existing lua generator, is because the existing generator hides a lot of the metadata behind hard-coded locals, and I think the output of this PR could be used as a great hands-on teaching tool for flatbuffers internals for those of us curious (or masochistic) enough to dive into the details.

File naming convention is a little weird for generated/supporting files as Wireshark itself has some odd file loading quirks (see wireshark/README.md

Output currently has two modes, regular and verbose. Regular is intended to just shows you the data you want to see:

image

Verbose is intended to tell you what every single byte (except for padding) in a buffer is used for:

image

*(both of these screenshots are of auto generated data full of gibberish - no, you aren't supposed to be seeing "real" strings)

This PR is still a WIP and I intend to edit this description down to just the relevant information once I handle the last few features/issues. I wanted to get this up now to collect feedback and get some questions answered.

Still TODO:

  • Generate standalone dissector objects for root types (so you can point Wireshark at your proto and dissect without writing any lua, if your packet is pure flatbuffer data)
  • Support multiple root_type objects in the plugins directory
  • Support vector of union
    • Seems blocked by bfbs generation - reflection of _type is still UType not vector of UType
  • Support displaying non empty default strings and default vectors
    • Also seems blocked by bfbs generation - non scalar defaults not populated in reflection
  • Support proper display of bit_flags
    • blocked by attributes not being sent to reflection - could workaround this by requiring --bfbs-builtins
  • support automatically parsing nested_flatbuffer
    • blocked by attributes not being sent to reflection - could workaround this by requiring --bfbs-builtins
  • Verify/fix windows paths
  • Support Wireshark 3 - requires require statements
  • Add a healthy dose of comments
  • Squash everything into a single commit
  • Handle (relevant) attributes (if possible?)
  • Testing? Unsure what automated testing would look like for this feature. I have worked to manually verify every feature I could. I'm sure I've missed plenty.
  • Documentation

Open Questions

  1. Is there a canonical way to handle paths between operating systems in flatbuffers? I am trying to use full paths as namespacing for generated files. I might have missed something simple here
  2. Am I reinventing the wheel anywhere? Am I doing anything bone headed that is super trivial to do with tools I missed?
  3. What would be required documentation to add to get this feature merged?
  4. Does anyone have any better ideas for file naming conventions? I went with a dumb, straight forward approach, but it doesn't exactly jibe with the rest of the repository.

Possible Future Work

  1. built-in gRPC support. This might be as simple as either doing nothing or adding some simple extra code to the generated root types, but I'm not familiar enough with gRPC to integrate this right now. I think nothing stops one from writing a custom lua dissector that calls both the gRPC built in dissector then one of these generated ones.
  2. flexbuffer support. No clue what this would look like, as I have no experience with flexbuffers, and would also likely need some attribute visibility modifications to be able to act on them.
  3. allow users to specify if they want speicific number fields displayed in other formats (hex or binary)
  4. The current architecture does not support filtering vectors/arrays of scalars in Wireshark's filter bar as these do not have their own specific protocol field types. (i.e. you can't say "show me packets which have a value of 5 in this array"). This may be a minor update I can tackle in the initial release.

Copy link

google-cla bot commented Apr 10, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@github-actions github-actions bot added c++ CI Continuous Integration codegen Involving generating code from schema documentation Documentation lua labels Apr 10, 2025
@thejtshow
Copy link
Author

Found a small bug when the payload doesn't start at the beginning of the buffer, and another small issue with parse_struct. Additionally, loading is a little weird on Wireshark < 4.0. Will address these later today.

1. if offset is nonzero, the parse function failed
2. struct and table name lookup now done by member_list
@thejtshow
Copy link
Author

Currently tracking a limitation where you can only have one dissector present in the plugins directory at a time. Wireshark doesn't like you sharing common ProtoFields apparently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ CI Continuous Integration codegen Involving generating code from schema documentation Documentation lua
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant