-
Notifications
You must be signed in to change notification settings - Fork 166
Description
Motivation: describe the problem to be solved
Currently, NannyML is built to rely on Pandas data frames and numpy arrays. Pandas is still the majority used data frame framework, but with new ones, such as Polars and Arrow, becoming more prevalent it would be foolish not to future proof ones own work by incorporating them. By using Narwhals, the devs can completely delegate this work to the 3rd party library while getting more support then just the previously mentioned. This has the additional benefit of code speedup possibly.
Describe the solution you'd like
NannyML is data frame and array agnostic library. This would make NannyML more robust to changes, more efficient (narwhals is faster than native pandas) and versatile.
Describe alternatives you've considered
The alternatives are continue to use Pandas only or implement more support oneselves.
- Continuing using pandas makes the package less versatile and slower, but requires no extra work contrary to the opposite.
- Implementing the support oneselves is both more work and is likely to not become better than what Narwhals provide.
Additional context
Personally, I think the narwhals API (which is very similar to that of Polars) is far superior to pandas and makes for much cleaner code too. I highly recommend considering this change. If this is something you are interested to look into, it's probably possible to ping Marco Gorelli to discuss any outstanding questions.