-
-
Notifications
You must be signed in to change notification settings - Fork 854
GSoC 2024 ‐ Snehil Shah
Hey there! I am Snehil Shah, a computer science undergraduate (as of writing this) at the Indian Institute of Information Technology, Nagpur, India. Apart from my interest in computers and software, I have a dormant passion for audio DSP and synthesis.
The read-eval-print loop (REPL) is a fixture of data analysis and numerical computing and provides a critical entry-point for individuals seeking to learn and better understand APIs and their associated behavior. For a library emphasizing numerical and scientific computing, a well-featured REPL becomes an essential tool allowing users to easily visualize and work with data in an interactive environment. The stdlib REPL is a command-line based interactive interpreter environment for Node.js equipped with namespaces and tools for statistical computing and data exploration enabling easy prototyping, testing, debugging, and programming.
This project aimed to implement a suite of enhancements to the stdlib REPL to achieve feature parity with similar environments for scientific computing such as IPython and Julia. These enhancements include:
- Fuzzy auto-completion
- Syntax highlighting
- Visualization tools for tabular data
- Multi-line editing
- Paged outputs
- Bracketed-paste
- and more...
My work on the REPL started before the official coding period began. Before that, I had contributed to some good first issues (implemented some easier packages, C implementations, and refactorings) to get a gist of project conventions and contribution flow. Since then, we have had an array of improvements to the REPL. Let's go through each of them from the beginning:
-
My first work on the REPL was implementing auto-closing brackets/quotations, a common feature in IDEs and code editors.
- #1680 - feat: add support for auto-closing brackets/quotations in the REPL
Steps:
- The approach is to walk the abstract syntax tree (generated by acorn) for the current line and detect an auto-closing candidate. If found we just write the corresponding auto-closing symbol to the output.
- Now there are also cases where the user instinctively types the closing symbol themselves as if this feature never existed, and we should respect that and allow the user to type through the auto-appended symbol.
- What about auto-deleting the appended symbol? When deleting an opening symbol, check if the corresponding closing symbol follows it. If it does, time to delete it. We need to avoid this behavior, if the user is editing a string literal and we again use acorn to find if the nodes around the cursor are strings.
As my first PR on the REPL, it wasn't the safest landing, with @kgryte doing most of the heavy lifting. We did get it through the finish line after a month of coding and review cycles, and by this time I had a good grasp of the REPL codebase.
-
Earlier, when an output was longer than the terminal height, it would scroll all the way to the end of the output. This meant the user had to scroll all the way back up to start reading the output. The pager aims to capture long outputs and display them in a scrollable way.
- #2162 - feat: add pager to allow scrolling of long outputs in the REPL
In general, pagers are simply implemented by halting the printing till the terminal height and waiting for user input to print further. But I wanted to do it differently. With our UI, we page in-place, meaning the pager appears like a screen, and we can still see the parent command on the top. The only downside to this might be the possible jittering of output as we rely on re-rendering the page upon every scroll.
Steps:
- Detecting a pageable output. We do this by checking if the number of rows in the output is greater than the height of the terminal stream (including space for the input command).
- Write the page UI, and maintain the page indexing. During paging mode, the entire REPL is frozen, and is only receptive to pager controls and
SIGINT
interrupts. - As we receive the page up/down controls, update the indices and re-render the page UI.
- Listen to
SIGWINCH
events to make it receptive to terminal resizes.
Maintenance work:
-
- #2178 - feat: add a stdlib ASCII art in REPL's default welcome message
Time for a REPL makeover with some new ASCII art.
-
Before:
-
After:
-
One of the most requested and crucial additions to the REPL was syntax highlighting.
-
#2254 - feat: add syntax highlighting in the REPL
This PR adds the core modules for syntax highlighting, namely the tokenizer, and highlighter.
Steps:
- With every keypress, capture the updated line.
- Check if the updated line is changed. This is a short caching mechanism to avoid perf drag during events like moving left/right in the REPL.
- Tokenization. To support various token types, a good tokenizer is crucial. We use acorn to parse the line into an abstract syntax tree. During parsing, we keep a record of basic tokens like comments, operators, punctuation, strings, numbers, and regexps. To resolve, declarations, we resolve all declarations (functions, classes, variables etc) in the local scope (not yet added to global context) by traversing the AST. To resolve all identifiers, we resolve the scopes in the order local > command > global. To resolve member expressions, we recursively traverse (and compute where needed) the global context to tokenize each member.
- Highlight. Each of the token types is then colored accordingly using ANSI escape sequences, and the line is re-rendered with the highlighted line.
-
#2291 - feat: add APIs, commands and tests for syntax-highlighting in the REPL
A follow-up PR adding REPL prototype methods, in-REPL commands, and test suites for the syntax highlighter. This adds various APIs for theming in the REPL, allowing the user to configure it with their own set of themes. Another small thing I took take care of is to disable highlighting in non-TTY environments.
-
#2341 - feat: add combined styles and inbuilt syntax highlighting themes in the REPL
This PR adds support for combining ANSI colors and styles to make hybrid colors. So something like
italic red bgBrightGreen
is supported. This will allow for more expressive theming. It also adds more in-built themes.
Maintenance work:
-
-
Prior to this, the REPL did support multi-line inputs using incomplete statements, but no way to edit them. Adding multi-line editing meant adding support for adding lines manually, and the ability to go up and edit like a normal editor.
- #2347 - feat: add multiline editing in the REPL
Implementing this is not as easy as it seems. Initially, I thought, just updating the
_rli
instance and using escape sequences with the updated lines by tracking each up/down keypress event would do the trick. But internally,readline
refreshes the stream after operations like left/right/delete etc. This meant, if we were at line 2 and the stream was refreshed, everything below that line was gone. So, to actually implement this, we had to implement manual rendering with each keypress event.Steps:
- Track each keypress event like up/down/right/left, backspace (for continuous deletion), and CTRL+O (for manually adding a new line), if the input is a multi-line input.
- Maintain line and cursor indices, and highlighted line buffers to store rendering data.
- After every keypress event, visually render the remaining lines below the current line.
- Maintain the final
_cmd
buffer for final execution.
-
A plot API for visualizing tabular data can be leveraged for downstream tasks like TTY rendering in the REPL or even in jupyter environments allowing users to easily work with tabular data when doing data analysis in the REPL (or elsewhere).
-
#2407 - feat: add
plot/table/unicode
The plot API supports data types like
Array<Array>
,MatrixLike
(2D<ndarray>
),Array<Object>
, andObject
. The API is highly configurable giving users full power over how the render looks like instead of giving them a pre-defined set of presets. This is how the default render looks like:┌───────┬──────┬───────┐ │ col1 │ col2 │ col3 │ ├───────┼──────┼───────┤ │ 45 │ 33 │ hello │ │ 32.54 │ true │ null │ └───────┴──────┴───────┘
The plotter also supports the one-of-a-kind, wrapping tables which allows breaking the table into segmented sub-tables when given appropriate
maxOutputWidth
prop values.Implementing this API has been tedious (as evident from the PR footprint) mainly because of the number of properties and signatures it needs, to parse various datatypes and support this level of configurability.
-
#2407 - feat: add
-
General Bug Fixes
https://github.com/stdlib-js/stdlib/pull/2430 https://github.com/stdlib-js/stdlib/pull/2435
TODO: add a summary of the current state of the project.
TODO: add a summary of what remains left to do. If there is a tracking issue, include a link to that issue.
TODO: add a summary of any unexpected challenges that you faced, along with any lessons learned during the course of your project.
TODO: add a report summary and include any acknowledgments (e.g., shout outs to contributors/maintainers who helped out along the way, etc).