Add GUI Chatbox for GPULlama3.java Inference #33

svntax · 2025-06-25T09:40:28Z

This PR adds a new JavaFX GUI for running inference with GPULlama3 (for issue #24).

It adds a new package com.example.gui containing all the new classes for the chatbox GUI, following a Model-View-Controller-Interactor framework.

Key Features

Dropdown menu to select an engine (TornadoVM, JVM).
Browse button to select the directory to the user's GPULlama3.java install.
Dropdown menu and Reload button to search for models inside of a /models folder in the user's GPULlama3.java directory.
Prompt text field for user input.
Run button to trigger inference by running llama-tornado as a new process.
Output area to display responses and other logs.

How to Run

After following the "Install, Build, and Run" instructions from the README, run the following:

mvn javafx:run

Notes

Dark theme styling is from AtlantaFX.
The right panel of the GUI is unfinished, but contains the System Monitoring panel with checkboxes (that do nothing), and an empty text area where the monitoring terminals would display.
I'm using Windows, so Linux / macOS is untested.

Next Steps

I can try to add the system monitoring features, although I'm not sure how far I'll get since I'm on Windows, so I can't test htop, nvtop or any of the Linux-specific options as far as I know.
Is the system monitoring display supposed to be embedded terminals? I did a bit of searching and found projects like TerminalFX and JediTermFX for this, but I don't know if that's the best way to implement this.

Following the Model-View-Controller-Interactor framework, added new classes for each component. The view layout follows the POC image from issue beehive-lab#24. The model reflects the properties the user can change in the GUI, which are set up by the controller. The interactor will have the logic for triggering inference and updating the output displays.

The Run button runs 'llama-tornado' as a new process and passes command-line options to it by reading from the chatbox model object. All response and error logs are displayed in the main output text area.

Set minimum widths for all buttons and labels. Replaced SplitPane container node with HBox to avoid having a divider in the middle.

Set up AtlantaFX dependency, changed the GUI style to a dark theme (CupertinoDark), and set accents for buttons.

CLAassistant · 2025-06-25T09:40:34Z

All committers have signed the CLA.

Copilot

Pull Request Overview

This PR introduces a new JavaFX GUI for running inference with GPULlama3.java using an MVC-I approach. The changes include:

A new package (com.example.gui) containing the GUI components (LlamaChatbox, ChatboxController, ChatboxViewBuilder, etc.).
Enhancements to model and interactor classes to support process-based inference.
Updates to pom.xml to add JavaFX and Atlantafx dependencies.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
src/main/java/com/example/gui/LlamaChatbox.java	Main entry point that sets the dark theme and launches the GUI.
src/main/java/com/example/gui/ChatboxViewBuilder.java	Constructs the chatbox UI components and binds UI controls to the model.
src/main/java/com/example/gui/ChatboxModel.java	Holds the data properties used by the chatbox UI.
src/main/java/com/example/gui/ChatboxInteractor.java	Executes the llama-tornado process and streams its output.
src/main/java/com/example/gui/ChatboxController.java	Orchestrates the UI initialization and inference task execution.
pom.xml	Updates dependencies and configures the JavaFX Maven plugin.

Copilot · 2025-06-25T10:26:08Z

src/main/java/com/example/gui/ChatboxViewBuilder.java

+        outputArea.setEditable(false);
+        outputArea.setWrapText(true);
+        VBox.setVgrow(outputArea, Priority.ALWAYS);
+        model.outputTextProperty().subscribe((newValue) -> {


The use of 'subscribe' on a JavaFX StringProperty is non-standard. Please replace it with the standard addListener method to ensure proper property change notifications.

Copilot · 2025-06-25T10:26:08Z

src/main/java/com/example/gui/ChatboxInteractor.java

+            while ((line = bufferedReader.readLine()) != null) {
+                builder.append(line);
+                builder.append(System.getProperty("line.separator"));
+                final String currentOutput = builder.toString();
+                javafx.application.Platform.runLater(() -> model.setOutputText(currentOutput));
+            }


Updating the UI on every line read from the process may cause performance issues for high-volume output. Consider batching updates to reduce the overhead on the JavaFX application thread.

Suggested change

while ((line = bufferedReader.readLine()) != null) {

builder.append(line);

builder.append(System.getProperty("line.separator"));

final String currentOutput = builder.toString();

javafx.application.Platform.runLater(() -> model.setOutputText(currentOutput));

}

int lineCounter = 0; // Counter to track the number of lines read.

while ((line = bufferedReader.readLine()) != null) {

builder.append(line);

builder.append(System.getProperty("line.separator"));

lineCounter++;

// Update the UI every 100 lines or when the process finishes.

if (lineCounter >= 100) {

final String currentOutput = builder.toString();

javafx.application.Platform.runLater(() -> model.setOutputText(currentOutput));

lineCounter = 0; // Reset the counter after updating the UI.

}

}

// Ensure any remaining output is flushed to the UI.

final String remainingOutput = builder.toString();

javafx.application.Platform.runLater(() -> model.setOutputText(remainingOutput));

mikepapadim · 2025-06-25T10:42:59Z

Thank you @svntax it worked just fine on my setup!

mikepapadim

This is great! Thank you for the contribution for #24

I have some suggestios to make it a bit more robust:

Lets avoid the indirection of calling the Python script from the tha Java process:
Lets extend the main to have something like this and make gui to be part of the actual application:

         // LlamaApp.java
         public static void main(String[] args) throws IOException {
             Options options = Options.parseOptions(args);
         
             if (options.guiMode()) { // Add a new guiMode() option
                 // Launch the JavaFX application
                 Application.launch(LlamaGui.class, args);
             } else {
                 // Run the existing CLI logic
                 Model model = loadModel(options);
                 Sampler sampler = createSampler(model, options);
                 if (options.interactive()) {
                     model.runInteractive(sampler, options);
                 } else {
                     model.runInstructOnce(sampler, options);
                 }
             }
         }

then we can modify the llama-tornado script to add a --gui flag and launch the gui from there.
we need to add to GUI a check box or drop-down menu for the 2 options -> either to run instruct mode or interactive.

The output text then it can be directly obtained from ->

GPULlama3.java/src/main/java/com/example/model/Model.java

Line 179 in 460be99

System.out.println(responseText);
GPULlama3.java/src/main/java/com/example/model/Model.java

Line 158 in 460be99

System.out.print(tokenizer().decode(List.of(token)));
in interactive case ->

GPULlama3.java/src/main/java/com/example/model/Model.java

Line 83 in 460be99

System.out.print(tokenizer().decode(List.of(token)));

if these changes make sense to you feel free to extend the PR.

Thanks

mikepapadim · 2025-06-25T11:03:19Z

src/main/java/com/example/gui/ChatboxInteractor.java

+            commands.add(String.format("%s\\external\\tornadovm\\.venv\\Scripts\\python", llama3Path));
+            commands.add("llama-tornado");
+        } else {
+            commands.add(String.format("%s/.llama-tornado", llama3Path));


Suggested change

commands.add(String.format("%s/.llama-tornado", llama3Path));

commands.add(String.format("%s/llama-tornado", llama3Path));

.llama-tornado does not work on Linux

mikepapadim · 2025-06-25T11:07:59Z

src/main/java/com/example/gui/ChatboxViewBuilder.java

+    private Node createLlama3PathBox() {
+        Button browseButton = new Button("Browse");
+        browseButton.getStyleClass().add(Styles.ACCENT);
+        browseButton.setMinWidth(80);
+        browseButton.disableProperty().bind(inferenceRunning);
+        browseButton.setOnAction(e -> {
+            DirectoryChooser dirChooser = new DirectoryChooser();
+            dirChooser.setTitle("Select GPULlama3.java Directory");
+            File selectedDir = dirChooser.showDialog(browseButton.getScene().getWindow());
+            if (selectedDir != null) {
+                model.setLlama3Path(selectedDir.getAbsolutePath());
+            }
+        });
+
+        TextField pathField = boundTextField(model.llama3PathProperty());
+        HBox box = new HBox(8, createLabel("Llama3 Path:"), pathField, browseButton);
+        box.setAlignment(Pos.CENTER_LEFT);
+        HBox.setHgrow(pathField, Priority.ALWAYS);
+        pathField.setMaxWidth(Double.MAX_VALUE);
+
+        return box;
+    }


This box I think is not required as the GUI is laucnhed from within the application and it is running the ROOT directory of the project. So, no need to specify that.

svntax · 2025-06-26T01:43:30Z

I'll work on these changes soon.

Since we run the GUI directly in the project root, the Llama3 path browsing is no longer needed. Models are now scanned at the start too.

The `llama-tornado` script has a new `--gui` flag for launching the GUI chatbox, and as a result, the `--model` argument is no longer required if `--gui` is present. The main `LlamaApp` class and `Options` now check for the `--gui` flag to launch the JavaFX application.

ChatboxInteractor now uses LlamaApp's methods to directly load and run models instead of indirectly running from the `llama-tornado` Python script. To show responses, a new PrintStream is created to capture output from Model.

svntax · 2025-06-28T08:26:05Z

I'm a bit stuck on how to run the application directly from the GUI. I have it working for instruct mode, but interactive mode currently reads input from the command line in a loop (in Model's runInteractive() method). Would we need a new method or changes to Model for interactive mode to work with the GUI? Or is there some other better approach?

Two other things I'd like feedback on:

To run models from the GUI, I'm using LlamaApp's methods loadModel() and createSampler() from within ChatboxInteractor, so is it okay to make them public?
Using TornadoVM requires LlamaApp.USE_TORNADOVM to read from config flags, but if we want the GUI to let the user choose, how should this change when right now it's a constant?

mikepapadim · 2025-07-01T12:30:44Z

I'm a bit stuck on how to run the application directly from the GUI. I have it working for instruct mode, but interactive mode currently reads input from the command line in a loop (in Model's runInteractive() method). Would we need a new method or changes to Model for interactive mode to work with the GUI? Or is there some other better approach?

Two other things I'd like feedback on:

To run models from the GUI, I'm using LlamaApp's methods loadModel() and createSampler() from within ChatboxInteractor, so is it okay to make them public?

Using TornadoVM requires LlamaApp.USE_TORNADOVM to read from config flags, but if we want the GUI to let the user choose, how should this change when right now it's a constant?

-Handling Interactive Mode with the GUI
Modifying runInteractive() to work with the GUI isn't ideal because its while loop is designed for a command-line interface (CLI).

A cleaner approach is to create a new method in the Model class specifically for the GUI. This method would take a single user input (as a string) and return the model's single response. This way, the GUI can manage the "loop" itself—calling this new method every time the user sends a message.

-Making loadModel() and createSampler() Public
Yes, making loadModel() and createSampler() in LlamaApp public is a perfectly reasonable solution.

The ChatboxInteractor is acting as a controller or intermediary between your GUI and the core application logic in LlamaApp. For the GUI to be able to trigger the model loading and sampler creation process, the methods that perform these actions must be accessible to it. Encapsulation is important, but in this case, you are intentionally exposing specific functionalities to the GUI layer, which is a standard and necessary practice.

-Managing the USE_TORNADOVM Flag
You're right; a static final constant won't work if you want the user to be able to change this setting from the GUI.

The best way to handle this is to change USE_TORNADOVM from a constant to a regular member variable within a configuration object or directly in LlamaApp.

Remove final: Change public static final boolean USE_TORNADOVM to something like private boolean useTornadoVM.

Add a Setter Method: Create a public method to change its value, for example, public void setUseTornadoVM(boolean value).

GUI Integration: Your GUI's checkbox or toggle can now call this setter method before it calls loadModel().

This way, the user's choice is set first, and then the model is loaded with the correct configuration. The setting is no longer a compile-time constant but a runtime configuration option.

Created a new method `runInteractiveStep()` in Model.java for the GUI to use when interactive mode is selected. It uses a simple data object Model.Response, which keeps track of a chat session's conversation tokens, state, and TornadoVM plan while running. Every time the GUI runs this method, a new Response object is created with updated state, and ChatboxInteractor saves this response as part of the ongoing chat.

Prevent the user from changing any of the GUI settings (like the model and engine) while an interactive session is running.

…feat/gui-chatbox

svntax · 2025-07-08T00:50:15Z

I finally have the GUI working directly with the main application, but there's a memory leak when using TornadoVM. I'm not sure if there's a problem with the approach I have, or if I'm not freeing up resources correctly, or if it's something else.

How to Run

Use the new --gui flag in llama-tornado to launch the GUI.

Windows example:

python llama-tornado --gui

How it works

I copied how LlamaApp runs by creating a Model and Sampler when starting a new chat. Instruct mode uses runInstructOnce(), and interactive mode uses the new method in Model runInteractiveStep(), which returns a Response object (a record in Model) with data for the ongoing conversation (state, conversation tokens, and a TornadoVMMasterPlan if applicable). These are reused every time the user sends a message in interactive mode, until "quit" or "exit" is sent, which then ends the chat session and should free resources.

Memory leak problem

I've tested with Llama-3.2-1B-Instruct-Q8_0.gguf, and I can confirm that in both instruct and interactive mode, if using TornadoVM, the model does load correctly (about 3-4 GB to VRAM), but when I try to free resources with freeTornadoExecutionPlan(), it frees up around 500 MB only.

Since the inference is happening in a background thread, I thought maybe it has to happen in the main JavaFX thread, but that didn't help.

The problem happens only with TornadoVM. You have to close the GUI for the memory to finally be freed. Also, there's no memory leak as far as I can tell if running on CPU (selecting JVM for the engine).

mikepapadim · 2025-07-08T11:56:51Z

@svntax thank you for contributing the GUI, let me try it and see if I can find the root cause of the memory leak

svntax added 11 commits June 21, 2025 05:16

Add JavaFX dependencies and set up main class.

4f2598f

Implement run inference button.

f65a1f6

The Run button runs 'llama-tornado' as a new process and passes command-line options to it by reading from the chatbox model object. All response and error logs are displayed in the main output text area.

Implement Browse button for Llama3 path.

44c9750

Implement Reload button for scanning for model files.

a0ce181

Change output text area to autoscroll to the bottom.

f21cec1

Change buttons to be disabled while inference is running.

a281461

Adjust buttons, labels, and container nodes for GUI chatbox.

ad22925

Set minimum widths for all buttons and labels. Replaced SplitPane container node with HBox to avoid having a divider in the middle.

Add AtlantaFX for new theme styles.

7c29d68

Set up AtlantaFX dependency, changed the GUI style to a dark theme (CupertinoDark), and set accents for buttons.

Decrease padding between left and right panels.

c43f7a9

Remove unused import

bed9b66

mikepapadim requested review from Copilot and mikepapadim and removed request for Copilot June 25, 2025 10:25

mikepapadim assigned svntax Jun 25, 2025

Copilot AI reviewed Jun 25, 2025

View reviewed changes

mikepapadim added the Tooling label Jun 25, 2025

mikepapadim reviewed Jun 25, 2025

View reviewed changes

svntax added 5 commits June 26, 2025 04:58

Remove the need for setting Llama3 path.

eb454a7

Since we run the GUI directly in the project root, the Llama3 path browsing is no longer needed. Models are now scanned at the start too.

Refactor chatbox interactor to run models directly.

03ee28c

ChatboxInteractor now uses LlamaApp's methods to directly load and run models instead of indirectly running from the `llama-tornado` Python script. To show responses, a new PrintStream is created to capture output from Model.

Fix width for engine dropdown menu and remove unused import.

bc43b28

Add dropdown menu for chat mode selection.

d381457

svntax added 4 commits July 4, 2025 06:42

Change USE_TORNADOVM flag into a regular member variable.

6937602

Disable GUI controls while an interactive chat is running.

480336e

Prevent the user from changing any of the GUI settings (like the model and engine) while an interactive session is running.

Merge branch 'main' of https://github.com/svntax/GPULlama3.java into …

b898006

…feat/gui-chatbox

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add GUI Chatbox for GPULlama3.java Inference #33

Add GUI Chatbox for GPULlama3.java Inference #33

Uh oh!

svntax commented Jun 25, 2025

Uh oh!

CLAassistant commented Jun 25, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jun 25, 2025

Uh oh!

Copilot AI Jun 25, 2025

Uh oh!

mikepapadim commented Jun 25, 2025

Uh oh!

mikepapadim left a comment •

edited

Loading

Uh oh!

mikepapadim Jun 25, 2025

Uh oh!

mikepapadim Jun 25, 2025

Uh oh!

svntax commented Jun 26, 2025

Uh oh!

svntax commented Jun 28, 2025

Uh oh!

mikepapadim commented Jul 1, 2025

Uh oh!

svntax commented Jul 8, 2025 •

edited

Loading

Uh oh!

mikepapadim commented Jul 8, 2025

Uh oh!

Uh oh!

	commands.add(String.format("%s/.llama-tornado", llama3Path));
	commands.add(String.format("%s/llama-tornado", llama3Path));

Add GUI Chatbox for GPULlama3.java Inference #33

Are you sure you want to change the base?

Add GUI Chatbox for GPULlama3.java Inference #33

Uh oh!

Conversation

svntax commented Jun 25, 2025

Key Features

How to Run

Notes

Next Steps

Uh oh!

CLAassistant commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

mikepapadim commented Jun 25, 2025

Uh oh!

mikepapadim left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mikepapadim Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

mikepapadim Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

svntax commented Jun 26, 2025

Uh oh!

svntax commented Jun 28, 2025

Uh oh!

mikepapadim commented Jul 1, 2025

Uh oh!

svntax commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How to Run

How it works

Memory leak problem

Uh oh!

mikepapadim commented Jul 8, 2025

Uh oh!

Uh oh!

CLAassistant commented Jun 25, 2025 •

edited

Loading

mikepapadim left a comment •

edited

Loading

svntax commented Jul 8, 2025 •

edited

Loading