Skip to content

Add GUI Chatbox for GPULlama3.java Inference #33

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

svntax
Copy link

@svntax svntax commented Jun 25, 2025

This PR adds a new JavaFX GUI for running inference with GPULlama3 (for issue #24).

It adds a new package com.example.gui containing all the new classes for the chatbox GUI, following a Model-View-Controller-Interactor framework.

Key Features

  • Dropdown menu to select an engine (TornadoVM, JVM).
  • Browse button to select the directory to the user's GPULlama3.java install.
  • Dropdown menu and Reload button to search for models inside of a /models folder in the user's GPULlama3.java directory.
  • Prompt text field for user input.
  • Run button to trigger inference by running llama-tornado as a new process.
  • Output area to display responses and other logs.

How to Run

After following the "Install, Build, and Run" instructions from the README, run the following:

mvn javafx:run

Notes

  • Dark theme styling is from AtlantaFX.
  • The right panel of the GUI is unfinished, but contains the System Monitoring panel with checkboxes (that do nothing), and an empty text area where the monitoring terminals would display.
  • I'm using Windows, so Linux / macOS is untested.

Next Steps

  • I can try to add the system monitoring features, although I'm not sure how far I'll get since I'm on Windows, so I can't test htop, nvtop or any of the Linux-specific options as far as I know.
  • Is the system monitoring display supposed to be embedded terminals? I did a bit of searching and found projects like TerminalFX and JediTermFX for this, but I don't know if that's the best way to implement this.

svntax added 11 commits June 21, 2025 05:16
Following the Model-View-Controller-Interactor framework, added new classes for each component. The view layout follows the POC image from issue beehive-lab#24. The model reflects the properties the user can change in the GUI, which are set up by the controller. The interactor will have the logic for triggering inference and updating the output displays.
The Run button runs 'llama-tornado' as a new process and passes command-line options to it by reading from the chatbox model object. All response and error logs are displayed in the main output text area.
Set minimum widths for all buttons and labels. Replaced SplitPane container node with HBox to avoid having a divider in the middle.
Set up AtlantaFX dependency, changed the GUI style to a dark theme (CupertinoDark), and set accents for buttons.
@CLAassistant
Copy link

CLAassistant commented Jun 25, 2025

CLA assistant check
All committers have signed the CLA.

@mikepapadim mikepapadim requested review from Copilot and mikepapadim and removed request for Copilot June 25, 2025 10:25
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new JavaFX GUI for running inference with GPULlama3.java using an MVC-I approach. The changes include:

  • A new package (com.example.gui) containing the GUI components (LlamaChatbox, ChatboxController, ChatboxViewBuilder, etc.).
  • Enhancements to model and interactor classes to support process-based inference.
  • Updates to pom.xml to add JavaFX and Atlantafx dependencies.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/main/java/com/example/gui/LlamaChatbox.java Main entry point that sets the dark theme and launches the GUI.
src/main/java/com/example/gui/ChatboxViewBuilder.java Constructs the chatbox UI components and binds UI controls to the model.
src/main/java/com/example/gui/ChatboxModel.java Holds the data properties used by the chatbox UI.
src/main/java/com/example/gui/ChatboxInteractor.java Executes the llama-tornado process and streams its output.
src/main/java/com/example/gui/ChatboxController.java Orchestrates the UI initialization and inference task execution.
pom.xml Updates dependencies and configures the JavaFX Maven plugin.

outputArea.setEditable(false);
outputArea.setWrapText(true);
VBox.setVgrow(outputArea, Priority.ALWAYS);
model.outputTextProperty().subscribe((newValue) -> {
Copy link
Preview

Copilot AI Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of 'subscribe' on a JavaFX StringProperty is non-standard. Please replace it with the standard addListener method to ensure proper property change notifications.

Copilot uses AI. Check for mistakes.

Comment on lines 70 to 75
while ((line = bufferedReader.readLine()) != null) {
builder.append(line);
builder.append(System.getProperty("line.separator"));
final String currentOutput = builder.toString();
javafx.application.Platform.runLater(() -> model.setOutputText(currentOutput));
}
Copy link
Preview

Copilot AI Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating the UI on every line read from the process may cause performance issues for high-volume output. Consider batching updates to reduce the overhead on the JavaFX application thread.

Suggested change
while ((line = bufferedReader.readLine()) != null) {
builder.append(line);
builder.append(System.getProperty("line.separator"));
final String currentOutput = builder.toString();
javafx.application.Platform.runLater(() -> model.setOutputText(currentOutput));
}
int lineCounter = 0; // Counter to track the number of lines read.
while ((line = bufferedReader.readLine()) != null) {
builder.append(line);
builder.append(System.getProperty("line.separator"));
lineCounter++;
// Update the UI every 100 lines or when the process finishes.
if (lineCounter >= 100) {
final String currentOutput = builder.toString();
javafx.application.Platform.runLater(() -> model.setOutputText(currentOutput));
lineCounter = 0; // Reset the counter after updating the UI.
}
}
// Ensure any remaining output is flushed to the UI.
final String remainingOutput = builder.toString();
javafx.application.Platform.runLater(() -> model.setOutputText(remainingOutput));

Copilot uses AI. Check for mistakes.

@mikepapadim
Copy link
Member

Thank you @svntax it worked just fine on my setup!

Copy link
Member

@mikepapadim mikepapadim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Thank you for the contribution for #24

I have some suggestios to make it a bit more robust:

  1. Lets avoid the indirection of calling the Python script from the tha Java process:
    Lets extend the main to have something like this and make gui to be part of the actual application:
             // LlamaApp.java
             public static void main(String[] args) throws IOException {
                 Options options = Options.parseOptions(args);
             
                 if (options.guiMode()) { // Add a new guiMode() option
                     // Launch the JavaFX application
                     Application.launch(LlamaGui.class, args);
                 } else {
                     // Run the existing CLI logic
                     Model model = loadModel(options);
                     Sampler sampler = createSampler(model, options);
                     if (options.interactive()) {
                         model.runInteractive(sampler, options);
                     } else {
                         model.runInstructOnce(sampler, options);
                     }
                 }
             }
  2. then we can modify the llama-tornado script to add a --gui flag and launch the gui from there.
  3. we need to add to GUI a check box or drop-down menu for the 2 options -> either to run instruct mode or interactive.

The output text then it can be directly obtained from ->

  1. System.out.println(responseText);
  2. System.out.print(tokenizer().decode(List.of(token)));
  3. in interactive case ->
    System.out.print(tokenizer().decode(List.of(token)));

if these changes make sense to you feel free to extend the PR.

Thanks

commands.add(String.format("%s\\external\\tornadovm\\.venv\\Scripts\\python", llama3Path));
commands.add("llama-tornado");
} else {
commands.add(String.format("%s/.llama-tornado", llama3Path));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
commands.add(String.format("%s/.llama-tornado", llama3Path));
commands.add(String.format("%s/llama-tornado", llama3Path));

.llama-tornado does not work on Linux

Comment on lines 80 to 101
private Node createLlama3PathBox() {
Button browseButton = new Button("Browse");
browseButton.getStyleClass().add(Styles.ACCENT);
browseButton.setMinWidth(80);
browseButton.disableProperty().bind(inferenceRunning);
browseButton.setOnAction(e -> {
DirectoryChooser dirChooser = new DirectoryChooser();
dirChooser.setTitle("Select GPULlama3.java Directory");
File selectedDir = dirChooser.showDialog(browseButton.getScene().getWindow());
if (selectedDir != null) {
model.setLlama3Path(selectedDir.getAbsolutePath());
}
});

TextField pathField = boundTextField(model.llama3PathProperty());
HBox box = new HBox(8, createLabel("Llama3 Path:"), pathField, browseButton);
box.setAlignment(Pos.CENTER_LEFT);
HBox.setHgrow(pathField, Priority.ALWAYS);
pathField.setMaxWidth(Double.MAX_VALUE);

return box;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This box I think is not required as the GUI is laucnhed from within the application and it is running the ROOT directory of the project. So, no need to specify that.

@svntax
Copy link
Author

svntax commented Jun 26, 2025

I'll work on these changes soon.

svntax added 5 commits June 26, 2025 04:58
Since we run the GUI directly in the project root, the Llama3 path browsing is no longer needed. Models are now scanned at the start too.
The `llama-tornado` script has a new `--gui` flag for launching the GUI chatbox, and as a result, the `--model` argument is no longer required if `--gui` is present.
The main `LlamaApp` class and `Options` now check for the `--gui` flag to launch the JavaFX application.
ChatboxInteractor now uses LlamaApp's methods to directly load and run models instead of indirectly running from the `llama-tornado` Python script. To show responses, a new PrintStream is created to capture output from Model.
@svntax
Copy link
Author

svntax commented Jun 28, 2025

I'm a bit stuck on how to run the application directly from the GUI. I have it working for instruct mode, but interactive mode currently reads input from the command line in a loop (in Model's runInteractive() method). Would we need a new method or changes to Model for interactive mode to work with the GUI? Or is there some other better approach?

Two other things I'd like feedback on:

  • To run models from the GUI, I'm using LlamaApp's methods loadModel() and createSampler() from within ChatboxInteractor, so is it okay to make them public?
  • Using TornadoVM requires LlamaApp.USE_TORNADOVM to read from config flags, but if we want the GUI to let the user choose, how should this change when right now it's a constant?

@mikepapadim
Copy link
Member

I'm a bit stuck on how to run the application directly from the GUI. I have it working for instruct mode, but interactive mode currently reads input from the command line in a loop (in Model's runInteractive() method). Would we need a new method or changes to Model for interactive mode to work with the GUI? Or is there some other better approach?

Two other things I'd like feedback on:

  • To run models from the GUI, I'm using LlamaApp's methods loadModel() and createSampler() from within ChatboxInteractor, so is it okay to make them public?
  • Using TornadoVM requires LlamaApp.USE_TORNADOVM to read from config flags, but if we want the GUI to let the user choose, how should this change when right now it's a constant?

-Handling Interactive Mode with the GUI
Modifying runInteractive() to work with the GUI isn't ideal because its while loop is designed for a command-line interface (CLI).

A cleaner approach is to create a new method in the Model class specifically for the GUI. This method would take a single user input (as a string) and return the model's single response. This way, the GUI can manage the "loop" itself—calling this new method every time the user sends a message.

-Making loadModel() and createSampler() Public
Yes, making loadModel() and createSampler() in LlamaApp public is a perfectly reasonable solution.

The ChatboxInteractor is acting as a controller or intermediary between your GUI and the core application logic in LlamaApp. For the GUI to be able to trigger the model loading and sampler creation process, the methods that perform these actions must be accessible to it. Encapsulation is important, but in this case, you are intentionally exposing specific functionalities to the GUI layer, which is a standard and necessary practice.

-Managing the USE_TORNADOVM Flag
You're right; a static final constant won't work if you want the user to be able to change this setting from the GUI.

The best way to handle this is to change USE_TORNADOVM from a constant to a regular member variable within a configuration object or directly in LlamaApp.

Remove final: Change public static final boolean USE_TORNADOVM to something like private boolean useTornadoVM.

Add a Setter Method: Create a public method to change its value, for example, public void setUseTornadoVM(boolean value).

GUI Integration: Your GUI's checkbox or toggle can now call this setter method before it calls loadModel().

This way, the user's choice is set first, and then the model is loaded with the correct configuration. The setting is no longer a compile-time constant but a runtime configuration option.

svntax added 4 commits July 4, 2025 06:42
Created a new method `runInteractiveStep()` in Model.java for the GUI to use when interactive mode is selected. It uses a simple data object Model.Response, which keeps track of a chat session's conversation tokens, state, and TornadoVM plan while running. Every time the GUI runs this method, a new Response object is created with updated state, and ChatboxInteractor saves this response as part of the ongoing chat.
Prevent the user from changing any of the GUI settings (like the model and engine) while an interactive session is running.
@svntax
Copy link
Author

svntax commented Jul 8, 2025

I finally have the GUI working directly with the main application, but there's a memory leak when using TornadoVM. I'm not sure if there's a problem with the approach I have, or if I'm not freeing up resources correctly, or if it's something else.

How to Run

Use the new --gui flag in llama-tornado to launch the GUI.

Windows example:

python llama-tornado --gui

How it works

I copied how LlamaApp runs by creating a Model and Sampler when starting a new chat. Instruct mode uses runInstructOnce(), and interactive mode uses the new method in Model runInteractiveStep(), which returns a Response object (a record in Model) with data for the ongoing conversation (state, conversation tokens, and a TornadoVMMasterPlan if applicable). These are reused every time the user sends a message in interactive mode, until "quit" or "exit" is sent, which then ends the chat session and should free resources.

Memory leak problem

I've tested with Llama-3.2-1B-Instruct-Q8_0.gguf, and I can confirm that in both instruct and interactive mode, if using TornadoVM, the model does load correctly (about 3-4 GB to VRAM), but when I try to free resources with freeTornadoExecutionPlan(), it frees up around 500 MB only.

Since the inference is happening in a background thread, I thought maybe it has to happen in the main JavaFX thread, but that didn't help.

The problem happens only with TornadoVM. You have to close the GUI for the memory to finally be freed. Also, there's no memory leak as far as I can tell if running on CPU (selecting JVM for the engine).

@mikepapadim
Copy link
Member

@svntax thank you for contributing the GUI, let me try it and see if I can find the root cause of the memory leak

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants