-
Notifications
You must be signed in to change notification settings - Fork 15
Add GUI Chatbox for GPULlama3.java Inference #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Following the Model-View-Controller-Interactor framework, added new classes for each component. The view layout follows the POC image from issue beehive-lab#24. The model reflects the properties the user can change in the GUI, which are set up by the controller. The interactor will have the logic for triggering inference and updating the output displays.
The Run button runs 'llama-tornado' as a new process and passes command-line options to it by reading from the chatbox model object. All response and error logs are displayed in the main output text area.
Set minimum widths for all buttons and labels. Replaced SplitPane container node with HBox to avoid having a divider in the middle.
Set up AtlantaFX dependency, changed the GUI style to a dark theme (CupertinoDark), and set accents for buttons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a new JavaFX GUI for running inference with GPULlama3.java using an MVC-I approach. The changes include:
- A new package (com.example.gui) containing the GUI components (LlamaChatbox, ChatboxController, ChatboxViewBuilder, etc.).
- Enhancements to model and interactor classes to support process-based inference.
- Updates to pom.xml to add JavaFX and Atlantafx dependencies.
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
src/main/java/com/example/gui/LlamaChatbox.java | Main entry point that sets the dark theme and launches the GUI. |
src/main/java/com/example/gui/ChatboxViewBuilder.java | Constructs the chatbox UI components and binds UI controls to the model. |
src/main/java/com/example/gui/ChatboxModel.java | Holds the data properties used by the chatbox UI. |
src/main/java/com/example/gui/ChatboxInteractor.java | Executes the llama-tornado process and streams its output. |
src/main/java/com/example/gui/ChatboxController.java | Orchestrates the UI initialization and inference task execution. |
pom.xml | Updates dependencies and configures the JavaFX Maven plugin. |
outputArea.setEditable(false); | ||
outputArea.setWrapText(true); | ||
VBox.setVgrow(outputArea, Priority.ALWAYS); | ||
model.outputTextProperty().subscribe((newValue) -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of 'subscribe' on a JavaFX StringProperty is non-standard. Please replace it with the standard addListener method to ensure proper property change notifications.
Copilot uses AI. Check for mistakes.
while ((line = bufferedReader.readLine()) != null) { | ||
builder.append(line); | ||
builder.append(System.getProperty("line.separator")); | ||
final String currentOutput = builder.toString(); | ||
javafx.application.Platform.runLater(() -> model.setOutputText(currentOutput)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updating the UI on every line read from the process may cause performance issues for high-volume output. Consider batching updates to reduce the overhead on the JavaFX application thread.
while ((line = bufferedReader.readLine()) != null) { | |
builder.append(line); | |
builder.append(System.getProperty("line.separator")); | |
final String currentOutput = builder.toString(); | |
javafx.application.Platform.runLater(() -> model.setOutputText(currentOutput)); | |
} | |
int lineCounter = 0; // Counter to track the number of lines read. | |
while ((line = bufferedReader.readLine()) != null) { | |
builder.append(line); | |
builder.append(System.getProperty("line.separator")); | |
lineCounter++; | |
// Update the UI every 100 lines or when the process finishes. | |
if (lineCounter >= 100) { | |
final String currentOutput = builder.toString(); | |
javafx.application.Platform.runLater(() -> model.setOutputText(currentOutput)); | |
lineCounter = 0; // Reset the counter after updating the UI. | |
} | |
} | |
// Ensure any remaining output is flushed to the UI. | |
final String remainingOutput = builder.toString(); | |
javafx.application.Platform.runLater(() -> model.setOutputText(remainingOutput)); |
Copilot uses AI. Check for mistakes.
Thank you @svntax it worked just fine on my setup! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! Thank you for the contribution for #24
I have some suggestios to make it a bit more robust:
- Lets avoid the indirection of calling the Python script from the tha Java process:
Lets extend the main to have something like this and make gui to be part of the actual application:// LlamaApp.java public static void main(String[] args) throws IOException { Options options = Options.parseOptions(args); if (options.guiMode()) { // Add a new guiMode() option // Launch the JavaFX application Application.launch(LlamaGui.class, args); } else { // Run the existing CLI logic Model model = loadModel(options); Sampler sampler = createSampler(model, options); if (options.interactive()) { model.runInteractive(sampler, options); } else { model.runInstructOnce(sampler, options); } } }
- then we can modify the
llama-tornado
script to add a --gui flag and launch the gui from there. - we need to add to GUI a check box or drop-down menu for the 2 options -> either to run instruct mode or interactive.
The output text then it can be directly obtained from ->
System.out.println(responseText); System.out.print(tokenizer().decode(List.of(token))); - in interactive case ->
System.out.print(tokenizer().decode(List.of(token)));
if these changes make sense to you feel free to extend the PR.
Thanks
commands.add(String.format("%s\\external\\tornadovm\\.venv\\Scripts\\python", llama3Path)); | ||
commands.add("llama-tornado"); | ||
} else { | ||
commands.add(String.format("%s/.llama-tornado", llama3Path)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commands.add(String.format("%s/.llama-tornado", llama3Path)); | |
commands.add(String.format("%s/llama-tornado", llama3Path)); |
.llama-tornado
does not work on Linux
private Node createLlama3PathBox() { | ||
Button browseButton = new Button("Browse"); | ||
browseButton.getStyleClass().add(Styles.ACCENT); | ||
browseButton.setMinWidth(80); | ||
browseButton.disableProperty().bind(inferenceRunning); | ||
browseButton.setOnAction(e -> { | ||
DirectoryChooser dirChooser = new DirectoryChooser(); | ||
dirChooser.setTitle("Select GPULlama3.java Directory"); | ||
File selectedDir = dirChooser.showDialog(browseButton.getScene().getWindow()); | ||
if (selectedDir != null) { | ||
model.setLlama3Path(selectedDir.getAbsolutePath()); | ||
} | ||
}); | ||
|
||
TextField pathField = boundTextField(model.llama3PathProperty()); | ||
HBox box = new HBox(8, createLabel("Llama3 Path:"), pathField, browseButton); | ||
box.setAlignment(Pos.CENTER_LEFT); | ||
HBox.setHgrow(pathField, Priority.ALWAYS); | ||
pathField.setMaxWidth(Double.MAX_VALUE); | ||
|
||
return box; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This box I think is not required as the GUI is laucnhed from within the application and it is running the ROOT directory of the project. So, no need to specify that.
I'll work on these changes soon. |
Since we run the GUI directly in the project root, the Llama3 path browsing is no longer needed. Models are now scanned at the start too.
The `llama-tornado` script has a new `--gui` flag for launching the GUI chatbox, and as a result, the `--model` argument is no longer required if `--gui` is present. The main `LlamaApp` class and `Options` now check for the `--gui` flag to launch the JavaFX application.
ChatboxInteractor now uses LlamaApp's methods to directly load and run models instead of indirectly running from the `llama-tornado` Python script. To show responses, a new PrintStream is created to capture output from Model.
I'm a bit stuck on how to run the application directly from the GUI. I have it working for instruct mode, but interactive mode currently reads input from the command line in a loop (in Two other things I'd like feedback on:
|
-Handling Interactive Mode with the GUI A cleaner approach is to create a new method in the Model class specifically for the GUI. This method would take a single user input (as a string) and return the model's single response. This way, the GUI can manage the "loop" itself—calling this new method every time the user sends a message. -Making loadModel() and createSampler() Public The ChatboxInteractor is acting as a controller or intermediary between your GUI and the core application logic in LlamaApp. For the GUI to be able to trigger the model loading and sampler creation process, the methods that perform these actions must be accessible to it. Encapsulation is important, but in this case, you are intentionally exposing specific functionalities to the GUI layer, which is a standard and necessary practice. -Managing the USE_TORNADOVM Flag The best way to handle this is to change USE_TORNADOVM from a constant to a regular member variable within a configuration object or directly in LlamaApp. Remove final: Change public static final boolean USE_TORNADOVM to something like private boolean useTornadoVM. Add a Setter Method: Create a public method to change its value, for example, public void setUseTornadoVM(boolean value). GUI Integration: Your GUI's checkbox or toggle can now call this setter method before it calls loadModel(). This way, the user's choice is set first, and then the model is loaded with the correct configuration. The setting is no longer a compile-time constant but a runtime configuration option. |
Created a new method `runInteractiveStep()` in Model.java for the GUI to use when interactive mode is selected. It uses a simple data object Model.Response, which keeps track of a chat session's conversation tokens, state, and TornadoVM plan while running. Every time the GUI runs this method, a new Response object is created with updated state, and ChatboxInteractor saves this response as part of the ongoing chat.
Prevent the user from changing any of the GUI settings (like the model and engine) while an interactive session is running.
…feat/gui-chatbox
I finally have the GUI working directly with the main application, but there's a memory leak when using TornadoVM. I'm not sure if there's a problem with the approach I have, or if I'm not freeing up resources correctly, or if it's something else. How to RunUse the new Windows example:
How it worksI copied how LlamaApp runs by creating a Model and Sampler when starting a new chat. Instruct mode uses Memory leak problemI've tested with Llama-3.2-1B-Instruct-Q8_0.gguf, and I can confirm that in both instruct and interactive mode, if using TornadoVM, the model does load correctly (about 3-4 GB to VRAM), but when I try to free resources with Since the inference is happening in a background thread, I thought maybe it has to happen in the main JavaFX thread, but that didn't help. The problem happens only with TornadoVM. You have to close the GUI for the memory to finally be freed. Also, there's no memory leak as far as I can tell if running on CPU (selecting JVM for the engine). |
@svntax thank you for contributing the GUI, let me try it and see if I can find the root cause of the memory leak |
This PR adds a new JavaFX GUI for running inference with
GPULlama3
(for issue #24).It adds a new package
com.example.gui
containing all the new classes for the chatbox GUI, following a Model-View-Controller-Interactor framework.Key Features
GPULlama3.java
install./models
folder in the user'sGPULlama3.java
directory.llama-tornado
as a new process.How to Run
After following the "Install, Build, and Run" instructions from the README, run the following:
mvn javafx:run
Notes
Next Steps
htop
,nvtop
or any of the Linux-specific options as far as I know.