Skip to content

OCR integration #13313

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 45 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
a51e3b0
Initial implementation using tess4j
Kaan0029 Jun 12, 2025
aca504a
merge
Siedlerchr Jun 12, 2025
48ffb06
add brew to jna path
Siedlerchr Jun 12, 2025
5a256ae
Adapted Exception handling and configured tessdata variable
Kaan0029 Jun 19, 2025
f80cec8
addressed mentor code feedback after 19.06.25
Kaan0029 Jul 3, 2025
db1f577
Merge branch 'upstream-main' into gsoc-ocr-tess4j-initial-implementation
Kaan0029 Jul 10, 2025
4020d3a
fix java modules
Siedlerchr Jul 10, 2025
4987978
fix tessdata path
Siedlerchr Jul 10, 2025
9842dd4
fix tessdata path amd module infor for lept4j
Siedlerchr Jul 10, 2025
8b133e6
fix tessdata path amd module infor for lept4j
Siedlerchr Jul 10, 2025
42704ea
fix module access
Siedlerchr Jul 10, 2025
a64e1ea
fix module access
Siedlerchr Jul 10, 2025
6069bc1
fix module access
Siedlerchr Jul 10, 2025
5734252
Avoid throwing exception in configureTessdata when tessdata is missing
Kaan0029 Jul 10, 2025
ab98a3a
Use Path.of instead of Paths.get for modern Java style
Kaan0029 Jul 10, 2025
62dce25
Add OCR tessdata path setting to AI preferences tab and add abstracti…
Kaan0029 Jul 10, 2025
0949300
Merge remote-tracking branch 'origin/gsoc-ocr-tess4j-initial-implemen…
Kaan0029 Jul 10, 2025
e4e45f3
fix(ocr): restore deleted files
InAnYan Jul 10, 2025
15272a6
fix(ocr): restore deleted IconThemes
InAnYan Jul 10, 2025
14332af
Update gradle to 9.1.0-jabref
koppor Jul 11, 2025
5488deb
fix(submodules): fix submodules
InAnYan Jul 11, 2025
26a8500
Update gradle
koppor Jul 11, 2025
e0da447
Merge branch 'gsoc-ocr-tess4j-initial-implementation' of https://gith…
koppor Jul 11, 2025
f1a06ac
Update gradle
koppor Jul 11, 2025
42d34ca
Workaround for gradle bug
calixtus Jul 12, 2025
3ffa1c1
Merge remote-tracking branch 'upstream/main' into gsoc-ocr-tess4j-ini…
calixtus Jul 12, 2025
59d87a2
Workaround for gradle bug
calixtus Jul 12, 2025
bc36a94
Fix submodules
calixtus Jul 12, 2025
e7a043d
Added Debugging Testing file for jna
Kaan0029 Jul 13, 2025
caa48f1
Attempt to enforce specific jna version
Kaan0029 Jul 13, 2025
9b51b23
Merge remote-tracking branch 'origin/gsoc-ocr-tess4j-initial-implemen…
Kaan0029 Jul 13, 2025
c115803
Fix jna include
calixtus Jul 13, 2025
f16c1f4
Add jna-jpms
koppor Jul 13, 2025
8c51016
fix arm64 crash
Kaan0029 Jul 15, 2025
5843c69
fix tessdata path issue
Kaan0029 Jul 15, 2025
1bd2e5b
revert previous tessdata change
Kaan0029 Jul 15, 2025
c7fcc6b
fixed lifecycle
Kaan0029 Jul 16, 2025
a8bab25
fixed bug related to life cylce issue
Kaan0029 Jul 16, 2025
9b9c7b7
Deleted unnecessary comments
Kaan0029 Jul 16, 2025
edc0343
use StringUtil.isBlank
Kaan0029 Jul 16, 2025
695e2b4
Use @NonNull
Kaan0029 Jul 16, 2025
e5e651e
Added TODO comment
Kaan0029 Jul 16, 2025
9ffdbff
delete /tessdata from .gitignore
Kaan0029 Jul 16, 2025
facb463
delete TO-DO comments in FilePreferences
Kaan0029 Jul 16, 2025
265a312
Delete unnecessary comment in OcrProvider
Kaan0029 Jul 16, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -436,6 +436,18 @@ extraJavaModuleInfo {
requires("java.naming")
requires("java.sql")
}
module("net.sourceforge.tess4j:tess4j", "net.sourceforge.tess4j") {
exportAllPackages()
requireAllDefinedDependencies()
requires("java.desktop")
}
module("net.sourceforge.lept4j:lept4j", "net.sourceforge.lept4j") {
exportAllPackages()
requireAllDefinedDependencies()
requires("java.desktop")
requires("java.logging")
}

module("org.apache.pdfbox:pdfbox", "org.apache.pdfbox") {
exportAllPackages()
requireAllDefinedDependencies()
Expand All @@ -446,6 +458,32 @@ extraJavaModuleInfo {
requireAllDefinedDependencies()
requires("java.xml")
}

module("org.apache.pdfbox:pdfbox-tools", "org.apache.pdfbox.tools") {
requireAllDefinedDependencies()
exportAllPackages()
requires("java.desktop")
}
module("org.apache.pdfbox:pdfbox-debugger", "org.apache.pdfbox.debugger") {
requireAllDefinedDependencies()
exportAllPackages()
}
module("org.apache.pdfbox:jbig2-imageio", "org.apache.pdfbox.jbig2") {
exportAllPackages()
requireAllDefinedDependencies()
requires("java.desktop")
}
module("org.jboss:jboss-vfs", "org.jboss.vfs") {
preserveExisting()
}
module("org.jboss.logging:jboss-logging", "org.jboss.logging") {
preserveExisting()
}
module("com.github.jai-imageio:jai-imageio-core", "com.github.jaiimageio.core") {
exportAllPackages()
requireAllDefinedDependencies()
requires("java.desktop")
}
module("com.squareup.okio:okio-jvm", "okio") {
exportAllPackages()
requireAllDefinedDependencies()
Expand Down
303,672 changes: 303,672 additions & 0 deletions error_log.txt

Large diffs are not rendered by default.

3 changes: 1 addition & 2 deletions gradle/wrapper/gradle-wrapper.properties
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
distributionBase=GRADLE_USER_HOME
distributionPath=wrapper/dists
distributionSha256Sum=bd71102213493060956ec229d946beee57158dbd89d0e62b91bca0fa2c5f3531
distributionUrl=https\://services.gradle.org/distributions/gradle-8.14.3-bin.zip
distributionUrl=https\://files.jabref.org/gradle-9.1.0-jabref-14.zip
networkTimeout=10000
validateDistributionUrl=true
zipStoreBase=GRADLE_USER_HOME
Expand Down
2 changes: 1 addition & 1 deletion jabgui/build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ dependencies {

implementation ("org.apache.pdfbox:pdfbox")

// implementation("net.java.dev.jna:jna")
implementation("net.java.dev.jna:jna-jpms")
implementation("net.java.dev.jna:jna-platform")

implementation("org.eclipse.jgit:org.eclipse.jgit")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@
import org.jabref.logic.integrity.FieldCheckers;
import org.jabref.logic.journals.JournalAbbreviationRepository;
import org.jabref.logic.l10n.Localization;
import org.jabref.logic.ocr.OcrService;
import org.jabref.logic.util.TaskExecutor;
import org.jabref.model.database.BibDatabaseContext;
import org.jabref.model.entry.BibEntry;
Expand Down Expand Up @@ -80,6 +81,8 @@ public class LinkedFilesEditor extends HBox implements FieldEditorFX {
@Inject private JournalAbbreviationRepository abbreviationRepository;
@Inject private TaskExecutor taskExecutor;
@Inject private UndoManager undoManager;
@Inject private OcrService ocrService;


private LinkedFilesEditorViewModel viewModel;

Expand Down Expand Up @@ -325,7 +328,9 @@ private void handleItemMouseClick(LinkedFileViewModel linkedFile, MouseEvent eve
bibEntry,
viewModel,
contextCommandFactory,
multiContextCommandFactory
multiContextCommandFactory,
taskExecutor,
ocrService
);

ContextMenu contextMenu = contextMenuFactory.createForSelection(listView.getSelectionModel().getSelectedItems());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import javafx.collections.ObservableList;
import javafx.scene.control.ContextMenu;
import javafx.scene.control.MenuItem;
import javafx.scene.control.SeparatorMenuItem;

import org.jabref.gui.DialogService;
Expand All @@ -10,7 +11,11 @@
import org.jabref.gui.copyfiles.CopySingleFileAction;
import org.jabref.gui.fieldeditors.LinkedFileViewModel;
import org.jabref.gui.fieldeditors.LinkedFilesEditorViewModel;
import org.jabref.gui.linkedfile.OcrAction;
import org.jabref.gui.preferences.GuiPreferences;
import org.jabref.logic.l10n.Localization;
import org.jabref.logic.ocr.OcrService;
import org.jabref.logic.util.TaskExecutor;
import org.jabref.model.database.BibDatabaseContext;
import org.jabref.model.entry.BibEntry;

Expand All @@ -25,21 +30,27 @@ public class ContextMenuFactory {
private final LinkedFilesEditorViewModel viewModel;
private final SingleContextCommandFactory singleCommandFactory;
private final MultiContextCommandFactory multiCommandFactory;
private final TaskExecutor taskExecutor;
private final OcrService ocrService;

public ContextMenuFactory(DialogService dialogService,
GuiPreferences preferences,
BibDatabaseContext databaseContext,
ObservableOptionalValue<BibEntry> bibEntry,
LinkedFilesEditorViewModel viewModel,
SingleContextCommandFactory singleCommandFactory,
MultiContextCommandFactory multiCommandFactory) {
MultiContextCommandFactory multiCommandFactory,
TaskExecutor taskExecutor,
OcrService ocrService) {
this.dialogService = dialogService;
this.preferences = preferences;
this.databaseContext = databaseContext;
this.bibEntry = bibEntry;
this.viewModel = viewModel;
this.singleCommandFactory = singleCommandFactory;
this.multiCommandFactory = multiCommandFactory;
this.taskExecutor = taskExecutor;
this.ocrService = ocrService;
}

public ContextMenu createForSelection(ObservableList<LinkedFileViewModel> selectedFiles) {
Expand Down Expand Up @@ -86,9 +97,46 @@ private ContextMenu createContextMenuForFile(LinkedFileViewModel linkedFile) {
factory.createMenuItem(StandardActions.DELETE_FILE, singleCommandFactory.build(StandardActions.DELETE_FILE, linkedFile))
);

// Add OCR menu item for PDF files
if (linkedFile.getFile().getFileType().equalsIgnoreCase("pdf")) {
menu.getItems().add(new SeparatorMenuItem());

MenuItem ocrItem = createOcrMenuItem(linkedFile);
menu.getItems().add(ocrItem);
}

return menu;
}

/**
* Creates the OCR menu item for a PDF file.
* The menu item is only enabled if the PDF file exists on disk.
*
* @param linkedFile The linked PDF file
* @return MenuItem configured for OCR action
*/
private MenuItem createOcrMenuItem(LinkedFileViewModel linkedFile) {
MenuItem ocrItem = new MenuItem(Localization.lang("Extract text (OCR)"));

// Create the OCR action
OcrAction ocrAction = new OcrAction(
Comment on lines +121 to +122
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is trivial and does not add any new information beyond what is clearly visible in the code. It simply restates what the code is doing.

linkedFile.getFile(),
databaseContext,
dialogService,
preferences.getFilePreferences(),
taskExecutor,
ocrService
);

// Set the action to execute when clicked
ocrItem.setOnAction(event -> ocrAction.execute());
Comment on lines +131 to +132
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is redundant and merely describes what the code does without providing additional context or reasoning. The code is self-explanatory.


// Disable if the action is not executable (file doesn't exist)
ocrItem.disableProperty().bind(ocrAction.executableProperty().not());
Comment on lines +134 to +135
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment restates what is evident from the code itself without providing additional insight or explanation about the underlying logic or design decision.


return ocrItem;
}

@FunctionalInterface
public interface SingleContextCommandFactory {
ContextAction build(StandardActions action, LinkedFileViewModel file);
Expand Down
115 changes: 115 additions & 0 deletions jabgui/src/main/java/org/jabref/gui/linkedfile/OcrAction.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
package org.jabref.gui.linkedfile;

import org.jabref.gui.DialogService;
import org.jabref.gui.StateManager;
import org.jabref.gui.actions.Action;
import org.jabref.gui.actions.ActionHelper;
import org.jabref.gui.actions.SimpleCommand;
import org.jabref.logic.util.BackgroundTask;
import org.jabref.logic.util.TaskExecutor;
import org.jabref.logic.l10n.Localization;
import org.jabref.logic.ocr.OcrService;
import org.jabref.logic.ocr.OcrResult;
import org.jabref.logic.ocr.OcrException;
import org.jabref.model.database.BibDatabaseContext;
import org.jabref.model.entry.LinkedFile;
import org.jabref.logic.FilePreferences;

import java.nio.file.Path;
import java.util.Optional;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
* Action for performing OCR (Optical Character Recognition) on linked PDF files.
*/
public class OcrAction extends SimpleCommand {
private static final Logger LOGGER = LoggerFactory.getLogger(OcrAction.class);

private final LinkedFile linkedFile;
private final BibDatabaseContext databaseContext;
private final DialogService dialogService;
private final FilePreferences filePreferences;
private final TaskExecutor taskExecutor;
private final OcrService ocrService;

public OcrAction(LinkedFile linkedFile,
BibDatabaseContext databaseContext,
DialogService dialogService,
FilePreferences filePreferences,
TaskExecutor taskExecutor,
OcrService ocrService) {
this.linkedFile = linkedFile;
this.databaseContext = databaseContext;
this.dialogService = dialogService;
this.filePreferences = filePreferences;
this.taskExecutor = taskExecutor;

// Only executable for existing PDF files
this.executable.set(
linkedFile.getFileType().equalsIgnoreCase("pdf") &&
linkedFile.findIn(databaseContext, filePreferences).isPresent()
);
this.ocrService = ocrService;
}

@Override
public void execute() {
Optional<Path> filePath = linkedFile.findIn(databaseContext, filePreferences);

if (filePath.isEmpty()) {
dialogService.showErrorDialogAndWait(
Localization.lang("File not found"),
Localization.lang("Could not locate the PDF file on disk.")
);
return;
}

dialogService.notify(Localization.lang("Performing OCR..."));

BackgroundTask<OcrResult> task = BackgroundTask.wrap(() -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, it would be very great if you could make this background task as a separate class (just inherit from BackgroundTask<OcrResult>)

return ocrService.performOcr(filePath.get());
})
.showToUser(true) // Show in task list
.withInitialMessage(Localization.lang("Performing OCR on %0", linkedFile.getLink()));

task.onSuccess(result -> {
// Use pattern matching with the sealed class
switch (result) {
case OcrResult.Success success -> {
String extractedText = success.text();
if (extractedText.isEmpty()) {
dialogService.showInformationDialogAndWait(
Localization.lang("OCR Complete"),
Localization.lang("No text was found in the PDF.")
);
} else {
// Show preview
String preview = extractedText.length() > 1000
? extractedText.substring(0, 1000) + "..."
: extractedText;

dialogService.showInformationDialogAndWait(
Localization.lang("OCR Result"),
preview
);
}
}
case OcrResult.Failure failure -> {
dialogService.showErrorDialogAndWait(
Localization.lang("OCR failed"),
failure.errorMessage()
);
}
}
})
.onFailure(exception -> {
dialogService.showErrorDialogAndWait(
Localization.lang("OCR failed"),
exception.getMessage()
);
})
.executeWith(taskExecutor);
}
}
Loading
Loading