Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implemented ocr tool using tesseract #1239

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/Linux-pack.yml
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,8 @@ jobs:
libqt5widgets5 \
libqt5gui5 \
libqt5svg5-dev \
tesseract-ocr \
libtesseract-dev \
python3 \
python3-pip
- name: Prepare cmake(>=3.13.0)
Expand Down
5 changes: 4 additions & 1 deletion .github/workflows/build_cmake.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,10 @@ jobs:
libqt5core5a \
libqt5widgets5 \
libqt5gui5 \
libqt5svg5-dev
libqt5svg5-dev \
tesseract-ocr \
libtesseract-dev


- name: Create Build Environment
# Some projects don't allow in-source building, so create a separate build directory
Expand Down
2 changes: 2 additions & 0 deletions data/graphics.qrc
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@
<file>img/material/black/circle-outline.svg</file>
<file>img/material/black/pixelate.svg</file>
<file>img/material/black/arrow-bottom-left.svg</file>
<file>img/material/black/ocr.svg</file>
<file>img/material/white/undo-variant.svg</file>
<file>img/material/white/text.svg</file>
<file>img/material/white/square.svg</file>
Expand Down Expand Up @@ -74,5 +75,6 @@
<file>img/material/white/shortcut.svg</file>
<file>img/material/black/filepath.svg</file>
<file>img/material/white/filepath.svg</file>
<file>img/material/white/ocr.svg</file>
</qresource>
</RCC>
Binary file added data/img/material/black/ocr.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions data/img/material/black/ocr.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/img/material/white/ocr.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions data/img/material/white/ocr.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,8 @@ target_link_libraries(
Qt5::Widgets
SingleApplication::SingleApplication
spdlog::spdlog
tesseract
leptonica
)

if (APPLE)
Expand Down
2 changes: 2 additions & 0 deletions src/tools/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ target_sources(
target_sources(flameshot PRIVATE rectangle/rectangletool.h rectangle/rectangletool.cpp)
target_sources(flameshot PRIVATE redo/redotool.h redo/redotool.cpp)
target_sources(flameshot PRIVATE save/savetool.h save/savetool.cpp)
target_sources(flameshot PRIVATE ocr/ocrtool.h ocr/ocrtool.cpp)
target_sources(flameshot PRIVATE ocr/tesseract_tool.h)
target_sources(flameshot PRIVATE selection/selectiontool.h selection/selectiontool.cpp)
target_sources(flameshot PRIVATE sizeindicator/sizeindicatortool.h sizeindicator/sizeindicatortool.cpp)
target_sources(
Expand Down
3 changes: 2 additions & 1 deletion src/tools/capturetool.h
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,8 @@ enum class ToolType
SIZEINDICATOR,
TEXT,
UNDO,
UPLOAD
UPLOAD,
OCR
};

class CaptureTool : public QObject
Expand Down
84 changes: 84 additions & 0 deletions src/tools/ocr/ocrtool.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
// Copyright(c) 2017-2019 Alejandro Sirgo Rica & Contributors
//
// This file is part of Flameshot.
//
// Flameshot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// Flameshot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with Flameshot. If not, see <http://www.gnu.org/licenses/>.

#include "ocrtool.h"
#include "tesseract_tool.h"
#include "src/utils/screenshotsaver.h"
#include <QPainter>
#include <tesseract/basetesseractApi.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this header come from? Per the tesseract docs it should be baseapi.h: https://github.com/tesseract-ocr/tesseract/blob/master/include/tesseract/baseapi.h

Having an issue building this on arch which lead me down this trail.

Copy link

@smartlitchi smartlitchi Feb 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work by replacing the header with baseapi.h ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work by replacing the header with baseapi.h ?

Yes this works

#include <leptonica/allheaders.h>

OcrTool::OcrTool(QObject* parent)
: AbstractActionTool(parent)
{}

bool OcrTool::closeOnButtonPressed() const
{
return true;
}

QIcon OcrTool::icon(const QColor& background, bool inEditor) const
{
Q_UNUSED(inEditor);
return QIcon(iconPath(background) + "ocr.svg");
}
QString OcrTool::name() const
{
return "OCR";
}

ToolType OcrTool::nameID() const
{
return ToolType::OCR;
}

QString OcrTool::description() const
{
return "Copy text in Capture to Clipboard";
}

CaptureTool* OcrTool::copy(QObject* parent)
{
return new OcrTool(parent);
}

void OcrTool::pressed(const CaptureContext& context)
{
char *outText;

tesseract::TessBaseAPI *tesseractApi = new tesseract::TessBaseAPI();

// TODO: tesseract language configs?
if (tesseractApi->Init(NULL, "eng")) {
// TODO: error system notification?
return;
}

Pix *image = TesseractTool::qImage2PIX(context.selectedScreenshotArea().toImage());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

building this on Manjaro fails with the following error:

./flameshot/src/tools/ocr/ocrtool.cpp:71:33: error: ‘qImage2PIX’ is not a member of ‘TesseractTool’
71 |     Pix *image = TesseractTool::qImage2PIX(context.selectedScreenshotArea().toImage());
|                                 ^~~~~~~~~~
make[2]: *** [src/CMakeFiles/flameshot.dir/build.make:1222: src/CMakeFiles/flameshot.dir/tools/ocr/ocrtool.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:404: src/CMakeFiles/flameshot.dir/all] Error 2
make: *** [Makefile:171: all] Error 2

tesseractApi->SetImage(image);

outText = tesseractApi->GetUTF8Text();
printf("OCR output:\n%s", outText);

const QString qString = outText;

ScreenshotSaver().saveToClipboard(qString);

tesseractApi->End();
delete tesseractApi;
delete [] outText;
}
41 changes: 41 additions & 0 deletions src/tools/ocr/ocrtool.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
// Copyright(c) 2017-2019 Alejandro Sirgo Rica & Contributors
//
// This file is part of Flameshot.
//
// Flameshot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// Flameshot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with Flameshot. If not, see <http://www.gnu.org/licenses/>.

#pragma once

#include "src/tools/abstractactiontool.h"

class OcrTool : public AbstractActionTool
{
Q_OBJECT
public:
explicit OcrTool(QObject* parent = nullptr);

bool closeOnButtonPressed() const;

QIcon icon(const QColor& background, bool inEditor) const override;
QString name() const override;
QString description() const override;

CaptureTool* copy(QObject* parent = nullptr) override;

protected:
ToolType nameID() const override;

public slots:
void pressed(const CaptureContext& context) override;
};
43 changes: 43 additions & 0 deletions src/tools/ocr/tesseract_tool.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
#pragma once

#include <QPainter>
#include <leptonica/allheaders.h>
#include <tesseract/baseapi.h>

// Thanks to Stackoverflow user "user898678": https://stackoverflow.com/a/10019508
// https://github.com/zdenop/qt-box-editor/blob/master/src/TessTools.cpp
// LICENSE: https://github.com/zdenop/qt-box-editor/blob/master/LICENSE
// modified version

class TesseractTool
{
public:
static PIX* qImageToPIX(const QImage& qImage)
{
PIX* pixs;
l_uint32* lines;

QImage qImageCopy = qImage.copy();

qImageCopy = qImageCopy.rgbSwapped();
int width = qImageCopy.width();
int height = qImageCopy.height();
int depth = qImageCopy.depth();
int wpl = qImageCopy.bytesPerLine() / 4;

pixs = pixCreate(width, height, depth);
pixSetWpl(pixs, wpl);
pixSetColormap(pixs, NULL);
l_uint32* datas = pixs->data;

for (int y = 0; y < height; y++) {
lines = datas + y * wpl;
QByteArray a((const char*)qImageCopy.scanLine(y),
qImageCopy.bytesPerLine());
for (int j = 0; j < a.size(); j++) {
*((l_uint8*)lines + j) = a[j];
}
}
return pixEndianByteSwapNew(pixs);
}
};
4 changes: 4 additions & 0 deletions src/tools/toolfactory.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
#include "rectangle/rectangletool.h"
#include "redo/redotool.h"
#include "save/savetool.h"
#include "ocr/ocrtool.h"
#include "selection/selectiontool.h"
#include "sizeindicator/sizeindicatortool.h"
#include "src/utils/confighandler.h"
Expand Down Expand Up @@ -110,6 +111,9 @@ CaptureTool* ToolFactory::CreateTool(CaptureToolButton::ButtonType t,
case CaptureToolButton::TYPE_CIRCLECOUNT:
tool = new CircleCountTool(parent);
break;
case CaptureToolButton::TYPE_OCR:
tool = new OcrTool(parent);
break;
default:
tool = nullptr;
break;
Expand Down
3 changes: 2 additions & 1 deletion src/utils/confighandler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,8 @@ QVector<CaptureToolButton::ButtonType> ConfigHandler::getButtons()
<< CaptureToolButton::TYPE_OPEN_APP
#endif
<< CaptureToolButton::TYPE_PIN << CaptureToolButton::TYPE_TEXT
<< CaptureToolButton::TYPE_CIRCLECOUNT;
<< CaptureToolButton::TYPE_CIRCLECOUNT
<< CaptureToolButton::TYPE_OCR;
}

using bt = CaptureToolButton::ButtonType;
Expand Down
3 changes: 3 additions & 0 deletions src/utils/configshortcuts.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,9 @@ const QKeySequence& ConfigShortcuts::captureShortcutDefault(
case CaptureToolButton::ButtonType::TYPE_TEXT:
m_ks = QKeySequence(Qt::Key_T);
break;
case CaptureToolButton::ButtonType::TYPE_OCR:
m_ks = QKeySequence(Qt::CTRL + Qt::Key_T);
break;
default:
break;
}
Expand Down
7 changes: 7 additions & 0 deletions src/utils/screenshotsaver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,13 @@ void ScreenshotSaver::saveToClipboard(const QPixmap& capture)
}
}

void ScreenshotSaver::saveToClipboard(const QString& text)
{
SystemNotification().sendMessage(
QObject::tr("Text saved to clipboard"));
QApplication::clipboard()->setText(text);
}

bool ScreenshotSaver::saveToFilesystem(const QPixmap& capture,
const QString& path,
const QString& messagePrefix)
Expand Down
1 change: 1 addition & 0 deletions src/utils/screenshotsaver.h
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ class ScreenshotSaver
ScreenshotSaver(const unsigned id);

void saveToClipboard(const QPixmap& capture);
void saveToClipboard(const QString& text);
bool saveToFilesystem(const QPixmap& capture,
const QString& path,
const QString& messagePrefix);
Expand Down
3 changes: 3 additions & 0 deletions src/widgets/capture/capturetoolbutton.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -138,8 +138,10 @@ static std::map<CaptureToolButton::ButtonType, int> buttonTypeOrder
defined(Q_OS_MACX))
{ CaptureToolButton::TYPE_OPEN_APP, 17 },
{ CaptureToolButton::TYPE_EXIT, 18 }, { CaptureToolButton::TYPE_PIN, 19 },
{ CaptureToolButton::TYPE_OCR, 20 },
#else
{ CaptureToolButton::TYPE_EXIT, 17 }, { CaptureToolButton::TYPE_PIN, 18 },
{ CaptureToolButton::TYPE_OCR, 19 },
#endif
};

Expand Down Expand Up @@ -175,4 +177,5 @@ QVector<CaptureToolButton::ButtonType>
#endif
CaptureToolButton::TYPE_PIN,
CaptureToolButton::TYPE_CIRCLECOUNT,
CaptureToolButton::TYPE_OCR
};
3 changes: 2 additions & 1 deletion src/widgets/capture/capturetoolbutton.h
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,8 @@ class CaptureToolButton : public CaptureButton
TYPE_REDO = 16,
TYPE_PIN = 17,
TYPE_TEXT = 18,
TYPE_CIRCLECOUNT = 19
TYPE_CIRCLECOUNT = 19,
TYPE_OCR = 20
};
Q_ENUM(ButtonType)

Expand Down
17 changes: 17 additions & 0 deletions src/widgets/capture/capturewidget.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,7 @@ void CaptureWidget::updateButtons()
case CaptureToolButton::ButtonType::TYPE_UNDO:
case CaptureToolButton::ButtonType::TYPE_REDO:
case CaptureToolButton::ButtonType::TYPE_IMAGEUPLOADER:
case CaptureToolButton::ButtonType::TYPE_OCR:
// nothing to do, just skip non-dynamic buttons with existing
// hard coded slots
break;
Expand Down Expand Up @@ -1045,6 +1046,10 @@ void CaptureWidget::initShortcuts()
QVariant::fromValue(CaptureToolButton::ButtonType::TYPE_IMAGEUPLOADER)
.toString());

shortcut = ConfigHandler().shortcut(
QVariant::fromValue(CaptureToolButton::ButtonType::TYPE_OCR).toString());
new QShortcut(QKeySequence(shortcut), this, SLOT(ocr()));

new QShortcut(QKeySequence(ConfigHandler().shortcut("TYPE_TOGGLE_PANEL")),
this,
SLOT(togglePanel()));
Expand Down Expand Up @@ -1208,6 +1213,18 @@ void CaptureWidget::saveScreenshot()
close();
}

void CaptureWidget::ocr()
{
m_captureDone = true;
if (m_activeTool != nullptr) {
QPainter painter(&m_context.screenshot);
m_activeTool->process(painter, m_context.screenshot, true);
}
hide();
// Process here!
close();
}

void CaptureWidget::undo()
{
m_undoStack.undo();
Expand Down
1 change: 1 addition & 0 deletions src/widgets/capture/capturewidget.h
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ private slots:
// TODO replace with tools
void copyScreenshot();
void saveScreenshot();
void ocr();
void undo();
void redo();
void togglePanel();
Expand Down