We are creating a YaCy portal which can be used to crawl the web and evaluate everyting that we find in multiple ways. It will be a Search-as-a-Service portal that is hosted online but can also be downloaded by everyone.
What you can find here is the early stage of development.
The searchlab will make use of the existing YaCy Grid search engine technology. The public search portal will provide data science dashboards and user accounts. All elements are free software and hosted in this repository or other repositories of the github.com/yacy organization.
To follow the implementation process, have a look at the milestones M1-M6 within the https://github.com/yacy/searchlab/issues issues.
To read more details about the project, visit https://searchlab.eu/en/access/about/
The searchlab application (this repository) was made as the front-end for the YaCy Grid ecosystem. It uses mainly the following components:
- YaCy Searchlab - this project
- YaCy Grid Crawler https://github.com/yacy/yacy_grid_crawler
- YaCy Grid Loader https://github.com/yacy/yacy_grid_loader
- YaCy Grid Parser https://github.com/yacy/yacy_grid_parser
- a S3 store, we are using minio from https://min.io/ but you will be able to use any other implementation
- an elasticsearch instance, we will use opensearch from https://opensearch.org/
A careful selection of the correct web design, an appropriate application server and overall web technology for a typical full-stack application had to be made. We refrained from complex one-page node-based front-end application schema and created instead a more classical design using server-rendered web pages and a static-code generator together with modern data-driven concepts and API designes.
- the web front-end is created using the static code generator MKDocs. Its source path is
ui
. - the template for the web front-end is based on MkDocs-Theme-Cinder-Superhero which is a privacy-aware combination of the Cinder Bootstrap-Theme for MKdocs with a theme adoption to turn it into a dark-mode version of Cinder using design ideas of Superhero.
- the back-end server is written in java and uses Undertow as web server.
- content within the mkdocs can use the handlebars template engine for dynamic/server-side content management. This feature has two elements:
- a handlebars template engine integration in the undertow server usage
- an api concept where each web page that uses undertow requires a json API which provides data for the undertow template. Even if the undertow template process also runs server-side, the API for the content that is handles must be provided as an external API.
- server-side includes allow the integration of server-rendered add-on content. This can be used to inject tablesaw- or plotly-generated html (see below).
- as a server-internal data structure, tablesaw provides data table libraries for data science functions. This library allows the ouput of plotly-based time-series data (see below).
- plotly graphs to visualize tables as graphs are added by tablesaw
- to further provide an excel-like experience to users who require this approach Bootstrap Table is used for extended table visualization. This contains a large variety of search end export function.
- To visualize workflows, we integrated also Mermaid for diagrams using text and code inside the MKDocs code.
The source code is released simply by providing a git clone opportunity using this github account. To get the source code, just run
git clone https://github.com/yacy/searchlab.git
git clone https://github.com/yacy/searchlab_apps.git
If you just want to download a zip file with all source, use this link: https://github.com/yacy/searchlab/archive/refs/heads/master.zip
To build the searchlab, you need the following components:
- python 3 and mkdocs which can simply be installed with
pip install mkdocs
- java 8 (or higher) which can be obtained i.e. from https://adoptium.net/
The application is build in two steps:
- first, the static web pages must be created:
cd ui
mkdocs build
- second, the server must be compiled
./gradlew assemble
- finally, the application can be started with
./gradlew run
The searchlab application can then be accessed at http://localhost:8400/
A docker release can be produced in one simple step: just run
cd ..
docker build -t searchlab -f searchlab/Dockerfile .
The image MUST be build from a directory path below the application folder.
The repository searchlab_apps
must exist in parallel to searchlab
.
Then a docker image will be in your local docker image store which can be started with
docker run -d --rm -p 8400:8400 --name searchlab searchlab
The searchlab application can be accessed at http://localhost:8400/
We publish docker images of the searchlab application also at dockerhub which can be obtained simply with
docker run -d --rm -p 8400:8400 --name searchlab yacy/searchlab