mishmash-io · arusevm · Oct 17, 2024 · Oct 17, 2024 · Oct 17, 2024 · Oct 17, 2024
diff --git a/README.md b/README.md
@@ -1,18 +1,18 @@
-# OpenTelemetry Data Sources for Java
+# OpenTelemetry Data Sources for Java (and other)
 
 This repository contains [OpenTelemetry](https://opentelemetry.io/) servers that can be embedded into other Java-based systems to act as data sources for `logs`, `metrics`, `traces` and `profiles` signals.
 
 Here you can also find implementations of such data sources for a few popular open source softwares and additional tools to use when working with OpenTelemetry data.
 
+You will also find additional tools, examples and demos that might be of service on your own OpenTelemetry journey.
+
 > [!TIP]
 > This is a public release of code we have accumulated internally over time and so far contains only a limited subset of what we intend to share.
 >
 > Examples of internal software that will be published here in the near future include:
 > 
 > - A small OTLP server based on [Apache BookKeeper](https://bookkeeper.apache.org/) for improved
 >   data ingestion reliability, even across node failures
-> - [Apache Superset](https://superset.apache.org/) charts and dashboards for OpenTelemetry
->   visualizations
 > - OpenTelemetry Data Sources for [Apache Pulsar](https://pulsar.apache.org/) for when more
 >   more complex preprocessing is needed
 > - Our [Testcontainers](https://testcontainers.com/) implementations that you can use to
@@ -29,6 +29,7 @@ Here you can also find implementations of such data sources for a few popular op
   - [Embed OTLP collectors in Java systems](#embeddable-collectors)
   - [Save OpenTelemetry to Apache Parquet files](#apache-parquet-stand-alone-server)
   - [Ingest OpenTelemetry into Apache Druid](#apache-druid-otlp-input-format)
+  - [Visualize OpenTelemetry with Apache Superset](#apache-superset-charts-and-dashboards)
 - [More about OpenTelemetry at mishmash io](#opentelemetry-at-mishmash-io)
 
 # Why you should switch to OpenTelemetry
@@ -78,6 +79,13 @@ using the [Apache Parquet Stand-alone server](./server-parquet) contained in thi
 
 If you are the sort of person who prefers to learn by looking at **actual data** - start with the [OpenTelemetry Basics Notebook.](./examples/notebooks/basics.ipynb)
 
+> [!TIP]
+> If you're wondering how to get your first OpenTelemetry data sets - check out [our fork of OpenTelemetry's Demo app.](https://github.com/mishmash-io/opentelemetry-demos)
+>
+> In there you will find complete deployments that will generate signals, save them and let you play with the data - by writing your own notebooks or creating
+> Apache Superset dashboards.
+> 
+
 # Artifacts
 
 ## Embeddable collectors
@@ -94,6 +102,7 @@ It is not intended for production use, but rather as a quick tool to save and ex
 Parquet files as saved by this Stand-alone server.
 - [README](./server-parquet)
 - [Javadoc on javadoc.io](https://javadoc.io/doc/io.mishmash.opentelemetry/server-parquet)
+- [Quick deployment with a demo app](https://github.com/mishmash-io/opentelemetry-demos)
 
 ## Apache Druid OTLP Input Format
 
@@ -112,7 +121,20 @@ like with [Apache BookKeeper](https://bookkeeper.apache.org) or [Apache Pulsar](
 Find out more about the OTLP Input Format for Apache Druid:
 - [README](./druid-otlp-format)
 - [Javadoc on javadoc.io](https://javadoc.io/doc/io.mishmash.opentelemetry/druid-otlp-format)
+- [Quick deployment with a demo app and Apache Superset](https://github.com/mishmash-io/opentelemetry-demos)
+
+## Apache Superset charts and dashboards
 
+![superset-dashboard](https://github.com/user-attachments/assets/8dba1e13-bcb3-41c9-ac40-0c023a3825c8)
+
+[Apache Superset](https://superset.apache.org/) is an open-source modern data exploration and visualization platform.
+
+You can use its rich visualizations, no-code viz builder and its powerful SQL IDE to build your own OpenTelemetry analytics.
+
+To get you started, we're publishing [data sources and visualizations](./superset-visualizations) that you can import into Apache Superset.
+
+- [Quick deployment with a demo app](https://github.com/mishmash-io/opentelemetry-demos)
+
 # OpenTelemetry at mishmash io
 
 OpenTelemetry's main intent is the observability of production environments, but at [mishmash io](https://mishmash.io) it is part of our software development process. By saving telemetry from  **experiments** and **tests** of 

diff --git a/druid-otlp-format/DRUID_GUI_SETUP.md b/druid-otlp-format/DRUID_GUI_SETUP.md
@@ -0,0 +1,163 @@
+# Using the Apache Druid GUI to setup OpenTelemetry ingestion
+
+This is a short, visual guide on how to use the [Apache Druid] GUI console to setup OpenTelemetry ingestion jobs using the
+[druid-otlp-format extension](./README.md), developed by [mishmash io](https://mishmash.io).
+
+To find more about the open source software contained in this repository - [click here.](../)
+
+# Intro
+
+At the time of writing the Druid GUI console does not directly allow configuring OTLP ingestion jobs. However, with a little 'hack',
+it is possible.
+
+Follow the steps below.
+
+> [!WARNING]
+> The guide here assumes you will be ingesting OpenTelemetry data published to Apache Kafka by the [Kafka Exporter.](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/kafkaexporter/README.md)
+>
+> In other setups the steps might be different.
+> 
+
+# Walk-through
+
+#### 1. Open the console
+
+Open the Druid GUI, select `Load data` from the top menu and then click `Streaming` in the dropdown:
+
+![druid-step-1](https://github.com/user-attachments/assets/e0e6ed38-18da-4d99-b4f9-5c22f2510386)
+
+#### 2. Start a new streaming spec
+
+Click on the `Start a new streaming spec`:
+
+![druid-step-2](https://github.com/user-attachments/assets/38b54817-d126-4bab-92c6-5dbae6c169e7)
+
+#### 3. Select the source
+
+When you start a new streaming spec you're given a choice to select where the data should come from. Click `Apache Kafka`:
+
+![druid-step-3](https://github.com/user-attachments/assets/c9c16edc-b1b0-4478-be76-78b9377c1623)
+
+#### 4. Connect to Kafka
+
+To connect Druid to Kafka - enter your Kafka brokers and topic name in the right panel:
+
+![druid-step-4](https://github.com/user-attachments/assets/06d63477-e188-44b2-be39-699c8e4bede0)
+
+Click `Apply` when done. This will trigger Druid to connect and in a little while (if the connection was successful), Druid will
+show some examples of what it read from the configured Kafka topic:
+
+![druid-step-5](https://github.com/user-attachments/assets/c3280ab8-6372-46a3-aa81-f8b6c0b6a845)
+
+At this step - the data will be messy - Druid doesn't yet know how to interpret it. It's okay, click the `Next: Parse data`
+button in the right panel.
+
+#### 5. Apply the OTLP format
+
+Now, you're given a choice of what `Input format` to apply on the incoming data:
+
+![druid-step-6](https://github.com/user-attachments/assets/c870c29a-2199-4b03-801b-ffd38dbea8d4)
+
+Here's the tricky part - the `otlp` format is not available inside the `Input format` dropdown. You'll have to edit a bit of JSON
+in order to get past this step.
+
+On the bar just under the top menu, click on the final step - `Edit spec`. A JSON representation of the ingestion spec will be
+shown to you. Edit the `inputFormat` JSON object:
+
+![druid-step-7](https://github.com/user-attachments/assets/3429a010-5fa3-407d-ae3e-02f0eff1b8a7)
+
+Use one of the following JSONs:
+- when ingesting `logs`:
+  ```json
+  ...
+  "inputFormat": {
+    "type": "otlp",
+    "otlpInputSignal": "logsRaw"
+  }
+  ...
+  ```
+- when ingesting `metrics`:
+  ```json
+  ...
+  "inputFormat": {
+    "type": "otlp",
+    "otlpInputSignal": "metricsRaw"
+  }
+  ...
+  ```
+- when ingesting `traces`:
+  ```json
+  ...
+  "inputFormat": {
+    "type": "otlp",
+    "otlpInputSignal": "tracesRaw"
+  }
+  ...
+  ```
+
+When done editing, just get back to the `Parse data` step, you don't need to click anywhere to confirm the changes you made.
+
+Now, Druid will re-read the sampled Kafka messages and will parse them:
+
+![druid-step-8](https://github.com/user-attachments/assets/d015b715-da30-4e15-bd06-861e1879350d)
+
+At this point - click the `Next: Parse time` button in the right panel to continue.
+
+#### 6. Configure the data schema
+
+Now, Druid needs to know a few more things about how you'd like your data to be organized.
+
+Select which column will be used for its **time-partitioning.** For example, click on the `time_unix_nano` column header,
+make sure it shows in the right bar, and that `Format` says `nano`. Then click 'Apply`:
+
+![druid-step-9](https://github.com/user-attachments/assets/016d6c6d-69d9-4fce-b63c-3b9319bb8103)
+
+When done, click `Next: Transform`:
+
+![druid-step-10](https://github.com/user-attachments/assets/6c166d4a-65b8-41cb-b69d-492a17a665a8)
+
+Here you can configure optional transformations of the data before saving it inside Druid. Continue to the next step - `Filter`.
+
+![druid-step-11](https://github.com/user-attachments/assets/879df0e6-834a-4cb2-b95a-61d94a5b1947)
+
+Filtering is also an optional step. Continue to the `Configure schema` step.
+
+Here you can edit the table schema. In the bar on the right - turn off the `Explicitly specify schema` radio button:
+
+![druid-step-12pre](https://github.com/user-attachments/assets/7c10f38d-4ae9-4e46-82bd-51dd25ac5645)
+
+Doing this will trigger a dialog to pop up, just confirm by clicking `Yes - auto detect schema`:
+
+![druid-step-12](https://github.com/user-attachments/assets/828a8c9a-ff25-4821-bb7c-fe789eb04000)
+
+Then proceed to the `Partition` step. Select a `Primary partitioning (by time)`. In this example, we're setting it to `hour`
+for hourly partitions:
+
+![druid-step-13](https://github.com/user-attachments/assets/6c6d1492-63fb-4188-baf4-b9b00eec53ce)
+
+Continue to `Tune`:
+
+![druid-step-14](https://github.com/user-attachments/assets/a67669ec-b7c4-4d02-bba5-3a1039de48ff)
+
+Set the `Use earliest offset` to `True` and continue to `Publish`:
+
+![druid-step-15](https://github.com/user-attachments/assets/1d49b0c0-02e0-487a-8bef-1bc75e0b126f)
+
+See if you would like to rename the `Datasource name` or any of the other options, but the defaults should be okay. Move to the
+next step - `Edit spec`:
+
+![druid-step-16](https://github.com/user-attachments/assets/e8628c83-ec73-4cc7-9b90-c62d76ab8fba)
+
+This is the same step we went to earlier in order to set the `InputFormat` to `otlp`. This time - there's nothing to do here,
+so, just click `Submit` in the right panel and that's it!
+
+At the end, you should get a screen similar to this:
+
+![druid-step-17](https://github.com/user-attachments/assets/330b2629-9cbc-419c-b29c-99fdf8e6aaff)
+
+##### Congratulations! :)
+
+You can now start querying OpenTelemetry data! :)
+
+
+
diff --git a/druid-otlp-format/README.md b/druid-otlp-format/README.md
@@ -1,5 +1,7 @@
 # Apache Druid extension for OpenTelemetry singals ingestion
 
+![druid-otlp-ingestion](https://github.com/user-attachments/assets/1b6d064e-7335-4365-a694-8cc2eebf1348)
+
 This artifact implements an Apache Druid extension that you can use to ingest 
 [OpenTelemetry](https://opentelemetry.io) signals - `logs`, `metrics`, `traces` and `profiles` - into Apache Druid, and then query Druid through interactive charts and dashboards.
 
@@ -37,6 +39,8 @@ To get an idea of why and when to use this Druid extension - here is an example
 5. Setup [Apache Superset](https://superset.apache.org/) with a [Druid database driver.](https://superset.apache.org/docs/configuration/databases#apache-druid)
 6. Explore your telemetry in Superset!
 
+![superset-dashboard](https://github.com/user-attachments/assets/8dba1e13-bcb3-41c9-ac40-0c023a3825c8)
+
 > [!TIP]
 > We have prepared a clone of [OpenTelemetry's demo app](https://opentelemetry.io/docs/demo/) with
 > the exact same setup as above.
@@ -290,6 +294,8 @@ that is being prepared. It has the same format as above and you can paste the co
 parameters (take the correct config from above). Once you do that - switch back to the `Parse data`
 tab and voila!
 
+See a [step by step Druid GUI console guide here.](./DRUID_GUI_SETUP.md)
+
 # OpenTelemetry at mishmash io
 
 OpenTelemetry's main intent is the observability of production environments, but at [mishmash io](https://mishmash.io) it is part of our software development process. By saving telemetry from  **experiments** and **tests** of