apache / spark
Apache Spark - A unified analytics engine for large-scale data processing
See what the GitHub community is most excited about today.
Apache Spark - A unified analytics engine for large-scale data processing
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Modern Load Testing as Code
Mill is a fast JVM build tool that supports Java and Scala. 2-4x faster than Gradle and 4-10x faster than Maven for common workflows, Mill aims to make your project’s build process performant, maintainable, and flexible
The Scala 3 compiler, also known as Dotty.
Open-source high-performance RISC-V processor
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
Open-source code analysis platform for C/C++/Java/Binary/Javascript/Python/Kotlin based on code property graphs. Discord https://discord.gg/vv4MH284Hc
A Spark plugin for reading and writing Excel files
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
Scala language server with rich IDE features 🚀
Spark RAPIDS plugin - accelerate Apache Spark with GPUs
An open source RESTful API platform for banks that supports Open Banking, XS2A and PSD2 through access to accounts, transactions, counterparties, payments, entitlements and metadata - plus a host of internal banking and management APIs.
An open protocol for secure data sharing
ZIO — A type-safe, composable library for async and concurrent programming in Scala
CMAK is a tool for managing Apache Kafka clusters
A Git platform powered by Scala with easy installation, high extensibility & GitHub API compatibility