Skip to content
Tomas Tulka edited this page Feb 14, 2022 · 43 revisions
  • Cloud is about where we're computing. Cloud-native is about how.
  • Cloud is a kind of deployment, must be decoupled.

Cloud Native

Cloud Native => designed to thrive in a cloud-based production environment.

  • One can implement a cloud-native solution on-premise.
  • Not all software should be cloud-native.
  • Store config in the environment, not in properties files (they're code); use properties (yaml) only to create a hierarchical abstraction.
  • Use a configuration server (like Spring Cloud Config Server).

Minimizing the strike price - switching cost from one vendor to another - is often seen as the architectural ideal, but it's rarely the most economical choice. [b]

The Twelve Factors App

Principles for building software-as-a-service.

  1. Codebase
  • One mainline in revision control, many deploys
  • No need to recompile the app for different environments
  1. Dependencies
  • Explicit declared and isolated
  • No implicit system-wide packages
  1. Config
  • Store config in the environment
  • Anything which differs in the dev/prod environment
  • Property files define variables that can be used throughout the code, and the values are bound to those variables from the most appropriate source at the right times.
  1. Backing Services
  • Treat backing services as attached resources
  • Reference thru URL, never locally
  • Embedded, without a code-change need
  • Examples: databases, REST-services, SMTP servers, ...
  1. Design, Build, Release, Run
  • Strictly separate stages
  1. Processes
  • Execute the app as one or more stateless processes
  • Design for failures
  • Shared-nothing architecture
  • Routing
  1. Port Binding
  • Export services via port-binding
  • Use a port per service
  • Mapping is done by the platform
  1. Concurrency
  • Ability to horizontal scaling
  • Scale out via the process model
  1. Disposability
  • Maximize robustness with fast startup and graceful shutdown
  1. Dev/Prov Parity
  • Design for continuous delivery
  • Stages as similar as possible
  1. Logs
  • Treat logs as event streams
  • Log to the stdout and stderr - the platform implements them
  1. Admin Processes
  • Run admin/management tasks as one-off processes
  • Sidecar is a process that runs alongside a main service
  • Not very Java relevant
  1. Audit
  • Audit all actions and changes
  1. Security
  • AuthN/AuthZ
  1. API First
  • Design APIs so, that they don't need to be changed over time
  • Avoid API versioning

Well-Architected Framework Pillars

Security

  • Identity and access management
  • Detective controls (searching for security issues)
  • Infrastructure protection (firewalls, gateways)
  • Data protection (encryption, backup, replication)
  • Incident response process

Security Principles

  • Security first
  • Security at all layers (among components)
  • Automate security
  • Traceability (log and audit all actions and changes)
  • Principle of least privilege (each actor has only the access necessarily needed to perform the intended action)

Reliability

  • What could fail, will fail.
  • Network fails.
  • Instead of bringing the entire system down, each component should be designed to gracefully degrade. For example (Netflix): during traffic surges instead of showing a list of movies personalized to the user, a static content is shown.

Reliability Principles

  • Test recovery (simulate failures)
  • Automatically recover
  • Scale horizontally (distribute big resources among more smaller)
  • Automate changes in architecture and infrastructure

Digital Transformation

Adopting new technology is fairly easy. Adjusting your way of thinking and working takes much more time. [a]

Digital companies like FAANG have changed the existing business model, not only the medium.

Organizations don't transform by sticking new labels on existing structures. [a]

Digital companies don't have the luxury of a well-defined target picture. Rather, they live in a world of constant change. [a]

Instead of being a good guesser, companies need to be fast learners so that they can figure out quickly what works and what doesn't. [a]

Organizations that look at the cloud from an infrastructure point of view, seeing servers, storage, and network, will thus miss out on the key benefits of cloud platforms. Instead, organizations should take an application-centric view on their cloud strategy. [a]

Moving to the cloud with your old way of working is the cardinal sin of cloud migration: you'll get yet another data center. [a]

A cloud strategy's objective should not be limited to getting existing applications to run in the cloud somehow. Rather, applications should measurably benefit from running in the cloud; for example, by scaling horizontally or becoming resilient thanks to a globally distributed deployment. [a]

You should be cautious not to transform your existing operating model to the cloud. Instead, you need to bring some elements of the cloud operating model to your environment. [a]

People who sell you stuff will promote an ideal that's rare and difficult, if not impossible, to achieve. [a]

It's best to first get things rolling, perhaps with fewer constraints and more handholding than desired, and then define governance based on what has been learned. [a]

Folks coming from a digital company know the target picture but they don't know the path, especially from your starting point. [a]

The friction that's likely to exist in the organization equates to everyone wading knee-deep in the mud. In such an environment a superstar sprinter won't be moving a whole lot faster than the rest of the folks. Instead, they will grow frustrated and soon leave. [a]

Don't build elaborate APIs that mimic the back-end system's design. Instead, build consumer-driven APIs that provide the services in a format that the front-ends prefer. [a]

Interfaces should reflect where we are heading, not where we came from.

The most expensive server is the one that's not doing anything. Even at a 30% discount. [a]

Ways to split a hybrid cloud:

  1. Tier: Front vs Back
  2. Generation: New vs Old
  3. Criticality: Non-critical vs Critical
  4. Life Cycle: Development vs Production
  5. Data classification: Non-sensitive vs Sensitive
  6. Data freshness: Backup vs Operational
  7. Operational state: Disaster Recovery vs Business as usual
  8. Workload demand: Burst vs Steady

Multi-Tenancy Patterns

+------+ +------+     +------+ +------+     +---------------+
|      | |      |     |tenant| |tenant|     | tenant tenant |
|tenant| |tenant|     +------+ +------+     | tenant tenant |
|      | |      |     +---------------+     | tenant tenant |
|      | |      |     | tenant tenant |     | tenant tenant |
+------+ +------+     +---------------+     +---------------+
       Silo                 Bridge                 Pool

Silo

  • Isolated

Bridge

  • Hybrid
  • Some layers are isolated, some are shared

Pool

  • Shared infrastructure

Principles

  • Access policies for a tenant to restrict access to resources (AWS IAM for example)
  • Token with the tenant identity (JWT)
  • Identity broker pattern (decoupling the app from identity providers)
  • Tag resources
  • Management & monitoring on the top and on tenant level, both

Isolation (thru policies)

  1. Account isolation (complex, not well scalable)
  2. Hybrid (pooled, silos in a pool)
  3. Network isolation (more VPNs, or/and more sub-nets)
  4. Layered (shared and isolated layers)

Data Partioning

  1. Separate databases for each user
  2. Single database, multiple schemas
  3. Shared database, a single schema (ForeignKey to TenantId)

Serverless

Serverless = FaaS + BaaS

In the cloud context, serverful computing is like programming in low-level assembly language whereas serverless computing is like programming in a higher-level language such as Python.

FaaS is about running backend code without managing your own server systems or your own server applications.

  • The app is stateless, the state lives in data services.
  • Modularize code into separate functions outside of the handler.
  • Orchestrate the application with state machines (AWS Step Functions), not within the functions (Lambdas) - chaining function executions within the code results in a monolithic and tightly coupled application.
  • State machines should be used to minimize the amount of custom try/catch, back-off, and retries within your serverless applications.
  • Saga pattern can also be achieved by using state machines, which will decouple and simplify the logic of the application.
  • Reduce costs by implementing the waiting state using state machines.
  • Reduce unnecessary function invocations by leveraging caching when possible at the API level (AWS API Gateway).
  • Lambda environment variables help separate source code from configuration.
  • Establish externalized connection code (such as a connection pool to a relational database) referenced in the Lambda function’s static constructor/initialization code (that is, global scope, outside the handler) to reuse.

BaaS means specialized frameworks that cater to specific application requirements. It contains services such as storage management, database management, etc.

Security

  • Follow least-privileged access and strictly allow only what’s necessary to perform a given operation.

Deployment

  • Separate API Gateway endpoints, Lambda functions, and state machines for each stage over aliases and versions alone.
  • API Gateway stage variables and Lambda aliases/versions should not be used to separate stages as they can add additional operational and tooling complexity including reduced monitoring visibility as a unit of deployment.

Serverless Deployment Best Practices

Testing

  • Integration tests shouldn’t mock services you can’t control, since they may change and may provide unexpected results. These tests are better performed when using real services because they can provide the exact same environment a serverless application would use when processing requests in production.
  • Acceptance or end-to-end tests should be performed without any changes because the primary goal is to simulate end users’ actions through the external interface available to them.

DataLake

A data lake is a specific architectural approach designed to create a centralized repository of all potentially relevant data available from enterprise and public sources, which can then be organized, discovered, analyzed, understood, and leveraged by the business.

  • All data in one place (as S3)

  • Heterogeneous (schemaless) data in a centralized repository

  • Schema is created on read, not on write

  • Data in the DataLake are immutable.

  • DataLake consists from packages, packages contain datasets (files) and are tagged (metadata).

  • Storage is decoupled from computing.

DataLake on AWS

datalake catalogue and search architecture

DataLake vs. Data Warehouse

  • In a data warehouse data are cleaned by putting them into the warehouse, same schema is then used for all users.
  • DataLake contains all the data in the raw form; schema is create on the fly for the concrete user.

DataLake vs Data Warehouse

Lambda Architecture

Lambda Architecture Lambda Architecture Flow

Lambda Architecture without the batch layer is called Kappa Architecture.

Amazon Web Services (AWS)

Amazon Cognito

Amazon Cognito lets you easily add user sign-up and sign-in and manage permissions for your mobile and web apps. You can create your own user directory within Amazon Cognito. You can also choose to authenticate users through social identity providers such as Facebook, or Amazon; with SAML identity solutions; or by using your own identity system.

Amazon Cognito enables you to save data locally on users' devices, allowing your applications to work even when the devices are offline. You can then synchronize data across users' devices so that their app experience remains consistent regardless of the device they use.

User Pool is a user directory that you can use to sign up and sign in users and to manage user profiles. User pools also provide tokens for your users when they sign in. You can use these tokens to control access for the user (via your app) to resources such as backend APIs. User pools can contain both native users who sign in directly (i.e., with a username and password stored in the user pool) and federated users who sign in via an external identity provider.

References