Cloud

Cloud is about where we're computing. Cloud-native is about how.
Cloud is a kind of deployment, must be decoupled.

Cloud Native

Cloud Native => designed to thrive in a cloud-based production environment.

One can implement a cloud-native solution on-premise.
Not all software should be cloud-native.
Store config in the environment, not in properties files (they're code); use properties (yaml) only to create a hierarchical abstraction.
Use a configuration server (like Spring Cloud Config Server).

Minimizing the strike price - switching cost from one vendor to another - is often seen as the architectural ideal, but it's rarely the most economical choice. [b]

The Twelve Factors App

Principles for building software-as-a-service.

Codebase

One mainline in revision control, many deploys
No need to recompile the app for different environments

Dependencies

Explicit declared and isolated
No implicit system-wide packages

Config

Store config in the environment
Anything which differs in the dev/prod environment
Property files define variables that can be used throughout the code, and the values are bound to those variables from the most appropriate source at the right times.

Backing Services

Treat backing services as attached resources
Reference thru URL, never locally
Embedded, without a code-change need
Examples: databases, REST-services, SMTP servers, ...

Design, Build, Release, Run

Strictly separate stages

Processes

Execute the app as one or more stateless processes
Design for failures
Shared-nothing architecture
Routing

Port Binding

Export services via port-binding
Use a port per service
Mapping is done by the platform

Concurrency

Ability to horizontal scaling
Scale out via the process model

Disposability

Maximize robustness with fast startup and graceful shutdown

Dev/Prov Parity

Design for continuous delivery
Stages as similar as possible

Logs

Treat logs as event streams
Log to the stdout and stderr - the platform implements them

Admin Processes

Run admin/management tasks as one-off processes
Sidecar is a process that runs alongside a main service
Not very Java relevant

Audit

Audit all actions and changes

Security

AuthN/AuthZ

API First

Design APIs so, that they don't need to be changed over time
Avoid API versioning

Well-Architected Framework Pillars

Security

Identity and access management
Detective controls (searching for security issues)
Infrastructure protection (firewalls, gateways)
Data protection (encryption, backup, replication)
Incident response process

Security Principles

Security first
Security at all layers (among components)
Automate security
Traceability (log and audit all actions and changes)
Principle of least privilege (each actor has only the access necessarily needed to perform the intended action)

Reliability

What could fail, will fail.
Network fails.
Instead of bringing the entire system down, each component should be designed to gracefully degrade. For example (Netflix): during traffic surges instead of showing a list of movies personalized to the user, a static content is shown.

Reliability Principles

Test recovery (simulate failures)
Automatically recover
Scale horizontally (distribute big resources among more smaller)
Automate changes in architecture and infrastructure

Digital Transformation

Adopting new technology is fairly easy. Adjusting your way of thinking and working takes much more time. [a]

Digital companies like FAANG have changed the existing business model, not only the medium.

Organizations don't transform by sticking new labels on existing structures. [a]

Digital companies don't have the luxury of a well-defined target picture. Rather, they live in a world of constant change. [a]

Instead of being a good guesser, companies need to be fast learners so that they can figure out quickly what works and what doesn't. [a]

Organizations that look at the cloud from an infrastructure point of view, seeing servers, storage, and network, will thus miss out on the key benefits of cloud platforms. Instead, organizations should take an application-centric view on their cloud strategy. [a]

Moving to the cloud with your old way of working is the cardinal sin of cloud migration: you'll get yet another data center. [a]

A cloud strategy's objective should not be limited to getting existing applications to run in the cloud somehow. Rather, applications should measurably benefit from running in the cloud; for example, by scaling horizontally or becoming resilient thanks to a globally distributed deployment. [a]

You should be cautious not to transform your existing operating model to the cloud. Instead, you need to bring some elements of the cloud operating model to your environment. [a]

People who sell you stuff will promote an ideal that's rare and difficult, if not impossible, to achieve. [a]

It's best to first get things rolling, perhaps with fewer constraints and more handholding than desired, and then define governance based on what has been learned. [a]

Folks coming from a digital company know the target picture but they don't know the path, especially from your starting point. [a]

The friction that's likely to exist in the organization equates to everyone wading knee-deep in the mud. In such an environment a superstar sprinter won't be moving a whole lot faster than the rest of the folks. Instead, they will grow frustrated and soon leave. [a]

Don't build elaborate APIs that mimic the back-end system's design. Instead, build consumer-driven APIs that provide the services in a format that the front-ends prefer. [a]

Interfaces should reflect where we are heading, not where we came from.

The most expensive server is the one that's not doing anything. Even at a 30% discount. [a]

Ways to split a hybrid cloud:

Tier: Front vs Back
Generation: New vs Old
Criticality: Non-critical vs Critical
Life Cycle: Development vs Production
Data classification: Non-sensitive vs Sensitive
Data freshness: Backup vs Operational
Operational state: Disaster Recovery vs Business as usual
Workload demand: Burst vs Steady

Multi-Tenancy Patterns

+------+ +------+     +------+ +------+     +---------------+
|      | |      |     |tenant| |tenant|     | tenant tenant |
|tenant| |tenant|     +------+ +------+     | tenant tenant |
|      | |      |     +---------------+     | tenant tenant |
|      | |      |     | tenant tenant |     | tenant tenant |
+------+ +------+     +---------------+     +---------------+
       Silo                 Bridge                 Pool

Silo

Isolated

Bridge

Hybrid
Some layers are isolated, some are shared

Pool

Shared infrastructure

Principles

Access policies for a tenant to restrict access to resources (AWS IAM for example)
Token with the tenant identity (JWT)
Identity broker pattern (decoupling the app from identity providers)
Tag resources
Management & monitoring on the top and on tenant level, both

Isolation (thru policies)

Account isolation (complex, not well scalable)
Hybrid (pooled, silos in a pool)
Network isolation (more VPNs, or/and more sub-nets)
Layered (shared and isolated layers)

Data Partioning

Separate databases for each user
Single database, multiple schemas
Shared database, a single schema (ForeignKey to TenantId)

Serverless

Serverless = FaaS + BaaS

In the cloud context, serverful computing is like programming in low-level assembly language whereas serverless computing is like programming in a higher-level language such as Python.

FaaS is about running backend code without managing your own server systems or your own server applications.

The app is stateless, the state lives in data services.
Modularize code into separate functions outside of the handler.
Orchestrate the application with state machines (AWS Step Functions), not within the functions (Lambdas) - chaining function executions within the code results in a monolithic and tightly coupled application.
State machines should be used to minimize the amount of custom try/catch, back-off, and retries within your serverless applications.
Saga pattern can also be achieved by using state machines, which will decouple and simplify the logic of the application.
Reduce costs by implementing the waiting state using state machines.
Reduce unnecessary function invocations by leveraging caching when possible at the API level (AWS API Gateway).
Lambda environment variables help separate source code from configuration.
Establish externalized connection code (such as a connection pool to a relational database) referenced in the Lambda function’s static constructor/initialization code (that is, global scope, outside the handler) to reuse.

BaaS means specialized frameworks that cater to specific application requirements. It contains services such as storage management, database management, etc.

Security

Follow least-privileged access and strictly allow only what’s necessary to perform a given operation.

Deployment

Separate API Gateway endpoints, Lambda functions, and state machines for each stage over aliases and versions alone.
API Gateway stage variables and Lambda aliases/versions should not be used to separate stages as they can add additional operational and tooling complexity including reduced monitoring visibility as a unit of deployment.

Serverless Deployment Best Practices

Testing

Integration tests shouldn’t mock services you can’t control, since they may change and may provide unexpected results. These tests are better performed when using real services because they can provide the exact same environment a serverless application would use when processing requests in production.
Acceptance or end-to-end tests should be performed without any changes because the primary goal is to simulate end users’ actions through the external interface available to them.

DataLake

A data lake is a specific architectural approach designed to create a centralized repository of all potentially relevant data available from enterprise and public sources, which can then be organized, discovered, analyzed, understood, and leveraged by the business.

All data in one place (as S3)
Heterogeneous (schemaless) data in a centralized repository
Schema is created on read, not on write
Data in the DataLake are immutable.
DataLake consists from packages, packages contain datasets (files) and are tagged (metadata).
Storage is decoupled from computing.

DataLake on AWS

DataLake vs. Data Warehouse

In a data warehouse data are cleaned by putting them into the warehouse, same schema is then used for all users.
DataLake contains all the data in the raw form; schema is create on the fly for the concrete user.

DataLake vs Data Warehouse

Lambda Architecture

Lambda Architecture Lambda Architecture Flow

Lambda Architecture without the batch layer is called Kappa Architecture.

Amazon Web Services (AWS)

Amazon Cognito

Amazon Cognito lets you easily add user sign-up and sign-in and manage permissions for your mobile and web apps. You can create your own user directory within Amazon Cognito. You can also choose to authenticate users through social identity providers such as Facebook, or Amazon; with SAML identity solutions; or by using your own identity system.

Amazon Cognito enables you to save data locally on users' devices, allowing your applications to work even when the devices are offline. You can then synchronize data across users' devices so that their app experience remains consistent regardless of the device they use.

User Pool is a user directory that you can use to sign up and sign in users and to manage user profiles. User pools also provide tokens for your users when they sign in. You can use these tokens to control access for the user (via your app) to resources such as backend APIs. User pools can contain both native users who sign in directly (i.e., with a username and password stored in the user pool) and federated users who sign in via an external identity provider.

References

Gregor Hohpe: Cloud Strategy [a]
Gregor Hohpe: The Software Architect Elevator [b]
Cornelia Davis: Cloud Native Patterns
https://d0.awsstatic.com/whitepapers/architecture/AWS_Well-Architected_Framework.pdf
https://d1.awsstatic.com/whitepapers/architecture/AWS-Serverless-Applications-Lens.pdf
https://12factor.net
Kenny Bastani, Josh Long: Cloud Native Java
https://martinfowler.com/articles/serverless.html
Tomcy John, Pankaj Misra: Data Lake for Enterprises
https://martinfowler.com/bliki/DataLake.html
https://docs.aws.amazon.com/cognito/latest/developerguide
Nik Rouda: The Compelling Advantages of a Cloud Data Lake
Haskell AWS Lambda
Multi-Tenant Solutions on AWS
Advanced Design Patterns for DynamoDB
Serverless Microservice Patterns for AWS
Fn - Container-native serverless platform
Don't get locked up into avoiding lock-in
Hybrid cloud
Building serverless SaaS on AWS
AWS EKS Workshop

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cloud

Cloud Native

The Twelve Factors App

Well-Architected Framework Pillars

Security

Security Principles

Reliability

Reliability Principles

Digital Transformation

Multi-Tenancy Patterns

Principles

Isolation (thru policies)

Data Partioning

Serverless

Security

Deployment

Testing

DataLake

DataLake vs. Data Warehouse

Lambda Architecture

Amazon Web Services (AWS)

Amazon Cognito

References

Clone this wiki locally