Applied Modern Data-Driven Software Architecture

A practical guide, walking you through the complete architecture of an ambitious product that operates efficiently under high-load, scales to petabytes (!) of data, built on a modern, next-gen, cloud-native, open source tech stack.

Planned for 2024

Created by

Hartmut Armbruster

Created by

Hartmut Armbruster

Planned for 2024

What you'll learn

Get to know different Architecture Areas
Exemplary Architectural Approach
Understand features, capabilities, limitations of:
- ScyllaDB | Apache Cassandra
- Redpanda | Apache Kafka
- Elasticsearch
Working with distributed eventual consistent data systems. (Three of them)
- Characteristics and challenges
- Design principles, patterns, anti-patterns
- Integration, data sync, event sourcing
- Schema & app design with sharding + partitioning in mind

Adopting a Software Architect's mindset
Event & Stream Processing
- CDC, r/w with Kafka Connect
- Stateless and stateful data processing with Kafka, Kafka Streams & Apache Flink
Applied architecture deep dive
- Data, Backend, Event Processing, Frontend, Platform
- High traffic and large scale application
- High volumes of data, at high velocity
- Optimise for availability, reliability, performance
Building blocks of secure architecture
- API, User/Client, Data in-transit/at-rest

What to expect

The general tenor is quite technical
Most of the training is very practical since everything relates to a real application, real use-cases & requirements
Starting off high-level, but getting into low-level/in-depth designs and solutions
Clear visual explanations and software architecture diagrams
The overall presentation style is like an on-boarding session (or hand over)
The might of a well-conceived, tailored end-to-end architecture

What -not- to expect

NO hands-on practice lessons, tasks, quizzes or similar
NO coding exercises (though the ‘Applied Architecture’ chapter may contain real db schema or simple code samples)
NOT about (enterprise) architecture frameworks such as TOGAF, ITIL, SAFe.
The course focus is on the conceptual design process and actual solutions to given requirements.
NOT teaching soft-skills - 'how to become a software architect'. (there already are many excellent guides or books available)

Audience

Developers, Backend Experience
Software/Solution Architects
Data Engineers
Everyone interested in learning modern design patterns and new technologies...

Prerequisites

Overall level of difficulty: advanced
-> premise of solid software engineering fundamentals
Plus but not a must: understanding of basic concepts of distributed systems, eventual consistency, data sharding, data validity
No prior knowledge of the fundamental data technologies is required. Basics will be covered throughout the training and more advanced concepts relevant to the project architecture are explored in detail with real use-cases as part of the ‘applied architecture’ section.
No coding skills are required

Technologies

Fundamental stack, directly relevant to the architecture.

Data
- ScyllaDB / Apache Cassandra
- Redpanda / Apache Kafka
- Elasticsearch
- Cloud Object Storage (~S3)
Backend, Integration
& Event Processing
- Stateless Services (Core Backend)
- Kafka Streams, Apache Flink
- CDC, Kafka Connect, Debezium
Web Frontend
- SPA, PWA, SSR
- Serverless, CDN, global
Platform
- Containers
- Kubernetes

While e.g. data systems are defining components of the overall (L1/L2) architecture, on the contrary the choice of frontend framework is more of a matter of personal preference (Svelte vs. React vs. Vuejs, ...).

Data
- ScyllaDB
- Redpanda
- Elasticsearch
- S3 (compatible)
Backend
- Kotlin
- Spring
- Project Reactor
- Caffeine Cache
- Redis
Web Frontend
- TypeScipt
- Vue3
- Tailwindcss
Integration & Event Processing
- Kotlin
- Spring (Boot, Kafka, Cloud Stream)
- Kafka Streams
- Kafka Connect
- Apache Flink
Interfaces & Schemas
- REST, OpenAPI
- Avro
Platform
- Kubernetes
- (...full platform/ops stack yet to be decided)< class="list-none pl-0">
- k8s platform provider
- serverless web frontend provider
- security
  - hashicorp vault
  - (...more yet to be decided)
- ~flux2 (gitops)
- ~Prometheus, Thanos(?), Grafana
- ~Loki vs. ELK vs. other?
- Nginx
(...more yet to be decided)

3rd Party SaaS

Pusher
?Sendgrid

The Product

The product subject to this guide is a microblogging and social networking service. Users access the application through its website interface. Registered members submit content to the site such as text posts, links, images, and videos. Content is organised into 'feeds', can be rated, searched, filtered and notifications and email digests are emitted. It provides access-control for feeds through organisations, teams, collaborators, roles and permissions.

Among other NFR the primary focus is put on availability, reliability, performance, and scalability. The app operates efficiently under high-load, scales to petabytes (!) of data, with the ambition to support a magnitude of e.g. Github, Discord, Reddit, or even Twitter.

Admittedly, this is a bold statement and big scale load tests are yet to be executed and to be made available - but I’m confident that the architecture and tech stack will take you a long way.

Course Curriculum

Coming soon.