Designing Data-Intensive Applications Course
Created
Updated
Outline
Ch 1: Reliable, Scalable, and Maintainable Applications
Lab 1.1 - Twitter Timeline (link is to a Killercoda scenario, you’ll need to create a free Killercoda account [or paid if you prefer])
Ch 5: TODO
Future
- Asynchronous Replication
- Vocab terms from book
- Data Model Throwdown (Relational, Document, Graph, Key-Value, Wide-Column, Search, Time-Series, Blob, Vector, etc)
- Jepsen consistency levels?
- Synchronous vs Asynchronous Replication
- Eventual Consistency in Asynchronous Replication
- Multileader Replication
- Leaderless Replication
- OLTP vs OLAP vs HTAP
- Columnar Cloud Storage: Parquet, Hive, Iceberg, Delta Lake, Hudi
- B Trees vs LSM Trees (other relevant storage engine stuff? maybe indexes?)
- Reliability, Scalability and Maintainability
- Blog on DSRP internal based on real system (postgres https://www.postgresql.org/docs/ and code, sqlite (this and postgres for oltp), duckdb (provides single node olap), trino, polypheny, f1 lightning, faiss et al, ml dsrp? pytorch and or tensorflow etc, parquet, arrow, etc, iceberg, mongo, cassandra, elasticsearch, redis, neo4j or other graph, influxdb or other time series, openstack object storage? other openstack? zookeeper or etcd etc, calcite
- Kafka vs SQS vs RabbitMQ
- Distributed Actors (Akka, Orleans)
- REST vs RPC
- Serialization: Pickle vs JSON vs Protobuf vs Avro
Examples