Four Distributed Systems Architectural Patterns Modern 3-tier
The above model evolved to
More and more application functionality in the front end. We will have a load balancer and behind the load balancer we will be scaling the server horizontally. (vertical scaling is considered as increasing the machine hardware)
ELB/NGINX is load balancer.
Cassandra
All the nodes are same. It’s a peer to peer database. Each node a specific token and that is used to divide the data to written and read. Always represent as circle with token ids.
Write a key value pair in Cassandra Tim is the key and Americano is value. We will feed to hash function and lets it get converted to 9F72 Again when reading we will go through Hash function an read.
To avoid single point failure, we do replication:
Strengths of Modern 3-tier architecture
Weakness of Modern 3-Tier Architecture
Sharded Architecture Simple architecture
When clients increase, it brings a load on the Application server.
Fix the issue and delay, we can shard the Application server. We can divide the application server based on the services and have a router which will route the requests from client to respective application server.
SLACK is build using this architectural pattern.
When sharding goes wrong We usually have database in master-replica format and have some layer to manage the configuraton. For example, we can have “zookeeper”.
Zookeeper : selects the master, lead balance the reads and writes. Now suppose the link between the replicas and master goes down.
The replicas which have lost the link with master will request zookeeper(shown at bottom left) to elect a new master.
Now suppose the links are back and then there will be one replica pointing to 2 Masters and that will cause a havoc. Sharding must be decided upfront and difficult to reshard later. Few systems provide dynamic sharding.
Whats good in Sharding
Client isolation is easy(data and deployment). Example, usually messenger server are asked to have their data centers in Same country. Known, simple technology.
Weakness
Complexity: monitoring and logging in distributed is difficult No comprehensive view of Data (ETL process is required to mine the data since we have multiple DBs)
Oversized shards: Shard become big overtime.
How to overcome the weakness of Sharding We add a large database like Cassandra to aggregate, log and analyze the data.
Lambda architecture Lambda make this distinction between streaming data and batch data. Batch data: where the data is stored at certain place. We address that data by that place ex. File, DB. Bounded data. Streaming Data: Log of events. Unboundable Lambda architecture assumes unbounded, immutable data. Usually events are immutable data and unboundable.
We can have bounded analysis and Unbounded analysis. In bounded analysis the events are stored in a DB and since the its stores the data for long time the volume increases and thus process them will take a lot of time.
For an runtime analysis, we can have in temporary storage (some kind of queue). The data in that case is unbounded data and we can have the analysis done at a fast rate because it’s a temporary storage. Both the types can be connected to Cassandra at the backend to provide additional functionality missing.
Combined:
KAFKA
In kafka we have producer, consumer and topics(named queues of Messages). A topic lives on broker, a broker just a server running kafka. Topic can also be partitioned. Because the message/data can increase. In case the data/message increases, we need to have cluster to handle the topic.
Topic portioning
Let’s the topic is portioned in 3 brokers/servers. Producers on right, Consumers on left. Let’s suppose that the producers the generating massages.
We can look at some portion of message(accont id, ip address. etc) and hash the part and using that hash we pump the message in topic. After the messages are geneteared from different producers, they are pushed to topics.
Partitioned are ordered, whole topics are not ordered.
Strength of Lambda
Weakness
It’s not a general purpose framework, its for analysis and logging.
Streaming Architecture