Posted 2021-03-20Updated 2025-06-19Notes6 minutes read (About 884 words)

Cloud Notes of Technical Issues in Distributed System

Time Synchronization
Coordination and agreement
Transactions and concurrency control

Time synchronization

Timing is important, for accurately.

Computers each have their own physical clocks

Due to the structural differences between servers, different time drifts are generated after a period of time, so that the physical clocks of different servers differ to some extent. As a direct result, event A may occur in a later order than event B, but the timestamp sent over is indeed less than B. If the synchronisation of state is involved B’s data will overwrite A’s data, which we don’t want to see.

Electronic devices that count oscillations occuring in a crystal at a frequency.
Operating System reads the hardware clock value.
Not perfect
- Clock skek: the instantaneous difference between the readings of any two clocks
- Clock drift: different crystal-based clock count time at different rates
  - Temperature matter
  - Drift rate: The change in the offset between the clock and a nominal perfect reference clock per unit of time

External syncronization

Synchronize a group of clocks with an authoritative external source of time
For example, UTC: Coordinated Universal Time
Network Time Protocol(NTP)

Process Time: t+T(round)/2

Internal syncronization

Synchronize between a group of computer. A coordinator computer is chosen to be the master. Other computers are slaves. Master periodically polls the slaves, and the slaves send back their clock values.

Berkeley Algorithm
Cristian’s Method

Distributed Mutual Exclusion

safety - at most one process can execute at a time
liveness - requests to enter and exit the critical section eventually succeed, freedom from deadlock and starvation
Ordering - entry to thee critical section is granted in that order.

Evaluated by:

Consumed bandwidth
- required two messages to enter the critical section(request message & grant message)
- required one messages to exit the critical section(a release message)
Client delay
- Round-trip delay
Throughput(synchronization delay)
- THe time for a release messages to the derver and a grant message to the next process.

Coordination and agreement

Transations and concurrency control

Motivation of Synchronization

Recoverable to handle process crash
Multiple clients access the same object concurrently
Atomic operation

Atomicity Transactions “原子不可分割”

All or nothing
- either completes successfully
- either has no effect at all
Isolation
- Each transaction must be performed without interference from other transactions
- No observation

Concurrency Control

Lost update
- Use old value to calculate a new value
inconsistent retrievals
- Transaction observes values that are involved in an ongoing updating transaction

Rules of Serial Equivalence

All pairs of conflicting operations of the two transactions be executed in the same order

FIFO?

Locking

Exclusive lock - Pessimistic Lock
Only one can access the object at the same time
Assuming that concurrency conflicts will occur, block any operations that may violate data integrity.

Java synchronized is an implementation of pessimistic locking, where every time a thread wants to modify data it first obtains a lock, ensuring that only one thread can manipulate the data at any one time, while the others are blocked.

Optimistic Lock
Timestamp/version
When the update is committed, check the timestamp of the data in the current database and compare it with the timestamp you got before the update, if it is the same then it is OK, otherwise it is a version conflict.
Two Phase lock
Deadlock
- Detection:
  - Find cycles in the wait-for graph
  - Select a transaction for abortion to break the cycle
- Timeout
Read/Write Locks
- read lock before performs read operation
- write lock before performs write operation
- write lock is more exclusive

Optimistic concurrency control

Checks “conflict operations” before commit
If yes, aborts it and the client may restart

Timestamp ordering

Record the most recent time of reading and writing of each object
Compare timestamp => determine it can be done immediately or must be delayed or rejected.

Clusters

Benefits of computer clusters include

Scalable performance
High availability
Fault tolerance
Modular growth
Use of commodity components

Attributes of Computer Clusters

Scalability
Packaging
- Compact packaging: closely packaged in racks
- Slack packaging: Located in different locations
Control
- Centralized
- Decentralized
Homogeneity
- Homogeneous cluster: Node from the same platfrom
- Heterogeneous cluster: Node from the different platfrom

Architecture

OS should be designed multiuser, multitasking and multithreaded
interconnected by fast commodity networks
Cluster middleware glues together all node platforms at the user space

Design principles of Clusters

Single-System image (SSI)
The same client will see the same view of the service no matter which machine in the cluster it connects to.
Reliability
- operate without a breakdown
Availability
- percentage of time available to the user
Servoceability
- maintenance/repair/upgrades etc.

Operate-Repair cycle

Mean time to failure
- average time of fails
Mean time to repair
- average time to fix(restore)

Type of Failures

Unplanned failures vs. planned shutdowns
Transient failures vs. permanent failures
- reboot can fix
Partial failures vs. total failures
- part of the system, the cluster still usable

Fault-Tolerant

Host standby
only primary nodes are actively doing the useful work
Standby nodes are powered on and running some monitoring programs
Active-takeover
All servers are primary and doing useful work.
User may experience some delays or may lost some data
Failover
When a component fails, it allows the remaining system to take over the services

Failure Cost Analysis

MTTF, MTTR
Avilability(%)
The downtime per year(hours)
The yearly failure cost

Cloud Notes of Technical Issues in Distributed System

https://blog.kwunlam.com/Cloud-Notes-of-Technical-Issues-in-Distributed-System/

Author

Elliot

Posted on

2021-03-20

Updated on

2025-06-19

Licensed under

#Notes Cloud Computing

Cloud Notes of Technical Issues in Distributed System

Cloud Notes of Technical Issues in Distributed System

Time synchronization

Computers each have their own physical clocks

External syncronization

Internal syncronization

Distributed Mutual Exclusion

Coordination and agreement

Transations and concurrency control

Motivation of Synchronization

Atomicity Transactions “原子不可分割”

Concurrency Control

Rules of Serial Equivalence

Locking

Two Phase lock

Optimistic concurrency control

Timestamp ordering

Clusters

Benefits of computer clusters include

Attributes of Computer Clusters

Architecture

Design principles of Clusters

Operate-Repair cycle

Type of Failures

Fault-Tolerant

Failure Cost Analysis

Author

Posted on

Updated on

Licensed under

Links

Categories

Recents

Archives

Tags

Subscribe for updates