Software Design Patterns - Creational Patterns

Software Design Patterns - Creational Patterns

Deals with object creation mechanisms

  • Singleton
  • Builder
  • Prototype
  • Factory
  • Abstract Factory

Singleton Pattern

A class of which only a single instance can exist
Application needs one, and only one, instance of an object.

Applicable when

  • Ownership of the single instance cannot be reasonably assigned
  • Lazy initialization is desirable
  • Global access is not otherwise provided for

Basic version

1
2
3
4
5
6
7
8
9
public final class singleton{
private static final Singleton INSTANCE = new Singleton();

private Singleton() {}

public static Singleton getInstance(){
return INSTANCE;
}
}

Lazy initialization version(Thread-Safe)

  • Pros: Initialised on first call to avoid memory wastage.

  • Cons: locks must be added to ensure thread safety, and adding locks will affect performance.

1
2
3
4
5
6
7
8
9
10
11
12
13
public class Singleton {  
private static Singleton instance = null;
private Singleton() {

}

public static Singleton getInstance() {
if (instance == null) {
instance = new Singleton();
}
return instance;
}
}

Initialiation-on-demand Holder Idiom

1
2
3
4
5
6
7
8
9
10
11
12
public class  Singleton {
private Singleton() {}

private static class LazyHolder {
static final Singleton INSTANCE = new Singleton();
}

public static Singleton getInstance() {
return LazyHolder.INSTANCE;
}

}

Builder Pattern

Complex objects can be created directly without the user knowing the construction process and details of the object.
Solves the Telescoping Problem.

CSDN Bloger「Carson_Ho」

Main parts of GoF’s Builder Pattern

  1. Product object

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    public class Computer{

    //电脑组件的集合
    private List<String> parts = new ArrayList<String>();

    //用于将组件组装到电脑里
    public void Add(String part){
    parts.add(part);
    }

    public void Show(){
    for (int i = 0;i<parts.size();i++){
    System.out.println(“组件”+parts.get(i)+“装好了”);
    }
    System.out.println(“电脑组装完成,请验收”);


    }
    }

  2. A Builder(interface or abstract class)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
public  abstract class Builder {  

//第一步:装CPU
//声明为抽象方法,具体由子类实现
public abstract void BuildCPU()

//第二步:装主板
//声明为抽象方法,具体由子类实现
public abstract void BuildMainboard();

//第三步:装硬盘
//声明为抽象方法,具体由子类实现
public abstract void BuildHD();

//返回产品的方法:获得组装好的电脑
public abstract Computer GetComputer();
}

  1. Concrete Builder(extend the Builder)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    //装机人员1
    public class ConcreteBuilder extend Builder{
    //创建产品实例
    Computer computer = new Computer();

    //组装产品
    @Override
    public void BuildCPU(){
    computer.Add("组装CPU")
    }

    @Override
    public void BuildMainboard(){
    computer.Add("组装主板")
    }

    @Override
    public void BuildHD(){
    computer.Add("组装主板")
    }

    //返回组装成功的电脑
    @Override
    public Computer GetComputer(){
    return computer
    }
    }

  2. Director object

1
2
3
4
5
6
7
8
9
10
public class Director{
//指挥装机人员组装电脑
public void Construct(Builder builder){

builder.BuildCPU();
builder.BuildMainboard();
builder.BuildHD();
}
}

Prototype Pattern

Creating duplicate object while keeping performance in mind. The operation will directly clone a object in the ram. RAM’s I/O is much faster than storage.

  1. Creator
  2. Prototype
  3. ConcretePrototype

Factory Pattern

In the factory pattern, we create objects without exposing the creation logic to the public and by using a common interface to point to the newly created object.

There’re two type of Factory Pattern

  1. Simple Factory
    • Simple Factory create products(object)
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      public interface Dog{
      public void speak();
      };
      public class Poodle implements Dog {
      public void speak(){System.out.println("Poodle says \"arf\"")};
      }
      public class Rottweiler implements Dog {
      public void speak(){System.out.println("Poodle says \"arf\"")};
      }

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
public class DogFactory{
public static Dog getDog(String criteria) {
switch (criteria.toLowerCase()){
case "small": return new Poodle();
case "big": return new Rottweiler();
...
default: throw new RuntimeException("Unsupported dog type: " + criteria);
}
}
}

public class TestFactory {
public static void main(String[] args) {
//create a small dog
Dog dog = Dogfactory.getDog("small";);
dog.speak();

...
}
}

  1. Abstract Factory Pattern
    • Abstract Factory create Factory

…write later…

Information Security Notes 6 - System Security

Information Security Notes 6 - System Security

Host Security

The Goals of computer security include confidentiality, data integrity, service availability.

Common attacks:

  • Phishing
  • Malicious code
  • System vulnerability

Buffer Overflow Attacks

When a program is executed, both code and data will be copied to ram.

Heap (Kind of tree)

Heap is used to stored dynamic data.

Stack

Stack is used to stored local variables, static variables and return address

What is buffer overflow?

When copying data into a buffer, the length of the data exceeds the remaining space in the buffer. Normally, buffer overflow data will only corrupt program data and cause unexpected termination. But if someone carefully constructs the contents of the overflow data, then it is possible to gain control of the system

By Buffer Overflow Attacks, When using the unsafe strcpy/gets function, the system will blindly copy the entire data of data into the memory area pointed to by buffer. buffer has a finite length and once the data of data exceeds BUF_LEN, a buffer overflow will occur.

Overflow the buffer with malicous (shell code), overwrite the return address to the shellcode. The function will return to the shellcode instead of the function caller

Countermeasure

  1. Avoid using high risk API
  2. Measure the length of the string firest
  3. Compiler Assistant

SQL injection

username: "or""="
password: "or""="

Then the SQL will be
SELECT * FROM Users WHERE Name = "" or ""="" AND Pass ="" or ""=""
This will always return true

Counter measure

  1. Write Better Program
  2. Data validation in front-end
  3. Use prepare statement in query instead

Network Security

Ping-of-death

Attackers ping a machine with a very fast rate, the server will be too busy.
It is Theoretically possible, but not realistic because the basic version of ping-of-death is not an efficient attacks.

Ping-of-death, buffer overflow

A typical ICMP packet is 64 bytes only, and do not expect packet larger than 65535(max size of ipv4). The server might encounter a buffer-overflow.

SYN Flooding

In TCP three-ways handshaking requires server to respond. The server will response a SYN-ACK to the client. We might issue a lot of SYN request to the server.

Counter measure

Both ping and SYN can be blacklisted

IP spoofing

The IP packet header stores the routing information of a packet. IP Spoofing refer to generate a fake random source IP address. That makes server cannot block SYN packet directly.
IPSec can be used to assert the correctness of IP header, but it cannot be used to prevent SYN Flooding.

Amplification Attacks with IP Spoofing

Some protocols use UDP instead of TCP connect. For example, Network Time protocol and Domain name service.
Attackers can make requests for large vlumes of replies from these service while putting the victims’s IP addresses as the source addresses.

Real life example: Prank call to a restaurant to order 10 people takeaway and ask them to deliver to a victim’s home.

DDoS - Distributed Denial of Service

Upgrade version of a DoS attack. Attackers controls a lot of devices to DoS attack the victim. Since packets are sent frm different direction, it is not easy block the packets.

Counter Measures

  1. Building Firewall/Intrusion Detection System(IDS)
  2. Challenges like Captcha
  3. Demilitarized zone(DMZ)
  4. Use(Purchase) third party sevice

DNS Poisoning

Return a fake/false entry on a DNS server.

Cloud Notes of Technical Issues in Distributed System

Cloud Notes of Technical Issues in Distributed System

  1. Time Synchronization
  2. Coordination and agreement
  3. Transactions and concurrency control

Time synchronization

Timing is important, for accurately.

Computers each have their own physical clocks

Due to the structural differences between servers, different time drifts are generated after a period of time, so that the physical clocks of different servers differ to some extent. As a direct result, event A may occur in a later order than event B, but the timestamp sent over is indeed less than B. If the synchronisation of state is involved B’s data will overwrite A’s data, which we don’t want to see.

  • Electronic devices that count oscillations occuring in a crystal at a frequency.

  • Operating System reads the hardware clock value.

  • Not perfect

    • Clock skek: the instantaneous difference between the readings of any two clocks
    • Clock drift: different crystal-based clock count time at different rates
      • Temperature matter
      • Drift rate: The change in the offset between the clock and a nominal perfect reference clock per unit of time

External syncronization

Synchronize a group of clocks with an authoritative external source of time
For example, UTC: Coordinated Universal Time
Network Time Protocol(NTP)

Process Time: t+T(round)/2

Internal syncronization

Synchronize between a group of computer. A coordinator computer is chosen to be the master. Other computers are slaves. Master periodically polls the slaves, and the slaves send back their clock values.

  • Berkeley Algorithm
  • Cristian’s Method

Distributed Mutual Exclusion

  1. safety - at most one process can execute at a time
  2. liveness - requests to enter and exit the critical section eventually succeed, freedom from deadlock and starvation
  3. Ordering - entry to thee critical section is granted in that order.

Evaluated by:

  1. Consumed bandwidth
    • required two messages to enter the critical section(request message & grant message)
    • required one messages to exit the critical section(a release message)
  2. Client delay
    • Round-trip delay
  3. Throughput(synchronization delay)
    • THe time for a release messages to the derver and a grant message to the next process.

Coordination and agreement

Transations and concurrency control

Motivation of Synchronization

  1. Recoverable to handle process crash
  2. Multiple clients access the same object concurrently
  3. Atomic operation

Atomicity Transactions “原子不可分割”

  1. All or nothing
    • either completes successfully
    • either has no effect at all
  2. Isolation
    • Each transaction must be performed without interference from other transactions
    • No observation

Concurrency Control

  1. Lost update
    • Use old value to calculate a new value
  2. inconsistent retrievals
    • Transaction observes values that are involved in an ongoing updating transaction

Rules of Serial Equivalence

All pairs of conflicting operations of the two transactions be executed in the same order

FIFO?

Locking

  • Exclusive lock - Pessimistic Lock
    Only one can access the object at the same time
    Assuming that concurrency conflicts will occur, block any operations that may violate data integrity.

Java synchronized is an implementation of pessimistic locking, where every time a thread wants to modify data it first obtains a lock, ensuring that only one thread can manipulate the data at any one time, while the others are blocked.

  • Optimistic Lock
    Timestamp/version
    When the update is committed, check the timestamp of the data in the current database and compare it with the timestamp you got before the update, if it is the same then it is OK, otherwise it is a version conflict.

  • Two Phase lock

  • Deadlock

    • Detection:
      • Find cycles in the wait-for graph
      • Select a transaction for abortion to break the cycle
    • Timeout
  • Read/Write Locks

    • read lock before performs read operation
    • write lock before performs write operation
    • write lock is more exclusive

Optimistic concurrency control

Checks “conflict operations” before commit
If yes, aborts it and the client may restart

Timestamp ordering

Record the most recent time of reading and writing of each object
Compare timestamp => determine it can be done immediately or must be delayed or rejected.

Clusters

Benefits of computer clusters include

  1. Scalable performance
  2. High availability
  3. Fault tolerance
  4. Modular growth
  5. Use of commodity components

Attributes of Computer Clusters

  • Scalability
  • Packaging
    • Compact packaging: closely packaged in racks
    • Slack packaging: Located in different locations
  • Control
    • Centralized
    • Decentralized
  • Homogeneity
    • Homogeneous cluster: Node from the same platfrom
    • Heterogeneous cluster: Node from the different platfrom

Architecture

  • OS should be designed multiuser, multitasking and multithreaded
  • interconnected by fast commodity networks
  • Cluster middleware glues together all node platforms at the user space

Design principles of Clusters

  • Single-System image (SSI)
    The same client will see the same view of the service no matter which machine in the cluster it connects to.
  • Reliability
    • operate without a breakdown
  • Availability
    • percentage of time available to the user
  • Servoceability
    • maintenance/repair/upgrades etc.

Operate-Repair cycle

  • Mean time to failure
    • average time of fails
  • Mean time to repair
    • average time to fix(restore)

Type of Failures

  1. Unplanned failures vs. planned shutdowns
  2. Transient failures vs. permanent failures
    • reboot can fix
  3. Partial failures vs. total failures
    • part of the system, the cluster still usable

Fault-Tolerant

  • Host standby
    only primary nodes are actively doing the useful work
    Standby nodes are powered on and running some monitoring programs
  • Active-takeover
    All servers are primary and doing useful work.
    User may experience some delays or may lost some data
  • Failover
    When a component fails, it allows the remaining system to take over the services

Failure Cost Analysis

  • MTTF, MTTR
  • Avilability(%)
  • The downtime per year(hours)
  • The yearly failure cost

Information Security Notes 5 - IPSec, TLS, VPN, HTTPS

Information Security Notes 5 - IPSec, TLS, VPN, HTTPS

Security Goal

  1. End-to-end Encryption: Only accessible by the sender and receiver
  2. Tunneling: Traffic pattern is hidden. Except the very last end point
  3. Authentication: Packets are authentic
  4. Fast
  5. Free

OSI model in Security

VPN is Network Layer(IP); Proxy is Transport Layer(TCP/UDP)

  • Application Layer
  • Physical/Data Link: hop-to-hop security
  • IP Layer: IPSec
  • Transport Layer: TLS/SSL
  • Upper Layer: HTTPS

IPSec

IPSec aka IP Security. Its purpose is to provide high security features for IP, and VPNs are solutions that arise from the way this security is achieved. ipsec is a framework architecture, consisting of two specific types of protocols.

AH - Authentication Header

  • To authenticate
  • By Hash(MD5, SHA1)
  • Less used than ESP

ESP - Encapsulated Security Payload

  • To encapsulate / encrypt

Why is AH less used? Because AH cannot provide encrypt. Also, AH cannot pass NAT network(because the authenticate function)
Of course, IPSec can use both AH and ESP to achieve the most complete security features in extreme cases, but such solutions are extremely rare.

Transport Mode

  • The IP header remains unchanged and is mainly used for End-to-End application scenarios
  • Provide protection primarily for upper-layer protocols (TCP/UDP)

Tunnal Mode

  • Tunnel mode encapsulates an external IP header after AH and ESP processing, which is mainly used in Site-to-Site application scenarios
  • Provide protection to the entire IP packet
  • Although tunnal mode can be applied to any scenario, tunnal mode requires an additional header overhead
  • For PC-to-PC scenarios, it is recommended to use transport mode

IPSec Traffic Processing

  • IPSec is executed on a packet-by-packet basis
  • IPSec searches the security policy database for a match
  • Discard if no match
  • Policy say bypass: send right away
  • Policy say PROTECT: look for a key to encrypt
    • Run IKE if no key is found

Security Association (SA)

  • Can be understood of the “Choose of parameter”
  • SPI: I local identifier enables receiving system to select the SA.
  • IP Destination: Unicast receiver address
  • Security Protocol: indicates it is an AH or ESP association

SSL

  • Protocols which provide secure communication on the internet
  • Encrypts network connection at the Transport Layer
    • On top of TCP; Under HTTP/FTP
    • TLS successor of SSL,TLS for transport Layer Security
  • Designed to prevent eavesdropping, tampering, and message forgery
  • End-to-End service through TCP

  • Encrypted communications over Internet
  • Ensures that the information is sent unchanged, and only to the server you intended
  • Asymmetric encryption for authentication and key exchange
  • Symmetric encryption to encrpyt data

SSL connection and SSL session

Different types of Handshaking

  • One way Authentication
  • Two way Authentication
  • Abbreviated Handshake
    • important and tedious.

Lower Layer stack of SSL

  • SSL Record Protocol
    • To provide supports to upper layer protocols
      • Message integrity
      • Confidentiality
    • Fragmentation: Cut the data into 2^14 bytes
    • Compression: Optionally, but must be lossless
    • Add MAC: compute a message authentication code

Upper Layer stack of SSL

  • SSL Handshake Protocol
  • SSL Change Cipher Spec. Protocol
  • SSL Alert Protocol
  • Application Protocol

HSTP

  • HTTP Strict Transport Security
  • A web security policy mechanism
  • Header:
    Strict-Transport-Sceurity: max-age=31536000
  • Turn any http to https
  • abort connection if cannot be ensured

Information Security Notes Summary 1-5

Information Security Notes Summary 1-5

Here is some concept summary of my mid-term. Those maths and case is excluded.

Type of Attack

I should understand how the following attack happen:

  • Brute-force Attack
  • Differential Attack
  • Length extension attacks
    • Hash using Merkle-Damgard construction
  • Second pre-image attack(2^(n-1))(fixed message)
  • Birthday attack(2^n/2)(any two message)
  • Meeting in the middle attack
  • Man in the middle attack
  • Replay attack
  • Offline dictionary attack

Symetric

Symetric has no hard problem, all depands on the key size. When we talk about symetric, We usually use AES. DES and 3DES should not be use anymore. Symetric is much faster than Asymmetric(10-100times).

Avalanclve effect: Small change in bit will lead to big change of the output.

  • Stream -> one bi
  • Block -> one block (normally 64bit)
    • More secure

Asymmetric

Asymmetric is a Trapdoor function, everyone can lock but only private key can unlock.

RSA

  • We need at least 2048bit
  • Starting from TLS1.3, RSA is no more included

ECC

  • aP and bP is impossible to compute abP
  • Diffle hellman algorithm
  • 256 ECC is as strong as 3072RSA
  • Legacy software does not support

Hash

Famous Hash:

  • MD5
  • SHA1
  • SHA256

Aim:

  • impossible hard to modify a message without changing the hash
  • impossible hard to Generate a messate that a given hash
  • impossible hard to Find two different message with the same hash

MD construction (collision)

大集合映射到小集合, collision必然會發生

H(P||s) for salt

  • Hash chain: one time password
  • Hash list: hash big file
  • Hash tree: only verify root hash

MAC message Authentication code

Goal:

  • Computable(very fast)
  • Unforgeable
  • one-wayness

MAC requies a key to verify

Digital Signature

Make Message can be publity verfiable

  • MAC? key is shared
  • Public Key? everyone can send an encrypted message

Example of Digital Signatures:

  • RSA-PSS
  • DSA
  • ECDSA - fastest and more secure

Mac is faster than DS, MAC only need a hash function.

Secure Public Key Distibution

Ways to distibution a key:

  1. Manual
  2. Certuficate
  3. Public KEY infarstructure
  4. PGP
  5. ID-base

Certificate:

  • Issued by CA (Certificate Authority)
  • Certificate Revocation List(CRL)

Symmetric key distribution

  • Use dellit-hellman Algorithms

  • Key agreement: both parties contribute some information

  • Perfect forward secrecy

  • Session Key: session key per each commanication session

  • Authenticated: is able to confirm the identity of the partner

  • Absent of Secure channel: do not need pre-share secure channel

Requirement of an AKE protocol

  1. Soundness
  2. Completeness
  3. Key establishment
  4. Mutual authentical
  5. Secure against replay attack
  6. Secure against offline dictionary attack
  7. Perfect forward secrecy
  8. Secure against Denning-Sacco attacks

Information Security Notes 4 - Authentication and Standards

Information Security Notes 4 - Authentication and Standards

Adversatry Model in Networks

  • Read only
  • Write only
  • Drop message
  • Replace message

Encryption is to ensure confidentiality
Digital signature/MAC is to ensure message integrity

Man-in-the-middle attacks

The attacker establishes separate contacts with each end of the communication and exchanges the data it receives so that the two ends of the communication think they are talking directly to each other through a private connection, when in fact the entire conversation is under the full control of the attacker.
Public Key Stolen.

A<=KEY=>M<=Key=>B

Protection of Man-in-the-middle attacks - SSL

Verify that the certificate used by one or both parties involved in the communication is issued by an authoritative and trusted digital certificate authority and is capable of performing two-way identity authentication.

Secure Public Key Distribution

  • Manual
  • Certificate
  • Public Key infrastructure
  • PGP
  • ID-base

Manual

Download a public key from a webpage/an email

  • Anyone can forge such a webpage or email

  • Hash the public key and release the hash instead

  • Hash is much more easier to compare by haman than public key

Visual Hash - RandomArt

Use image hash instead of using hex, most non-IT ppls are allergic to hex
Directly use Vitual hash is insecure because attackers can “easily” generate the same public key with the same visual hash

What should we do:

  1. Broadcast the hash of their public keys
  2. announce their public key.
  3. Compute the visual hashes

Certificate

  • Certificate Authority(CA): It is an authoritative body responsible for issuing and managing digital certificates and acts as a trusted third party, assuming the responsibility of checking the legitimacy of public keys.

It is impossible to do the Man-in-the-middle attacks unless:

  • CA also issues a certificate for attackers
  • attackers applies a certificate with user’s identity from another trusted CA
  • Attackers steals CA’s private key
  • Steals Users’ private key

Information included in a digital certificate:

  • Basic info of the subject: name, email, web address
  • Public key of the subject
  • Basic info of the issuer
  • Issuer’s digital signature on the certificate
  • Validity of the certificate: the period of time that the certificate is valid

X.509

Defines a framework for the provision of authentication services by the X.500 directory to its users
Extensions:

  • .pem
  • .cer
  • .crt
  • .der
  • .p7b

Used in many areas such as S/MIME, IPSec, SSL/TLS

Digital Certificate Creation

  1. Subject generates public key pair
  2. Subject keeps the private key to himself
  3. Subject sends his identity and public key to issuer
  4. Issuer verifies subject’s identity
  5. Issuer signs the certificate and returns it to subject

Digital Certificate Verification

  1. Subject presents the certificate to others
  2. Other checks the issuer’s digital signature on the certificate
  3. Check if the certificate has been revoked

Certificate Revocation List CRL

A list that contains revoked certificates.

Reasons for revocation:

  • CA improperly issued a certificate
  • The private key has been hacked
  • Owner no longer own the domains

Pretty Good Privacy - PGP

Self-signed certificate
User trusts this certifcate because there is a link between issuer and users

ID Based

  • Require a trusted authority
    • Called a Public Key Generator(PKG)
  • PKG generates a private key to the user
  • Anyone can use the users’s ID to send an encrypted message
  • The ID is the public key

Pros

  • Do not need a real public KEY, just use ID
  • Encryption cipher is usually short

Cons

  • Authority has the ability to deduce everyone’s private key
  • ID is not legally bind with a person identity

Symmetric key Distribution

  1. Key agreement: Both parties contribute some information in key establishment.
  2. Perfect forward secrecy: When you lose this long term key, the security of your future actions cannot be guaranteed, but your previous actions are guaranteed to be secure.
  3. Session Key: uses a session key per each communication session
  4. Authenticated: is able to confirm the identity of the partner
  5. Absent of Secure Channel: does not require both party to pre-share a secure channel

Encrypted Key Exchange

  • A and B share a password
  • Diffie-Hellman
  1. Secure against replay attack
  2. Secure against offline dictionary attack
  3. Perfect forward secracy
  4. Secure against Denning-Sacco attacks

More of password-base authentication

  1. Do not choose weak passwords
  2. Do not write down password
  3. Change password regularly
  4. Avoid using personal information as password

Information Security Notes 3 - Hash function

Information Security Notes 3 - Hash function

Hash function, known as message digest, used for verification in general.

Famous hash: MD5, SHA1, SHA-256

Hash is discuss in Data Structure


MD construction || Length extension attacks

Two conditions for Length extension attacks:

  • When thee encrypted data is less than 64 bytes, it will be filled automatically.(padding)
  • Encrypt the plaintext in blocks according to this fixed length and use the encryption result of the previous block as the next encrypted block (Initial Vector).

MD5, SHA-1 SHA-2 are all with MD construction
SHA-3 is not.
The most effective way to prevent hash collisions is to expand the bits where the hash values are taken.

Attacks on (Ideal) Hash functions

  • Brute-force: To break the n-bits hash, we need to perform 2^(n-1) operations.
  • Second Pre-image: given a fixed message, output another message such that their hash values are the same, requires 2^(n-1) operations.
  • Birthday attack: To obtain any two message that has the same hash values, requires only 2^(n/2) operations! Birthday Attacker is Maths

Hash Chain

Can be used in applications that requires only one-way authentication e.g. security token
How it works: Hash another Hash

Hash list

Verifying the top hash can assery the correctness of data
When data is transferred, we cut the file into small pieces. Each data packet have a hash value. We put all hash value together as a top hash. Hash check the top hash.

Hash Tree(Merkle Tree)

Hashing like a tree. By verifying the root hash.
Usage: git, p2p system, Bitcoin

Message Authentication Code (MAC)

MAC is a small bit of information that can be used to check the authenticity and the integrity of a message.

  • Authenticity: from the designated person
  • Integrity: unmodified

Requirment of MAC

  • Computable: very fast
  • Unforgeable: cannot be forged by attacker
  • One-wayness: Message cannot be recovered from MAC

Limitation of MAC

MAC requires a key to verify (symmetric).

Digital Signature

  • Asymmetric key version of MAC
  • SSL?
  • Anyone has the public key can verify the document is signed by a signer

Example of Digial Signatures

  • RSA-PSS
  • DSA
  • ECDSA => shortest, fastest and more secure

Comparing MAC and Digital Signature

  • MAC is not publicly verifiable
  • MAC is faster, because requires a hash operation only

Information Security Notes 2 - Symmetric and Asymmetric Encryption

Information Security Notes 2 - Symmetric and Asymmetric Encryption

Symmetric and Asymmetric key

Symmetric

  • Share Same Key
  • Most classoc crypto are Symmetric
  • Encrypts and Decrypts using the same key
  • Also called “Secret Key Encryption”

Symmetric Overview

  • More secure if the algorithm is opened and publicly review
  • Two type of Symmetric
    1. Stream Cipher: the smallest unit to encrypt is one bit
    2. Block Cipher: the smallest unit to encrypt is a block
  • Block Ciphers are more secure than most stream ciphers
  • Most encryptions scramble the message by substitution and rearrangement for multiple-times
  • Security is bounded by the key size.
    • 64bits key => broken with at most 2^64 operations

History - Data Encryption Standard

DES and 3DES should not be adopted today.

  • Known more by reading “Meeting in the middle attack”

Asymmetric

  • Public Key
  • Private Key
  • Encrypt with public key; Decrypt with private key
  • Also called “Public Key Encryption”
  • Everyone can lock but only the one with private key can unlock
  • Trapdoor Function: As a function, its function value is very easy to calculate, but the calculation of the inverse solution is very difficult unless some special information is known.

Note that there are many Maths about ECC, AES etc. I will note them if I have time later.

Data Mining Notes 1

Attribute of data

  • Nominal (ID numbers)
  • Ordinal (grades)
  • Interval (dates)
  • Ratio

How to measure the similarity of two objects

Similarity = 1 - Dissimilarity

  • Similarity
    • Numerical measure of how alike two data objects are
    • Is higher when objectsare more alike
    • Often falls in the range [0,1]
  • Dissimilarity
    • Numberical measure of how different re two data objects
    • Lower when objects are more alike
    • Miinimum Dissimilarity is often 0
    • Upper limit varies

Proximity refers to a similarity or dissimilarity

Data Quality

Examples of problems:

  • Noise and outliers
  • Missing Values
  • Duplicate data

Duplicate Data

Data set may include data objects that are duplicates, or almost duplicates of one another

  • Major issue when merging data from heterogeous sources

Example: Same person with multiple email addresses
Data Cleaning: Process of dealing with duplicate data issues

Data Preprocessing

  • Aggregation: Conbining two or more attributes
    • Data reduction
    • Change of scale
    • More “stable” data
  • Sampling
  • Dimensionality Reduction
  • Feature subset selection
  • Discretization
  • Attribute Transformation

Sampling

  • Sampling is the main technique employed for data selection
  • Statisticians sample because obtaining the entire set of data interest is too expensive or time consuming
  • Sampling is used in data mining because processing the entire set of data of interest is too expensive or time consuming

Key principle for effective sampling is

  1. using a sample will work almost as well as using the entire data sets, if the sample is representative.
  2. A sample is representative if it has approximately the same property as the original set of data

Types of sampling

  • Simple Random Sampling
  • Sampling without replacement
  • Sampling with replacement
  • Stratified sampling

Dimensionality Reduction

???? Need to learn later

Distributed and Cloud Computing Notes 1

Distributed and Cloud Computing Notes 1

Reasons for Distributed Systems

  • Functional Separation
    • Different Capabilities and purposes
  • Inherent Distribution
    • Information
    • People
  • Power imbalance and load variation
  • Reliability
  • Economies

Consequences of Distributed Systems

  • Concurrency - Each computer is autonomous
    • Carry our tasks independently
    • Tasks coordinate their actions by exchanging messages
    • System capacity can be increased by adding more resources
  • No global clock
  • Independent Failures

Motivation of Distributed Systems

  • To share resource and information
  • The emergence of pervasive networking technology
  • The emergence of mobile and ubiquitous computing
  • The increasing demand for multimedia services
  • The view of distributed systems as a utility

Maintenance of intranet

  • No rick if no connection to internet
  • Firewalls are used to limit services from/to an intranet
    • Limit FTP/Remote Desktop etc.

Mobile computing: Performing computing tasks while the user is on the move, away from his/her usual environment

Eight forms of transparency

  • Access transparency
  • Location transparency
  • Concurrency transparency
  • Replication transparency
  • Failure transparency
  • Mobility transparency
  • Performance transparency
  • Scaling transparency

List of Challenge

  • Heterogeneity
  • Security
    • Confidentiality
      • Protection against disclosure to unauthorized individual information
    • Integrity
      • Protection against alteration or corruption
    • Availability
      • Protection against interference targeting access to the resources
      • DDoS
    • Authenticity or Non-repudiation
      • Proof of sending / receiving an information
      • digital signature

Failure

Availability =MTTF/(MTTF+MTTR)

  • Mean time to failure(MTTF)
    • The average time of normal operation before the system fails
  • Mean time to repair (MTTR)
    • The average time it takes to repair the system and restore it to working condition

Single point failure

Single hardware/Software component failures cause the whole system crash. The key approach to enhancing availability is to make as many as possible partial failures by removing single points of failure

Checkpointing

  • The process of periodically saving the stage of an executing program to stable storage, from which the system can recover after a failure.
    • Each program stae saved is called a Checkpoint .
    • Checkpointing can be realized by operating system at kernel level/Third party library/by the application itself.

Jobs

  • Serial Jobs: Run on a single node
  • Parallel jobs: use multiple nodes
  • Interactive jobs: require fast turnaround time, and their input/output is directed to a termainal
  • Batch jobs: need more resources and don’t need immediate responses. Scheduled jobs.

job Management System

  • A user server: Let user submit jobs.
  • A job scheduler: performs job scheduling
  • A resource manager: allocates and monitors resources. Enforces scheduling policies, and collects accounting information.

Security Mechanisms

  • Encryption(AES, RSA)
  • Authentication(Password, Public key)
  • Authorization(access control)

  • Concurrency
    • Fair scheduling
    • Preserve dependencies
    • Avoid deadlocks
    • Object locking, data consistency, semaphores
  • Fault tolerance (No failure despite faults)
    • Fault detection
      • Checksums
      • Heartbeat
    • Fault masking
      • Retransmission of corrupted messages
      • Redundancy
    • Fault toleration
      • Exception handling
      • Timeouts
    • Fault recovery
      • Rollback mechanisms
  • Scalability
  • Openness
  • Distribution transparency <= Do not let other touch