Data Networks (ECE 158B)


Table of Contents

I took this class to learn about network systems and applications. We had hands-on measurement and implementation of network protocols (link layer, IP layer, and transport layer) and experimented with state-of-the-art network architectures and applications. It covers the following topics: network programming over TCP/UDP sockets; application-layer protocols (DNS, HTTP, P2P); multimedia protocols (video streaming, VoIP, video telephony, virtual reality); data center networking and cloud computing; software-defined networking; network security.

This post explores foundational and advanced networking topics, diving into the structure and operation of modern networks. From the underlying architecture of the Internet to the complexities of network security and mobility, each section explains why these concepts matter and how they are used in the real world.


Overview

1. Structure of the Internet

2. High-Level Architecture of Different Networks

3. Web Request Workflow

5. TCP vs. UDP Socket Programming


Network Applications

1. Application-Specific Transport Layer Choices

Different applications have unique quality of service requirements that influence the choice between TCP and UDP. For example, streaming applications often use UDP because its low latency is essential for real-time performance, even if it means accepting occasional packet loss. In contrast, web applications rely on TCP’s reliability to ensure that every part of a web page is delivered intact, providing a consistent user experience despite fluctuations in network conditions.

2. P2P vs. Client/Server Architecture

3. Web Page Structure and HTTP

Modern web pages are constructed using HTML, CSS, and JavaScript, which define both the content and its presentation. The HTTP protocol governs the exchange of these resources between web browsers and servers. This separation of content and presentation, combined with HTTP’s request-response model, enables rapid and flexible web development, facilitating everything from static websites to dynamic, interactive applications.

4. Stateless vs. Stateful Protocols

5. Persistent vs. Non-Persistent HTTP

6. HTTP Cookies

HTTP cookies are small pieces of data stored on a client’s device by a web server. They allow servers to maintain state across stateless HTTP transactions by storing session information, user preferences, or tracking data. This mechanism is crucial for enabling personalized user experiences, managing logins, and tracking user behavior, despite the underlying stateless nature of the HTTP protocol.

7. Web Caching Benefits

Web caching involves storing copies of frequently requested resources closer to the user, either on the client side, at an ISP, or on dedicated caching servers. By reducing the need to repeatedly fetch data from the origin server, caching significantly decreases latency and reduces network congestion. This results in faster load times and more efficient use of bandwidth, improving the overall user experience.

8. FTP Control/Data Separation

File Transfer Protocol (FTP) separates the control channel (used for sending commands) from the data channel (used for transferring files). This design allows for simultaneous management of commands and bulk data transfer, leading to more efficient error handling and resource allocation. The separation also facilitates the use of different security measures and connection types for control and data, optimizing overall performance.

9. Email Workflow

Email systems are built on a layered protocol architecture where SMTP is used for sending messages and protocols like IMAP or POP are used for retrieving them. The workflow typically involves composing an email, relaying it through various mail servers, and finally delivering it to the recipient’s mailbox. This process not only ensures reliable message delivery but also supports features such as spam filtering, attachment handling, and user authentication.

10. DNS Name Resolution

The Domain Name System (DNS) functions like the Internet’s phonebook, translating human-friendly domain names into numerical IP addresses that computers use to identify each other. By providing a hierarchical and distributed lookup service, DNS simplifies user interactions with the Internet and supports load distribution and redundancy, which are essential for large-scale web services.

11. P2P and BitTorrent

BitTorrent, a popular P2P file-sharing protocol, divides large files into smaller chunks that can be downloaded concurrently from multiple peers. This chunk-based distribution method not only speeds up downloads but also reduces the burden on any single server. Key mechanisms like tit-for-tat encourage fair sharing among users, ensuring that peers contribute upload bandwidth in exchange for receiving data, thereby creating a balanced ecosystem.

12. Distributed Hash Tables (DHT)

DHTs are decentralized systems that provide efficient lookup services by mapping keys to values across a network of nodes. They offer a robust solution for scalable resource discovery without the need for a central directory. This decentralized approach is particularly valuable in P2P networks and distributed applications, where resilience to node failures and dynamic membership are critical.

13. Web Caching vs. CDN

While web caching typically involves storing copies of frequently accessed data locally or at a network edge, Content Delivery Networks (CDNs) take this concept to a global scale. CDNs distribute content across numerous geographically dispersed servers, which reduces latency, improves load times, and provides redundancy. This distributed approach is essential for handling high traffic volumes and ensuring a consistent user experience worldwide.


Multimedia Networking

1. Video Streaming Mechanisms

Video streaming relies on two key buffering mechanisms. Initial buffering preloads a segment of the video before playback begins, which helps mitigate initial delays and jitter. Playout buffering, on the other hand, continuously stores incoming data to compensate for variations in network speed, ensuring smooth playback. Both techniques are critical in maintaining quality of service for streaming applications, especially under fluctuating network conditions.

2. UDP vs. TCP for Video Streaming

UDP is often preferred for live streaming and real-time communication because it minimizes delay by forgoing the overhead of connection establishment and error recovery. This results in lower latency, although it means that some packets may be lost without retransmission. Conversely, TCP provides a reliable, ordered stream of data, which is beneficial for pre-buffered video content where quality and integrity are paramount, even though this can introduce delays due to its retransmission mechanisms.

3. DASH vs. Custom Streaming

Dynamic Adaptive Streaming over HTTP (DASH) represents an evolution from traditional streaming methods by dynamically adjusting the video bitrate based on real-time network conditions. This adaptability helps prevent buffering and quality degradation during periods of variable bandwidth. Compared to custom streaming solutions that may use fixed bitrates or proprietary protocols, DASH offers greater compatibility, flexibility, and a better user experience under diverse network environments.

4. CDN for Multimedia

CDNs play an essential role in multimedia delivery by caching video and audio content on servers located near end users. This proximity minimizes latency and reduces the load on central servers during high-demand events. By intelligently redirecting user requests to the nearest edge server, CDNs ensure that content is delivered quickly and reliably, which is especially important for high-bandwidth multimedia applications.

5. VoIP with SIP

Voice over IP (VoIP) applications rely on the Session Initiation Protocol (SIP) to set up, modify, and terminate calls over the Internet. SIP is designed to handle mobility and changing network conditions, making it ideal for users who move between networks. Its role in establishing sessions and negotiating media parameters ensures that voice communications are maintained even when the underlying IP addresses change.

6. Skype Architecture and NAT Traversal

Skype’s architecture leverages a combination of supernodes and relay servers to enable direct communication between users, even when they are behind firewalls or NAT devices. This design facilitates NAT traversal, allowing users on different networks to connect seamlessly. The use of decentralized supernodes not only distributes the network load but also enhances robustness by eliminating a single point of failure.

7. WebRTC and GCC Congestion Control

WebRTC enables real-time audio, video, and data sharing directly between browsers without the need for plugins. Its built-in Google Congestion Control (GCC) algorithm continuously monitors network conditions and dynamically adjusts the sending rate to balance throughput and delay. This adaptive congestion control is crucial for maintaining high-quality, interactive communications in environments with varying network performance.

8. Traffic Scheduling Algorithms


Data Center Networking and SDN

1. Data Center Traffic Patterns

Data centers experience highly bursty and often synchronized traffic due to parallel computing tasks and distributed applications. This traffic pattern can lead to sudden spikes in demand on particular network links. Understanding these patterns is crucial for designing data center networks that can dynamically allocate resources and avoid bottlenecks, ultimately ensuring high performance and scalability.

2. Scale-Up vs. Scale-Out

3. Load Balancing Limitations

In data centers, load balancing can be implemented on a per-packet or per-flow basis, each with its own set of challenges. Per-packet load balancing may result in packets arriving out of order, while per-flow load balancing can lead to uneven distribution if some flows are significantly larger than others. Recognizing these limitations helps network designers develop hybrid solutions that optimize traffic distribution and minimize latency.

4. Topology-Aware Addressing and Centralized Source Routing

Topology-aware addressing assigns IP addresses based on the physical or logical layout of the network. This method simplifies routing decisions and improves the efficiency of packet delivery. In centralized source routing, a central controller determines the optimal path for each packet, which can lead to improved performance and easier troubleshooting. However, the reliance on a central point also necessitates robust failover strategies to maintain network resilience.

5. DC-TCP and pFabric

6. Software-Defined Networking (SDN)

SDN represents a paradigm shift by decoupling the network’s control plane from its data plane. This separation allows network administrators to manage and reconfigure the network centrally, leading to increased flexibility and faster deployment of new services. SDN also enables network virtualization, where multiple virtual networks share the same physical infrastructure, resulting in better resource utilization and simplified management in complex, dynamic environments.


Network Security and Wireless Networks

1. Security Principles

At the heart of network security lie three core principles: confidentiality, integrity, and authentication. Confidentiality ensures that sensitive data remains accessible only to authorized users. Integrity protects data from unauthorized alteration during transmission, while authentication verifies the identities of communicating parties. Together, these principles create a robust framework that underpins secure communications across all types of networks.

2. Symmetric Key Cryptography

Symmetric key cryptography uses a single, shared secret for both encryption and decryption. Its main advantage is speed, which makes it ideal for encrypting large amounts of data. However, the challenge of securely distributing and managing the secret key, especially over insecure channels, means that symmetric cryptography is often used in conjunction with other security measures to establish initial trust.

3. Public Key Cryptography

Public key cryptography, such as RSA, utilizes a pair of mathematically related keys—a public key for encryption and a private key for decryption. This method enables secure communication over open networks and is fundamental for digital signatures and certificate-based authentication. Its strength lies in the mathematical complexity that makes it computationally infeasible to derive the private key from the public one, ensuring secure key exchange and data integrity.

4. Authentication Protocols

Authentication protocols verify the identities of users and devices before granting access to network resources. These protocols range from simple password-based systems to sophisticated multi-factor authentication and digital certificate schemes. Each method comes with its advantages and vulnerabilities, and choosing the right one depends on the required security level, ease of use, and the potential impact of a breach.

5. Digital Signatures

Digital signatures leverage public key cryptography to verify the origin and integrity of a message or document. By generating a unique signature based on the content and a private key, the sender ensures that any subsequent modification can be detected by the recipient using the corresponding public key. This process not only establishes trust but also provides non-repudiation, meaning that the sender cannot later deny the authenticity of the signed data.

6. Email Security

Email security combines encryption, digital signatures, and spam filtering to protect communications from unauthorized access and tampering. By encrypting messages, only the intended recipient can read the content; digital signatures ensure the message’s integrity; and spam filters help prevent phishing and malware distribution. This layered approach is essential for safeguarding sensitive information in both personal and professional contexts.

7. Firewalls

Firewalls are the first line of defense in network security, controlling incoming and outgoing traffic based on predefined rules. Stateless firewalls examine individual packets in isolation, offering a basic level of protection with minimal resource overhead. Stateful firewalls, however, maintain context about active connections, enabling them to detect and block suspicious traffic more effectively. This distinction allows organizations to balance performance with robust security based on their specific needs.

Wireless connections differ significantly from wired ones due to factors like interference, signal attenuation, and physical obstructions. These links are prone to higher error rates and variability in quality, necessitating specialized protocols for error correction and retransmission. Understanding these characteristics is essential for designing reliable wireless systems that can adapt to environmental changes and maintain service quality.

9. 802.11 MAC Protocol

The 802.11 MAC protocol underpins Wi-Fi networks by employing techniques like Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA). This method minimizes data collisions by ensuring that devices check for a clear channel before transmitting. Its design is crucial for efficiently managing the shared wireless medium and ensuring that multiple devices can communicate simultaneously without excessive interference.

10. Cellular vs. Wireline Networks

Cellular networks are engineered to support users on the move, offering seamless connectivity across wide geographical areas. They incorporate advanced features to handle handoffs between cells, manage interference, and optimize spectrum usage. In contrast, wireline networks, typically providing higher data rates and lower latency, are ideal for fixed locations. The differences in design and functionality highlight the trade-offs between mobility and performance.

11. 4G vs. 802.11 MAC

In 4G networks, scheduled access is used to allocate specific time slots or frequency bands to users, enhancing efficiency and reducing collisions in high-density scenarios. This controlled environment contrasts with the contention-based approach of 802.11 MAC in Wi-Fi networks, where devices compete for access to the medium. The scheduled access in 4G results in more predictable performance, which is particularly beneficial in congested environments, while the flexible, albeit less efficient, mechanism of 802.11 is well-suited for less predictable, local wireless communications.

12. Indirect Routing for Mobility

Indirect routing addresses the challenge of maintaining connectivity as mobile users change locations. By using mechanisms such as home agents and foreign agents, the network can dynamically forward packets from the user’s original address to their current location. This technique ensures that mobile devices remain reachable regardless of their movement across different networks, thereby providing a seamless user experience even in highly dynamic environments.