Networking

Key Points:

Extra readings:

Networking

Computers can trade data over the Internet. It's kind of a thing now. But what does that really mean?

Since this is a games class and not a networking class, we'll focus mostly on applications and practice here rather than theory. For a fuller background, please see the sockets book listed in Extra Readings (you have it free with O'Reilly via the library.

Computer networking is a mishmash of various layers and protocols, hardware and software; we're going to refine our attention to two specific parts of it in what's called the OSI model of networking:

  1. Physical layer (wires, radio waves)
  2. Link layer (ethernet frames, MAC addressing)
  3. Network layer (IP packets, routers and switches)
  4. Transport layer (TCP, UDP)
  5. Session layer (from here on up…)
  6. Presentation layer (it's kind of arbitrary…)
  7. Application layer (and a bit of a mess!)

Protocols like SMTP, HTTP, and friends tend to smear across layers 5-7.

If you're implementing a networked game, you first have to decide how one instance of the game will transmit information to other instances. There are three main choices:

  1. HTTP/web APIs to send information to and retrieve information from a centralized server. This is most appropriate for games that have a slow pace of networked communication and want to maximize interoperability (e.g., between platforms). We'll ignore this for now, but web requests in Rust are pleasant with libraries like ureq, isahc, or reqwest:

    let body: String = ureq::get("http://example.com")
        .set("Example-Header", "header value")
        .call()?
        .into_string()?;
    
  2. TCP sockets with a custom protocol, either with a centralized server or peer-to-peer.
  3. UDP sockets with a custom protocol, either with a centralized server or peer-to-peer.

Technically (1) is a special case of (2), which could be appropriate for turn-taking games or if you're already comfortable with HTTP. You might consider writing a game server in Rust using one of the many http server crates out there; Rocket, Actix, Warp, and Tide are popular options.

The rest of these notes will focus on options 2 and 3. But first off, what are the differences between TCP (Transmission Control Protocol) and UDP (User Datagram Protocol)?

Transport layer: TCP and UDP

TCP and UDP both live at the transport layer, which is built on top of the internet protocol (IP). Other protocols like HTTP, SMTP, and SSH are built on top of TCP; UDP underlies protocols like DNS (domain name service), voice and video communication, and QUIC.

Quoth wikipedia:

Transmission Control Protocol is a connection-oriented protocol and requires handshaking to set up end-to-end communications. Once a connection is set up, user data may be sent bi-directionally over the connection.

  • Reliable – TCP manages message acknowledgment, retransmission and timeouts. Multiple attempts to deliver the message are made. If data gets lost along the way, data will be re-sent. In TCP, there's either no missing data, or, in case of multiple timeouts, the connection is dropped.
  • Ordered – If two messages are sent over a connection in sequence, the first message will reach the receiving application first. When data segments arrive in the wrong order, TCP buffers the out-of-order data until all data can be properly re-ordered and delivered to the application.
  • Heavyweight – TCP requires three packets to set up a socket connection before any user data can be sent. TCP handles reliability and congestion control.
  • Streaming – Data is read as a byte stream, no distinguishing indications are transmitted to signal message (segment) boundaries.

User Datagram Protocol is a simpler message-based connectionless protocol. Connectionless protocols do not set up a dedicated end-to-end connection. Communication is achieved by transmitting information in one direction from source to destination without verifying the readiness or state of the receiver.

  • Unreliable – When a UDP message is sent, it cannot be known if it will reach its destination; it could get lost along the way. There is no concept of acknowledgment, retransmission, or timeout.
  • Not ordered – If two messages are sent to the same recipient, the order in which they arrive cannot be guaranteed.
  • Lightweight – There is no ordering of messages, no tracking connections, etc. It is a very simple transport layer designed on top of IP.
  • Datagrams – Packets are sent individually and are checked for integrity on arrival. Packets have definite boundaries which are honored upon receipt; a read operation at the receiver socket will yield an entire message as it was originally sent.
  • No congestion control – UDP itself does not avoid congestion. Congestion control measures must be implemented at the application level or in the network.
  • Broadcasts – being connectionless, UDP can broadcast - sent packets can be addressed to be receivable by all devices on the subnet.
  • Multicast – a multicast mode of operation is supported whereby a single datagram packet can be automatically routed without duplication to a group of subscribers.

The important distinctions for our purposes are that TCP guarantees that packets arrive in order and that all packets arrive (or the connection errors out). TCP also mandates congestion control and other features. All these features come at a cost: latency. UDP makes no such guarantees (packets might arrive out of order or not at all, although an individual packet will always fully arrive or not arrive at all). Nothing comes for free; TCP achieves reliability by asking the sender to re-send lost packets, and it gets ordering by waiting around for missing packets in the stream.

For certain games, TCP is a natural choice. A turn-taking game, for example, doesn't have strong latency requirements but would benefit from the reliability and ordering guarantees of TCP. But for other games, TCP is not appropriate; consider an FPS, where players give dozens of inputs per second—if we miss one, it doesn't really matter since the player will be off doing something else. So, to a rough approximation, the choice of UDP vs TCP depends on how frequently inputs need to be sent between players, and whether old inputs are still important after new ones arrive.

Protocols

Separately from the question of transport protocol, we also need to consider how applications send data over the network sockets. At the most basic level, communication looks like this:

  1. One side of the communication (the host) opens (binds) a socket and listens for new connections on some address.
  2. The other side of the communication (the client) opens a socket. For TCP, the client will connect to the host here; if it's a UDP socket, the client can start sending data to the host address right away.
  3. For TCP games, when a connection is established on the host (accepted) a new socket is created on the host and hooked up to that client. For UDP games, this is not necessary.
  4. Once the connection is established (for TCP) or once the sockets are both opened (for UDP), either side can send and receive data from the other.

But what data are actually sent over the wire? If you're using UDP, you might want to use some bytes to help you implement ordering, reliability, or other features. Whether you're using TCP or UDP, you might want to encrypt the data you're sending (maybe using TLS or some other protocol). But eventually, you need to decide what data you're sending. It's probably worth spending some time thinking about your protocol: whether it's plain text or binary first of all (mostly a consideration for TCP), and then what data you're sending. Are you transmitting each move being made, or the whole game's state every so often? Do you have multiple types of messages (try an enum!)? How do you serialize your complex game data for network transmission (maybe use the serde crate or protocol buffers or capnproto)?

Async Rust

Recommended reading: Asynchronous Programming in Rust.

Rust has a major language feature that we've only slightly touched on so far: asynchronous (async) functions, supported by the standard library's Future trait. A future is a computation-in-flight represented as a data structure. If we have a future F, we can do a few things with it (as our rendering code does): for example, we can wait for it to be ready, or we can create a new future G and schedule it for when F is finished. We can write async functions that, when called, produce Futures (examples from the async Rust book):

// `foo()` returns a type that implements `Future<Output = u8>`.
// `foo().await` will result in a value of type `u8`.
async fn foo() -> u8 { 5 }

We can also write async blocks that produce Futures:

fn bar() -> impl Future<Output = u8> {
    // This `async` block results in a type that implements
    // `Future<Output = u8>`.
    async {
        let x: u8 = foo().await;
        x + 5
    }
}

Note that in order to call foo, bar has to use an async block and produce a future; only async functions or blocks can call async functions. Moreover, within that async block, just calling foo() doesn't yield a number that we can do anything with—we also must wait for the number to be produced using .await.

So what's the point of all this? Well, let's imagine some networking pseudo-code:

let socket = open_socket();
let message = socket.recv();
socket.send(fmt!("You said {message}".into()));
println!("Done!");

The second line will block: it will wait around until a message is received. That means we're not rendering, or processing input, or anything until we get data over the network, which is famously super slow!

We could try to make it better:

let socket = open_socket();
// ... later, near our game update code ...
if socket.ready() {
    let message = socket.recv();
    socket.send(fmt!("You said {message}".into()));
    println!("Done!");
}

But… what if send blocks too? We'd need to poll separately for whether we're allowed to send on the socket, and then send information (maybe a little bit at a time!). Even worse, if we are using TCP we might not recv a complete message at once, so we'd need buffers and state all over the place. Plus, we've had to pollute our game update code with something like process_network_justabit() calls. Gnarly!

You might say at this point, "what about multithreading?" We could use an OS thread (std::thread) to handle network communications, and then use a mutex or something to update game state across threads. Sure! But that's a bit heavyweight and maybe even error-prone.

Instead, our solution will rely on the observation that most of the time, the networking code is doing nothing at all except for waiting around. We are bound by I/O latency, not by CPU time, and it's for exactly this type of problem (loading files, network IO) that async Rust (or JavaScript promises, or Go goroutines) is intended.

The Rust standard library defines a Future trait and ways to create Futures from code. We can call poll() on any future to progress it as much as possible, but we have no way to actually schedule futures (since the "right" way will vary from application to application). Two popular runtimes for async Rust are smol and tokio; we'll use tokio in the examples here.

For your game engine, you'll want to add tokio as a dependency with only particular features enabled:

tokio = { version = "1", features = ["rt-multi-thread", "net", "io-util"] }

You could say features = ["full"] but it's good hygiene in a library like your engine crate not to overcommit on dependencies.

Now we can write a simple UDP echo service (tweaked from the tokio tutorial and API docs):

use tokio::net::UdpSocket;
// UDP packets can't be larger than this
const UDP_MAX_SIZE:usize = 65536;
async fn run_server() {
    // Bind the socket to the address
    let sock = UdpSocket::bind("0.0.0.0:8080").await.unwrap();
    let mut buf = [0; UDP_MAX_SIZE];
    loop {
        // we got a message!
        let (len, addr) = sock.recv_from(&mut buf).await.unwrap();
        println!("{:?} bytes received from {:?}: ", len, addr, );
        // we send the same thing right back!
        let len = sock.send_to(&buf[..len], addr).await.unwrap();
        println!("{:?} bytes sent", len);
    }
}

You may notice there's no main() function there. If you use tokio's macros feature you can write something like:

#[tokio::main]
async fn main() {
    run_server().await
}

But in a real situation, you might not want your main to be an async fn, and you may want more control over the tokio runtime. See the tokio docs for more info here.

Note also that UDP datagrams have a maximum size, and if you try to send or recv something too large you might see an error or data might just get discarded!

The client side might look a bit like this. This client just sends the time elapsed since program start repeatedly to the server.

use std::net::SocketAddr;
use tokio::net::UdpSocket;

// UDP packets can't be larger than this
const UDP_MAX_SIZE:usize = 65536;
async fn run_client() {
    let start = std::time::Instant::now();
    // Bind the socket to some arbitrary local port
    let sock = UdpSocket::bind("127.0.0.1:0").await.unwrap();
    let addr = "0.0.0.0:8080".parse::<SocketAddr>().unwrap();
    // "connect" it to the remote host.  This doesn't form
    // a connection in the TCP sense, but it does tie this socket
    // to that port so we can just use send and recv.
    sock.connect(addr).await.unwrap();
    let mut buf = [0_u8; UDP_MAX_SIZE];
    loop {
        // send a message...
        let since = start.elapsed().as_micros();
        let bytes = since.to_ne_bytes();
        buf[..bytes.len()].copy_from_slice(&bytes);
        sock.send(&buf[..bytes.len()]).await.unwrap();
        // wait to get it back!
        let len = sock.recv(&mut buf).await.unwrap();
        println!("{:?} bytes received from {:?}", len, addr);
    }
}

And here's how it would look with TCP (but remember, we get ordering and reliability):

use tokio::net::{TcpListener, TcpStream};
use tokio::io::{AsyncBufReadExt, BufReader, AsyncWriteExt};
async fn run_server() {
    // Bind the listener to the address
    let listener = TcpListener::bind("0.0.0.0:8000").await.unwrap();
    loop {
        // The second item contains the IP and port of the new connection.
        let (socket, _) = listener.accept().await.unwrap();
        // This is async, so we'll go back into the loop right away and await the next one!
        tokio::spawn(async move {
            process(socket).await
        });
    }
}
async fn process(mut socket: TcpStream) {
    // This is like Java's BufferedReader, a big help when dealing with TCP
    let (reader, mut writer) = socket.split();
    let mut reader = BufReader::new(reader);
    let mut buf = String::new();
    loop {
        // This protocol will be a line-based protocol.
        // We could instead, say, read one byte for a "message type code"
        // and then read as many bytes as we ought to for that type of message,
        // or read a couple bytes representing message length and then read
        // that many bytes.
        match reader.read_line(&mut buf).await {
            Ok(0) => return,
            Err(_err) => return,
            Ok(n) => {
                println!("Got {n} bytes: {buf}");
                writer.write_all(&buf[0..n].as_bytes()).await.unwrap();
                buf.clear();
            }
        }
    }
}
use tokio::net::TcpStream;
use tokio::io::{AsyncBufReadExt, BufReader, AsyncWriteExt};

async fn echo_client() {
    let mut sock = TcpStream::connect("127.0.0.1:8000").await.unwrap();
    let (reader, mut writer) = sock.split();
    let mut reader = BufReader::new(reader);
    let mut buf = String::new();
    loop {
        // This time, instead of sending a timestamp, we'll send
        // whatever the user types at the console.
        std::io::stdin().read_line(&mut buf).unwrap();
        writer.write_all(&buf.as_bytes()).await.unwrap();
        // clear buf and read a response from the server
        buf.clear();
        match reader.read_line(&mut buf).await {
            Ok(0) => return,
            Err(err) => return,
            Ok(n) => {
                println!("Got {n} bytes: {buf}");
                buf.clear();
            }
        }
    }
}

Now, your game won't be sending just timestamps or just arbitrary strings. You'll have some enum:

enum ClientMessage<GameClientMessage> {
    Connect{username:String, password:String}, // or whatever!
    GetPlayerData(PlayerID),
    JoinLobby{name:String},
    CreateLobby{name:String},
    SendChat(String),
    // ...
    GameSpecific(GameClientMessage)
}
enum ServerMessage<GameServerMessage> {
    PlayerJoined(PlayerID),
    PlayerLeft(PlayerID),
    GameStarted,
    GameEnded,
    ChatMessage(PlayerID, String),
    // ...
    GameSpecific(GameServerMessage)
}

You could serialize such an enum to bytes (or a JSON string) using a library like serde, bincode, or capnp. Often these kinds of libraries allow you to #[derive(...)] the serialization support for your types, which is pretty handy!

Activity: Think of what the client and server messages should be like for the following games:

  1. A billiards game, where players move by deciding on a power level along with an xy (horizontal) and a yz (vertical) angle around the cue ball.
  2. A rock-paper-scissors game
  3. A multiplayer game of tag

Game Network Architecture

One overarching consideration when implementing game networking is: who do you trust? Consider these three scenarios for a Minecraft-like game with players running around a world:

  1. To move, clients transmit control change events like "change direction" and "change speed".
  2. To move, clients transmit their current position twice per second.
  3. To craft items, clients transmit "craft item X" messages. To use items, clients transmit "used item Y" messages.
  4. Clients transmit "lost item X" and "got item Y" messages, and crafting is a sequence of losses and gains.

Activity: Every one of these possibilities has risks from a cheating standpoint. Take a few minutes and figure out what could go wrong in each case.

In conclusion, trust no one, software is awful, and you should throw your computer into the nearest ocean at your earliest convenience.

Really resolving these kinds of problems in an airtight way is difficult. So I recommend avoiding them as much as possible, because ultimately anti-cheating is a cat-and-mouse game and you're playing against, uh, gamers.

One way to avoid solving cheating is to only allow players to play with people they know and forget about having an open-ended competitive scene. Then cheating is fine, and even fun! If, for example, there are no public lobbies and players advertise games/servers themselves, it's no problem. Another way to limit cheating is to have as little computation as possible happen on the client, preferring to validate every action and simulate every result on the server, and never sending a player any information that they "shouldn't" be able to see. Then the clients are essentially just fancy displays for the game state.

For now, we'll mostly ignore cheating and focus just on the technical issues of getting games to share state at all.

Broadly, there are two dominant architectures we've touched on already: client-server and peer-to-peer. In a client-server setup, one machine (which might just be a normal copy of the game, or might be a dedicated server) is in charge of running the game and synchronizing states across clients, and many clients connect to this server to play (sometimes, the player at the machine running the server is also connecting as a client). In a peer-to-peer architecture, clients connect directly to each other and share gameplay updates without a designated server (although in practice, one client may be the game "host" and have special permissions to e.g. kick users or change game modes). You might also see hybrids—clients might all connect to one server, but clients which are close to each other in-game might connect directly to swap updates with lower latency.

Besides the topology of how games connect to each other, we also need to consider what information is transferred between players. Two general approaches are (1) to send changes and (2) to synchronize state. A game might use both (snapshots of state with changes transmitted in between) or commit to one or the other; it may be that clients send changes and the server sends states or vice versa; but the important thing to know is how these two approaches work.

In the first case, we send changes: if a player changes their direction of movement or wants their character to move to a particular position, we transmit that action. If the server and all clients have the property that seeing the same sequence of actions will produce the same state (determinism), this can be very efficient. If not, the server may need to report back to the client (and to other clients) the result of executing that action, which is a matter of synchronizing state. Determinism has many benefits: it makes games easier to test and debug, it allows for efficient save-file and networking formats, and it means you can easily implement things like time-travel features. But it also has drawbacks: it can be complicated to prove a game deterministic (especially if floating-point math on distinct computers is involved, or if actions are time-sensitive), and it can lead to cheating issues if pseudo-random numbers are involved. It's often simpler to make a best effort at determinism on short time-scales (like a few seconds) and use state synchronization to supplement it.

In state synchronization, we send world updates (or complete states-of-the-world) from client to server or (more often) from server to client, and clients revise their internal game state datastructures to match the information from the server. This can lead to judder (where an opponent seems to teleport or rapidly slide from one position to another) or jitter (where the player moves a bit, but the server has sent a report of a different position which took a while to arrive). On the plus side, it definitely brings clients into sync with each other even if they are running at different framerates, and it's relatively easy to graft onto existing engines.

An important tool for minimizing perceived latency in both regimes, but especially in state synchronization, is client-side prediction. If a player starts moving in a new direction, they should not have to wait for the server to tell them what their new position is; they should be able to simulate it locally and then be corrected, if necessary, but the server. Likewise for a player's opponents: we can assume that they will continue moving in their current direction of motion (extrapolating their position from their velocity) and if the server sends us an update, we can visually interpolate between the client's belief about the opponent's position and their actual position in the world over the course of some true-up interval. More advanced techniques abound, but these notes are already getting pretty long!