// notes from the open web

This site began in 2008 as a companion to a book about SMIL, the W3C standard for synchronized multimedia, and although the original project eventually wound down the domain stayed with us, so we kept writing here whenever something in open web technology or media engineering felt worth the trouble of putting into words.

The News Archive Problem

A news article looks like a stable object to the reader, since it has a URL and a headline and a timestamp sitting at the top of the page, but the version you are looking at right now is usually only one of several that the publisher has served from that same address in the course of the day. The headline gets reworked once or twice in the first hour as the desk settles on a line, the lede shifts as the story develops or as a competing outlet reframes it, and a few months later the same URL may resolve to a paywall, a 404, or a quietly edited version with a paragraph removed and no public record that anything changed. The web is set up to treat news as a stream that flows past the reader, and treating that same stream as a durable record is a separate engineering problem which, to a surprising degree, nobody has fully solved.

The Internet Archive's Wayback Machine is the closest thing the open web has to a default answer, and the work it does is genuinely extraordinary, but it was not built around the specific shape of a news story. It captures pages on a crawl schedule rather than on edits, which means the version of a headline you are reading at 14:03 in the afternoon may never have been captured at all if the crawler happened to visit at 09:00 and then again the following morning. It also does not structure its captures around the entity "this article" in the way a news-specific archive would, so reconstructing the edit history of a single piece across snapshots is a manual job that involves diffing arbitrary HTML and inferring which changes were editorial and which were boilerplate.

There has been a smaller and more specialised set of projects working at the news layer specifically, and they are interesting to read about even when, or perhaps especially when, they have stopped running. The GDELT Project ingests news across languages and regions and exposes the result as an enormous queryable dataset, which has been valuable to researchers studying media patterns but was never meant to function as a reading interface for a curious human. NewsDiffs, which ran out of MIT and Stanford for several years in the early 2010s, tracked headline and body-text changes on the New York Times, Washington Post, BBC and a handful of others and made the diffs browsable in a way that exposed, sometimes uncomfortably, the editorial choices a newsroom would rather have made invisible. The project went dormant, but the instinct behind it was the right one, and you can still find writers and researchers referencing the captures it produced.

What feels more interesting in the present generation is a group of projects that treat the aggregation surface itself as the primary artifact rather than as scaffolding for something else. The Hear is a working example of this approach: a multi-language headline aggregator that pulls timestamped updates from a configurable set of regional and international outlets, retains what it captures so the reader can step backwards in time, and lets you change country or language without losing the temporal axis you were reading along. The architectural choice underneath, which is to store the stream as a first-class object and then expose the archive as a navigable surface on top of it, is the part that makes this generation of tools different from the simple RSS readers of fifteen years ago, and it sits in the same intellectual neighbourhood as the Common Crawl news subset and the older Memex-style web-clipping tools, applied specifically to the moment-by-moment headline layer that legacy archives tend to flatten.

None of this is anywhere near a solved problem, of course, because the legal questions around archiving paywalled content, the storage questions around whether to keep every edit or only periodic snapshots, and the interface questions around how to present a story with fourteen versions to a reader who only has time for one are all genuinely difficult and largely unresolved. What seems clear is that the projects taking the headline layer seriously as something worth preserving on its own terms, rather than as a side effect of general-purpose web crawling, are quietly doing the work that the open web has needed someone to do for a long time.

What a Video Container Actually Is (And Why It Matters)

The container and the codec are two different things. They get conflated because they travel together, but confusing them causes real problems when video breaks in ways you can't explain. An .mp4 file is not a codec. H.264 is a codec. The file is H.264-encoded video (and probably AAC-encoded audio) wrapped in the MPEG-4 Part 14 container. The container is the packaging. The codec is the compression algorithm for the data inside.

The container holds multiple streams — video, audio, subtitles, chapter markers, metadata — and gives a player the structure it needs to navigate and synchronize them. It records which codec encoded each stream, how the streams line up in time, where the keyframes are (so seeking works without decoding from the start), and whatever descriptive metadata the file carries. The container is a file format specification. The codec is an algorithm.

This distinction matters most when something doesn't play. Browser compatibility questions are almost always really two questions: does the browser support the container, and does it support the codec inside? A browser can parse an MP4 container without supporting every codec that can legally go inside one. When video plays in Chrome but fails in Firefox, or works on desktop but not iOS, the codec is usually where the problem is. MP4 and WebM have broad container support. What's inside determines whether a given decoder can handle it.

Three containers cover most web video. MP4 (formally MPEG-4 Part 14, sometimes called ISOBMFF for the ISO Base Media File Format spec it extends) is the format most video comes out of: cameras, phones, production tools. It can hold H.264, H.265/HEVC, AV1, and others. WebM is Google's open container, paired with VP8, VP9, or AV1 video and Vorbis or Opus audio, designed specifically for web delivery. HLS and DASH use a variant called fragmented MP4 (fMP4) — regular MP4 restructured so that each segment is independently decodable, without needing the file's initial metadata block (the "moov" atom) to be present before playback can start.

A standard MP4 file has its moov atom at the end by default — the encoder puts it there because it doesn't know the final file size until encoding finishes. For progressive download, you move it to the front with ffmpeg's -movflags +faststart, which lets the browser start playing before the full file arrives. That works fine for single-file delivery. For adaptive streaming with HLS or DASH, where the player is fetching separate two-to-six-second segments, it doesn't work at all — each segment needs to be independently decodable, with no dependency on a moov atom somewhere else in the file. Fragmented MP4 handles this by embedding the timing and offset information into each segment rather than collecting it centrally.

Codec licensing is where it gets complicated. H.264 is the most widely deployed video codec ever built, but it's covered by patents administered by the MPEG LA patent pool. Browser vendors have paid licensing fees to ship H.264 decoders, which is why it works everywhere. H.265/HEVC compresses better — roughly half the bitrate of H.264 at equivalent quality — but is messier legally, with multiple competing patent pools and higher per-unit royalties. Browser support is patchier as a result: Safari supports it, Chrome does on some platforms, Firefox doesn't ship a decoder by default. AV1 was produced by the Alliance for Open Media to give the industry a royalty-free path. Browser support is solid across Chrome, Firefox, Edge, and Safari on Apple Silicon.

The codec string in the type attribute is worth actually understanding rather than copying from Stack Overflow. For H.264, `avc1.42E01E` breaks down like this: avc1 is the codec identifier; 42 is the profile byte (0x42 = 66 decimal, the Constrained Baseline profile); E0 is a constraints byte; 1E is the level (0x1E = 30 decimal, meaning Level 3.0, which caps out at 720p/30fps). Serving 1080p requires at least Level 4.0 — `avc1.640028`. The browser checks these values before fetching anything. A device that can't handle Level 4.0 skips that source entirely rather than downloading it and choking partway through, which is the actual reason to be specific rather than just writing `video/mp4`.

For web delivery: H.264 in MP4 is still the safe baseline. Add VP9 or AV1 in WebM if you want better compression or want out of the licensing situation, with explicit type attributes on each <source> so the browser can decide without fetching. For adaptive streaming, fMP4 segments are the right call for both HLS and DASH — the MPEG-TS segments that older HLS used are mostly legacy at this point, and every player worth using handles fMP4.

WebCodecs and the End of the Black Box

For most of the web's history, video in the browser worked one way: you handed a URL to a <video> element and the browser handled everything. It fetched the bytes, demuxed the container, decoded the frames, composited them onto the page. The pipeline was completely opaque. If you needed to do anything outside it — process frames before display, encode video in the browser, build a player with custom buffering — you were fighting the platform rather than using it.

The WebCodecs API opens that up. It's a low-level interface to the browser's codec infrastructure — the same H.264 and VP9 decoders <video> uses internally, now reachable from JavaScript. You get individual frames. You control the decode pipeline. You can feed encoded chunks in any order, implement your own buffering, read pixel data off frames before they're composited, encode camera input to a compressed stream without routing through an intermediate canvas.

The API shipped in Chrome 94 in 2021 and landed in Firefox at version 130. It's not meant for ordinary media playback — <video> still does that better with less code. It's for applications that need to work below the abstraction: browser-based video editors, conferencing tools that need frame-level access for effects or background replacement, streaming platforms managing their own adaptive bitrate logic, and anything doing real-time analysis or transformation of video in the browser.

The core objects are VideoDecoder, VideoEncoder, AudioDecoder, and AudioEncoder. Each takes a configuration — codec string, image dimensions, bitrate, keyframe interval — and two callbacks: one for output chunks or frames, one for errors. For decoding, you construct the decoder with those callbacks, call configure() with the codec parameters, then feed it EncodedVideoChunk objects. Each chunk carries the encoded bytes, a timestamp in microseconds, a duration, and a type field that's either "key" for keyframes or "delta" for everything else. The decoder calls your output callback with a VideoFrame for each one it successfully decodes.

The VideoFrame you get back carries a timestamp and duration matching what you put in, a format field describing the pixel layout (usually I420 or NV12 for YUV, RGBA for others), and separate codedWidth/codedHeight and displayWidth/displayHeight fields — those can differ if the encoder used crop offsets or non-square pixels. You read pixel data with copyTo(), passing a TypedArray and an optional layout descriptor to control stride. One thing that catches people: call close() on frames when you're done with them. WebCodecs doesn't garbage-collect them automatically. Let enough pile up and the decoder's internal queue fills, which shows up as opaque errors rather than anything helpful.

One thing that trips people up: flush(). When you're done feeding chunks to a decoder, you call flush(), which returns a Promise that resolves when all pending output has been delivered. If you're seeking or switching streams, you call reset() instead, which discards queued input and output immediately. The distinction matters because decoders buffer several frames internally for reference — a decoder won't necessarily output a frame the moment it receives the corresponding chunk. Without flush(), you can end up missing the last few frames of a clip.

Decoding video in JavaScript — parsing compressed bytes, implementing motion compensation, reconstructing frames — was never practical beyond low resolutions and frame rates. That's the ceiling that WebCodecs removes. It hands decode work to the same native decoder the browser uses for <video>, with hardware acceleration where the device supports it. A browser-based editor can pull 4K footage at full frame rate on hardware where pure JS couldn't have gotten close.

Pair WebCodecs with the File System Access API and WebAssembly and you have enough to build tools that used to require a native app or a server. Demux the container with MP4Box.js or ffmpeg compiled to WASM, feed the encoded chunks to WebCodecs, process the output frames in a WASM module, write results back to disk via File System Access — no upload, no round-trip, no server involved. That stack didn't exist in a usable form two years ago.

It's not easy to implement. The API is deliberately low-level, and there's real plumbing work involved: handling keyframe dependencies when seeking (you can't decode a delta frame without its reference frames), managing decoder state across stream switches, keeping encode and decode in sync during transcoding. The error handling is also more demanding than higher-level APIs — you need to handle both synchronous configuration errors and asynchronous decode errors in the error callback. But the pieces are there. What you can build in a browser without a backend has moved considerably.

Earlier posts