Buffer

intermediate buffer binary encoding memory

A Buffer is Node’s representation of raw binary data — a fixed-length sequence of bytes. In simple language, it’s like an array of integers from 0 to 255, but stored outside the V8 JavaScript heap so it can be passed cheaply to C code (file system, sockets, crypto).

Buffers came before JavaScript had Uint8Array. Today Buffer is a subclass of Uint8Array — anywhere a typed array works, a Buffer works too.

Why we need it

JavaScript strings are UTF-16 encoded internally. When we read a file or receive a network packet, the data is just bytes — could be UTF-8, binary image data, anything. We need a type that represents raw bytes without a charset assumption. That’s Buffer.

Memory layout
V8 Heap
JS objects, strings, numbers, arrays
Managed by GC. Slow to copy to C.
Buffer memory (off-heap)
Raw bytes. Allocated via libuv.
Zero-copy hand-off to syscalls.

Creating buffers

Three main ways, each with different semantics:

// 1. Allocate N bytes, zero-filled (safe, slightly slower)
const a = Buffer.alloc(10);
console.log(a); // <Buffer 00 00 00 00 00 00 00 00 00 00>

// 2. Allocate N bytes, uninitialized (FAST but may contain old data!)
const b = Buffer.allocUnsafe(10);
// b might contain anything — use only if you immediately overwrite all of it

// 3. From existing data
const c = Buffer.from("hello", "utf8");
console.log(c); // <Buffer 68 65 6c 6c 6f>

const d = Buffer.from([0xde, 0xad, 0xbe, 0xef]);
const e = Buffer.from("SGVsbG8=", "base64"); // → "Hello"

Never use the deprecated new Buffer(n) — it was a security disaster (allocated uninitialized memory by default).

Encodings

When converting between string and bytes, we specify an encoding:

  • utf8 (default) — variable-width, the standard
  • utf16le — UTF-16 little-endian
  • ascii — 7-bit ASCII, top bit dropped
  • latin1 — 1 byte = 1 codepoint, lossy for non-Latin chars
  • base64, base64url — common for transport / URLs
  • hex — pairs of hex digits
  • binary — alias for latin1 (legacy)
const buf = Buffer.from("hello", "utf8");

buf.toString("utf8");    // "hello"
buf.toString("hex");     // "68656c6c6f"
buf.toString("base64");  // "aGVsbG8="

Common operations

const buf = Buffer.from("hello world");

buf.length;              // 11 (bytes, not characters)
buf[0];                  // 104 (the byte for 'h')
buf.slice(0, 5);         // <Buffer 68 65 6c 6c 6f> — shares memory!
buf.subarray(0, 5);      // same; preferred name
buf.includes("world");   // true
buf.indexOf("world");    // 6
buf.equals(Buffer.from("hello world")); // true

// Concat multiple buffers
const merged = Buffer.concat([buf, Buffer.from("!")]);

slice shares memory — careful

buf.subarray() (and the old buf.slice()) returns a view, NOT a copy. Writing to it mutates the original.

const a = Buffer.from("hello");
const b = a.subarray(0, 3);
b[0] = 0x48; // 'H'
console.log(a.toString()); // "Hello"   ← original changed!

If we want a real copy, use Buffer.from(buf).

Reading and writing typed values

Buffers have helpers for parsing binary protocols — reading integers, floats at specific offsets in big or little endian:

const buf = Buffer.alloc(8);
buf.writeUInt32BE(0x12345678, 0);  // write 4 bytes big-endian at offset 0
buf.writeUInt32LE(0xCAFEBABE, 4);  // little-endian at offset 4

buf.readUInt32BE(0).toString(16);  // "12345678"
buf.readUInt32LE(4).toString(16);  // "cafebabe"

This matters when we’re talking to TCP protocols, parsing image headers, or implementing wire formats.

Real-world: hashing a file

const fs = require("node:fs");
const crypto = require("node:crypto");

const hash = crypto.createHash("sha256");
const stream = fs.createReadStream("./big-file.zip");

stream.on("data", (chunk) => {
  // chunk is a Buffer
  hash.update(chunk);
});

stream.on("end", () => {
  console.log(hash.digest("hex"));
});

Notice we never convert chunks to strings — that would corrupt binary data. The whole pipeline is buffer → buffer.

Buffer pool — a perf detail

For small buffers (< 4KB by default), Buffer.allocUnsafe and Buffer.from(string) allocate from a shared pool to avoid the cost of asking libuv for memory each time. That’s why “unsafe” buffers may contain bits of previously freed data. For larger sizes, Node allocates fresh memory directly.

When to use Buffer vs Uint8Array

In new code, Uint8Array works in browsers AND Node. Buffer adds convenience methods (toString, write, indexOf for strings, encoding conversions) but is Node-only. For shared browser/Node code, prefer Uint8Array + TextEncoder/TextDecoder for string conversion.