← EasyTool.me

Claude as a User Space IP Stack: LLM Responds to Ping — Full Experiment Guide

Published: 2026-05-11

EasyTool.me12 min read中文版

"I fed raw IP packets to Claude and asked it to act as a user space IP stack. Then I pinged it. It replied."

That's the premise of Adam Dunkels' latest experiment, which rocketed to the front page of Hacker News with 57 points in under an hour. The creator of Contiki OS and the lwIP TCP/IP stack — one of the most widely embedded networking stacks in existence — decided to find out if a large language model could understand and respond to raw network protocols.

The answer, it turns out, is yes. Just very, very slowly.

TL;DR: Adam Dunkels wrote a Python script using Scapy to capture ICMP echo requests (pings) on the wire and forward the raw packet bytes to Claude via the API. Claude was instructed to act as an IP stack: parse the incoming packet, decide if it's an ICMP echo request, and if so, construct and return a valid ICMP echo reply. The result: Claude responded to pings successfully — but with a round-trip latency of approximately 10 seconds. That's roughly 10,000,000x slower than a real IP stack. And it's amazing.

The Experiment: How to Make Claude Into a Network Stack

Adam Dunkels — the same engineer who wrote lwIP, the lightweight TCP/IP implementation that runs on hundreds of millions of embedded devices — posed a simple question: Could an LLM do what lwIP does, just at human reasoning speed?

The experiment runs in four steps:

  1. Capture — A Python script listens for ICMP echo requests using Scapy (raw socket capture)
  2. Forward — The raw bytes of the incoming IP packet are sent to Claude via the Anthropic API
  3. Process — Claude, prompted to "act as a user space IP stack," parses the IP header, identifies the ICMP type, extracts the payload, and constructs a valid ICMP echo reply
  4. Respond — The Python script takes Claude's response (a hex-encoded IP packet), unwraps it, and sends it back on the wire
$ ping 10.0.0.1
PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=10256 ms
64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=9872 ms
64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=10134 ms

--- 10.0.0.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 30002ms
rtt min/avg/max/mdev = 9872/10087/10256/158 ms

Three pings sent. Three replies received. Zero percent packet loss. The round-trip time was consistently around 10 seconds — but the key insight is that each ICMP echo reply was a valid IP packet, with the correct IP and ICMP checksums, the right sequence number, and a properly constructed echo reply matching the incoming payload.

The Python Script: Raw IP Forwarding to Claude

Here's the core of the experiment — a Python script that bridges raw network packets to Claude's API:

#!/usr/bin/env python3
"""
Claude as a User Space IP Stack
Feed raw ICMP echo requests to Claude, let it construct replies.
"""

import socket
import struct
import os
from anthropic import Anthropic

# Claude API client
client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

SYSTEM_PROMPT = """You are a user space IP stack.
Your job is to parse incoming IP packets and generate appropriate responses.
When you receive an ICMP echo request (type 8), you must generate a valid
ICMP echo reply. The reply must have:
- type=0 (Echo Reply)
- code=0
- checksum (correctly computed)
- identifier and sequence number from the request
- the same payload as the request

Output your response as a hex-encoded IP packet (with proper IP header,
correct total length, and correct IP checksum).
The source and destination IPs should be swapped in the reply."""

def build_raw_socket(iface=None):
    """Create a raw socket to capture and send IP packets."""
    s = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_ICMP)
    s.setsockopt(socket.IPPROTO_IP, socket.IP_HDRINCL, 1)
    if iface:
        s.setsockopt(socket.SOL_SOCKET, 25, bytes(iface, "utf-8"))
    return s

def icmp_echo_request(packet):
    """Check if an IP packet is an ICMP echo request."""
    ip_header = packet[:20]
    proto = ip_header[9]
    if proto != 1:  # ICMP
        return None
    
    icmp_start = (ip_header[0] & 0x0F) * 4
    icmp_type = packet[icmp_start]
    return icmp_start if icmp_type == 8 else None

def forward_to_claude(raw_packet, sock):
    """Send raw IP packet to Claude and send the reply on the wire."""
    hex_packet = raw_packet.hex()
    
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        system=SYSTEM_PROMPT,
        messages=[{
            "role": "user",
            "content": f"Incoming IP packet (hex): {hex_packet}"
        }]
    )
    
    reply_hex = response.content[0].text.strip()
    # Extract hex from code block if present
    if "```" in reply_hex:
        reply_hex = reply_hex.split("```")[1]
        if reply_hex.startswith("hex"):
            reply_hex = reply_hex[3:]
        reply_hex = reply_hex.strip()
    
    reply_packet = bytes.fromhex(reply_hex)
    sock.sendto(reply_packet, ("10.0.0.2", 0))
    return len(reply_packet)

def main():
    sock = build_raw_socket()
    print("Listening for ICMP echo requests...")
    
    while True:
        packet, addr = sock.recvfrom(65535)
        offset = icmp_echo_request(packet)
        if offset is not None:
            print(f"ICMP echo request received from {addr}")
            reply_len = forward_to_claude(packet, sock)
            print(f"Reply sent ({reply_len} bytes)")

if __name__ == "__main__":
    main()

The script is deceptively simple. A raw socket captures ICMP traffic, and any incoming echo request is hex-encoded and forwarded to Claude. The LLM's response — a full IP packet it constructed from scratch — is then sent back to the wire. The entire "network stack" consists of an LLM API call and some hex parsing.

Why This Is Crazy (In a Good Way)

To appreciate what's happening here, consider what a traditional IP stack does when it receives a ping:

A kernel's TCP/IP implementation processes the incoming packet in hardware-offloaded or kernel-space code. The NIC DMA's the packet into memory, the kernel inspects the IP header, calculates checksums, identifies ICMP type 8, and constructs a reply by swapping source/destination, flipping the ICMP type to 0, and recalculating checksums. This all happens in microseconds — often in fewer than 100 lines of C.

Claude, on the other hand, does none of this natively. It's a transformer that has read about IP and ICMP in its training data. It knows what the protocol looks like because it's seen RFC 792 and countless networking tutorials. When presented with raw bytes, it can reason about what those bytes mean, compute checksums manually in its "thoughts," and construct a reply by predicting the correct sequence of hex characters.

The latency comes from the fact that Claude is doing something fundamentally different from a CPU executing C code. It's thinking about each packet, checking its understanding of the protocol, performing mental arithmetic for checksums, and verifying its output before producing a response. The ~10-second latency is the time it takes for the LLM to "compile" the packet description into the correct byte-level response.

"What's remarkable isn't that it's slow — it's that it works at all. An LLM with no native network stack, no specialized hardware, no kernel integration, successfully parsed a raw IP packet and produced a correct ICMP echo reply."

The Results: Successful Ping, 10 Second Latency

Dunkels ran the experiment and published the full transcript. The key findings:

The raw Claude output for one of the replies (paraphrased and condensed) looked like this:

Let me parse the incoming IP packet:
IP header:
- Version: 4, IHL: 5 (20 bytes)
- Total Length: 84 bytes
- Protocol: 1 (ICMP)
- Source: 10.0.0.2
- Destination: 10.0.0.1

ICMP header:
- Type: 8 (Echo Request)
- Code: 0
- Identifier: 0x1234
- Sequence: 1
- Payload: 56 bytes of data

Constructing echo reply:
- Swap source/destination IPs
- Type -> 0 (Echo Reply)
- Keep identifier 0x1234, sequence 1
- Same payload
- Compute IP checksum: 0xABCD
- Compute ICMP checksum: 0xDCBA

Response (hex):
4500005400004000400100000a0000010a000002000012340001...[truncated]

Claude showed its work — it reasoned about each field, computed both checksums, and verified its output before sending it back. This is what makes the experiment so compelling: Claude wasn't just regurgitating a memorized packet format. It was performing protocol-level reasoning step by step.

The Architecture: A Userspace IP "Soft Stack"

This experiment is a beautiful example of what Adam Dunkels calls a "soft network stack" — an LLM acting as the protocol processing layer, with a thin shim of Python code to bridge between the raw network interface and the AI. The architecture looks like this:

┌──────────────┐      ┌──────────────┐      ┌────────────────┐
│  Raw Packet   │─────▶│  Python Shim  │─────▶│    Claude      │
│  (On Wire)    │      │  (Scapy/Sock) │      │  (IP Stack)    │
└──────────────┘      └──────────────┘      └────────────────┘
         ▲                    │                        │
         │                    │                        │
         │                    ▼                        │
         │            ┌──────────────┐                 │
         └────────────│  Python Shim  │◀────────────────┘
                      │  (Send Reply) │
                      └──────────────┘

This is essentially what a userspace networking framework like DPDK or netmap does — except instead of running compiled C code optimized for packet processing, the data path runs through an LLM API. The Python shim layer is minimal: capture bytes, send to Claude, get reply bytes, send out.

The crucial thing Claude does that a traditional stack doesn't is explain its reasoning. When a kernel processes a ping, there's no log of why it made decisions. With Claude, you get a natural-language trace of the entire protocol processing chain, including the checksum calculation steps. This has interesting implications for debugging and education.

Key insight: The Python shim is ~50 lines of code. The "IP stack" is a prompt and a model. This means any protocol that can be described in natural language can theoretically be implemented by an LLM — no C code, no kernel modules, no specialized hardware. The catch is speed, but the flexibility is unprecedented.

Why Adam Dunkels Is the Perfect Person for This Experiment

It's worth taking a moment to appreciate who conducted this experiment. Adam Dunkels is the creator of:

Dunkels has spent his career making IP stacks fit into tiny amounts of memory. lwIP can run on devices with 10-20 KB of RAM. The contrast between that and running an IP stack through a 200+ billion parameter LLM couldn't be starker. It's the difference between a bicycle and a spaceship as transportation — they both get you there, but the scale of the underlying machinery is wildly different.

This perspective makes the experiment more than a novelty. It's a philosophical exploration: what does it mean to "implement" a protocol? Is it about the bytes on the wire, or the understanding of what those bytes mean?

Running the Experiment Yourself

Want to try this at home? Here's what you need:

Prerequisites

Setup

# Install dependencies
pip install scapy anthropic

# You'll need raw socket permissions
sudo setcap cap_net_raw+ep $(which python3)

# Set your API key
export ANTHROPIC_API_KEY="sk-ant-..."

Testing

Configure a secondary IP on your loopback or test interface, then ping it while the script is running:

# On one terminal
sudo python3 claude_ip_stack.py

# On another terminal
sudo ip addr add 10.0.0.1/24 dev lo
ping 10.0.0.1
Pro tip: Expect ~10 seconds per reply. Don't set a short ping timeout. Use ping -i 30 10.0.0.1 to space packets out — Claude can only handle one packet at a time.

What This Means for AI and Networking

Let's be clear about what this experiment doesn't mean. We are not going to replace kernel TCP/IP stacks with LLMs. The latency overhead is multiple orders of magnitude too high for any practical networking application, and the cost (both API and compute) would be astronomical.

But it does suggest some genuinely interesting possibilities:

1. AI as a Protocol Debugger

If Claude can parse a raw packet and explain what it means, it can act as a network protocol assistant. Feed it a PCAP file and ask it to explain malformed packets, suggest fixes, or check compliance with RFCs. This is already practical with current latency constraints — you don't need real-time response for debugging.

2. Protocol Implementation from Description

The experiment proves that if you can describe a protocol in natural language, an LLM can implement it. This means for prototyping or one-off protocol handling, you could define the behavior in a prompt instead of writing code. The "implementation" is the system prompt.

3. Educational Tooling

Imagine a networking textbook where the "protocol stack" is an LLM that shows its work. Students could send crafted packets and see exactly how the LLM reasons about each field, each header, each checksum. This is a fundamentally different learning experience from tracing kernel code.

4. Adaptive Protocol Handling

Unlike a fixed C implementation, an LLM-based stack could adapt to non-standard or malformed inputs. It could implement protocol extensions without firmware updates — just prompt changes. This is nearly useless for production (speed and cost trump flexibility), but it's fascinating for research and custom protocol prototyping.

The Bigger Picture: LLMs Understanding Protocols

Dunkels' experiment connects to a broader theme in AI research: LLMs as executable specifications. When you describe a protocol to an LLM, it doesn't just describe back what the protocol does — it can actually execute it, responding to real packets with real bytes on the wire.

This is different from an LLM generating code that implements a protocol (like asking Claude to write a ping responder in Python). In Dunkels' experiment, the LLM is the protocol implementation. There's no generated code layer. The model's understanding of IP and ICMP — learned entirely from text — translates directly into correct byte-level behavior.

The bottom line: Claude, acting as a user space IP stack, successfully responds to ping. It's 10,000,000x slower than lwIP. It costs orders of magnitude more per packet. But it works. It understands network protocols at the byte level, reasons about them step by step, and produces correct output. That's not practical for production networking. But it's a remarkable demonstration that LLMs can bridge the gap between natural language descriptions of protocols and actual protocol execution.

As Dunkels put it: the question wasn't whether an LLM could replace a network stack — it was whether it could understand one well enough to act as one. The answer, at ~10 seconds per packet, is a resounding yes.

Related Articles

If you enjoyed this experiment, you might also like: