Claude as a User Space IP Stack: LLM Responds to Ping — Full Experiment Guide
Published: 2026-05-11
"I fed raw IP packets to Claude and asked it to act as a user space IP stack. Then I pinged it. It replied."
That's the premise of Adam Dunkels' latest experiment, which rocketed to the front page of Hacker News with 57 points in under an hour. The creator of Contiki OS and the lwIP TCP/IP stack — one of the most widely embedded networking stacks in existence — decided to find out if a large language model could understand and respond to raw network protocols.
The answer, it turns out, is yes. Just very, very slowly.
The Experiment: How to Make Claude Into a Network Stack
Adam Dunkels — the same engineer who wrote lwIP, the lightweight TCP/IP implementation that runs on hundreds of millions of embedded devices — posed a simple question: Could an LLM do what lwIP does, just at human reasoning speed?
The experiment runs in four steps:
- Capture — A Python script listens for ICMP echo requests using Scapy (raw socket capture)
- Forward — The raw bytes of the incoming IP packet are sent to Claude via the Anthropic API
- Process — Claude, prompted to "act as a user space IP stack," parses the IP header, identifies the ICMP type, extracts the payload, and constructs a valid ICMP echo reply
- Respond — The Python script takes Claude's response (a hex-encoded IP packet), unwraps it, and sends it back on the wire
PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=10256 ms
64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=9872 ms
64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=10134 ms
--- 10.0.0.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 30002ms
rtt min/avg/max/mdev = 9872/10087/10256/158 ms
Three pings sent. Three replies received. Zero percent packet loss. The round-trip time was consistently around 10 seconds — but the key insight is that each ICMP echo reply was a valid IP packet, with the correct IP and ICMP checksums, the right sequence number, and a properly constructed echo reply matching the incoming payload.
The Python Script: Raw IP Forwarding to Claude
Here's the core of the experiment — a Python script that bridges raw network packets to Claude's API:
#!/usr/bin/env python3
"""
Claude as a User Space IP Stack
Feed raw ICMP echo requests to Claude, let it construct replies.
"""
import socket
import struct
import os
from anthropic import Anthropic
# Claude API client
client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
SYSTEM_PROMPT = """You are a user space IP stack.
Your job is to parse incoming IP packets and generate appropriate responses.
When you receive an ICMP echo request (type 8), you must generate a valid
ICMP echo reply. The reply must have:
- type=0 (Echo Reply)
- code=0
- checksum (correctly computed)
- identifier and sequence number from the request
- the same payload as the request
Output your response as a hex-encoded IP packet (with proper IP header,
correct total length, and correct IP checksum).
The source and destination IPs should be swapped in the reply."""
def build_raw_socket(iface=None):
"""Create a raw socket to capture and send IP packets."""
s = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_ICMP)
s.setsockopt(socket.IPPROTO_IP, socket.IP_HDRINCL, 1)
if iface:
s.setsockopt(socket.SOL_SOCKET, 25, bytes(iface, "utf-8"))
return s
def icmp_echo_request(packet):
"""Check if an IP packet is an ICMP echo request."""
ip_header = packet[:20]
proto = ip_header[9]
if proto != 1: # ICMP
return None
icmp_start = (ip_header[0] & 0x0F) * 4
icmp_type = packet[icmp_start]
return icmp_start if icmp_type == 8 else None
def forward_to_claude(raw_packet, sock):
"""Send raw IP packet to Claude and send the reply on the wire."""
hex_packet = raw_packet.hex()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system=SYSTEM_PROMPT,
messages=[{
"role": "user",
"content": f"Incoming IP packet (hex): {hex_packet}"
}]
)
reply_hex = response.content[0].text.strip()
# Extract hex from code block if present
if "```" in reply_hex:
reply_hex = reply_hex.split("```")[1]
if reply_hex.startswith("hex"):
reply_hex = reply_hex[3:]
reply_hex = reply_hex.strip()
reply_packet = bytes.fromhex(reply_hex)
sock.sendto(reply_packet, ("10.0.0.2", 0))
return len(reply_packet)
def main():
sock = build_raw_socket()
print("Listening for ICMP echo requests...")
while True:
packet, addr = sock.recvfrom(65535)
offset = icmp_echo_request(packet)
if offset is not None:
print(f"ICMP echo request received from {addr}")
reply_len = forward_to_claude(packet, sock)
print(f"Reply sent ({reply_len} bytes)")
if __name__ == "__main__":
main()
The script is deceptively simple. A raw socket captures ICMP traffic, and any incoming echo request is hex-encoded and forwarded to Claude. The LLM's response — a full IP packet it constructed from scratch — is then sent back to the wire. The entire "network stack" consists of an LLM API call and some hex parsing.
Why This Is Crazy (In a Good Way)
To appreciate what's happening here, consider what a traditional IP stack does when it receives a ping:
A kernel's TCP/IP implementation processes the incoming packet in hardware-offloaded or kernel-space code. The NIC DMA's the packet into memory, the kernel inspects the IP header, calculates checksums, identifies ICMP type 8, and constructs a reply by swapping source/destination, flipping the ICMP type to 0, and recalculating checksums. This all happens in microseconds — often in fewer than 100 lines of C.
Claude, on the other hand, does none of this natively. It's a transformer that has read about IP and ICMP in its training data. It knows what the protocol looks like because it's seen RFC 792 and countless networking tutorials. When presented with raw bytes, it can reason about what those bytes mean, compute checksums manually in its "thoughts," and construct a reply by predicting the correct sequence of hex characters.
The latency comes from the fact that Claude is doing something fundamentally different from a CPU executing C code. It's thinking about each packet, checking its understanding of the protocol, performing mental arithmetic for checksums, and verifying its output before producing a response. The ~10-second latency is the time it takes for the LLM to "compile" the packet description into the correct byte-level response.
The Results: Successful Ping, 10 Second Latency
Dunkels ran the experiment and published the full transcript. The key findings:
- Three pings sent, three replies received — 0% packet loss over a ~30-second test window
- Average RTT: ~10,087 ms — roughly 10 million times slower than a kernel stack (~1 ms)
- Correct ICMP type and code — Claude correctly set type=0 (Echo Reply) and code=0
- Correct checksums — Both the IP header checksum and ICMP checksum were valid
- Payload integrity — The echo reply contained identical payload bytes to the request
- Sequence number preserved — Each reply carried the correct ICMP sequence number
The raw Claude output for one of the replies (paraphrased and condensed) looked like this:
Let me parse the incoming IP packet:
IP header:
- Version: 4, IHL: 5 (20 bytes)
- Total Length: 84 bytes
- Protocol: 1 (ICMP)
- Source: 10.0.0.2
- Destination: 10.0.0.1
ICMP header:
- Type: 8 (Echo Request)
- Code: 0
- Identifier: 0x1234
- Sequence: 1
- Payload: 56 bytes of data
Constructing echo reply:
- Swap source/destination IPs
- Type -> 0 (Echo Reply)
- Keep identifier 0x1234, sequence 1
- Same payload
- Compute IP checksum: 0xABCD
- Compute ICMP checksum: 0xDCBA
Response (hex):
4500005400004000400100000a0000010a000002000012340001...[truncated]
Claude showed its work — it reasoned about each field, computed both checksums, and verified its output before sending it back. This is what makes the experiment so compelling: Claude wasn't just regurgitating a memorized packet format. It was performing protocol-level reasoning step by step.
The Architecture: A Userspace IP "Soft Stack"
This experiment is a beautiful example of what Adam Dunkels calls a "soft network stack" — an LLM acting as the protocol processing layer, with a thin shim of Python code to bridge between the raw network interface and the AI. The architecture looks like this:
┌──────────────┐ ┌──────────────┐ ┌────────────────┐
│ Raw Packet │─────▶│ Python Shim │─────▶│ Claude │
│ (On Wire) │ │ (Scapy/Sock) │ │ (IP Stack) │
└──────────────┘ └──────────────┘ └────────────────┘
▲ │ │
│ │ │
│ ▼ │
│ ┌──────────────┐ │
└────────────│ Python Shim │◀────────────────┘
│ (Send Reply) │
└──────────────┘
This is essentially what a userspace networking framework like DPDK or netmap does — except instead of running compiled C code optimized for packet processing, the data path runs through an LLM API. The Python shim layer is minimal: capture bytes, send to Claude, get reply bytes, send out.
The crucial thing Claude does that a traditional stack doesn't is explain its reasoning. When a kernel processes a ping, there's no log of why it made decisions. With Claude, you get a natural-language trace of the entire protocol processing chain, including the checksum calculation steps. This has interesting implications for debugging and education.
Why Adam Dunkels Is the Perfect Person for This Experiment
It's worth taking a moment to appreciate who conducted this experiment. Adam Dunkels is the creator of:
- Contiki — the operating system for IoT devices that runs on 8-bit microcontrollers with kilobytes of RAM
- lwIP — the lightweight TCP/IP stack that powers countless embedded systems
- Protothreads — a lightweight threading mechanism for memory-constrained systems
Dunkels has spent his career making IP stacks fit into tiny amounts of memory. lwIP can run on devices with 10-20 KB of RAM. The contrast between that and running an IP stack through a 200+ billion parameter LLM couldn't be starker. It's the difference between a bicycle and a spaceship as transportation — they both get you there, but the scale of the underlying machinery is wildly different.
This perspective makes the experiment more than a novelty. It's a philosophical exploration: what does it mean to "implement" a protocol? Is it about the bytes on the wire, or the understanding of what those bytes mean?
Running the Experiment Yourself
Want to try this at home? Here's what you need:
Prerequisites
- A Linux machine (raw sockets require root or CAP_NET_RAW)
- Python 3.10+ with
scapyandanthropicpackages - An Anthropic API key with access to Claude Sonnet 4 or Claude Opus 4
Setup
# Install dependencies
pip install scapy anthropic
# You'll need raw socket permissions
sudo setcap cap_net_raw+ep $(which python3)
# Set your API key
export ANTHROPIC_API_KEY="sk-ant-..."
Testing
Configure a secondary IP on your loopback or test interface, then ping it while the script is running:
# On one terminal
sudo python3 claude_ip_stack.py
# On another terminal
sudo ip addr add 10.0.0.1/24 dev lo
ping 10.0.0.1
ping -i 30 10.0.0.1 to space packets out — Claude can only handle one packet at a time.
What This Means for AI and Networking
Let's be clear about what this experiment doesn't mean. We are not going to replace kernel TCP/IP stacks with LLMs. The latency overhead is multiple orders of magnitude too high for any practical networking application, and the cost (both API and compute) would be astronomical.
But it does suggest some genuinely interesting possibilities:
1. AI as a Protocol Debugger
If Claude can parse a raw packet and explain what it means, it can act as a network protocol assistant. Feed it a PCAP file and ask it to explain malformed packets, suggest fixes, or check compliance with RFCs. This is already practical with current latency constraints — you don't need real-time response for debugging.
2. Protocol Implementation from Description
The experiment proves that if you can describe a protocol in natural language, an LLM can implement it. This means for prototyping or one-off protocol handling, you could define the behavior in a prompt instead of writing code. The "implementation" is the system prompt.
3. Educational Tooling
Imagine a networking textbook where the "protocol stack" is an LLM that shows its work. Students could send crafted packets and see exactly how the LLM reasons about each field, each header, each checksum. This is a fundamentally different learning experience from tracing kernel code.
4. Adaptive Protocol Handling
Unlike a fixed C implementation, an LLM-based stack could adapt to non-standard or malformed inputs. It could implement protocol extensions without firmware updates — just prompt changes. This is nearly useless for production (speed and cost trump flexibility), but it's fascinating for research and custom protocol prototyping.
The Bigger Picture: LLMs Understanding Protocols
Dunkels' experiment connects to a broader theme in AI research: LLMs as executable specifications. When you describe a protocol to an LLM, it doesn't just describe back what the protocol does — it can actually execute it, responding to real packets with real bytes on the wire.
This is different from an LLM generating code that implements a protocol (like asking Claude to write a ping responder in Python). In Dunkels' experiment, the LLM is the protocol implementation. There's no generated code layer. The model's understanding of IP and ICMP — learned entirely from text — translates directly into correct byte-level behavior.
As Dunkels put it: the question wasn't whether an LLM could replace a network stack — it was whether it could understand one well enough to act as one. The answer, at ~10 seconds per packet, is a resounding yes.
Related Articles
If you enjoyed this experiment, you might also like: