What an Ultra-Low-Interaction Honeypot Sees When You Watch for Years

On May 19, 2026, between 19:43 UTC and the end of the day, 310,119 distinct source IPs hit a Raspberry Pi in California. They were firing 1,807 different payload variants in lockstep, all of them matching no known L7 protocol, raw bytes before any TLS or HTTP handshake completed. From the Pi’s perspective, the day was a wall of unsolicited inbound that no other day in the previous twenty- seven months came close to matching.

The wall is in the data because the Pi has been doing one thing, quietly, since 2024: answering scans without engaging. No banners, no honey credentials, no Cowrie shell, no fake-Telnet rabbit hole. Just a sensor that records every packet that lands on its address and lets nothing through. Ultra-low-interaction, long-window observation. The thesis: if you sit completely still on a public IP for years and listen, the bleeding edge of the internet walks past your front door at 3am.

This post is about what walks past.

The sensor

A high-interaction honeypot (Cowrie, T-Pot’s SSH emulator, Dionaea) lets the attacker make progress. Type a password and you “get in.” Drop a binary and the honeypot pretends to execute it. The reward is detailed attacker behavior, what they type, what they download, what they pivot to. The cost is enormous noise and meaningful operational risk. The more your sensor engages, the more it looks like a real target, and the more it attracts retail scanners running mass-exploitation kits that have nothing to teach you. An ultra-low-interaction honeypot does the opposite. It captures the packet, generates a minimal response (or none), and refuses to play. Senders that need the conversation to continue give up and move on. What survives in the log is the inbound-only side: the SYN, the unsolicited UDP, the SCTP INIT, the GRE encapsulation attempt, the malformed packet, the pre-auth probe, all before any attacker behavioral mimicry kicks in.

That sounds like a strict downgrade. For long-window pattern and anomaly detection, it’s the opposite. The signal-to-noise ratio is exceptional because the sensor never produces engagement noise. Every packet is exactly as the attacker emitted it. No engine, emulator, or trap has shaped what you see. The capture is protocol-promiscuous: TCP, UDP, ICMP, plus SCTP (proto 132), GRE (47), ESP/AH (50/51), IP-in-IP (4), 6to4 (41), and the routing- protocol curiosities like OSPF (89) and EIGRP (88) that turn up occasionally at a public address with no business being there. And the risk to your network is near zero. Nothing replied with anything useful, so nothing pivoted to anything dangerous. It’s a single-purpose Unix daemon in spirit. Does one thing. Listens.

The sensor doing the heavy lifting here isn’t mine. It’s Tom Liston’s patented stateless-TCP honeypot (U.S. Patent No. 12500930), from the same Tom Liston of LaBrea Tarpit fame, applied at Pi-sized scale. The trick is in the name. A normal TCP listener allocates a socket, a conntrack entry, a chunk of kernel state, and an application thread for every incoming SYN. Multiply that by 65,535 ports and the kernel falls over before the attacker finishes typing. Liston’s design crafts the SYN-ACK on the wire without ever creating connection state. No socket pool. No conntrack. No per-flow memory. The “session” lives entirely in the packet that goes back out. That’s how a $40 Raspberry Pi can listen on every TCP port, every UDP port, every ICMP type, every weird IPv4 protocol the kernel will hand it, and the IPv6 versions of all of the above (65,535 × 2 + ICMP + the long tail) without dropping a packet or breaking a sweat. The ideal single-purpose Unix daemon: small, focused, one thing, quietly impossible scale. Every claim in this post is downstream of that one piece of engineering. Without stateless TCP you don’t get all-port coverage on commodity hardware, and without all-port coverage you don’t see the leading edge.

The other half of the recipe is patience. Interesting patterns at the bleeding edge of unsolicited inbound are slow. Operator infrastructure persists across months, a campaign that lasts a weekend looks like noise, but the same operator coming back six months later with new IPs and the same JA4 fingerprint and the same RDP mstshash cookie is a campaign you can attribute to a coherent actor. Some scanners rate-limit to a few packets per /24 per week to stay under detection thresholds; the /24 fans out across dozens of source IPs across months, and only continuous observation can reconstruct it. Specific Mozi or Mirai botnet members live at one IP for days to weeks before the swarm rotates them, but the C2 layer behind those nodes lasts for years, and long-window observation lets you see the C2 silhouette through the rotating membership. Calendar effects show up only across cycles: US-holiday timing, weekend low-staffing patterns, end-of-quarter campaign sprints, these need multi-year context before you can honestly call them “the operator is optimizing against your SOC’s calendar.” And the leading edge of weird protocols is rare per day, common per year. SCTP arrivals at a residential public IP are vanishingly unlikely on any given day. Twenty-seven months of capture surfaced 137 of them from 22 distinct sources, enough to look at the port triad and realize you’re seeing SIGTRAN/Diameter recon, not noise. None of this is visible in a one-week pcap. Months don’t fully cut it. Years do.

The pipeline

The sensor (honeypi00) is a Raspberry Pi running Tom Liston’s stateless-TCP honeypot. The honeynet is multi-homed across four static public IPs on four different ISPs, cable, fiber, DSL, and Starlink satellite, so what lands on the sensor isn’t an artifact of any one provider’s routing or CGNAT. It dumps every packet to syslog as a base64-encoded Ethernet frame plus a parsed header. The log server collects everything. On top of that I built a vibe-coded analysis stack called SensorGrind_AI that chews through the corpus and produces a defender-facing report.

A consolidator turns the rotated raw syslog into a single sorted merged.log.gz, repairing sensor-side clock drift along the way against accurate-clock anchors from the same syslog stream (the network firewall, mostly; more on that in a moment). The parser streams the merged log into four daily-partitioned Parquet tables: 1.1 billion packets, 97 million deduplicated payload blobs, 32 million parsed L7 events (HTTP, TLS, SSH, RDP, and the rest), and 22.7 million extracted IOCs. Then ~36 analyzer stages run over those tables. Some are dumb aggregations (portscans, workhours, pacing). Some are model-driven (LightGBM threat classifier with temporal split and SHAP, HDBSCAN actor clustering, FP-Growth pattern mining with lift and conviction gates, PELT change-points, Wasserstein and PSI feature drift with BH-FDR correction, survival analysis by cohort, Leiden community detection over a tooling- fingerprint graph). Some are bespoke. botnetcls does YARA-style payload classification. ja4plus computes JA4 / JA4T / JA4S. tooling fingerprints per-IP toolkits. selfmimic watches for inbound packets that echo the sensor’s own banners back at it. nonipproto is the protocol-promiscuous catch-all for SCTP, GRE, ESP/AH, and the long tail.

A campaign-onset detector then looks at each payload sent by ≥10 distinct IPs, computes the per-IP first-arrival distribution, and extracts p5 (launch), median, peak day, and p95 (decay). The ranking is density_score = ip_count × corpus_span / burst_window, which prefers tight bursts over always-on background; payloads whose senders span more than 60% of the corpus drop out entirely. On top of all that is the part that earns the build cost: a cross-signal correlator with 18 hand-coded rules joining two or more analyzer outputs, and an LLM stage that prompts multiple AIs to write a structured narrative for each finding. What happened. Why it matters. The closest named threat model. The risk if the toolkit hit a real target. Recommended action. Cached by prompt SHA so re-runs are free. The whole pipeline runs in about six and a half hours, costs maybe $3 in API calls per run, and produces a ~480 KB markdown report I read with coffee. Section 0.5 of that report, the cross-signal findings with their narratives, above the executive summary, is the part I built this whole thing for.

What walks past the door

Eighty-seven cross-signal findings in the most recent run, six of them critical, fifty-two high. The Mirai-is-still-a-thing stuff is in there, but it’s not what surprised me.

Start with the telco recon, because it’s the most surprising legitimate signal in the corpus. Sixteen sources, mostly on Chinese carrier and cloud ASNs, have been methodically probing the canonical telco-signaling port triad: 5060 (SIP), 5061 (SIPS), 36412 (S1AP/SCTP). The same sources delivered 137 unsolicited SCTP packets from 22 distinct IPs over the corpus window. SCTP arrivals at a darknet residential IP are not noise; they’re deliberate SIGTRAN/S1AP fingerprinting. The port mix on the TCP side was surgical: 103.56.61.144 hit 5061×22, 5060×10, and 36412×9. This isn’t a VoIP fraud operator. This is mobile-core reconnaissance. The LLM writeup of the finding named the matching TTPs unprompted (SS7 MAP location-tracking, Diameter signaling injection, S1AP fuzzing of MME/HSS) and noted that a successful version of the attack would “geolocate subscribers, intercept SMS- based 2FA, redirect or eavesdrop on calls, and commit international revenue-share fraud billed to the carrier.” The SS7/Diameter attack surface has been public knowledge since 2014 but is still under- defended at most operators. The Pi caught the recon side of an attack class that almost nobody is set up to detect, and attributed it to specific source ASNs. A telco SOC would see the same campaign as a GTP/SIP probe at their signaling DMZ. I saw it as 137 stray SCTP packets at a Raspberry Pi. The recon happens here first.

After that the May 19 burst. 1,807 distinct payload SHA-256s shared their peak day on May 19, 2026: 310,119 unique source IPs over the day, 12% of all campaign-classified traffic in the entire corpus arriving in one 24-hour window. Every single one of those 1,807 payloads landed in the L7 “unknown” bucket, raw TCP/binary before any application protocol handshake completed. That alignment is not coincidence. Independent operators don’t peak on the same day. This is a single C2 layer (or a single botnet-as-a-service) issuing synchronized tasking to a shared bot fleet across 1,800+ payload variants. The all-unknown L7 profile is operationally suspicious, non-standard ports, custom binary protocols, or pre- handshake probes designed to evade signature-based DPI. You catch this kind of thing only if your sensor records raw packets (no L7 termination) and you compute cross-payload synchronization at the year scale. Neither of those is standard practice at commercial threat-intel shops.

The detect that’s going to be controversial: twenty-five distinct hosts inside 146.88.241.0/24, Arbor Networks/Netscout’s ATLAS measurement subnet, have each touched the sensor on roughly 58 to 83 days, spread across 833 to 841 day spans, with per-source activity densities of 7 to 10%. The arrival pattern shows a single coordinated campaign spread thinly across the whole /24, with each individual IP staying well under any rate-based detector but the subnet as a whole hitting the sensor most weeks of the year for over two years. Two possibilities. Either this is Arbor/Netscout’s legitimate internet-measurement infrastructure doing its job (in which case their ATLAS scanner isn’t on every threat-intel allowlist it should be on, including mine), or someone has compromised hosts inside the ATLAS range and is running a stealth campaign from inside a DDoS-vendor’s measurement footprint to defeat attribution. Either is publishable. Both matter to a defender about to treat all of 146.88.241.0/24 the same way. You only see this kind of structure with subnet-wide aggregation across years. Per-IP detectors miss it. Subnet aggregation at weekly resolution buries it. You have to compute /24 coordination density over the full corpus span.

The protocol catalog has its own surprises. Over the 27-month window, 1,500+ unique source IPs have sent unsolicited GRE (proto 47), IP-in-IP (4), 6to4 (41), ESP/AH (50/51), and even ETHERIP (97) packets at a residential IP that never initiates traffic. None of these protocols carry session state in the TCP sense, and all of them are tunneling-class protocols an attacker might use to set up a covert channel, probe for misconfigured tunnel endpoints, or fingerprint a target’s upstream-router stance toward unsolicited encapsulated traffic. A 6to4 packet at a residential IPv4 isn’t how anything legitimate behaves in 2026. An ETHERIP packet is so rare it could plausibly be a single researcher writing a master’s thesis. Yet they keep arriving, from a long tail of distinct sources, year after year. The nonipproto analyzer is the only reason they’re visible at all; 90% of honeypot stacks silently drop anything that isn’t TCP/UDP/ICMP. The AI decoder ran on a sample of each protocol’s payload bytes and identified one of the IP-in-IP samples unprompted as a packet encapsulating an ICMP-redirect attempt, exactly the kind of probe you’d see if someone was looking for routers willing to accept spoofed redirects via tunnel.

There’s a parallel detect at the minute scale. On 2024-06-24 between 07:30 and 07:50 PDT, the sensor saw fifteen consecutive one-minute buckets each carrying 343 to 426 brand-new source IPs, against a baseline of 28 new IPs per minute. That’s not steady- state botnet noise. That’s the moment of a botnet C2 push, an operator detonated a fresh batch of bots into circulation and the sensor was looking. The arrival_bursts detector catches it by asking, minute by minute, how many of the IPs hitting the sensor have never been seen before, and flagging buckets more than five times the median. Steady-state rate-limited scanning never qualifies. Coordinated rotation does.

The last featured detect is one I’m watching now because it’s still warm. Between April 27 and May 1, 2026, 1,190 distinct source IPs hammered the sensor’s SSH port from a tightly-coordinated campaign that launched on the 27th with 547 unique sources in the first 24 hours and decayed inside 3.68 days. Density score of 271,989, ranked first on density among 1,867 campaigns in the corpus. The 69.5.169.0/24 cluster, MEVSPACE and IP-Volume bulletproof ranges, and several DigitalOcean droplets all participated in lockstep. This is the launch of a campaign, caught while it’s still ongoing. The campaign-onset detector with its p5/p95 percentile model is exactly what makes a fresh-and- tight burst visible against the background, a threshold-based “synchronized waves” ranker would have buried it under the year- long noise.

The reason the finding count more than quadrupled this run is a pattern that only resolves at corpus scale: behavioral fingerprint rotation. A single botnet’s stack signature, its JA4T fingerprint, its packet choreography, the odd ephemeral port it keeps ringing (TCP/20211, /14762, /23142, or plain Telnet/23), stays constant while the infrastructure underneath it churns. One fingerprint reappeared across 2,422 distinct ASNs hammering TCP/20211; a Telnet/23 credential-spray fingerprint rotated across 2,266 ASNs and 15,992 CIDRs. Per-IP and per-ASN blocklists are a treadmill against this, the only durable detection is the behavior, not the address. Most of the fifty-two high-severity findings this run are some flavor of the same story: the same actor wearing a different network each week.

The rest are mostly known patterns whose value lies in the corpus-scale measurement. 3.3 million ICMP echo packets with non-trivial payload from 18,663 sources lit up the covert-channel detector at corpus scale. 101,532 SYN packets with payload from 3,610 sources plus 23,369 mid-stream ACK/PSH-ACK arrivals with no prior SYN add up to a serious raw-socket-scanner / spoofing-toolkit volume. The RDP mstshash cookie hello has been reused across 1,478 distinct IPs in 106,602 connection attempts (the stateless-TCP design completes enough of the handshake that the client sends the CR_TPDU with the cookie in the third segment, so this signal is real even though the sensor never authenticates anyone). Credential-spray-as-a-service has a longer commercial lifespan than most defenders model. Eighteen otherwise-unrelated campaigns peaked on US Independence Day 2025 with their top five peak-day IP counts converging into a 35-IP band (997–1032 each), operator-discipline calendar targeting visible only because the corpus spans multiple holiday cycles.

A sidebar on the clock that lied to me

A short detour because it’s a useful cautionary tale, then back to the main thread.

For about eight months mid-corpus, the sensor’s timestamps were wrong. Honeypi00 has no battery-backed RTC. The clock comes from NTP at boot. The box was rebooting constantly because its share of the 8-port USB-C power supply running the whole honeynet was starting to fail: just enough current to come up, not enough to keep the network stable. Every reboot, the clock fell back to the hardware-default 2019-02-14 epoch. Eight months of “the analyzer shows 2019 attacks” turned out to be a $30 wall wart.

I fixed the power supply. The clock has held steady since. The software-side correction in the consolidator, anchor against firewall timestamps that survived the reboot loop on a properly- powered device, stays in the pipeline as defense-in-depth. The next clock failure in this house won’t be the last.

The reason this matters for the main story: long-window analysis makes you confront hardware reality. A week-long capture, you can trust the timestamps and move on. Two years in, you’ll discover that your sensor’s idea of “when” has decoupled from yours, and every signal that depends on time, every drift detector, every campaign-onset percentile, every “this campaign launched in March” finding, was being computed on lies. If you’re going to watch for years, you have to verify the clock at the cadence of years.

Spice over. Back to the main course.

What I’d say to another operator

Patience beats interaction. Every additional layer of pretend-I’m-a-real-target adds noise and attack surface. The information you need for trend and anomaly work lives in the arrival pattern, not the conversation. The minute-scale arrival burst on 2024-06-24 (15 consecutive one-minute buckets of 343 to 426 brand-new IPs against a baseline of 28) is exactly the kind of signal a high-interaction honeypot would lose under engagement noise; the layer-4 sensor caught the moment of detonation because the moment is all it listens for.

Capture protocol-promiscuously, then analyze without assuming TCP/UDP/ICMP is the universe. The most novel finding in this run needed SCTP capture, which 90% of honeypot stacks silently drop.

“Noteworthy” is not a finding. A report that flags 36k-IP patterns without explaining them is just printing noise in a bigger font. The cross-signal correlator with LLM-written narratives is the part of the tool that converts data into “you should care because X.” Build that or your dashboard is decorative.

The leading-edge signal is in the cross-modal joins, not in any single analyzer. Telco SS7 recon is invisible if you only look at TCP scans. It’s invisible if you only look at SCTP arrivals. The signal is in the coincidence, which IPs do both. The harness has to be designed to ask that question.

Anomaly detection over long windows needs percentile-based thinking, not threshold-based. The original synchronized-waves detector ranked by ip_count divided by span_minutes, which silently rewarded always-on background and silently buried the campaigns I wanted. The replacement (per-IP first-arrival percentiles, peak-day extraction, density scoring with a 60%-of-corpus burst-window cutoff) surfaced the May 19 1,807-campaign day on the first run.

The clock is a sensor. Sensors fail. Verify ground truth at the cadence you analyze at. Twenty-seven months of corpus uncovered eight months of hardware-induced timestamp drift. A week-long capture would never have noticed. Anchor every timestamp against a source the sensor didn’t generate.

Slop is real and accumulates. The reckless intern with a Red Bull habit will copy-paste eighteen near-identical _finding(...) calls and add a no-op try: ... except Exception: raise wrapper in five different files because it felt right. You have to be the one who notices. Vibe coding is fine. Vibe analyzing is the dangerous part. Claude wrote 95% of the code; the pipeline is solid. The interpretation of what the pipeline produces is where the AI confidently produces wrong answers, and where the expert has to push back hardest.

Closing

The honeypot has been listening for years. The pipeline has been running for a week. The bleeding edge of unsolicited-inbound recon (the SS7 stuff, the synchronized multi-payload pushes, the Arbor-range slow-and-low, the minute-scale botnet detonations) is in there because the sensor doesn’t blink and the analysis is willing to look at the whole window at once.

What an ultra-low-interaction long-window sensor gives you that a commercial threat-intel feed doesn’t is the leading edge, not the lagging edge. Feeds publish what has been reported. A quiet sensor on a residential IP is watching what’s being tried right now. The SS7 recon campaign has been hitting this box since 2024 and I haven’t seen it written up anywhere. The 2024-06-24 botnet detonation showed up as a fifteen-minute spike at this sensor and nowhere I’ve looked since. The Arbor ATLAS range may or may not belong to who its WHOIS says it does. None of these have ticket numbers at a SOC yet.

If you run a network and you don’t have an ultra-low-interaction sensor somewhere on a public IP just listening, you’re paying for threat intel that’s a published version of what my Pi already saw six months ago. If you have one but only look at the last seven days, you’re throwing away the only thing that makes the sensor useful. And if you have a long-window corpus but no analyzer stack that does cross-signal joins and computes percentiles instead of thresholds, you’re sitting on years of data and printing top-five lists of background noise.

The bad guys aren’t waiting. They’re trying things, at scale, 24/7, against every public IP on the internet, including this one. The point of building the sensor and the pipeline is to find out what they’re trying first, before it shows up in a vendor blog, before it makes the news, before your SOC’s first ticket of the quarter. Quiet sensor, long window, cross-signal analysis. That’s the recipe.

Also, check your power supply.

Credits. The sensor that makes any of this work is Tom Liston’s stateless-TCP sensor, U.S. Patent No. 12500930. Tom invented LaBrea Tarpit twenty-something years ago and has been quietly making unsolicited-inbound observation cheap and complete ever since. The honeypi packaging and analysis stack on top (SensorGrind_AI) are mine.

iamnor