I Built the Fishing-Forecast Tool Some PhDs Told My Friend He Couldn’t

A friend of mine has been fishing seriously for years and recently started getting into AI. He started bolting prompts and scripts together into the beginnings of a fishing helper, the way you do when you have a real hobby and a new toy and you want them to talk to each other. He took it to some sciency PhD types he knew for feedback. They were dismissive in the particular way credentialed people can be dismissive: not “here is the paper that already addresses what you are trying to do,” not “here is the dataset you actually want,” just the gatekeeping flavor. He told me about it. It kinda got my dander up.

So I went looking, on the assumption that whatever the rude PhDs had not handed my friend was sitting in plain sight in the literature and the federal data catalog. I gave it six deep-research passes and that turned out to be true. The USGS has published presence-probability maps for 419 native species across every river and stream reach in the lower 48, free download. Peer-reviewed thermal-niche papers for the species my friend cares about go back to Elliott 1994 on brown trout, Wehrly 2007 on brook trout, Selong 2001 on bull trout. Charbonneau 2025 (more on that one in a minute) measured what the recreational-angler catch rates actually mean against a real population baseline. The federal data substrate behind all of that is even more available than the literature: federal hydrography, USGS real-time gauges, EPA water-quality alerts, NOAA forecasts, Open-Meteo’s sixteen-day extended forecast. None of it is behind a paywall. None of it requires a partnership. The work to bolt it into something useful is real work, but it is not impossible work, and the people who told my friend otherwise either had not looked or had a reason not to want him to.

This post is about what I bolted together for him, where the build got surprising, and a handful of things I had to learn the hard way along the way.

What I built him

The short version: a tool that runs on his laptop, ingests federal data once over the network, then for any river he names and any species he cares about answers the question “where on this water and on which days this week is fishing most likely to be worth my time.” The answer comes back as sixteen colored daily maps, one per day in the forecast window, plus a summary map showing the best day per reach over the whole window, plus a paragraph in plain English naming the best and worst day with the reasons.

The reasons are the load-bearing part. On a representative test query for brown trout on the Big Hole River, the best day in the window was 2026-06-21. Mean water temperature that day was projected at 13.4 C, which sits inside Elliott 1994’s brown trout optimum of 13.9. Flow was within the typical recent-baseline band. The seasonal multiplier was at its post-spawn-feeding maximum because June is when brown trout feed hardest. The worst day was 2026-06-29 at a forecast 21.7 C, which is eight tenths of the way down the warm side of the bell. Three days in the window flagged a snowmelt-driven flow anomaly. The tool says all of that in plain English in the narrative file, with the specific numbers, with citations to the papers each threshold came from.

His current workflow before this was a tab graveyard. USGS WaterWatch for the gauge. NWS for the air temp. A state agency’s stocking page if he cares about that. A reach map in TroutRoutes or onWater. The state regulation PDF. A forum thread to see if anyone has reported back from the river this week. Some species-temperature table memorized from a magazine in 1998. Each of those tools is fine on its own. Stitched together they are a chore. None of them answers his actual question, because each one occupies one slice of it and the joins are all in his head.

The tool runs entirely on his machine. No SaaS account. No phone-home. No analytics call. The language model that writes the narrative paragraph is a local llama.cpp model loaded on whatever backend his hardware happens to favor. The data, once pulled, stays on disk. He can run the whole thing in airplane mode. The code is GPL-3.0. The licenses on each model and each dataset get surfaced at the moment of download, not buried in a click-through.

How it works under the hood

Ten components, each independently testable, all anchored on a single substrate: every reach-level join in the system points at one federal reach ID. The hydrography is the truth.

Component	What it does
Hardware Probe	Looks at CPU, RAM, GPU, NPU. Picks the right local-inference backend.
Inference Layer	Loads the chosen language model on the chosen backend.
Model Registry	Catalog of which model to use for which task. Picks based on VRAM available.
Data Ingestion	Per-source modules that pull from each authoritative endpoint into local storage.
Feature Store	Local database. All joins anchor on the federal reach ID.
Prediction Layer	The math. Species priors, water temperature, thermal niches, phenology, flow, scoring, map rendering.
Calibration Layer	The probability type. Every probability ships with a 95% interval that cannot be stripped.
Reasoning Layer	A three-stage AI pipeline (parse the question, plan the tools to call, write the narrative).
Ethics Layer	Sensitive-species suppression. Consulted before any reach-level output is written.
User Interfaces	CLI, local web API, PNG, GeoJSON, Markdown narratives.

The math at the center is one function, and it has the shape of a multiplicative composite of five honest factors:

			
suitability_index = clamp_0_1(
    presence_prior
    * thermal_fit
    * flow_factor
    * seasonal_phenology
    * flow_anomaly
)

		

The presence prior is the USGS species-presence probability. The thermal fit is a bell curve anchored on the species’s published optimum temperature, with the width set so the function reads about 0.5 at the species’s published upper preferred bound. The flow factor compares the last seven days of discharge to a thirty-day baseline and penalizes both very low (warm, stagnant) and very high (turbid, blown out) conditions. The seasonal phenology multiplier captures the published month-by-month activity curve for each species (spawning months down, post-spawn months up, normal months neutral). The flow anomaly factor catches the gauge readings that have departed by more than two standard deviations from their thirty-day baseline and attenuates the score accordingly. Each factor is deterministic given its inputs. Each carries a source tag. Missing data returns a factor of 1.0 with the honest tag not_modeled:<reason> rather than a substituted value.

The probability interval propagates through the whole chain. The map shows the point estimate. The JSON output carries the lower and upper bounds. That is the calibration discipline.

Where the design got surprising

I expected the hard part of this build to be the modeling. It wasn’t. The hard part was figuring out what kind of quantity I was actually computing, and being willing to call it what it was when it turned out not to be what I had hoped.

The first surprise was how much of the work was translation. I went in thinking I would need to fit something. There was nothing to fit. The USGS species presence priors were already published, against every reach in the country, with calibrated probabilities. The thermal niches were already in peer-reviewed papers going back thirty years. The temperature-projection physics was published in 1998. The flow anomaly math is undergraduate statistics. My job was not to do science. My job was to find the right published number and put it in the right place in the chain. The PhDs who brushed off my friend were sitting on a translation problem and acting like it was a research problem. They were wrong about that.

The second surprise was the moment I had to back off from a beautiful answer. Charbonneau et al. 2025 measured the hyperstability of recreational-angler catch rates against fisheries-independent baselines. Their key number is that the catch rate scales with the true population to the power 0.23. The intuition is brutal: a 50 percent true decline in fish shows up as only a 40 percent decline in what anglers catch, because anglers crowd onto the surviving productive reaches and keep their hours on the gear. If you trust angler-reported catch data without correcting for hyperstability, you watch a fishery collapse and your dashboard tells you everything is fine. So I had the correction implemented. Raise the catch rate to the power 1/0.23, which is about 4.347. Beautiful. Then I sat down to think about what kind of quantity my upstream prior actually was, and realized I was about to multiply inches by Celsius. The USGS prior is a presence probability (“is the species here at all”), not a catch rate (“how many fish per hour”). Raising a presence probability to the 4.347 power is an arithmetic operation that produces a number; the number is meaningless. I had a hyperstability-correction-shaped solution looking for a hyperstability-correction-shaped problem, and my data was not that shape. I kept the constant in the codebase, tagged on every probability for provenance, but the actual correction is held at zero weight in the chain. When the next iteration replaces the presence prior with a model trained on real catch data, the weight comes off the floor and the correction starts doing real work. Until then, applying it would be a confident lie.

The third surprise was the naming. I almost called the output catch_probability. Each of the five factors going into the math is real, individually. None of them is the probability that a particular angler catches a particular fish on a particular day. I had to sit and think about what the chain actually computes, and the honest answer was “a relative ranking of how suitable conditions look for this species at this reach on this day.” That is a suitability_index, not a catch_probability. The rename is two lines of code and a colorbar disclaimer (“Relative suitability index 0-1, NOT a probability”) and it is the single most important honesty move in the whole build. Tools that label this kind of output as “your probability of catching a fish” are lying to their users with math that looks rigorous. I had a chance to be one of those tools and didn’t take it.

A sidebar on what a per-day map exposed

A short tangent, because it’s the kind of bug that only shows up when you look at the right cadence.

The first version of the forecast pipeline rendered one summary map per query: best day over the window, per reach. It looked great. The colors made sense. The narrative paragraph named the right reaches. I shipped it to myself and called it done.

Then I expanded the output to per-day maps. Sixteen daily PNGs instead of one summary. I re-ran the Big Hole and three other Western rivers and looked at the daily sequences. The Madison and the Firehole and the Henrys Fork all showed the same pattern: fifteen consecutive identical maps, then a jump. Day fourteen and day fifteen and day sixteen would suddenly carry a different temperature than days zero through thirteen. The summary average had been smoothing it out. The bug had been hiding in the rollup.

The cause was straightforward in retrospect. The fallback chain for water temperature, when no real-time gauge observation existed at a reach for a given forecast day, was pure inverse-distance-weighted interpolation between the nearest USGS water-temperature gauges in the same watershed. The IDW is a spatial operation. It has no time dimension. So every forecast day in the window got the same interpolated value, derived from the most recent observation at the anchor gauges. If a new observation arrived from one of the anchor gauges partway through the forecast window, the IDW changed and the maps jumped. Otherwise they were flat.

The fix is one line of math and a new source tag. For each forecast day, project the water temperature as the IDW value plus a Mohseni-Stefan air-to-water delta from the forecast air temperature minus the air temperature on the day the IDW anchor was actually measured. The IDW gives you the spatial anchor. The Mohseni-Stefan piece (published, peer-reviewed since 1998, the standard regression of water on air for streams) gives you the day-to-day movement. The combined projection carries a different source tag than pure IDW, so the user can see on the map caption which math produced the number.

The lesson, beyond the technical detail: render at the cadence you analyze at. I had been analyzing in summary statistics and rendering in summary statistics, so the staleness was invisible. The moment I started rendering at per-day cadence, the staleness leapt off the screen. The summary map had been pretty for the same reason it had been wrong. Anglers planning a trip do not care about the rollup. They care about Tuesday. The rendering cadence has to match the planning cadence, and if you do not look at the data at the same cadence the user is going to act on, you are going to ship them a tool that lies in the part they trust.

What I’d say to another builder going at this

A few things, in roughly the order they hit me.

Don’t believe the gatekeepers. If you have a hobby and a builder’s instinct and somebody credentialed tells you a real tool cannot be built without their permission, go check. The published literature is usually more cooperative than the people who wrote it.

The federal data substrate is the most underused asset in US recreational outdoor tech. Anyone who tells you the public-data side of this problem is hard is selling you a product. The work is in the translation, not the access.

Calibration honesty has to be structural. “We promise to remember the interval” loses the moment somebody refactors the rendering layer on a Tuesday and the interval drops out of the call signature. Make the interval a required field on the return value, not an optional extra a caller can fail to read.

Do not apply someone else’s correction to a quantity that is not the same kind of quantity. Before you apply a correction, write down what kind of quantity each side of the multiplication actually is, and whether the correction’s published derivation lives in that quantity’s domain. The arithmetic will produce a number either way. The number will not necessarily mean anything.

Render at the cadence your users will act on. Summary statistics hide bugs that per-instance rendering exposes immediately. If they plan one day at a time, render one day at a time, and look at the sequence.

Reach for the honest name first. The temptation to call your output what your user wishes it were is real and it is wrong. The honest name is the one the math actually supports, not the one the marketing department would prefer.

Sensitive-species suppression has to be a hard gate, not a configuration flag. “Configurable ethics” is the design pattern that ships unethical tools because somebody flipped the flag. Make the moral choice impossible to undo, structurally, and you cannot accidentally undo it.

Build for one real person you know. Triangulating to the median user produces median products. Building for one specific human gives the product a spine no persona-driven design will match.

What it intentionally does not do

The deliberate exclusions. Rivers and streams only — no lakes, reservoirs, or saltwater. No catch probability claim — it is a relative suitability index. No mobile app. No graphical UI today. No telemetry, ever, by default. No scraping of tribally-managed water data. Pennsylvania today as the regulation exemplar; other states come later. No Pacific Northwest salmon and steelhead queries yet (the data systems for those are next on the list).

The bet behind this scope: local-first, calibration-honest, per-day temporal forecasting, on real federal data is a position that TroutRoutes, FishAngler, Navionics, Anglers Atlas, and OnX Fish do not occupy. Each of those tools makes one or two of those bets. None makes all four. The market gap is for the angler who fishes rivers more than they buy gear, cares about the data lineage, and wants the data on their own laptop.

What’s next

The data structures are designed so the underlying models can be swapped without rewriting the surface. The next pass, roughly in priority order:

Replace the species presence prior with a spatial stream network regression trained on real observed catch data. The output stops being a relative suitability index and becomes a calibrated catch probability. The honesty disclaimer on the colorbar comes off.
Swap in peer-reviewed gridded stream-temperature datasets (NorWeST, EcoSHEDS) for most US reaches.
State stocking schedules and regulations for Idaho, Montana, Wyoming, Virginia.
Deep-learning nowcasts of per-reach water temperature and flow.
Pacific Northwest salmon and steelhead queries.
A graphical desktop UI.

Closing

The thing that keeps this build interesting to me, after the technical work is done, is that the bar to clear was honesty, not brilliance. The substrate was sitting in plain sight. The math was published. Somebody just had to put it together and refuse to oversell what came out the other side. That is a low bar in principle and apparently a hard one to clear in practice, because the commercial tools my friend was using had not cleared it.

The code is GPL-3.0. The data is federal and stays on your machine. The model licenses are surfaced at download. If you are the kind of angler who fishes rivers more than twenty days a year and you want a tool you can extend, audit, and run in airplane mode, this is the tool. If you are not that angler but you know one, give it to them.

Tight lines.

Repo: https://github.com/rondilley/Angler_AI

iamnor