Reconstructing Flights from Raw ADS-B
by Ondrej Grünwald, Founder / CEO
ADS-B does not emit flights. It emits position updates — tens of thousands per hour from a single receiver, one row per aircraft per second or so, each carrying a timestamp, an ICAO hex, and some combination of latitude, longitude, altitude, ground speed, vertical speed, and an on_ground bit that is correct most of the time.
A flight is something you have to derive. And the derivation is messier than it sounds.
The previous post on this blog covered the aviation data platform — the lakehouse, the hot path, the orchestration. This post is about the first piece of derived data built on top of that platform: a scheduled Python job that reads aviation_stream.adsb_messages and writes aviation_derived.flights, with one row per real-world flight, a deterministic ID, segmentation timestamps, airport fields for a downstream matching phase, and an end_reason that says what evidence closed the segment.
The hard parts are not the happy path. They are the moments when the signal disagrees with itself — when on_ground lies, when the receiver drops out mid-cruise, when a reprocessing window reopens a flight you already closed.
This post covers the state model, the takeoff and landing rules with their fallbacks, the gap-timeout behavior, and the reprocessing design that makes the job safe to re-run a week back without producing duplicates.
1. The Output Contract
Before talking about how the job works, it helps to look at what it produces. The output is a flights row with these fields:
@dataclass
class FlightRecord:
flight_id: str # deterministic hash of (icao, dep_ts)
icao: str
dep_ts: datetime
arr_ts: datetime | None
dep_lat: float | None
dep_lon: float | None
arr_lat: float | None
arr_lon: float | None
end_reason: str # LANDED | GAP_TIMEOUT | INCOMPLETE_STREAM | ...
start_reason: str | None # TAKEOFF | AIRBORNE_SEEN
dep_airport_icao: str | None
dep_airport_iata: str | None
arr_airport_icao: str | None
arr_airport_iata: str | None
arrival_gap_candidate: bool = False
Three details are worth flagging up front.
flight_id is a deterministic hash. sha256("{icao}:dep:{dep_ts.isoformat()}"). The same aircraft starting the same flight at the same timestamp always produces the same ID — across runs, across reprocessing windows, across restarts. This is the single most important property for safe re-runs. Without it, every reprocessing window would risk producing a duplicate row for the same real-world flight.
end_reason is part of the data, not metadata. Downstream consumers should treat LANDED and GAP_TIMEOUT as different things. They are not failure modes; they are different evidence levels. A flight that ended with GAP_TIMEOUT ended in the observed stream, but not necessarily at the last observed position. A flight that ended with INCOMPLETE_STREAM is still airborne at the end of the processing window. The column tells consumers how much to trust the arrival fields.
Some fields are nullable on purpose. dep_lat/dep_lon may be null when the aircraft was first seen mid-cruise. arr_ts is null only for unfinalized rows. dep_airport_icao is null when no airport sits within match radius (more on that — and the extrapolation trick that recovers many of these — in the follow-up post).

SkyTrace aircraft history view — live state on top, finalized
aviation_derived.flights records below.
2. A State Machine, Not a Query
The instinct, looking at a stream of ADS-B messages, is to reach for a SQL window function. Group by icao, order by ts, look for gaps. It works for small datasets and small definitions of "flight."
It does not work for production. Three reasons:
- A real flight spans hours. A window function over a multi-month source table either reads the whole table or partitions in ways that lose context across partitions.
- Some flights cross processing runs. The job runs every few minutes; a flight that takes off at minute 0 and lands at minute 90 needs state that survives between runs.
- The signal is partial.
on_groundis sometimes wrong, sometimes missing, sometimes lagging by tens of seconds. Single-message rules produce flapping; persistence-based rules need a place to count.
So the job is a per-ICAO state machine. State lives in a small database table:
CREATE TABLE aviation_derived.active_flights (
icao TEXT PRIMARY KEY,
flight_id TEXT,
state TEXT, -- UNKNOWN | AIRBORNE | ON_GROUND
last_ts TIMESTAMPTZ,
last_lat DOUBLE PRECISION,
last_lon DOUBLE PRECISION,
last_on_ground BOOLEAN,
last_alt DOUBLE PRECISION,
last_gs DOUBLE PRECISION,
candidate_dep_ts TIMESTAMPTZ,
candidate_dep_lat DOUBLE PRECISION,
candidate_dep_lon DOUBLE PRECISION,
start_reason TEXT,
on_ground_since_ts TIMESTAMPTZ,
takeoff_since_ts TIMESTAMPTZ,
low_speed_since_ts TIMESTAMPTZ
);
One row per aircraft, latest state only — not a history. The history of flight finalizations lives in flights; active_flights is just the per-aircraft tip of the iceberg.
That decision is load-bearing. Storing only the latest state means restart is cheap: load the row, replay messages newer than last_ts, advance the state. There is no per-aircraft event log to reconcile, no replay-from-the-beginning startup cost.
The cost is that active_flights is not a queryable history of where each aircraft has been. That is fine — that is what flights is for. The two tables have different jobs, and bleeding one into the other would have made both worse.
3. Detecting Takeoff When on_ground Lies
The naive rule is on_ground going from true to false. It works often enough to feel right, then fails in ways that are hard to debug.
on_ground comes from the aircraft's own avionics. Some aircraft never report it. Some report it stuck at true after takeoff for tens of seconds. Some report it as null during climbout. ADS-B receivers add their own noise: a momentary dropout can produce a single on_ground=null message right at the wrong moment.
The job uses three layered rules.
Primary evidence — on_ground is false, or on_ground is null and the altitude is above a threshold:
if on_ground is True:
airborne_evidence = False
elif on_ground is False:
airborne_evidence = True
else:
airborne_evidence = takeoff_alt is not None and takeoff_alt >= takeoff_alt_ft
When airborne_evidence flips and the state was not already AIRBORNE, the flight starts. If the previous message had on_ground=true and the current has on_ground=false, the departure anchor uses the last on-ground point rather than the current point — that puts the anchor closer to the runway and makes airport matching work better:
if state.last_on_ground is True and on_ground is False and state.last_ts is not None:
dep_ts = state.last_ts
dep_lat = state.last_lat
dep_lon = state.last_lon
Takeoff fallback — for when on_ground is unreliable. If altitude is rising, ground speed is high, and vertical speed is positive, start a timer:
takeoff_fallback = (
on_ground is not True
and takeoff_alt is not None
and takeoff_alt >= takeoff_start_alt_ft
and takeoff_alt <= takeoff_max_alt_ft
and msg.get("gs") is not None
and msg.get("gs") >= takeoff_gs_kts
and vs is not None
and vs >= takeoff_vs_fpm
)
If those conditions persist for takeoff_fallback_confirm_secs (30 seconds in production), the flight is force-started, with the initial fallback point preserved as the departure anchor — not the later confirmation point, which would have drifted away from the runway.
The takeoff-max-altitude guard is the rule that bites you if you forget it. Without takeoff_max_alt_ft, a flight that disappears for an hour and reappears at cruise altitude looks like a takeoff at FL350. The guard rejects "takeoff" evidence above 6000 ft, so the new flight is labeled AIRBORNE_SEEN instead, with a null departure airport and a clear signal that we picked it up mid-cruise.
Three rules sound like a lot. They are necessary because the underlying signal is noisier than the documentation suggests, and because the cost of a wrong takeoff (a phantom flight with no real departure) is worse than a missed one.
4. Detecting Landing Three Ways
Landing detection runs only while the state is AIRBORNE. There are three independent paths.
Confirmed on-ground. on_ground=true persists for on_ground_confirm_secs:
if state.state == "AIRBORNE":
if on_ground is True:
if state.on_ground_since_ts is None:
state.on_ground_since_ts = event_ts
elif (event_ts - state.on_ground_since_ts).total_seconds() >= on_ground_confirm_secs:
ended = _finalize_flight(state, event_ts, "LANDED", ...)
The persistence requirement matters. A single on_ground=true message after a long airborne run could be a one-message glitch from a receiver hand-off. Waiting for the bit to stay true for a minute (60s in production, 120s historically) cuts the false-positive rate to near zero without meaningfully delaying real landings.
Low-speed fallback. When on_ground is missing or unreliable, infer landing from gs ≤ landing_gs_kts and alt_geom ≤ landing_alt_ft (or null altitude, if landing_allow_missing_alt is true). Same persistence requirement, same finalization path.
Window-end finalization. If the stream goes silent after landing evidence starts — the aircraft taxied out of receiver coverage, the receiver went down, anything — the processing window end is used as the confirmation clock:
confirmed_on_ground = (
state.last_on_ground is True
and state.on_ground_since_ts is not None
and (window_end - state.on_ground_since_ts).total_seconds() >= on_ground_confirm_secs
)
confirmed_low_speed = (
state.low_speed_since_ts is not None
and (window_end - state.low_speed_since_ts).total_seconds() >= on_ground_confirm_secs
)
The arrival timestamp and position come from the last observed message, not from window_end. The window only provides the clock for the confirmation timer, not the data itself.
All three paths feed _finalize_flight, which enforces a minimum flight duration (min_flight_duration_secs, 120s) before emitting a row. Short hops below that threshold are discarded — they are almost always taxi noise or a misclassified on_ground flicker, not real flights.
5. Gap Timeouts
What about flights that just stop transmitting? An aircraft that drops out of receiver coverage mid-cruise, or a receiver that goes down, or a flight that ends up in airspace we do not cover.
The rule is simple: if more than gap_start_minutes (30) pass between consecutive messages for the same ICAO, any open flight is finalized:
if state.last_ts is not None:
gap = (event_ts - state.last_ts).total_seconds() / 60
if gap_minutes and gap > gap_minutes:
gap_arrival_candidate = (
state.state == "AIRBORNE"
and state.last_alt is not None
and state.last_alt <= gap_arrival_alt_ft
and state.last_gs is not None
and state.last_gs <= gap_arrival_gs_kts
)
ended = _finalize_flight(
state,
state.last_ts,
"GAP_TIMEOUT",
...
arrival_gap_candidate=gap_arrival_candidate,
)
_reset_active_flight(state, "UNKNOWN")
_clear_last_observation(state)
Two things are worth pointing out.
First, end_reason = "GAP_TIMEOUT" is honest. The job does not pretend this was a landing. The arrival fields point to the last observed position, which may be hundreds of miles from any airport. Downstream consumers should treat these rows as flights with an unknown ending — useful for fleet activity counts, useless for arrival statistics.
Second, the arrival_gap_candidate flag is the seam where airport matching can upgrade the row. If the last pre-gap point was already low and slow — i.e., it looked like an approach — the airport-matching phase gets a chance to project the descent forward and check for a nearby airport. If it finds one, the row is upgraded from GAP_TIMEOUT to LANDED. That extrapolation logic is its own post; the segmentation phase just records the candidacy.
After the finalization, both active_flight state and the last-observation cache are cleared. A post-gap message starts a fresh state window — it does not retroactively confirm a landing for the prior flight, and it does not let a stale landing timer leak into the next segment.
6. Reprocessing as a First-Class Concern
Almost every batch job I have seen treats reprocessing as a special mode — a script you run manually, a flag you set, a snowflake state to clean up afterward. This job treats reprocessing as the normal mode and the first-time backfill as a special case.
Every run computes a window from the watermark:
def compute_effective_window(
now, last_received_at, lookback_hours, max_reprocess_days,
):
lookback = timedelta(hours=lookback_hours or 0)
cap = timedelta(days=max_reprocess_days or 0)
if last_received_at is None:
start = now - cap if cap.total_seconds() else now - lookback
else:
start = last_received_at - lookback
if cap.total_seconds():
min_start = now - cap
if start < min_start:
start = min_start
return {"start": start, "end": now}
The window is [last_event_ts - lookback, now). The lookback overlap means every run reprocesses the tail of the previous window. That overlap is deliberate — it absorbs late-arriving messages, recovers from a previous run that died mid-window, and lets the operator widen the lookback at any time to reprocess a longer tail without writing a separate script.
The max_reprocess_days cap protects the cluster from a runaway backfill. If the watermark is stale (say, the job has been off for a week), the window does not silently scan the entire source table — it caps at max_reprocess_days and emits clear logs about what it skipped.
Three properties make the overlap safe:
Idempotent flight IDs. A reprocessed flight has the same (icao, dep_ts) and therefore the same flight_id. The upsert into flights overwrites the previous row in place. No duplicates.
Time-travel guard. When the job reads the existing active_flights state and then encounters a message older than that state's last_ts, it skips:
if not reprocess and state.last_ts is not None and msg["event_ts"] < state.last_ts:
# Skip time-travel messages from reprocessing to avoid creating duplicates.
continue
Without this guard, a reprocessed message from before the state's high-water mark could be misinterpreted as the start of a new flight for an aircraft already known to be airborne. The guard is bypassed when the operator explicitly passes --reprocess, which builds state from scratch inside the window.
Watermark advances on event time, not wall time. The job writes the latest event_ts it observed as the new watermark, not now. If the source table lags ingestion by 30 seconds, the watermark stays 30 seconds behind real time and the next run will reprocess that tail.
The combination — deterministic IDs, time-travel guard, event-time watermark — lets the operator run with --lookback-hours 24 or --lookback-hours 168 interchangeably. The shape of the output does not depend on the lookback; only how much work the run does.
7. Profiles, Not Flags
The job ships with four config profiles that compose with a base config.yaml:
config.analysis.yaml— short windows for analytical runs; emits open departures asINCOMPLETE_STREAMso analysts see partial flightsconfig.production.yaml— full flights, no window-end finalization; this is the canonical pipelineconfig.realtime.yaml— small bounding box, no backfill, frequent runs; intended for live operational useconfig.realtime.test.yaml— realtime profile writing to theaviation_testschema for integration tests
Profiles are applied via --config-override:
python -m flight_recon.run \
--config config.yaml \
--config-override config.production.yaml
Compared to per-environment CLI flags or environment variables, profile composition keeps the diff between modes visible in version control. The production run and the analysis run are not the same code with different flags; they are the same code with different configurations, and the configurations are diff-able files.
The aviation_test schema is the same shape as aviation_derived but isolated. Integration tests against a real Postgres write into aviation_test, so a test run never pollutes the production tables — and a developer can promote a fully validated dataset from aviation_test into aviation_derived as a deliberate step rather than as a side effect.
8. What I Would Do Differently
Callsign should have been a first-class column from row one. Without it, joining flights to FlightAware, AeroAPI, or any external system that keys on callsign requires a separate per-message lookup back into the source table. ADS-B callsigns are emitted intermittently and sometimes change mid-flight (selcal updates, crew changes), so the right shape is (first_callsign, last_callsign, callsign_changes) rather than a single column — but any of those would be more useful than the current zero columns.
The string-valued start_reason / end_reason fields ended up doing confidence's job. Downstream consumers filter on end_reason = 'LANDED' to get high-confidence arrivals — which works, but every such query is a CASE WHEN over a small string vocabulary instead of a numeric threshold. A future start_confidence / end_confidence float would let segmentation rules contribute additively (the takeoff-fallback rule adds 0.6, a confirmed on-ground transition adds 0.95) and let consumers pick their own threshold rather than memorize which strings mean what.
No enforced producer-side schema contract. The job reads aviation_stream.adsb_messages, populated by the platform's TSDB consumer. The contract between producer and consumer is enforced by Pydantic on one side and by psycopg column names on the other — i.e., not enforced at all. A breaking schema change would surface as silent nulls in finalized flight records, not as an obvious crash. A schema registry (Confluent, Apicurio) or even a shared protobuf definition would catch the break at deploy time.
9. Where This Fits
The segmentation job is small — a few hundred lines of Python, a handful of SQL tables, one watermark row. The work it does is not glamorous; it produces a table that downstream consumers treat as obvious.
That is the point. The lakehouse post argued that a platform is worth building because it makes the next product possible without rebuilding the data layer. This is the first such product: a clean flights table that downstream systems can join against without re-implementing the state machine, the takeoff fallback, the gap-timeout rule, or the reprocessing guard.
Scheduled flight plans cover controlled traffic and can be bought commercially, but they describe intent rather than actual movement and miss everything outside controlled airspace. The derived flights table is the ground-truth complement — owned, replayable, and covering every aircraft a receiver can hear.
The next post in this series covers the half of the job we deferred here: airport matching, including the extrapolation trick that recovers departure and arrival airports when ADS-B coverage starts mid-climb or ends mid-descent, and how the arrival_gap_candidate flag from this post gets used to upgrade GAP_TIMEOUT rows to LANDED when the geometry agrees.
The segmentation rules are the boring part. The airport-matching geometry is where the data gets interesting.