HeadwayForge
Home / Resources / GTFS-Realtime validation
Feed quality

GTFS-Realtime validation: keeping your feed (and your analysis) trustworthy

Feed quality · ~8 min read

A transit feed is a public-facing product whether an agency thinks of it that way or not. The moment a GTFS Schedule and a GTFS-Realtime feed are published, they flow into Google Maps, the Transit app, Apple Maps, agency websites, arrival signs, and every analytics tool downstream — including your own. When the feed is clean, none of that is visible. When it is not, riders see phantom buses and wrong arrival times, internal analysis quietly produces bad numbers, and public trust erodes one missed prediction at a time. Feed validation is how you catch the problems before your riders and your reports do.

Static GTFS versus GTFS-Realtime

The two feeds play different roles. The static GTFS Schedule describes the service an agency plans to run — routes, trips, stops, and times in a ZIP of CSV files, refreshed when service changes. GTFS-Realtime describes what is happening right now, streamed continuously as Protocol Buffer messages and refreshed every few seconds to a minute. Realtime does not stand alone: nearly all of it references trip and route IDs that must exist in the matching static feed. If the schedule is wrong or stale, the realtime feed that points at it is wrong too. That dependency is why you validate both, and why you validate the static feed first.

The three GTFS-Realtime entity types

A GTFS-Realtime feed carries up to three kinds of entity, each answering a different question. Agencies may publish them as separate feed URLs or combine them.

  • Vehicle Positions — where each vehicle is right now: its latitude/longitude, and often bearing, speed, occupancy, and the trip/stop it is currently serving. This is what powers the moving dots on a live map.
  • Trip Updates — predicted or actual arrival and departure times for upcoming stops on a trip, expressed as delays or absolute times against the schedule. This is what drives "next bus in 4 minutes." Trip Updates also carry cancellations and added trips via schedule-relationship flags.
  • Service Alerts — human-readable notices about disruptions: detours, stop closures, elevator outages, delays, scoped to the affected routes, stops, or trips. This is what surfaces "Route 12 detoured due to construction."

Vehicle Positions tell you where service is; Trip Updates tell you when it will arrive; Service Alerts tell riders why something changed. A complete realtime picture usually needs all three.

Why feed quality matters

Feed defects have three audiences, and they fail all three:

  • Riders and the apps they use. Consumer apps are strict. A vehicle on a trip ID that does not exist in the schedule, or a stale timestamp, can make a bus vanish from the map or show a prediction that never comes true. Riders do not blame the feed format — they blame the agency.
  • Internal analysis. Reliability metrics, headway adherence, on-time performance, and realized-service measures are all computed by reconciling realtime against the schedule. Mismatched IDs or dropped positions silently bias those numbers, and a biased reliability finding can send a service decision the wrong way.
  • Public trust. A feed that is visibly unreliable undermines confidence in the whole system, and once riders stop trusting real-time arrivals they stop using them.

Common realtime problems

Most realtime quality issues fall into a handful of recurring patterns:

  • Stale timestamps. Every feed message and most entities carry a timestamp. If the feed header, or individual vehicles, stop updating, consumers are reading old data presented as current. Freshness is the single most important realtime health signal — a feed that hasn't moved in minutes is effectively down even if the URL still returns 200.
  • ID mismatches against the schedule. A Trip Update or Vehicle Position that references a trip_id or route_id not present in the active static feed cannot be placed on the network. This is one of the most common defects, and it usually traces back to the realtime and static feeds drifting out of sync after a service change.
  • Missing vehicles or positions. Vehicles in service but absent from the feed, or position records with no coordinates, leave gaps that make coverage and reliability analysis incomplete.
  • Schedule–realtime drift. When the static feed is updated but the realtime producer still points at the old trips (or vice versa), predictions attach to the wrong runs. Keeping the two feeds version-aligned is a constant operational task.
  • Gaps and bunching, surfaced honestly. Real operations produce bunched and dropped trips; a good realtime feed reports them faithfully rather than hiding them. Analysis depends on the feed telling the truth about what ran.
The rule of thumb: a GTFS-Realtime feed is only as trustworthy as the static feed it references and the freshness of its timestamps. Validate the schedule first, then check that every realtime entity matches a real trip and carries a recent timestamp — everything else builds on those two checks.

Validating the static feed

Before any realtime check — and before any analysis — validate the schedule. The reference tool is the Canonical GTFS Schedule Validator, the community-standard validator maintained under the MobilityData umbrella, which the wider GTFS ecosystem treats as the baseline definition of a well-formed feed. It parses the feed and reports a list of notices, each with a severity:

  • ERROR — a violation of the specification that will likely break consumers (e.g., a referenced stop_id that does not exist, a malformed time). These must be fixed.
  • WARNING — something that is technically valid but probably unintended or risky (e.g., unusually fast travel, far-flung stops, duplicate entries). These should be reviewed.
  • INFO — observations and good-to-know notes that rarely require action.

Reading the notice rollup — how many errors, which warnings, where they cluster — tells you whether the schedule is fit to publish and fit to analyze. A feed full of errors will produce wrong headways, wrong distances, and a realtime feed that references broken trips, so the static check is the foundation everything else rests on.

Best practice: validate continuously and reconcile

Validation is not a one-time gate; feeds change with every service change and the realtime stream changes every minute. A durable practice looks like this:

  • Validate the static feed on every publish with the Canonical Validator, and treat errors as release-blocking.
  • Monitor realtime freshness continuously — alert when feed or entity timestamps go stale.
  • Reconcile realtime against the schedule — confirm that trip and route IDs in Vehicle Positions and Trip Updates resolve to active trips in the current static feed, and watch for drift after each service change.

For planners, this is not a back-office IT concern — it is the precondition for credible analysis. Reliability and headway-adherence findings are computed directly from realtime reconciled against the schedule, so a finding is only as defensible as the feed underneath it. Starting an analysis on an unvalidated feed means you may be measuring data defects, not service.

How HeadwayForge helps

HeadwayForge bakes feed quality into the workflow rather than leaving it to a separate tool. It validates GTFS feed quality using the Canonical Validator and presents the results as notice rollups — error and warning counts you can scan at a glance — so you know whether a schedule is sound before you build on it. Alongside that, it surfaces live vehicle positions and realtime health, including feed freshness, so you can see at a glance whether the live feed is current and consistent with the schedule. The result is that an analysis — service supply, reliability, access, or equity — starts from data you have already confirmed is clean. See the product overview for how validation fits the rest of the workbench, and the data-coverage view for how feed health is tracked across agencies nationwide.

Analyze your agency →   More resources

Start every analysis from a feed you trust

HeadwayForge runs the Canonical GTFS Validator, rolls up the notices, and surfaces live vehicle positions and realtime freshness — so your reliability and service numbers rest on clean data.