As of this spring, Miovision will have turned 1.5 million hours of video into traffic data from multiple video sources, including our own Scout Video Collection Unit. With daily volumes as high as 7,000 hours of video, we rely on a systematic method of combining computer vision algorithms and human verification to ensure data reports meet our customers’ expectations.
In this two-part blog series, I’ll be writing about how Miovision turns video into traffic data with significant support from Miovision Technical Director and Computer Vision Architect Justin Eichel, PhD.
Video-Based Traffic Data Collection
For those not familiar with Miovision’s traffic data collection solution, here’s a quick overview:
- Customer records video: this can be of an intersection, midblock, highway count, roundabout or pedestrian and bicycle location. Video can be provided by any video source, but we recommend our best-in-class Scout video collection unit.
- Video is uploaded to Miovision: every customer has a cloud-based account on the Miovision Platform where they can upload video and quickly select the types and amount of data they need.
- Data Reports are Downloaded: the video is turned into traffic data on the Miovision Platform. Data reports are stored on the customer account and available for download in a variety of formats, along with the video recording.
Counting Cars from Video – How We Do It
When the video is uploaded to the Miovision Platform, it is received by one of our Data Services Technicians at our office in Kitchener, Ontario, Canada. From there, every video follows the same process:
Configuration and Processing
- Video is manually configured by a Data Services Technician to identify all possible movements by their entry points and exit points, then submitted for processing
- Traffic data is produced with computer vision by successfully detecting vehicles and converting those detections into traffic observations
- Areas of low computer vision confidence are flagged and manually counted by a Data Services Technician
- 12% of each hour of video is randomly sampled to be manually counted by a Data Services Technician. The double-processing helps to ensure our accuracy standards for each vehicle classification, pedestrian count and bicycle count
- A visual time-of-day inspection is done to review for data abnormalities
- Corridors and adjacent locations are checked with data visualization tools to diagnose any potential data discrepancies between common links.
- Discrepancies are manually reviewed for data bin-overlap, or by reviewing the study area for mid-block trip generators or sinks
Producing Traffic Counts with Computer Vision
Our vehicle detection algorithm is trained on a growing dataset of over 7.7 million frames of video to ensure that we cover various locations and scenarios. The algorithm is tuned for rain, snow, changes in lighting, time of day, and various other environmental conditions.
Vehicle detection and directionality are inferred by separating moving vehicles from their background. To successfully isolate vehicle movement, our algorithm performs millions of logic and pattern recognition calculations per frame to understand the scene and localize vehicle motion.
The background is derived through scene understanding using temporal information, structural information, and appearance information, such as colour and luminosity. When scenic elements are understood, vehicle motion can be isolated within a configured area (Fig 1). Using state machines and vehicle tracking algorithms, vehicle directionality is inferred.
For each detection, probabilistic models are applied to ensure the detection is actually a vehicle. Other components of the algorithm use pattern recognition and statistical models to eliminate false positives and to reinforce areas of successful vehicle identification.
The algorithm itself has been developed over the last several years, beginning with Miovision CEO Kurtis McBride’s work in his Master’s Thesis published in 2007. The algorithm has evolved since to include:
- static object detector
- dynamic lane estimation
- video rectification
- scene registration
- object tracking
- background estimation
Certain times of day present the computer vision algorithm with various challenges related to lighting conditions and sun glare. At times, humans must intervene to correctly enumerate video segments; however, continued development work is done to overcome challenging scenes and evolve the production system. For example, when the algorithm detects that a video segment is outside of its scope and cannot be counted automatically, that segment is immediately distributed to a Data Services Technician to be manually reviewed and counted. The same segment is also distributed to a database of training videos used by the development team.
More on De-constructing video for counting, how we account for error, and algorithm training in Part Two of Turning Video into Traffic Data.