Data Collection in Transport Systems
Our world is becoming substantially more connected as we move toward the realization of the Internet of Things (IoT) concept. IoT is a state where computers, industrial equipment, sensors, and virtually all other types of devices are able to share data with each other (1). From PCs to smartphones to vehicles, many everyday objects are already able to communicate with one another. Information technology research and advisory company, Gartner, estimates that by 2020, there will be approximately 26 billion units in the IoT network with an expected revenue of $300 billion (2), an exponential growth from the 2.5 billion devices connected in 2009 (1). IoT has been discussed extensively in general data terms, but this concept also has large implications for the transportation world and it is already transforming different aspects.
In the present day, automobiles represent dozens of computers on four wheels. They have capabilities to communicate with the driver, as well as external networks. This connectedness is only increasing with time and there is a steady stream of resources being invested into the development of autonomous and connected vehicles. These vehicles exhibit the potential to be in sync with an interconnected transport network. This connected network will bring about opportunities to better the world through the unthinkable amount of data created by the network. Access to this data will force current data collection methods to evolve and new types to be born in order to properly harness the potential. Data collection in transport systems and big data are undeniably interconnected.
Traditionally, traffic data has been collected through intrusive methods such as pneumatic road tubes, piezoelectric sensors, and magnetic loops, or non-intrusive methods such as manual counts, passive and active infra-red, passive magnetic, and microwave radar (3). These methods of collection often require a significant input of time and money to produce results and these results are often limited in breadth.
Floating Car Data (FCD) is the idea of collecting, analyzing, and relaying traffic data in real time through global positioning systems (GPS) or mobile networks (2). This data includes vehicle location, infrastructure data, weather data, and transit data (4). Figure 1 shows an example of what the data transfer network at an urban intersection might look like in the near future. The network would inform users of their surroundings and traffic authorities of conditions. It would facilitate a more seamless system with fewer conflicts between modes. FCD is not a new concept and there has been considerable application of this strategy, however, it can be applied further to advance Intelligent Transport Systems (ITS). An example of this is FCD’s application in transport safety. In the past, transportation safety programs have relied upon historical crash data to identify critical network locations, determine collision causation, and evaluate safety countermeasures. This reactive approach requires years of data collection since collisions are not high frequency events. Furthermore, the concept requires collisions to occur in order to prevent them and this collision information does not usually illustrate the entire picture of the crash event. FCD and new data collection and processing technologies allow for a more complete understanding of the pre-collision conditions, and they allow for large amounts of surrogate data to be collected to identify and prioritize critical sites in real time (5). These shifts in data collection will provide a more solid foundation for decision making. This type of research regarding transportation safety and other applications of big data is ongoing and programs such as the Real-Time Data Capture and Management Program (DCM) and the Dynamic Mobility Applications program (DMA) are instituted by the USDOT and are examples of the direction towards advancing integration of IoT data. These programs further develop FCD by establishing new data acquisition techniques and progressing application technologies (4).
A systematic change as large as the IoT concept comes with many challenges. In their paper titled Urban computing in the wild: A survey on large scale participation and citizen engagement with ubiquitous computing, cyber physical systems, and Internet of Things, Salim and Haque note that incorporating an urban computing system into reality successfully is complex and brings forth many challenges not found in a controlled environment (1). In a world with an infinite number of dynamic variables, having a system that can adapt to changing conditions is critical for it to remain valuable.
One of these challenges is designing systems that invoke mass public support through active participation. Salim and Haque add that having the technological capabilities to collect data is not enough on its own to successfully facilitate big data transfer. We must consider the human element as well when constructing large data collection systems (1). In a transportation context, this could involve a phone app that allows users to report various traffic conditions to a central system and gives users the opportunity to collectively view what has been submitted and adjust their actions accordingly. Human willingness to participate is essential for such applications. Salim and Haque propose that a five step strategy be adopted for structuring participation in large-scale projects. These steps are: identifying needs and dilemmas, identifying stakeholders, identifying incentives, gathering evidence and experience, and providing tools and affordance (1).
Another challenge presented with the IoT is data quality control. Our systems are becoming more automated and data driven which introduces new vulnerabilities. Fruehe points out that in an IoT system, bad data leads to bad decisions, especially when that data is not realized to be wrong (6). This issue is being addressed through the development of processing techniques, such as data cleaning methodologies using computer self-learning software which differentiate poor data from higher quality data (7).
Other challenges such as data security and data retention become concerns due to the large networks and finite storage capacity for information. Currently, 2.5 quintillion bytes of data are created every day, meaning 90% of data in the world was created in the past 2 years (5). This skyrocketing growth rate begins to put our storage needs into perspective. An autonomous system is also vulnerable to data being intercepted or interfered with, such as a traffic light receiving wrong information from an external hacker, meaning infrastructure must be developed with these threats in mind. In regards to dealing with the large quantities of data that will present, Skorupa argues that data handling entities will be forced to decentralize applications and aggregate data in smaller data centers. This is contrary to current trends that have seen data centralization unfold (2).
The future of transportation data is an expanding interconnected network of data providers and data users. This network will come with challenges, but it has even greater benefits in the form of increased efficiency and safety. Earlier this week, the Obama administration released regulations for the introduction of autonomous vehicles into the public (8). This marks new ground in the shift towards a more connected transport system and ultimately, the Internet of Things. With the future unfolding before us, we must remain vigilant and adaptive to the limitless potential these networks encompass. In all reality, future of transportation management has as much to do with IT as it does with concrete and steel (9).
 F. Salim and U. Haque, "Urban computing in the wild: A survey on large scale participation and citizen engagement with ubiquitous computing, cyber physical systems, and Internet of Things," Elsevier, Melbourne, 2014.
 Gartner, "Gartner Newsroom," Gartner, 9 March 2014. [Online]. Available: http://www.gartner.com/newsroom/id/2684616. [Accessed 14 09 2016].
 G. Leduc, "Road Traffic Data: Collection Methods and Applications," European Commission, Seville, 2008.
 United States Department of Transportation, "Intelligent Transportation Systems Joint Program Office," United States Department of Transportation, 2016. [Online]. Available: http://www.its.dot.gov/factsheets/realtime_dcm_factsheet.htm. [Accessed 14 09 2016].
 L. Fu, L. F. Miranda-Moreno, C. Lee and L. Thakali, "Applying Big Data to Road Safety," Canadian Civil Engineer, Montreal, 2015.
 J. Fruehe, "The Internet of Things is About Data, Not Things," Forbes, 30 07 2015. [Online]. Available: http://www.forbes.com/sites/moorinsights/2015/07/30/the-internet-of-things-is-about-data-not-things/#5a1f20d74e45. [Accessed 14 09 2016].
 V. M. Megler, K. Tufte and D. Maier, "Improving Data Quality in Intelligent," Portland State University, Portland, 2015.
 C. Kang, "Self-Driving Cars Gain Powerful Ally: The Government," The New York Times, 19 09 2016. [Online]. Available: http://www.nytimes.com/2016/09/20/technology/self-driving-cars-guidelines.html. [Accessed 23 09 2016].
 S. Ezell, "Explaining International IT Application Leadership: Intelligent Transportation Systems," The Information Technology and Innovation Foundation, Washington D.C., 2010.