Statistical Insights into Transportation / New York, US

Project Name: Statistical Insights Into Transportation / New York, US


Project Description:


New York Transportation Statistical Insights is a project that is meant to provide information to people that are looking to travel to New York or local citizens that are interested in how transportation in one of the world’s most crowded cities functions.


In this project I used Python for Data Analysis and Modeling, along with its libraries that helped me create new data frames and display several points across the map (Folium, Matplotlib, Pandas, NumPy). These insights can be used to populate a Machine Learning model and get results that are relevant to the area of Smart Cities and Transportation. This analysis is addressing one of the biggest problems that big cities have: overcrowding. By analyzing the number of cars, the traffic incident detections, the subway turnstile usage, number of pedestrians passing by and weather conditions we can derive patterns in the way people from New York like to move around the city. This can result in recommendations based on the datasets which can be used to avoid going out on certain days or suggest better routes to get to the destination.


These insights were derived using a cloud based Python 3.6 instance powered by IBM: Developer Skills Network – Cognitive Class ( for smaller datasets and IBM Cloud ( for bigger datasets. Tableau Public was also used for displaying bar charts. The data was first modeled (converting, creating new data frames for accuracy, removing columns/rows that were not populated) and then it was used in order to get insights, plot maps and bar charts.

I strongly believe that Statistics and Machine Learning can help us build better cities in the future and I think that we can achieve that goal by using computational power. I have created the project with this idea in mind, but I also wanted to make it friendly and easy to watch or use



Additional Items: N/A

Developer(s): Andrei Ionut Dumitrache, University of Essex, Award Recipient, Global IoT Datathon hosted by Terbine


Data Feeds Employed:


Open Source Tools:

IBM: Developer Skills Network – Cognitive Class

IBM Cloud

Tableau Public


GitHub: N/A


Original Posting Date: 10 September 2020