This study delves into the of daily for New York City's iconic , covering the period of 2017-2019 which saw a significant decline in ridership. Using a comprehensive dataset from the NYC Taxi and Limousine Commission, various approaches were employed, including ARIMA models, to accurately predict daily passenger volumes. The analysis revealed strong seasonal patterns within the data, showing a consistent linear decline of approximately 200 passengers per day throughout the study period. After comparing multiple modeling approaches, it was determined that a first-order autoregressive model with meticulous detrending and cycle removal techniques yielded the most precise predictions. This model achieved a test Root Mean Square Error (RMSE) of 34,880 passengers against an average daily ridership of 438,000 passengers. These findings provide valuable insights for policymakers and stakeholders seeking to understand and potentially address the downward trajectory observed in NYC's yellow taxi service. Furthermore, through an exploration of different modeling techniques such as ARMA models and grid search methods to determine optimal parameters for the model fitting process, potential candidates like ARMA(9, 9) and ARMA(6, 4) emerged as promising choices based on AIC scores and harmonic mean scores. The consideration of AIC scores highlighted ARMA(9, 9) as a favorable candidate due to its low score of 22.55; however, caution was advised against overfitting by opting for higher order models. In contrast, BIC scores favored simpler models like ARMA(6, 4), emphasizing model complexity penalization as a guiding principle in selecting appropriate models. In conclusion, not only sheds light on the intricate dynamics influencing NYC's yellow taxi ridership but also underscores the importance of employing robust time series modeling techniques to effectively navigate through complex datasets. By offering nuanced insights into passenger trends and forecasting methodologies tailored to address declining ridership patterns accurately, this study serves as a valuable resource for decision-makers aiming to steer NYC's yellow taxi service towards sustainable growth and resilience in an evolving transportation landscape.
- - Study focused on daily ridership for New York City's iconic yellow taxis from 2017-2019
- - Utilized comprehensive dataset from NYC Taxi and Limousine Commission
- - Analysis revealed consistent linear decline of approximately 200 passengers per day over the study period
- - First-order autoregressive model with detrending and cycle removal techniques provided most precise predictions
- - Test RMSE of 34,880 passengers against average daily ridership of 438,000 passengers achieved by the model
- - Exploration of modeling techniques like ARMA models and grid search methods identified potential candidates such as ARMA(9, 9) and ARMA(6, 4)
- - AIC scores favored ARMA(9, 9) due to low score of 22.55 while BIC scores favored simpler models like ARMA(6, 4)
- - Emphasized importance of employing robust time series modeling techniques for accurate forecasting in complex datasets
SummaryA study looked at how many people rode in New York City's famous yellow taxis each day from 2017 to 2019. They used a lot of information from the NYC Taxi and Limousine Commission to learn about this. The study found that the number of passengers going down by about 200 every day during this time. By using special math models, they could predict how many passengers would ride each day more accurately. One model was really good at guessing, with an error of about 34,880 passengers compared to the average of 438,000 passengers daily.
Definitions- Daily ridership: The number of people who use a service or travel on a vehicle in one day.
- Dataset: A collection of data or information used for analysis.
- Linear decline: A steady decrease in a straight line over time.
- Autoregressive model: A mathematical model that uses past values to predict future values.
- RMSE (Root Mean Square Error): A measure of how accurate a prediction is compared to actual values.
- ARMA models: Autoregressive Moving Average models used for time series analysis.
- Grid search methods: A technique for finding the best parameters for a model by testing different combinations systematically.
- AIC (Akaike Information Criterion) scores: A measure used for comparing statistical models, where lower scores indicate better fitting models.
- BIC (Bayesian Information Criterion) scores: Another measure for comparing statistical models, favoring simpler models with lower scores
Introduction
New York City's yellow taxi service has long been an iconic symbol of the city's bustling streets and vibrant energy. However, in recent years, there has been a noticeable decline in daily ridership for this mode of transportation. This study aims to delve into the factors contributing to this decline by analyzing a comprehensive dataset from the NYC Taxi and Limousine Commission covering the period of 2017-2019.
The Dataset
The dataset used in this study contains daily passenger volumes for New York City's yellow taxis from January 1st, 2017 to December 31st, 2019. It includes information such as pickup date and time, number of passengers, trip distance, fare amount, and other relevant variables.
Methodology
To accurately predict daily passenger volumes for NYC's yellow taxis during the study period, various modeling approaches were employed. These included ARIMA models (AutoRegressive Integrated Moving Average) which are commonly used for time series analysis.
Data Analysis
The first step in data analysis was to explore any underlying patterns or trends within the dataset. This was done through visualizations such as line graphs and scatter plots. The analysis revealed strong seasonal patterns within the data with a consistent linear decline of approximately 200 passengers per day throughout the study period.
Model Selection
After exploring different modeling techniques like ARMA (AutoRegressive Moving Average) models and grid search methods to determine optimal parameters for model fitting process, potential candidates like ARMA(9, 9) and ARMA(6 ,4) emerged as promising choices based on AIC (Akaike Information Criterion) scores and harmonic mean scores.
The consideration of AIC scores highlighted ARMA(9 ,9) as a favorable candidate due to its low score of 22.55. However, caution was advised against overfitting by opting for higher order models. In contrast, BIC (Bayesian Information Criterion) scores favored simpler models like ARMA(6 ,4), emphasizing model complexity penalization as a guiding principle in selecting appropriate models.
Model Evaluation
To evaluate the accuracy of the chosen model, a test Root Mean Square Error (RMSE) was calculated. The selected first-order autoregressive model with meticulous detrending and cycle removal techniques achieved an RMSE of 34,880 passengers against an average daily ridership of 438,000 passengers.
Findings
The analysis revealed that there is a consistent linear decline in daily passenger volumes for NYC's yellow taxis throughout the study period. This suggests that there are underlying factors contributing to this downward trend.
Furthermore, through exploring different modeling techniques and evaluating their performance using AIC and BIC scores, this study highlights the importance of employing robust time series modeling techniques to effectively navigate through complex datasets.
Implications
The findings from this study have important implications for policymakers and stakeholders seeking to understand and potentially address the declining ridership observed in NYC's yellow taxi service. By providing valuable insights into passenger trends and forecasting methodologies tailored to address declining ridership patterns accurately, this study serves as a valuable resource for decision-makers aiming to steer NYC's yellow taxi service towards sustainable growth and resilience in an evolving transportation landscape.
Suggestions for Future Research
While this study provides valuable insights into daily passenger volumes for NYC's yellow taxis during the period of 2017-2019, further research could explore other potential factors contributing to the decline in ridership such as competition from ride-sharing services or changes in consumer behavior.
Additionally, incorporating external variables such as weather conditions or events happening within the city could also improve the accuracy of future predictions. Furthermore, expanding the study to include data from other cities or time periods could provide a broader perspective on the factors influencing yellow taxi ridership.
Conclusion
In conclusion, this study sheds light on the intricate dynamics influencing NYC's yellow taxi ridership and underscores the importance of employing robust time series modeling techniques to effectively navigate through complex datasets. By offering nuanced insights into passenger trends and forecasting methodologies tailored to address declining ridership patterns accurately, this study serves as a valuable resource for decision-makers aiming to steer NYC's yellow taxi service towards sustainable growth and resilience in an evolving transportation landscape.