Forecasting NYC Yellow Taxi Ridership Decline: A Time Series Analysis of Daily Passenger Counts (2017-2019)

AI-generated keywords: analysis forecasting passenger counts time series modeling NYC yellow taxis

AI-generated Key Points

Study focused on daily ridership for New York City's iconic yellow taxis from 2017-2019
Utilized comprehensive dataset from NYC Taxi and Limousine Commission
Analysis revealed consistent linear decline of approximately 200 passengers per day over the study period
First-order autoregressive model with detrending and cycle removal techniques provided most precise predictions
Test RMSE of 34,880 passengers against average daily ridership of 438,000 passengers achieved by the model
Exploration of modeling techniques like ARMA models and grid search methods identified potential candidates such as ARMA(9, 9) and ARMA(6, 4)
AIC scores favored ARMA(9, 9) due to low score of 22.55 while BIC scores favored simpler models like ARMA(6, 4)
Emphasized importance of employing robust time series modeling techniques for accurate forecasting in complex datasets

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Gaurav Singh

arXiv: 2507.10588v1 - DOI (econ.GN)

License: CC BY 4.0

Abstract: This study analyzes and forecasts daily passenger counts for New York City's iconic yellow taxis during 2017-2019, a period of significant decline in ridership. Using a comprehensive dataset from the NYC Taxi and Limousine Commission, we employ various time series modeling approaches, including ARIMA models, to predict daily passenger volumes. Our analysis reveals strong seasonal patterns, with a consistent linear decline of approximately 200 passengers per day throughout the study period. After comparing multiple modeling approaches, we find that a first-order autoregressive model, combined with careful detrending and cycle removal, provides the most accurate predictions, achieving a test RMSE of 34,880 passengers on a mean ridership of 438,000 daily passengers. The research provides valuable insights for policymakers and stakeholders in understanding and potentially addressing the declining trajectory of NYC's yellow taxi service.

Submitted to arXiv on 11 Jul. 2025

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2507.10588v1

Comprehensive Summary
Key points
Layman's Summary
Blog article

This study delves into the of daily for New York City's iconic , covering the period of 2017-2019 which saw a significant decline in ridership. Using a comprehensive dataset from the NYC Taxi and Limousine Commission, various approaches were employed, including ARIMA models, to accurately predict daily passenger volumes. The analysis revealed strong seasonal patterns within the data, showing a consistent linear decline of approximately 200 passengers per day throughout the study period. After comparing multiple modeling approaches, it was determined that a first-order autoregressive model with meticulous detrending and cycle removal techniques yielded the most precise predictions. This model achieved a test Root Mean Square Error (RMSE) of 34,880 passengers against an average daily ridership of 438,000 passengers. These findings provide valuable insights for policymakers and stakeholders seeking to understand and potentially address the downward trajectory observed in NYC's yellow taxi service. Furthermore, through an exploration of different modeling techniques such as ARMA models and grid search methods to determine optimal parameters for the model fitting process, potential candidates like ARMA(9, 9) and ARMA(6, 4) emerged as promising choices based on AIC scores and harmonic mean scores. The consideration of AIC scores highlighted ARMA(9, 9) as a favorable candidate due to its low score of 22.55; however, caution was advised against overfitting by opting for higher order models. In contrast, BIC scores favored simpler models like ARMA(6, 4), emphasizing model complexity penalization as a guiding principle in selecting appropriate models. In conclusion, not only sheds light on the intricate dynamics influencing NYC's yellow taxi ridership but also underscores the importance of employing robust time series modeling techniques to effectively navigate through complex datasets. By offering nuanced insights into passenger trends and forecasting methodologies tailored to address declining ridership patterns accurately, this study serves as a valuable resource for decision-makers aiming to steer NYC's yellow taxi service towards sustainable growth and resilience in an evolving transportation landscape.

- Study focused on daily ridership for New York City's iconic yellow taxis from 2017-2019
- Utilized comprehensive dataset from NYC Taxi and Limousine Commission
- Analysis revealed consistent linear decline of approximately 200 passengers per day over the study period
- First-order autoregressive model with detrending and cycle removal techniques provided most precise predictions
- Test RMSE of 34,880 passengers against average daily ridership of 438,000 passengers achieved by the model
- Exploration of modeling techniques like ARMA models and grid search methods identified potential candidates such as ARMA(9, 9) and ARMA(6, 4)
- AIC scores favored ARMA(9, 9) due to low score of 22.55 while BIC scores favored simpler models like ARMA(6, 4)
- Emphasized importance of employing robust time series modeling techniques for accurate forecasting in complex datasets

SummaryA study looked at how many people rode in New York City's famous yellow taxis each day from 2017 to 2019. They used a lot of information from the NYC Taxi and Limousine Commission to learn about this. The study found that the number of passengers going down by about 200 every day during this time. By using special math models, they could predict how many passengers would ride each day more accurately. One model was really good at guessing, with an error of about 34,880 passengers compared to the average of 438,000 passengers daily. Definitions- Daily ridership: The number of people who use a service or travel on a vehicle in one day. - Dataset: A collection of data or information used for analysis. - Linear decline: A steady decrease in a straight line over time. - Autoregressive model: A mathematical model that uses past values to predict future values. - RMSE (Root Mean Square Error): A measure of how accurate a prediction is compared to actual values. - ARMA models: Autoregressive Moving Average models used for time series analysis. - Grid search methods: A technique for finding the best parameters for a model by testing different combinations systematically. - AIC (Akaike Information Criterion) scores: A measure used for comparing statistical models, where lower scores indicate better fitting models. - BIC (Bayesian Information Criterion) scores: Another measure for comparing statistical models, favoring simpler models with lower scores

Introduction

New York City's yellow taxi service has long been an iconic symbol of the city's bustling streets and vibrant energy. However, in recent years, there has been a noticeable decline in daily ridership for this mode of transportation. This study aims to delve into the factors contributing to this decline by analyzing a comprehensive dataset from the NYC Taxi and Limousine Commission covering the period of 2017-2019.

The Dataset

The dataset used in this study contains daily passenger volumes for New York City's yellow taxis from January 1st, 2017 to December 31st, 2019. It includes information such as pickup date and time, number of passengers, trip distance, fare amount, and other relevant variables.

Methodology

To accurately predict daily passenger volumes for NYC's yellow taxis during the study period, various modeling approaches were employed. These included ARIMA models (AutoRegressive Integrated Moving Average) which are commonly used for time series analysis.

Data Analysis

The first step in data analysis was to explore any underlying patterns or trends within the dataset. This was done through visualizations such as line graphs and scatter plots. The analysis revealed strong seasonal patterns within the data with a consistent linear decline of approximately 200 passengers per day throughout the study period.

Model Selection

After exploring different modeling techniques like ARMA (AutoRegressive Moving Average) models and grid search methods to determine optimal parameters for model fitting process, potential candidates like ARMA(9, 9) and ARMA(6 ,4) emerged as promising choices based on AIC (Akaike Information Criterion) scores and harmonic mean scores. The consideration of AIC scores highlighted ARMA(9 ,9) as a favorable candidate due to its low score of 22.55. However, caution was advised against overfitting by opting for higher order models. In contrast, BIC (Bayesian Information Criterion) scores favored simpler models like ARMA(6 ,4), emphasizing model complexity penalization as a guiding principle in selecting appropriate models.

Model Evaluation

To evaluate the accuracy of the chosen model, a test Root Mean Square Error (RMSE) was calculated. The selected first-order autoregressive model with meticulous detrending and cycle removal techniques achieved an RMSE of 34,880 passengers against an average daily ridership of 438,000 passengers.

Findings

The analysis revealed that there is a consistent linear decline in daily passenger volumes for NYC's yellow taxis throughout the study period. This suggests that there are underlying factors contributing to this downward trend. Furthermore, through exploring different modeling techniques and evaluating their performance using AIC and BIC scores, this study highlights the importance of employing robust time series modeling techniques to effectively navigate through complex datasets.

Implications

The findings from this study have important implications for policymakers and stakeholders seeking to understand and potentially address the declining ridership observed in NYC's yellow taxi service. By providing valuable insights into passenger trends and forecasting methodologies tailored to address declining ridership patterns accurately, this study serves as a valuable resource for decision-makers aiming to steer NYC's yellow taxi service towards sustainable growth and resilience in an evolving transportation landscape.

Suggestions for Future Research

While this study provides valuable insights into daily passenger volumes for NYC's yellow taxis during the period of 2017-2019, further research could explore other potential factors contributing to the decline in ridership such as competition from ride-sharing services or changes in consumer behavior. Additionally, incorporating external variables such as weather conditions or events happening within the city could also improve the accuracy of future predictions. Furthermore, expanding the study to include data from other cities or time periods could provide a broader perspective on the factors influencing yellow taxi ridership.

Conclusion

In conclusion, this study sheds light on the intricate dynamics influencing NYC's yellow taxi ridership and underscores the importance of employing robust time series modeling techniques to effectively navigate through complex datasets. By offering nuanced insights into passenger trends and forecasting methodologies tailored to address declining ridership patterns accurately, this study serves as a valuable resource for decision-makers aiming to steer NYC's yellow taxi service towards sustainable growth and resilience in an evolving transportation landscape.

Created on 16 Jul. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Similar papers summarized with our AI tools

48.6%

The Rise and Fall of Ideas' Popularity

econ.GN

47.8%

Open vs Closed-ended questions in attitudinal surveys -- comparing, combining…

econ.GN

46.2%

Examining the drivers of business cycle divergence between Euro Area and Roma…

econ.GN

45.3%

Measurement of carbon finance level and exploration of its influencing factors

econ.GN

45.2%

Economic Consequences of Online Tracking Restrictions

econ.GN

45.0%

The Effect of Marketing Investment on Firm Value and Systematic Risk

econ.GN

44.7%

The impacts of incarceration on crime

econ.GN

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.