Analyzing Atlantic hurricane database (HURDAT2)

Anuroobika K
Analytics Vidhya
Published in
3 min readJun 23, 2021

--

A hurricane is a storm that occurs in the Atlantic Ocean and northeastern Pacific Ocean, it is called a typhoon if it occurs in the northwestern Pacific Ocean, and a cyclone if it occurs in the South Pacific Ocean or Indian Ocean.

Earth Space Picture from NASA showing a storm

Hurricanes form over the warm ocean water of the tropics. When warm moist air over the water rises, it is replaced by cooler air. The cooler air will then warm and start to rise. This cycle causes huge storm clouds to form. These storm clouds will begin to rotate with the spin of the Earth forming an organized system. If there is enough warm water, the cycle will continue and the storm clouds and wind speeds will grow causing a hurricane to form.

Now, let’s analyze the hurricane database, which has information on all known tropical and subtropical storms, to understand hurricanes better.

Data collection:

I downloaded the database HURDAT2 from National Hurricane Center site. This dataset ([known as Atlantic HURDAT2] has a comma-delimited, text format with six-hourly information on the location, maximum winds, central pressure, and (beginning in 2004) size of all known tropical cyclones and subtropical cyclones.

Detailed information regarding the Atlantic Hurricane Database Re-analysis Project is available from the Hurricane Research Division- Info on HURDAT2 database.

Snapshot of raw data

Data cleansing:

I used a python script from a trajectory segmentation research work [ Thanks to Etemad, M., Júnior, A. S., Hoseyni, A., Rose, J., & Matwin, S. (2019)] to convert HURDAT to a dataframe which could be processed easily.

I did further data manipulations such as creating a new variable called ‘Category’ based on the Maximum sustained wind speed and another variable called ‘StormID’ based on Basin-Atlantic, ATCF cyclone number for that year and year of the storm, dropping few columns (all quadrant-wise wind radii) and renaming variables.

Snapshot of processed dataframe

Cyclicality:

Studying the cyclicality of hurricanes over the year, it is clearly visible the number of storms has started to increase around year 2000 and is in increasing trend.

Seasonality:

The hurricane season is from July to November with its peak in September.

Category frequency:

There are around 700 named storms from 1950 to 2020. 50% of them are Category-1 hurricanes with max. wind speed between 74 and 95 miles per hour and only four are Category-5 hurricanes with max. wind speed of >156mph.

--

--

Anuroobika K
Analytics Vidhya

Writes about data science topics in simple words and also enjoys writing about life skills. Connect on https://www.linkedin.com/in/anuroobika-k-905b8823/