Climate Prediction
Essay by review • February 14, 2011 • Research Paper • 1,785 Words (8 Pages) • 1,580 Views
CP-KNN: Seasonal to Inter вЂ" annual Climate Prediction using Data Mining KNN technique.
The impact of seasonal to inter вЂ" annual climate prediction on the society, business, agriculture and almost all aspect of human life enforce the scientist to give proper attention to the matter. The last few years show tremendous achievements in this field. All systems, techniques developed so far, use the Sea Surface Temperature (SST) as a main factor among other seasonal climatic attributes. The statistical and mathematical models are then used for further climate predictions. In this paper we are going to develop a system that uses the historical weather data (Rain, Wind speed, Dew point, temperature etc) of a region and applies data mining algorithm “kвЂ"nearest neighbor (KNN)” for classification of these historical data into specific time span, the k nearest time spans (K nearest Neighbors) are then taken to predict the weather a month in advance. The experiments show that the system generates accurate result within reasonable time for a month in advance.
Objectives
The motivation behind the research is to extend the application of Data Mining to the field of meteorology, oceanography and climatology. This will open a new era in the field of Data Mining and climate prediction. The main Objectives are
• Utilization of historical data
• Data Cleansing to convert the data in uniform format
• Concrete Model for Climate Prediction using Data Mining
• Prediction using Numerical Data
• Improvements in performance and accuracy of Climate Prediction
Hypothesis (Problem Solution)
The huge amount of climatic data is available for years. The data should be brought into a uniform format. If the data is noisy it can be cleanse using any of data mining technique. The textual information that is of categorical nature can be converted to numerical form e.g. the attribute having yes/no option will be converted to 1/0. Once the data is converted to numerical format, then the data will be cleansed to remove any noisy data or missing data. The data mining techniques for cleansing e.g. mean, average etc. can be used to remove noise from the data.
The user will select the city, dates range and attributes for which the prediction is sought. The system will retrieve the following data from the dataset
1) The previous data equal to the number of days to forecast. This will start from the first date in the user specified date range to the number of days to forecast.
2) The data for all dates of the selected attributes and city.
This data will be divided into sequences where each sequence will be equal to the No_of_days_to_forcast * No_of_selected_attributes. The previous data will become a sequence by itself. This sequence will work as a base sequence in distance measurement formula.
Once the data is divided into sequences one of the distance measurement techniques will be applied on the sequences to find the distance between the all record sequences and base sequence. The distance will be calculated day wise and attribute wise. After the distance calculation all the sequences will be sorted according to the distance. We will collect the top k sequences and take their simple mean again day wise and attribute wise as predicted value.
Introduction
Seasonal to inter annual (S2I) climate prediction is the recent development of meteorology with the collaboration of oceanography and climatology all over the world. Weather and climate affects human society in all dimensions. In agriculture it increases or decreases crop production [1, 2]. In water management [3] rain, the most important factor for water resources, an element of weather. Energy sources e.g. natural gas and electricity are greatly depends on weather conditions. The day to day weather prediction is used for decades to forecast few days in advance, but recent developments move the trend from few days to inter annual forecast [4]. The S2I forecast is to forecast climate from months to year in advance. Climate is changing from year to year e.g. rain/ dry, cold/warm seasons significantly influence society as well as economy. Technological improvements increase the understanding in meteorology that how the different cycles, ENSO (El NiÐ*?u Southern Oscillation i.e. the warm and cold and vice versa phenomena of ocean) over the Pacific Ocean and Sea Surface Temperature (SST), affects the climate of regions world widely. Many countries United State of America (National Ocean and Atmospheric Administration - NOAA), England (MetOffice вЂ" Metrology Department and London Weather Center), Sri Lanka (Department of meteorology, Sri Lanka), India (India Meteorology Department and National Center for Medium Range Weather Forecasting - NCMRWF), and Bangladesh (Bangladesh Meteorological Department) etc have started the utilization of Seasonal Climate forecast Systems.
Related Work
A number of tools are available for climate Prediction. All the initial efforts use statistical models. Most of these techniques predict the SST based on ENSO phenomena [5]. NINO3 uses simple average for a specific latitude and longitude ( ) etc. Canonical correlation analysis [11, 12] is another statistical model that takes data from different oceans i.e. Indian, Atlantic, Pacific etc) and forecast the SST monthly anomalies (notable changes from routine measurement- very high or very low SST).
The International Research Institute for climate prediction (IRI) developed dynamical models based on the Atmosphere General Circulation Model (AGCM) e.g. ECHAM3 (Max Planck Institute), MRF9(National Center for Environmental Prediction вЂ"NECP) CCA3 (National Center for Atmospheric Research вЂ" NCAR). Other models are Canonical Correlation Analysis (CCA) [6;11], Nonlinear Canonical Correlation Analysis(NCCA) [7], Tropical Atmosphere Ocean Array TAOA [8], Global Forecast System, Climate Forecast Model are statistical, Numerical and dynamical (Two tiered). These models use SST as main attribute for forecasting among other climatic attributes. The sources of these attributes [9] are ENSO teleconnections (effects the global climate), Indian and Atlantic Ocean (effect the regional climate). None of these models are accurate for all situations and regions. These systems also use the geographical (longitude and latitude) location to identify the different regions instead
...
...