Practical Deep Learning with Climate Data

The topic of deep learning seems to have lost its previous popularity. Thanks to the myriad tutorials available online, anyone can talk about deep learning for five minutes. But has the threshold for deep learning dropped to the level of statistical methods like EOF decomposition?

On one hand, deep learning is overly touted as a unique algorithm, leading those who are eager to try it to overestimate the necessity of using deep learning algorithms to solve their problems. On the other hand, deep learning frameworks like PyTorch and TensorFlow give people the impression of a new programming language.

In terms of usage, the mathematical processes behind basic deep learning models are not complex. Of course, deep learning models require hyperparameter tuning, and unlike traditional algorithms, which provide a fixed output for a given input, they necessitate extensive experimentation. As for deep learning frameworks, the initial learning cost of PyTorch and NumPy is quite similar. Compared to the scattered tutorials available online, this article suggests that those interested in applying such algorithms undergo systematic training, such as attending classes, especially those that require assignments, to make deep learning algorithms as accessible as any commonly used algorithm.

This semester, I enrolled in a course titled “Practical Deep Learning with Climate Data”, finally addressing the regret of having only attended theoretical classes without daring to engage in practical work. The theoretical part is quite simple, so I rarely attended class. However, the greatest value of this course lies in the practice; after each class, there is a notebook, and to be honest, the workload is substantial. But this also constitutes the course’s greatest value. The practice even includes an introductory notebook for Python. Initially, all exercises were not allowed to use PyTorch, so manually coding functions deepens your understanding of the model training process. As the models became more complex later on, we began using the functions provided by PyTorch. Completing these assignments showed that PyTorch is just a package in Python.

Since this is the first time the course has been offered, I felt embarrassed to directly share the assignments on GitHub. I have uploaded the course materials and code to Google Drive: https://drive.google.com/drive/folders/1gLsVDVIEdq-21RBXwKJEq5yUZW0g8kNE?usp=sharing. Since the notebooks can be run directly in Colab, I highly recommend using Google Drive. Of course, if you cannot access it due to network issues, you can download it here: https://owncloud.gwdg.de/index.php/s/mYPlh7rYnhypveM

The exercises mainly include the following modules:Practical Deep Learning with Climate Data

NOTE: This resource (including the notebooks and PDFs) is copyrighted by David Greenberg, and this sharing link is solely for educational exchange. No one may use this resource for commercial purposes without the permission of David Greenberg.

In the notebooks, the instructor provides detailed descriptions of the required steps through Markdown, and some important steps include code hints, like the one below:Practical Deep Learning with Climate Data

The exercises are progressive, and in later exercises, you can partially copy code from previous exercises, maintaining great continuity. In summary, it is perfect material for beginners in deep learning.

Upon completing the above exercises, you’ll find that PyTorch is very flexible, and the documentation is very comprehensive. When needed, simply look up the function being used (e.g., PyTorch.conv2d) to understand the meaning of each parameter and size (for example, the first parameter of PyTorch.conv2d represents the number of channels of the input data; for instance, if the data from the previous seven days of SST is input at once, it would be in_channel=7), the size of the input data, and the meaning of each dimension (for example, PyTorch.conv2d requires the input data size to be in the order of (batch_size, input_channels, Height, Width)), as well as the size of the output data and the meaning of each dimension, to ensure the output is correctly fed into the next function.

At the end of the course, each person is required to complete a project. Although most of the students taking the course are first-year graduate students, I was surprised by every project. In addition to some excellent projects that expanded on the exercises from 04 – Convolutional Networks, one group thought of using an auto-encoder model to expand the ensemble size of the data. I am not quite sure what the correct translation of ensemble should be, but within the model, people have found that making slight oscillations to the initial conditions and then applying the same forcing can lead to vastly different results. Thus, the observed data in the real world is merely one of countless possibilities. When we observe a rising trend in temperature, how can we prove that it is due to external forcing rather than just a coincidental occurrence in this realization? However, if we have a large number of ensembles and analyze all the ensemble data together, we can find that although the temperature changes in each ensemble are different, the long-term trend in each is upward. Thus, the ensemble mean can represent the signal of external forcing. While we can use the model to output multiple ensembles, some may argue that how can we prove that each ensemble’s results are accurate, since in reality, there is only one ensemble? Therefore, constructing an ensemble of observational data becomes a significant project. The idea of this group is that different ensembles should have the same spatial mean and variance, and there is a model in deep learning called an auto-encoder, which takes an input field and outputs another field, making it have a mean and standard deviation close to that of the input. The two concepts align, meaning that an auto-encoder can produce a big ensemble of existing data.

As for my own project, I plan to use a seq2seq model to predict the daily NAO index. This kind of project is something to try out in the course’s final project, but in real scientific research, I wouldn’t dare choose such a topic because predicting the NAO index is very challenging, let alone the daily index. After the presentation, the instructor remarked that this is currently an impossible task. However, I enjoy such attempts; in our daily research, especially as a research worker, we often do not dare to make bold attempts for the sake of output, but some of these attempts can be quite interesting. Therefore, I will also try to update some small projects in my public account that do not guarantee success and may not necessarily be useful.

Statement: You are welcome to reprint or forward the original content of this account. You can leave a message in the comment section or contact the editor through the backend (WeChat: gavin7675) for authorization. The reprint information of the Meteorologist public account aims to promote communication, and the content is the author’s responsibility, not reflecting the views of this account. Some images in the text are sourced from the internet. If there are issues regarding content, copyright, or other matters, please contact the editor through the backend for resolution.

Practical Deep Learning with Climate Data

Practical Deep Learning with Climate Data

Practical Deep Learning with Climate Data

Previous Recommendations

ERA5-Land High-Resolution Reanalysis Data (~16TB)

★ ERA5 Common Variable Reanalysis Data (~11TB)

TRMM 3B42 Precipitation Data (Daily/3h)

Free Sharing of Research Data: GPM Satellite Precipitation Data

In the Meteorological Circle, there are people and there is a world; do not let virtue not match position!

Please, a certain meteorological public account, do not “judge a gentleman’s heart with a villain’s measure”!

EC Data Store Launches Online Processing Toolbox for Python

EC Creates Practical Meteorological Python Tool Metview

Introduction to Machine Learning and Its Applications in Short-Term Weather Warnings

AMS Recommendation | Python Tutorial for Meteorologists and Oceanographers

Nature – Deep Learning and Understanding in Earth System Science

Using Neural Networks and Deep Learning to Forecast Precipitation, Temperature etc.

Welcome to Join the Meteorologist Exchange Group

Please note:Name/Nickname – Institution/School – Research Direction

Applications without notes will not be approved

Practical Deep Learning with Climate Data

Practical Deep Learning with Climate Data

❤️ 「Meteorologist」 LikePractical Deep Learning with Climate Data

Leave a Comment