10 Lesser-Known but Incredibly Useful Deep Learning Algorithms You Need to Master in 2024

The Era of Large Models is Here: 4 Must-Know Research Directions! Miss One and You're OUT!

May 04, 2024

The actual effect may not be the best, but the theory behind it is certainly elegant and profound. Here are my top 10 favorites.

Word2vec

When I first encountered machine learning, I saw an example in a book: "China - Beijing = France - Paris". It completely changed my perception and I fell into the deep pit of machine learning.

Variational Autoencoder (VAE)

Variational inference and autoencoders work great together. Unlike regular autoencoders, they use random sampling. This makes the network learn encodings with local patterns. This idea has greatly influenced generative models.

Generative Adversarial Network (GAN)

It's not like a normal Encoder-Decoder. It's a new way. The min-max issue becomes a battle of two nets: one makes, one judges.

Graph Convolutional Network (GCN)

Papers： Spectral Networks and Locally Connected Networks on Graphs， Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering， Semi-Supervised Classification with Graph Convolutional Networks

Graph convolutional networks have a wide meaning here. They include ideas from three papers. The papers are about different types of networks on graphs.

Spectral CNN was the first key work. It used ideas from spectral graph theory. It applied convolution to graph networks for the first time.

Chebyshev Spectral CNN made a change. It used Chebyshev polynomials in place of the filtering function. These polynomials are based on the Laplacian matrix. This change made the math much faster.

GCN is the culmination. It further simplifies, ultimately condensing graph convolution into a single matrix. It achieves the utmost elegance.

Unfortunately, after this, GNNs no longer use graph convolution, only message passing remains.

PointNet

Using neural networks to handle point clouds was a big step forward. It fixed the messy nature of point cloud data. This was the start of deep learning in 3D vision.

Neural Radiance Field (NeRF)

Learning how light behaves at every point with neural networks is simply amazing. Then turning that into images through volume rendering is incredibly cool. While 3D vision has a new path, working on it can make you lose your hair fast!

Deep Q-Network (DQN)

Bringing together Q-learning and deep learning was brilliant. The result was very powerful and impressive.

AlphaGo

Mixing Monte Carlo tree search with deep learning beat Ke Jie at Go. This was a key event in AI history.

Proximal Policy Optimization (PPO)

Schulman made TRPO to help to learn to stay stable. But it used too much computing power. So he made PPO, which was simpler. He probably thought this change was not a big deal. He did not even try to publish a paper. He just put it online.

PPO became very popular. It was easy to use, stable, and fast. Almost everyone working on applications uses PPO now. Even ChatGPT uses it.

Neural Tangent Kernel (NTK)

The paper talks about why training neural networks can be stable. It can happen when the networks are very wide. How stable it is does not depend on the starting values. But it does depend on how the network is structured.

I don't have a deep understanding of NTK research. Here, I will only share my intuitive understanding.

When a neural network gets very wide, its NTK stays almost the same during training. This means we can think of the network's learning process as a simple linear equation.

I will use the ideas from a blog post called "Understanding the Neural Tangent Kernel".

The blog post reveals some profound phenomena:

The network's initialization does not affect its convergence.
Due to the stability of the NTK, the network's training process is also stable.
A neural network can have many more parameters than it needs. We call this "over-parameterization". Even when this happens, the neural network can still learn. The network can quickly reduce the error to zero. It can do this without making big changes to its parameters.

Four research directions in the era of large models

Technical direction: Focus on the advancements in large model techniques. This includes model design, training tricks, model compression, and low-precision training. Study the strengths, weaknesses, and suitable scenarios of different types of large models.
Application direction: Think about how large models can be used in different fields. These fields include NLP, computer vision, and tasks that use multiple types of data. Find areas where the extra computing power of large models can help. Make sure the model's abilities fit the specific application.
Theoretical direction: Investigate the scientific questions of why large models are effective. Explore theoretical foundations such as sample complexity, model capacity, and pre-training techniques. Seek theoretical breakthroughs in the relationship between large models and human intelligence.
Social direction: Think about how large models might affect society. This includes areas like law, ethics, and jobs. Find ways to make AI safer, fairer, and easier to understand. Keep in mind that social impacts may be different in various countries and cultures.

AI Disruption

Discussion about this post