Meta Open-Sources "Memory Layers," Redefining Transformer Architecture for Large Models
Discover Meta's groundbreaking 'Memory Layers,' optimizing Transformer models with efficient query mechanisms and scalability. Explore its open-source tech now!
Today, the global social media giant Meta shares an innovative research breakthrough: Memory Layers.
Currently, in Transformer-based pre-trained large models, the computational demand for storing and querying data grows exponentially as the model parameters increase.
The Memory Layers introduce a novel and efficient query mechanism that replaces traditional methods. By comparing query keys with keys from two smaller subsets, it can quickly locate the most relevant keys without scanning the entire memory layer.
This means that the number of model parameters can be significantly increased without adding computational overhead.
For example, researchers added 12.8 billion additional memory parameters to a model with only 130 million base parameters. The performance was comparable to Meta’s open-source Llama 2-70 model while requiring roughly one-tenth of the computational resources.