Source Idea
Formalizing the scaling laws of transformers using fractional order differential equations (FODEs) can provide a more accurate description of learning dynamics, especially in sparse data environments.
View Source Idea →
Files (9)
- README.md
- metadata.json
- requirements.txt
- src/__init__.py
- src/data_loader.py
- src/evaluate.py
- src/fode_optimizer.py
- src/train.py
- src/utils.py
README Preview
# Transformer FODE Project
## Description
This project explores the use of fractional order differential equations (FODEs) to formalize scaling laws in transformer models, particularly in environments with sparse data. The hypothesis is that FODEs provide a more accurate description of learning dynamics compared to traditional ODEs.
## Research Hypothesis
Formalizing the scaling laws of transformers using FODEs can provide a more accurate description of learning dynamics, especially in sparse data environments.
## Implementation Approach
- Develop a custom FODE-based optimizer.
- Simulate sparse data environments.
- Train transformer models using FODEs and compare against traditional ODE-based models.
## Setup Instructions
1. Install Python 3.8 or later.
2. Clone this repository.
3. Install dependencies using `pip install -r requirements.txt`.
## Usage Examples
Run training with:
```bash
python src/train.py
```
Evaluate the model with:
```bash
python src/evaluate.py
```
## Expected Results
We aim to demonstrate improved convergence and generalization in sparse data settings using FODEs versus traditional methods.
## References
- [Unifying Learning Dynamics and Generalization in Transformers Scaling Law](http://arxiv.org/abs/2512.22088v1)