Source Idea
Decomposing internal policies in transformer-based vision models enhances image classification performance by optimizing feature extraction layers.
View Source Idea →
Files (9)
- README.md
- metadata.json
- requirements.txt
- src/__init__.py
- src/data_loader.py
- src/evaluate.py
- src/model.py
- src/train.py
- src/utils.py
README Preview
# Vision Policy Decomposition
## Project Title
Decomposing Internal Policies in Transformer-based Vision Models to Enhance Image Classification Performance
## Research Hypothesis
Decomposing internal policies in transformer-based vision models enhances image classification performance by optimizing feature extraction layers.
## Implementation Approach
We propose to implement policy decomposition techniques on a vision transformer model by identifying and optimizing internal policies within its layers. We will conduct experiments on the ImageNet dataset to evaluate improvements in accuracy and efficiency, comparing results with baseline models that do not utilize policy decomposition.
## Setup Instructions
1. Clone the repository:
```bash
git clone
cd vision_policy_decomposition
```
2. Install the required packages:
```bash
pip install -r requirements.txt
```
3. Prepare the dataset (ImageNet): Follow the instructions to download and prepare the ImageNet dataset.
## Usage Examples
- To train the model:
```bash
python src/train.py
```
- To evaluate the model:
```bash
python src/evaluate.py
```
## Expected Results
We expect to see an improvement in image classification accuracy and efficiency compared to baseline transformer models.
## References
- Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies (http://arxiv.org/abs/2512.19673v1)