← Back to Projects

Hierarchical Dataset Selection for Domain Adaptation

hierarchical_dataset_selection_can_improve_domain Not Started

Project Actions

Open in Terminal

Project Status

Source Idea

Hierarchical dataset selection can improve domain adaptation by optimizing the source data selection process for transfer learning tasks.

View Source Idea →

Files (10)

  • README.md
  • metadata.json
  • requirements.txt
  • src/__init__.py
  • src/data_loader.py
  • src/dataset_selector.py
  • src/evaluate.py
  • src/hierarchical_selection.py
  • src/train.py
  • src/utils.py

README Preview

# Hierarchical Dataset Selection for Domain Adaptation ## Description This project explores the hypothesis that hierarchical dataset selection can improve domain adaptation by optimizing the source data selection process for transfer learning tasks. The aim is to reduce negative transfer by selecting relevant subsets of the source data most aligned with the target domain. ## Research Hypothesis Hierarchical dataset selection can improve domain adaptation by optimizing the source data selection process for transfer learning tasks. ## Implementation Approach The project will involve: - Implementing a hierarchical dataset selector. - Applying this selector to domain adaptation benchmarks such as Amazon Reviews and Office-31. - Comparing performance metrics, such as accuracy and robustness, of models trained with and without hierarchical selection. ## Setup Instructions 1. Clone the repository: ```bash git clone cd hierarchical_dataset_selection ``` 2. Install the required Python packages: ```bash pip install -r requirements.txt ``` 3. Download the datasets and place them in the `data/` directory. ## Usage Examples ### Training To train a model with hierarchical dataset selection: ```bash python src/train.py --use-hierarchy ``` ### Evaluation To evaluate the model: ```bash python src/evaluate.py ``` ## Expected Results The project aims to demonstrate improved domain adaptation performance using hierarchical dataset selection, showing higher accuracy and robustness compared to standard methods. ## References - [Hierarchical Dataset Selection for High-Quality Data Sharing](http://arxiv.org/abs/2512.10952v1)