reflection-tuning dataset generation (#349)

This commit is contained in:
Sebastian Raschka
2024-09-10 21:42:12 -05:00
committed by GitHub
parent 8ad50a3315
commit 835ed29dbf
7 changed files with 1077 additions and 4 deletions

View File

@@ -1,6 +1,7 @@
# Generating a Dataset for Instruction Finetuning
# Generating Datasets for Instruction Finetuning
This folder contains utility code that can be used for generating a dataset for instruction finetuning.
- [llama3-ollama.ipynb](llama3-ollama.ipynb): A notebook that creates a synthetic instruction finetuning dataset using Llama 3 and Ollama
- [reflection-gpt4.ipynb](reflection-gpt4.ipynb): A notebook that implements an instruction dataset refinement step based on reflection-tuning

View File

@@ -0,0 +1,4 @@
{
"OPENAI_API_KEY": "sk-...",
"_comment": "Enter your API key from https://platform.openai.com/api-keys"
}

View File

@@ -498,7 +498,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.11.4"
}
},
"nbformat": 4,

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,2 @@
openai>=1.30.3
tqdm>=4.65.0

View File

@@ -12,4 +12,4 @@
- [04_preference-tuning-with-dpo](04_preference-tuning-with-dpo) implements code for preference finetuning with Direct Preference Optimization (DPO)
- [05_dataset-generation](05_dataset-generation) contains code to generate synthetic datasets for instruction finetuning
- [05_dataset-generation](05_dataset-generation) contains code to generate and improve synthetic datasets for instruction finetuning