Datasets
Browse All Available Dataset¶
Online Explorer¶
https://voidful.github.io/NLPrep-Datasets/
Add a new dataset¶
follow template from template/dataset
- edit task_datasetname to your task. eg: /tag_clner
- edit dataset.py in
template/dataset/task_datasetname
Edit DATASETINFOImplementDATASETINFO = { 'DATASET_FILE_MAP': { "dataset_name": "dataset path" # list for multiple detests in one tag }, 'TASK': ["gen", "tag", "clas", "qa"], 'FULLNAME': "Dataset Full Name", 'REF': {"Some dataset reference": "useful link"}, 'DESCRIPTION': 'Dataset description' }
load
for pre-loading'DATASET_FILE_MAP'
's dataImplementdef load(data): return data
toMiddleFormat
for converting file to input and targetdef toMiddleFormat(path): dataset = MiddleFormat(DATASETINFO) # some file reading and processing dataset.add_data("input", "target") return dataset
- move
task_datasetname
folder tonlprep/datasets