To train your own parser, you will need:
train.conllu: the training data. Put this file inside the folder ./training_data/new/dev.conllu: the development data. Put this file inside the folder ./training_data/new/./test_data/gold/, and a copy of each of these inside ./test_data/tobeannotated/After adding these files to the right directories, the project directory should look like the following (note the placement of train.conllu, dev.conllu, and of the gold-test data, named text1.conllu and text2.conllu):
📦ROOT
┣ 📂models
┃ ┗ 📂OldSlavNet
┃ ┣ 📜model
┃ ┗ 📜model.params
┣ 📂oldslavnet-venv
┣ 📂scripts
┣ 📂test_data
┃ ┣ 📂annotated
┃ ┣ 📂gold
┃ ┃ ┣ 📜text1.conllu
┃ ┃ ┗ 📜text2.conllu
┃ ┗ 📂tobeannotated
┃ ┣ 📜text1.conllu
┃ ┗ 📜text2.conllu
┣ 📂training_data
┃ ┣ 📂new
┃ ┃ ┣ 📜dev.conllu
┃ ┃ ┗ 📜train.conllu
┃ ┗ 📂past
┃ ┗ 📂OldSlavNet
┃ ┣ 📜dev.conllu
┃ ┗ 📜train.conllu
┣ 📜LICENSE
┣ 📜Makefile
┣ 📜README.md
┣ 📜requirements.txt
┣ 📜tag.sh
┗ 📜train.sh
From the ROOT directory, run:
./train.sh
You will be prompted to enter:
This will:
./models/ named after the name you entered for your model, where the trained model itself (the model and model.params files) will be saved./training_data/new/ to a new folder under ./training_data/past/ named after the name you entered for your model./test_data/tobeannotated/, compare them with those with the same name under ./test_data/gold/ and generate a text file for each of them with performance metrics under ./models/yourmodelname/validation-output/After the model has been trained, the project directory should look like the following:
📦ROOT
┣ 📂models
┃ ┣ yourmodelname
┃ ┃ ┣ 📜model
┃ ┃ ┣ 📜model.params
┃ ┃ ┗ 📂validation-output
┃ ┃ ┣ 📜text1-validated.txt
┃ ┃ ┗ 📜text2-validated.txt
┃ ┗ 📂OldSlavNet
┃ ┣ 📜model
┃ ┗ 📜model.params
┣ 📂oldslavnet-venv
┣ 📂scripts
┣ 📂test_data
┣ 📂training_data
┃ ┣ 📂new
┃ ┗ 📂past
┃ ┗ 📂yourmodelname
┃ ┣ 📜dev.conllu
┃ ┗ 📜train.conllu
┃ ┗ 📂OldSlavNet
┃ ┣ 📜dev.conllu
┃ ┗ 📜train.conllu
┣ 📜LICENSE
┣ 📜Makefile
┣ 📜README.md
┣ 📜requirements.txt
┣ 📜tag.sh
┗ 📜train.sh