In my understanding, to create a training file, you put your words in a text file. Then after each word, add a place or tab to the tag (such as PAR, LOC, etc ...)
I have also copied one sample from a sample file to a word pad. How do I get them into a gz file, which I can input and use in a classifier?
Please guide me though. I am a newbie and quite inefficient with technology.
Your training file ( training - data.tsv ) should look like this :
IO has been transferred to Vancouver Venue BC LOCATION Tomorrow o Where o means "outside ", As is not a named unit
where the space between columns is a tab .
You do not put them in the ser.gz file. The CRGs file classifier model is created by the training process.
To train a classifier driver:
anticipated java -cp Ner.jar edu.stanford.nlp.ie.crf.CRFClassifier- my-classifier.properties Where my-classifier.properties will look like this:
trainFile = training-data.tsv serializeTo = my- Classification-model.ser.gz map = word = 0, answer = 1 ...
No comments:
Post a Comment