diff --git a/README.md b/README.md index 39dc56f..9ad7fb0 100644 --- a/README.md +++ b/README.md @@ -103,10 +103,11 @@ we used ***f1_score*** metric for evaluation of our baseline classifier. > F1 score is a weighted average of Precision and Recall(or Harmonic Mean between Precision and Recall). > The formula for F1 Score is: > F1 = 2 * (precision * recall) / (precision + recall) - + ```python eval.metrics.f1_score(y_true, y_pred, labels, average) ``` + **Parameters**: **y_true** : 1-d array or list of gold class values **y_pred** : 1-d array or list of estimated values returned by a classifier @@ -116,13 +117,21 @@ eval.metrics.f1_score(y_true, y_pred, labels, average) [Link](/eval/metrics.py) to the metrics source code. ### Results -Confusion Matrix Plot +Confusion Matrix Plot ### 2) Feedforward Neural Network (using PyTorch) A feed-forward neural network classifier with a single hidden layer containing 9 units. While a feed-forward neural network is clearly not the ideal architecture for sequential text data, it was of interest to add a sort of second baseline and examine the added gains (if any) relative to a single perceptron. The input to the feedforward network remained the same; only the final model was suitable for more complex inputs such as word embeddings. Check this feed-forward model source [code](/classifier/linear_model.py) for more details. +### Running the Model +```shell +(citation-env) [user@server citation-analysis]$ python3 -m testing.ff_model_testing +``` + +### Results +Confusion Matrix Plot + ### 3) BiLSTM + Attention with ELMo (AllenNLP Model) The Bi-directional Long Short Term Memory (BiLSTM) model built using the [AllenNLP](https://allennlp.org/) library. For word representations, we used 100-dimensional [GloVe](https://nlp.stanford.edu/projects/glove/) vectors trained on a corpus of 6B tokens from Wikipedia. For contextual representations, we used [ELMo](https://allennlp.org/elmo) Embeddings which have been trained on a dataset of 5.5B tokens. This model uses the entire input text, as opposed to selected features in the text, as in the first two models. It has a single-layer BiLSTM with a hidden dimension size of 50 for each direction. @@ -193,7 +202,7 @@ We also have an another way to make predictions without using `allennlp predict` Modify [this](/testing/bilstm_predict.py) source to run predictions on different experiments. It also saves the Confusion Matrix Plot (as shown below) after prediction. ### Results -Confusion Matrix Plot +Confusion Matrix Plot ## References [\[1\]](https://github.com/allenai/scicite) SciCite GitHub Repository