From 57a08e53cb8f19ca0154f6ff6e1f81d6f8fd7f87 Mon Sep 17 00:00:00 2001 From: Pavan Mandava Date: Sat, 1 Aug 2020 13:41:52 +0200 Subject: [PATCH] WIP : README Documentation - Added References --- README.md | 29 ++++++++++++++++++++----- eval/metrics.py | 2 ++ predict.py => testing/bilstm_predict.py | 4 ++-- 3 files changed, 28 insertions(+), 7 deletions(-) rename predict.py => testing/bilstm_predict.py (89%) diff --git a/README.md b/README.md index 22ee99e..be35a20 100644 --- a/README.md +++ b/README.md @@ -14,7 +14,7 @@ This README documentation focuses on running the code base, training the models For more information on the Citation Intent Classification in Scientific Publications, follow this [link](https://arxiv.org/pdf/1904.01608.pdf) to the original published paper and their [GitHub repo](https://github.com/allenai/scicite) ## Environment & Setup -It's recommended to use **Python 3.5 or greater**. Now we can install and create a Virtual Environment to run this project. +This project needs **Python 3.5 or greater**. We need to install and create a Virtual Environment to run this project. #### Installing virtualenv ```shell @@ -59,7 +59,8 @@ export SAVED_MODELS_PATH=/mount/arbeitsdaten/studenten1/team-lab-nlp/mandavsi_ri ``` ## Data -We have 3 different intents/classes in the dataset: +This project uses a large dataset of citation intents provided by this `SciCite` [GitHub repo](https://github.com/allenai/scicite). Can be downloaded from this [link](https://s3-us-west-2.amazonaws.com/ai2-s2-research/scicite/scicite.tar.gz).
+We have 3 different intents/classes in this dataset: - background (background information) - method (use of methods) @@ -135,6 +136,7 @@ Our BiLSTM AllenNLP model contains 4 major components: 2. Model - [BiLstmClassifier](/calssifier/nn.py) - The model's `forward()` method is called for every data instance by passing `tokens` and `label` - The signature of `forward()` needs to match with field names of the `Instance` created by the DatasetReader + - This Model uses [ELMo](https://allennlp.org/elmo) deep contextualised embeddings. - The `forward()` method finally returns an output dictionary with the predicted label, loss, softmax probabilities and so on... 3. Config File - [basic_model.json](configs/basic_model.json?raw=true) - The AllenNLP Configuration file takes the constructor parameters for various objects (Model, DatasetReader, Predictor, ...) @@ -145,7 +147,7 @@ Our BiLSTM AllenNLP model contains 4 major components: - Batch Size - Dropout - Embeddings - - All the classes that the Config file uses must register using Python decorators (Ex: `@Model.register('bilstm_classifier'`). + - All the classes that the Config file uses must register using Python decorators (for example, `@Model.register('bilstm_classifier'`). 4. Predictor - [IntentClassificationPredictor](/testing/intent_predictor.py) - AllenNLP uses `Predictor`, a wrapper around the trained model, for making predictions. - The Predictor uses a pre-trained/saved model and dataset reader to predict new Instances @@ -161,7 +163,7 @@ $ allennlp train \ --include-package classifier ``` We ran a few experiments on this model, the run configurations, results and archived models are available in the `SAVED_MODELS_PATH` directory.
-**Note:** If the GPU cores are not available, set the `"cuda_device":` to `-1` in the [config file](/configs/basic_model.json?raw=true), or the available GPU Core. +**Note:** If the GPU cores are not available, set the `"cuda_device":` to `-1` in the [config file](/configs/basic_model.json?raw=true), otherwise the available GPU Core. ### Evaluation To evaluate the model, simply run: @@ -184,7 +186,24 @@ $ allennlp predict \ --predictor citation_intent_predictor ``` +We also have an another way to make predictions without using `allennlp predict` command. This returns prediction list, softmax probabilities and more details useful for error analysis. Simply run the following command: +```shell +(citation-env) [user@server citation-analysis]$ python3 -m testing.bilstm_predict +``` +Modify [this](/testing/bilstm_predict.py) source to run predictions on different experiments. It also saves the Confusion Matrix Plot (as shown below) after prediction. + ### Results Confusion Matrix Plot -## References \ No newline at end of file +## References +[\[1\]](https://github.com/allenai/scicite) SciCite GitHub Repository
+This repository contains datasets and code for classifying citation intents, our poroject is based on this repository.

+[\[2\]](https://s3-us-west-2.amazonaws.com/ai2-s2-research/scicite/scicite.tar.gz) SciCite Dataset
+Large Datset of Citation Intents

+[\[3\]](https://allennlp.org/tutorials) AllenNLP Library.
+An open-source NLP research library, built on PyTorch.

+[\[4\]](https://allennlp.org/elmo) ELMo Embeddings
+Deep Contextualized word representations.

+[\[5\]](https://guide.allennlp.org/) AllenNLP Guide
+A Guide to Natural Language Processing With AllenNLP.

+ diff --git a/eval/metrics.py b/eval/metrics.py index 741c29c..a689a18 100644 --- a/eval/metrics.py +++ b/eval/metrics.py @@ -212,6 +212,8 @@ def plot_confusion_matrix(confusion_mat, classifier_name, plot_file_name): plt.xlabel('Predicted') plt.savefig(plot_file_name) + print('Confusion Matrix Plot saved to :: ', plot_file_name) + class Result: """ diff --git a/predict.py b/testing/bilstm_predict.py similarity index 89% rename from predict.py rename to testing/bilstm_predict.py index 9b5b199..04dbf4f 100644 --- a/predict.py +++ b/testing/bilstm_predict.py @@ -1,4 +1,5 @@ import classifier + import testing.intent_predictor as pred import eval.metrics as metrics @@ -8,9 +9,8 @@ y_pred, y_true = pred.load_model_and_predict_test_data(saved_model_dir) confusion_matrix = metrics.get_confusion_matrix(y_true, y_pred) +print("Confusion Matrix :: ") print(confusion_matrix) plot_file_path = saved_model_dir+'/confusion_matrix_plot.png' metrics.plot_confusion_matrix(confusion_matrix, "BiLSTM Classifier + Attention with ELMo", plot_file_path) - -print('Confusion Matrix Plot saved to :: ', plot_file_path)