Merge remote-tracking branch 'origin/master'

5 years ago · 6e0fd99357
parent 108653411c 0f2e8343f1
commit 6e0fd99357
2 changed files with 84 additions and 11 deletions
--- a/README.md
+++ b/README.md
@ -10,11 +10,53 @@ We implemented 3 classifiers and evaluated on test dataset:
 - Feedforward Neural Network Classifier (using [PyTorch](https://pytorch.org/))
 - BiLSTM + Attention with ELMo Embeddings (using [AllenNLP](https://allennlp.org/) library)

-This README documentation focuses on running the code base, training the models and predictions. For more information about our project work, model results and detailed error analysis, check [this](https://www.overleaf.com/project/5f1b0e8a6d0fb80001ceb5eb) report. <br/>
+This README documentation focuses on running the code base, training the models and predictions. For more information about our project work, model results and detailed error analysis, check [this](https://www.overleaf.com/project/5f1b0e8a6d0fb80001ceb5eb) report. Slides from the mid-term presentation are available [here](/presentation.pdf).<br/>
 For more information on the Citation Intent Classification in Scientific Publications, follow this [link](https://arxiv.org/pdf/1904.01608.pdf) to the original published paper and their [GitHub repo](https://github.com/allenai/scicite)

 ## Environment & Setup
-TODO
+It's recommended to use **Python 3.5 or greater**. Now we can install and create a Virtual Environment to run this project.
+
+#### Installing virtualenv
+```shell
+python3 -m pip install --user virtualenv
+```
+#### Creating a virtual environment
+**venv** (for Python 3) allows us to manage separate package installations for different projects.
+```shell
+python3 -m venv citation-env
+```
+#### Activating the virtual environment
+Before we start installing or using packages in the virtual environment we need to _activate_ it.
+```shell
+source citation-env/bin/activate
+```
+#### Leaving the virtual environment
+To leave the virtual environment, simply run:
+```shell
+deactivate
+```
+
+After activating the Virtual Environment, the console should look like this:
+```shell
+(citation-env) [user@server ~]$ 
+```
+#### Cloning the Repository
+```shell
+git clone https://github.com/yelircaasi/citation-analysis.git
+```
+Now change the current working directory to the project root folder (`> cd citation-analysis`). <br />
+**Note:** Stay in the Project root folder while running all the experiments.
+
+#### Installing Pacakages
+Now we can install all the packages required to run this project, available in [requirements.txt](/requirements.txt) file.
+```shell
+(citation-env) [user@server citation-analysis]$ pip install -r requirements.txt
+```
+#### Environment Variable for Saved Models Path
+Run the below line in the console, we'll use this variable later on.
+```shell
+export SAVED_MODELS_PATH=/mount/arbeitsdaten/studenten1/team-lab-nlp/mandavsi_rileyic/saved_models
+```

 ## Data
 We have 3 different intents/classes in the dataset:
@ -49,9 +91,11 @@ Since we have 3 different classes for Classification, we create a Perceptron obj
 Check the source [code](/classifier/linear_model.py) for more details on the implementation of Perceptron Classifier.

 ### Running the Model
-> `(citation-env) [user@server citation-analysis]$ python -m testing.model_testing`
+```shell
+(citation-env) [user@server citation-analysis]$ python3 -m testing.model_testing
+```
  
-[Link](/testing/model_testing.py) to the source code. All the Hyperparameters can be modified to experiment with.
+[Link](/testing/model_testing.py) to the test source code. All the Hyperparameters can be modified to experiment with.
  
 ### Evaluation  
 we used ***f1_score*** metric for evaluation of our baseline classifier.
@ -71,7 +115,7 @@ eval.metrics.f1_score(y_true, y_pred, labels, average)
 [Link](/eval/metrics.py) to the metrics source code.

 ### Results
-<img src="/plots/perceptron/confusion_matrix_plot.png?raw=true" width="600" height = "450" alt = "Confusion Matrix Plot" />
+<img src="/plots/perceptron/confusion_matrix_plot.png?raw=true" width="500" height = "375" alt = "Confusion Matrix Plot" />

 ### 2) Feedforward Neural Network (using PyTorch)
 A feed-forward neural network classifier with a single hidden layer containing 9 units. While a feed-forward neural network is clearly not the ideal architecture for sequential text data, it was of interest to add a sort of second baseline and examine the added gains (if any) relative to a single perceptron. The input to the feedforward network remained the same; only the final model was suitable for more complex inputs such as word embeddings.
@ -81,7 +125,7 @@ Check this feed-forward model source [code](/classifier/linear_model.py) for mor
 ### 3) BiLSTM + Attention with ELMo (AllenNLP Model)
 The Bi-directional Long Short Term Memory (BiLSTM) model built using the [AllenNLP](https://allennlp.org/) library. For word representations, we used 100-dimensional [GloVe](https://nlp.stanford.edu/projects/glove/) vectors trained on a corpus of 6B tokens from Wikipedia. For contextual representations, we used [ELMo](https://allennlp.org/elmo) Embeddings which have been trained on a dataset of 5.5B tokens. This model uses the entire input text, as opposed to selected features in the text, as in the first two models. It has a single-layer BiLSTM with a hidden dimension size of 50 for each direction. 

-We used AllenNLP's [Config Files](https://guide.allennlp.org/using-config-files) to build our model, just need to implement a model and a dataset reader (with a Config file).
+We used AllenNLP's [Config Files](https://guide.allennlp.org/using-config-files) to build our model, just need to implement a model and a dataset reader (with a JSON Config file).

 Our BiLSTM AllenNLP model contains 4 major components:

@ -94,24 +138,53 @@ Our BiLSTM AllenNLP model contains 4 major components:
 	 - The `forward()` method finally returns an output dictionary with the predicted label, loss, softmax probabilities and so on...
 3. Config File - [basic_model.json](configs/basic_model.json?raw=true)
 	 - The AllenNLP Configuration file takes the constructor parameters for various objects (Model, DatasetReader, Predictor, ...)
-	 - We can also define a number of Hyperparameters from the Config file.
+	 - We can provide a number of Hyperparameters in this Config file.
 		 - Depth and Width of the Network
 		 - Number of Epochs
 		 - Optimizer & Learning Rate
 		 - Batch Size
 		 - Dropout
 		 - Embeddings
+	- All the classes that the Config file uses must register using Python decorators (Ex: `@Model.register('bilstm_classifier'`).
 4. Predictor - [IntentClassificationPredictor](/testing/intent_predictor.py)
-	 - AllenNLP uses `Predictor`, a wrapper around trained model, for making predictions.
+	 - AllenNLP uses `Predictor`, a wrapper around the trained model, for making predictions.
 	 - The Predictor uses a pre-trained/saved model and dataset reader to predict new Instances

 ### Running the Model
-TODO
+AllenNLP provides `train`, `evaluate` and `predict` commands to interact with the models from command line.
+
+#### Training
+```shell
+$ allennlp train \
+    configs/basic_model.json \
+    -s $SAVED_MODELS_PATH/experiment_10 \
+    --include-package classifier
+```
+We ran a few experiments on this model, the run configurations, results and archived models are available in the `SAVED_MODELS_PATH` directory. <br />
+**Note:** If the GPU cores are not available, set the `"cuda_device":` to `-1` in the [config file](/configs/basic_model.json?raw=true), or the available GPU Core.

 ### Evaluation
-TODO
+To evaluate the model, simply run:
+```shell
+$ allennlp evaluate \
+    $SAVED_MODELS_PATH/experiment_4/model.tar.gz \
+    data/jsonl/test.jsonl \
+    --cuda-device 3 \
+    --include-package classifier
+```
+
+### Predictions
+To make predictions, simply run:
+```shell
+$ allennlp predict \
+    $SAVED_MODELS_PATH/experiment_4/model.tar.gz \
+    data/jsonl/test.jsonl \
+    --cuda-device 3 \
+    --include-package classifier
+    --predictor citation_intent_predictor
+```

 ### Results
-<img src="/plots/bilstm_model/confusion_matrix_plot.png?raw=true" width="600" height = "450" alt = "Confusion Matrix Plot" />
+<img src="/plots/bilstm_model/confusion_matrix_plot.png?raw=true" width="500" height = "375" alt = "Confusion Matrix Plot" />

 ## References
--- a/presentation.pdf
+++ b/presentation.pdf