diff --git a/README.md b/README.md
index 001be3e..733f93c 100644
--- a/README.md
+++ b/README.md
@@ -4,14 +4,14 @@ Project repo for Computational Linguistics Team Lab at the University of Stuttga
## Introduction
This repository contains code and datasets for classifying citation intents in research papers.
-We implemented 3 different classifiers and evaluated the results:
+We implemented 3 classifiers and evaluated on test dataset:
- - Perceptron Classifier - Baseline (Implemented from scratch)
+ - Perceptron Classifier - Baseline model (Implemented from scratch)
- Feedforward Neural Network Classifier (using [PyTorch](https://pytorch.org/))
- BiLSTM + Attention with ELMo Embeddings (using [AllenNLP](https://allennlp.org/) library)
-This README documentation focuses on running the code base, training the models and predictions. For more information about our project work and detailed error analysis, check [this](https://www.overleaf.com/project/5f1b0e8a6d0fb80001ceb5eb) report.
-For more information on the Citation Intent Classification in Scientific Publications, follow this [link](https://arxiv.org/pdf/1904.01608.pdf) to the original published paper and the [GitHub repo](https://github.com/allenai/scicite)
+This README documentation focuses on running the code base, training the models and predictions. For more information about our project work, model results and detailed error analysis, check [this](https://www.overleaf.com/project/5f1b0e8a6d0fb80001ceb5eb) report.
+For more information on the Citation Intent Classification in Scientific Publications, follow this [link](https://arxiv.org/pdf/1904.01608.pdf) to the original published paper and their [GitHub repo](https://github.com/allenai/scicite)
## Environment & Setup
TODO
@@ -27,11 +27,12 @@ We have 3 different intents/classes in the dataset:
| | background | method | result |
|:---|:---:|:---:|:---:|
| train | 4.8 K | 2.3 K | 1.1 K |
+| dev | 0.5 K | 0.3 K | 0.1 K |
| test | 1 K | 0.6 K | 0.2 K |
## Methods (Classification)
### 1) Perceptron Classifier (Baseline Classifier)
-We implemented [Perceptron](https://en.wikipedia.org/wiki/Perceptron) as a baseline classifier, from scratch (including evaluation). Perceptron is an algorithm for supervised learning of classification. It's a Linear and a Binary Classifier, which means it can only decide whether or not an input feature belongs to some specific class and also it's only capable of learning linearly separable patterns.
+We implemented [Perceptron](https://en.wikipedia.org/wiki/Perceptron) as a baseline classifier, from scratch (including evaluation). Perceptron is an algorithm for supervised learning of classification. It's a linear and binary classifier, which means it can only decide whether or not an input feature belongs to some specific class and it's only capable of learning linearly separable patterns.
```python
class Perceptron:
def __init__(self, label: str, weights: dict, theta_bias: float):
@@ -47,15 +48,13 @@ Since we have 3 different classes for Classification, we create a Perceptron obj
Check the source [code](/classifier/linear_model.py) for more details on the implementation of Perceptron Classifier.
-#### Running the Model
-
+### Running the Model
> `(citation-env) [user@server citation-analysis]$ python -m testing.model_testing`
-[link](/testing/model_testing.py) to the source code. All the Hyperparameters can be modified to experiment with.
+[Link](/testing/model_testing.py) to the source code. All the Hyperparameters can be modified to experiment with.
-**Evaluation**
+### Evaluation
we used ***f1_score*** metric for evaluation of our baseline classifier.
-
> F1 score is a weighted average of Precision and Recall(or Harmonic Mean between Precision and Recall).
> The formula for F1 Score is:
> F1 = 2 * (precision * recall) / (precision + recall)
@@ -72,12 +71,47 @@ eval.metrics.f1_score(y_true, y_pred, labels, average)
[Link](/eval/metrics.py) to the metrics source code.
### Results
-
+
### 2) Feedforward Neural Network (using PyTorch)
+A feed-forward neural network classifier with a single hidden layer containing 9 units. While a feed-forward neural network is clearly not the ideal architecture for sequential text data, it was of interest to add a sort of second baseline and examine the added gains (if any) relative to a single perceptron. The input to the feedforward network remained the same; only the final model was suitable for more complex inputs such as word embeddings.
-TODO
+Check this feed-forward model source [code](/classifier/linear_model.py) for more details.
### 3) BiLSTM + Attention with ELMo (AllenNLP Model)
+The Bi-directional Long Short Term Memory (BiLSTM) model built using the [AllenNLP](https://allennlp.org/) library. For word representations, we used 100-dimensional [GloVe](https://nlp.stanford.edu/projects/glove/) vectors trained on a corpus of 6B tokens from Wikipedia. For contextual representations, we used [ELMo](https://allennlp.org/elmo) Embeddings which have been trained on a dataset of 5.5B tokens. This model uses the entire input text, as opposed to selected features in the text, as in the first two models. It has a single-layer BiLSTM with a hidden dimension size of 50 for each direction.
+
+We used AllenNLP's [Config Files](https://guide.allennlp.org/using-config-files) to build our model, just need to implement a model and a dataset reader (with a Config file).
+
+Our BiLSTM AllenNLP model contains 4 major components:
+
+ 1. Dataset Reader - [CitationDatasetReader](/utils/reader.py)
+ - It reads the data from the file, tokenizes the input text and creates AllenNLP `Instances`
+ - Each `Instance` contains a dictionary of `tokens` and `label`
+ 2. Model - [BiLstmClassifier](/calssifier/nn.py)
+ - The model's `forward()` method is called for every data instance by passing `tokens` and `label`
+ - The signature of `forward()` needs to match with field names of the `Instance` created by the DatasetReader
+ - The `forward()` method finally returns an output dictionary with the predicted label, loss, softmax probabilities and so on...
+ 3. Config File - [basic_model.json](configs/basic_model.json?raw=true)
+ - The AllenNLP Configuration file takes the constructor parameters for various objects (Model, DatasetReader, Predictor, ...)
+ - We can also define a number of Hyperparameters from the Config file.
+ - Depth and Width of the Network
+ - Number of Epochs
+ - Optimizer & Learning Rate
+ - Batch Size
+ - Dropout
+ - Embeddings
+ 4. Predictor - [IntentClassificationPredictor](/testing/intent_predictor.py)
+ - AllenNLP uses `Predictor`, a wrapper around trained model, for making predictions.
+ - The Predictor uses a pre-trained/saved model and dataset reader to predict new Instances
+
+### Running the Model
+TODO
+
+### Evaluation
+TODO
+
+### Results
+
-TODO
\ No newline at end of file
+## References
\ No newline at end of file
diff --git a/classifier/nn.py b/classifier/nn.py
index 1f0f51e..d2c17d2 100644
--- a/classifier/nn.py
+++ b/classifier/nn.py
@@ -104,10 +104,6 @@ class BiLstmClassifier(Model):
output_dict['probabilities'] = class_probabilities
output_dict['positive_label'] = label
output_dict['prediction'] = label
- # citation_text = []
- # for batch_text in output_dict['tokens']:
- # citation_text.append([self.vocab.get_token_from_index(token_id.item()) for token_id in batch_text])
- # output_dict['tokens'] = citation_text
return output_dict
diff --git a/plots/bilstm_model/confusion_matrix_plot.png b/plots/bilstm_model/confusion_matrix_plot.png
new file mode 100644
index 0000000..1f824c8
Binary files /dev/null and b/plots/bilstm_model/confusion_matrix_plot.png differ