Metadata-Version: 2.1
Name: wrapperCoreference
Version: 0.0.4
Summary: Coreference Resolution wrapper
Home-page: UNKNOWN
Author: Henry Rosales
Author-email: hrosmendez@gmail.com
License: UNKNOWN
Keywords: Coreference Resolution,NLP
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Education
Classifier: Topic :: Software Development :: Build Tools
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4
Description-Content-Type: text/markdown
Provides-Extra: dev
Requires-Dist: check-manifest ; extra == 'dev'
Provides-Extra: test
Requires-Dist: coverage ; extra == 'test'

# Coreference Resolution wrapper

Coreference Resolution is the task of finding all expressions that refer to the same entity in a text. It is an important step for a lot of higher level NLP tasks that involve natural language understanding such as document summarization, question answering, and information extraction.

This is a simple library that wrap two Coreference Resolution models form StanfordNLP package: the statistic and neural models. We use here the SpaCy package to load the neural model (a.k.a, *NeuralCoref*), and the stanfordnlp package to load the statistic model (a.k.a, *CoreNLPCoref*).

## Requirements

```bash
pip3 install spacy
pip3 install stanfordnlp
pip3 install wrapperCoreference
```

StanfordNLP also require the manual downloading of a core of modules, review [here](https://stanfordnlp.github.io/CoreNLP/download.html) for more details.

```bash
wget http://nlp.stanford.edu/software/stanford-corenlp-full-2018-10-05.zip
```

## Methods
Example of usage of the neural model 
```python
from wrapperCoreference import WrapperCoreference
wc = WrapperCoreference()
wc.NeuralCoref(u'My sister has a dog. She loves him.')
#output: [{'start': 21, 'end': 24, 'text': 'She', 'resolved': 'My sister'}, {'start': 31, 'end': 34, 'text': 'him', 'resolved': 'a dog'}]
```


Example of usage of the statistic model 
```python
from wrapperCoreference import WrapperCoreference
wc = WrapperCoreference()
wc.setCoreNLP('/tmp/stanford-corenlp-full-2018-10-05')
print(wc.CoreNLPCoref(u'My sister has a dog. She loves him.'))
#output: [{'start': 31, 'end': 34, 'text': 'him', 'resolved': 'a dog', 'fullInformation': [{'start': 14, 'end': 19, 'text': 'a dog'}]}, {'start' : 21, 'end': 24, 'text': 'She', 'resolved': 'My sister', 'fullInformation': [{'start': 0, 'end': 9, 'text': 'My sister'}]}]
```



## Combining the output with Entity Linking

You can use the [nifwrapper](https://github.com/henryrosalesmendez/nifwrapper) library in order to merge the coreference outputs with Entity Linking annotations.
```python

from wrapperCoreference import WrapperCoreference
from nifwrapper import *

#---- Obtaining coreferences
wc = WrapperCoreference()
corefResults = wc.NeuralCoref(u'My sister has a dog. She loves him.')
#corefResults = [{'start': 21, 'end': 24, 'text': 'She', 'resolved': 'My sister'}, {'start': 31, 'end': 34, 'text': 'him', 'resolved': 'a dog'}]


#---- Obtaining Entity Linking results
# inline NIF corpus creation
wrp = NIFWrapper()
doc = NIFDocument("https://example.org/doc1")
#--
sent = NIFSentence("https://example.org/doc1#char=0,19")
sent.addAttribute("nif:beginIndex","0","xsd:nonNegativeInteger")
sent.addAttribute("nif:endIndex","19","xsd:nonNegativeInteger")
sent.addAttribute("nif:isString","My sister has a dog.","xsd:string")
sent.addAttribute("nif:broaderContext",["https://example.org/doc1"],"URI LIST")


#-- 
a1 = NIFAnnotation("https://example.org/doc1#char=14,19", "14", "19", ["https://en.wikipedia.org/wiki/ExambleDogUri"], ["dbo:FamilyRelations"])
a1.addAttribute("nif:anchorOf","a dog","xsd:string")
sent.pushAnnotation(a1)
doc.pushSentence(sent)

#--
sent2 = NIFSentence("https://example.org/doc1#char=21,35")
sent2.addAttribute("nif:isString","She loves him.","xsd:string")
sent2.addAttribute("nif:broaderContext",["https://example.org/doc1"],"URI LIST")
sent2.addAttribute("nif:beginIndex","21","xsd:nonNegativeInteger")
sent2.addAttribute("nif:endIndex","35","xsd:nonNegativeInteger")
doc.pushSentence(sent2)
#--
wrp.pushDocument(doc)

#---- Combining EL annotations with coreferences 
wrp.extendsDocWithCoref(corefResults, doc.uri)

print(wrp.toString())
```


