Set up neuralcoref
The package is outdated and unmaintained, so there are difficulties
• 6 min read
!git clone https://github.com/huggingface/neuralcoref
Cloning into 'neuralcoref'... remote: Enumerating objects: 772, done. remote: Counting objects: 100% (24/24), done. remote: Compressing objects: 100% (16/16), done. remote: Total 772 (delta 10), reused 16 (delta 7), pack-reused 748 (from 1) Receiving objects: 100% (772/772), 67.85 MiB | 9.70 MiB/s, done. Resolving deltas: 100% (407/407), done. Updating files: 100% (151/151), done.
%cd /content/neuralcoref
/content/neuralcoref
!cat setup.py|sed -e 's/, msvccompiler//' > tmp
!mv tmp setup.py
!pip install Cython==0.29.36
Collecting Cython==0.29.36 Downloading Cython-0.29.36-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.metadata (3.1 kB) Downloading Cython-0.29.36-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.9 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.9/1.9 MB 16.8 MB/s eta 0:00:00 Installing collected packages: Cython Attempting uninstall: Cython Found existing installation: Cython 3.0.11 Uninstalling Cython-3.0.11: Successfully uninstalled Cython-3.0.11 Successfully installed Cython-0.29.36
%pip install -r requirements.txt
Collecting spacy<3.0.0,>=2.1.0 (from -r requirements.txt (line 1)) Downloading spacy-2.3.9-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (15 kB) Requirement already satisfied: cython>=0.25 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 2)) (0.29.36) Requirement already satisfied: pytest in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 3)) (7.4.4) Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->-r requirements.txt (line 1)) (1.0.10) Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->-r requirements.txt (line 1)) (2.0.8) Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->-r requirements.txt (line 1)) (3.0.9) Collecting thinc<7.5.0,>=7.4.1 (from spacy<3.0.0,>=2.1.0->-r requirements.txt (line 1)) Downloading thinc-7.4.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (23 kB) Requirement already satisfied: blis<0.8.0,>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->-r requirements.txt (line 1)) (0.7.11) Collecting wasabi<1.1.0,>=0.4.0 (from spacy<3.0.0,>=2.1.0->-r requirements.txt (line 1)) Downloading wasabi-0.10.1-py3-none-any.whl.metadata (28 kB) Collecting srsly<1.1.0,>=1.0.2 (from spacy<3.0.0,>=2.1.0->-r requirements.txt (line 1)) Downloading srsly-1.0.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB) Collecting catalogue<1.1.0,>=0.0.7 (from spacy<3.0.0,>=2.1.0->-r requirements.txt (line 1)) Downloading catalogue-1.0.2-py2.py3-none-any.whl.metadata (13 kB) Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->-r requirements.txt (line 1)) (4.66.6) Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->-r requirements.txt (line 1)) (75.1.0) Requirement already satisfied: numpy>=1.15.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->-r requirements.txt (line 1)) (1.26.4) Collecting plac<1.2.0,>=0.9.6 (from spacy<3.0.0,>=2.1.0->-r requirements.txt (line 1)) Downloading plac-1.1.3-py2.py3-none-any.whl.metadata (2.3 kB) Requirement already satisfied: requests<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->-r requirements.txt (line 1)) (2.32.3) Requirement already satisfied: iniconfig in /usr/local/lib/python3.10/dist-packages (from pytest->-r requirements.txt (line 3)) (2.0.0) Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from pytest->-r requirements.txt (line 3)) (24.1) Requirement already satisfied: pluggy<2.0,>=0.12 in /usr/local/lib/python3.10/dist-packages (from pytest->-r requirements.txt (line 3)) (1.5.0) Requirement already satisfied: exceptiongroup>=1.0.0rc8 in /usr/local/lib/python3.10/dist-packages (from pytest->-r requirements.txt (line 3)) (1.2.2) Requirement already satisfied: tomli>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from pytest->-r requirements.txt (line 3)) (2.0.2) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<3.0.0,>=2.1.0->-r requirements.txt (line 1)) (3.4.0) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<3.0.0,>=2.1.0->-r requirements.txt (line 1)) (3.10) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<3.0.0,>=2.1.0->-r requirements.txt (line 1)) (2.2.3) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<3.0.0,>=2.1.0->-r requirements.txt (line 1)) (2024.8.30) Downloading spacy-2.3.9-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.9 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.9/4.9 MB 36.8 MB/s eta 0:00:00 Downloading catalogue-1.0.2-py2.py3-none-any.whl (16 kB) Downloading plac-1.1.3-py2.py3-none-any.whl (20 kB) Downloading srsly-1.0.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (369 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 369.2/369.2 kB 24.1 MB/s eta 0:00:00 Downloading thinc-7.4.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/1.0 MB 26.0 MB/s eta 0:00:00 Downloading wasabi-0.10.1-py3-none-any.whl (26 kB) Installing collected packages: wasabi, plac, srsly, catalogue, thinc, spacy Attempting uninstall: wasabi Found existing installation: wasabi 1.1.3 Uninstalling wasabi-1.1.3: Successfully uninstalled wasabi-1.1.3 Attempting uninstall: srsly Found existing installation: srsly 2.4.8 Uninstalling srsly-2.4.8: Successfully uninstalled srsly-2.4.8 Attempting uninstall: catalogue Found existing installation: catalogue 2.0.10 Uninstalling catalogue-2.0.10: Successfully uninstalled catalogue-2.0.10 Attempting uninstall: thinc Found existing installation: thinc 8.2.5 Uninstalling thinc-8.2.5: Successfully uninstalled thinc-8.2.5 Attempting uninstall: spacy Found existing installation: spacy 3.7.5 Uninstalling spacy-3.7.5: Successfully uninstalled spacy-3.7.5 ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. confection 0.1.5 requires srsly<3.0.0,>=2.4.0, but you have srsly 1.0.7 which is incompatible. en-core-web-sm 3.7.1 requires spacy<3.8.0,>=3.7.2, but you have spacy 2.3.9 which is incompatible. weasel 0.4.1 requires srsly<3.0.0,>=2.4.3, but you have srsly 1.0.7 which is incompatible. Successfully installed catalogue-1.0.2 plac-1.1.3 spacy-2.3.9 srsly-1.0.7 thinc-7.4.6 wasabi-0.10.1
%pip install -e .
Obtaining file:///content/neuralcoref Preparing metadata (setup.py) ... done Requirement already satisfied: numpy>=1.15.0 in /usr/local/lib/python3.10/dist-packages (from neuralcoref==4.0) (1.26.4) Collecting boto3 (from neuralcoref==4.0) Downloading boto3-1.35.55-py3-none-any.whl.metadata (6.7 kB) Requirement already satisfied: requests<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from neuralcoref==4.0) (2.32.3) Requirement already satisfied: spacy<3.0.0,>=2.1.0 in /usr/local/lib/python3.10/dist-packages (from neuralcoref==4.0) (2.3.9) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->neuralcoref==4.0) (3.4.0) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->neuralcoref==4.0) (3.10) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->neuralcoref==4.0) (2.2.3) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->neuralcoref==4.0) (2024.8.30) Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->neuralcoref==4.0) (1.0.10) Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->neuralcoref==4.0) (2.0.8) Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->neuralcoref==4.0) (3.0.9) Requirement already satisfied: thinc<7.5.0,>=7.4.1 in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->neuralcoref==4.0) (7.4.6) Requirement already satisfied: blis<0.8.0,>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->neuralcoref==4.0) (0.7.11) Requirement already satisfied: wasabi<1.1.0,>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->neuralcoref==4.0) (0.10.1) Requirement already satisfied: srsly<1.1.0,>=1.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->neuralcoref==4.0) (1.0.7) Requirement already satisfied: catalogue<1.1.0,>=0.0.7 in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->neuralcoref==4.0) (1.0.2) Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->neuralcoref==4.0) (4.66.6) Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->neuralcoref==4.0) (75.1.0) Requirement already satisfied: plac<1.2.0,>=0.9.6 in /usr/local/lib/python3.10/dist-packages (from spacy<3.0.0,>=2.1.0->neuralcoref==4.0) (1.1.3) Collecting botocore<1.36.0,>=1.35.55 (from boto3->neuralcoref==4.0) Downloading botocore-1.35.55-py3-none-any.whl.metadata (5.7 kB) Collecting jmespath<2.0.0,>=0.7.1 (from boto3->neuralcoref==4.0) Downloading jmespath-1.0.1-py3-none-any.whl.metadata (7.6 kB) Collecting s3transfer<0.11.0,>=0.10.0 (from boto3->neuralcoref==4.0) Downloading s3transfer-0.10.3-py3-none-any.whl.metadata (1.7 kB) Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /usr/local/lib/python3.10/dist-packages (from botocore<1.36.0,>=1.35.55->boto3->neuralcoref==4.0) (2.8.2) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil<3.0.0,>=2.1->botocore<1.36.0,>=1.35.55->boto3->neuralcoref==4.0) (1.16.0) Downloading boto3-1.35.55-py3-none-any.whl (139 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 139.2/139.2 kB 4.0 MB/s eta 0:00:00 Downloading botocore-1.35.55-py3-none-any.whl (12.7 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.7/12.7 MB 48.9 MB/s eta 0:00:00 Downloading jmespath-1.0.1-py3-none-any.whl (20 kB) Downloading s3transfer-0.10.3-py3-none-any.whl (82 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 82.6/82.6 kB 6.0 MB/s eta 0:00:00 Installing collected packages: jmespath, botocore, s3transfer, boto3, neuralcoref Running setup.py develop for neuralcoref Successfully installed boto3-1.35.55 botocore-1.35.55 jmespath-1.0.1 neuralcoref-4.0 s3transfer-0.10.3
!python -m spacy download en_core_web_sm
DEPRECATION: https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.3.1/en_core_web_sm-2.3.1.tar.gz#egg=en_core_web_sm==2.3.1 contains an egg fragment with a non-PEP 508 name pip 25.0 will enforce this behaviour change. A possible replacement is to use the req @ url syntax, and remove the egg fragment. Discussion can be found at https://github.com/pypa/pip/issues/11617 Collecting en_core_web_sm==2.3.1 Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.3.1/en_core_web_sm-2.3.1.tar.gz (12.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.0/12.0 MB 38.4 MB/s eta 0:00:00 Preparing metadata (setup.py) ... done Requirement already satisfied: spacy<2.4.0,>=2.3.0 in /usr/local/lib/python3.10/dist-packages (from en_core_web_sm==2.3.1) (2.3.9) Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.10/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.0.10) Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (2.0.8) Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (3.0.9) Requirement already satisfied: thinc<7.5.0,>=7.4.1 in /usr/local/lib/python3.10/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (7.4.6) Requirement already satisfied: blis<0.8.0,>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (0.7.11) Requirement already satisfied: wasabi<1.1.0,>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (0.10.1) Requirement already satisfied: srsly<1.1.0,>=1.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.0.7) Requirement already satisfied: catalogue<1.1.0,>=0.0.7 in /usr/local/lib/python3.10/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.0.2) Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /usr/local/lib/python3.10/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (4.66.6) Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (75.1.0) Requirement already satisfied: numpy>=1.15.0 in /usr/local/lib/python3.10/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.26.4) Requirement already satisfied: plac<1.2.0,>=0.9.6 in /usr/local/lib/python3.10/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (1.1.3) Requirement already satisfied: requests<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (2.32.3) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (3.4.0) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (3.10) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (2.2.3) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.13.0->spacy<2.4.0,>=2.3.0->en_core_web_sm==2.3.1) (2024.8.30) Building wheels for collected packages: en_core_web_sm Building wheel for en_core_web_sm (setup.py) ... done Created wheel for en_core_web_sm: filename=en_core_web_sm-2.3.1-py3-none-any.whl size=12047089 sha256=b1292e2d6515f75310cde14dd4377388d5dcc2747c07f157d8ad13240f4181eb Stored in directory: /root/.cache/pip/wheels/4f/1f/0e/16fae4b01d2d87454e0f484e58c48793efcf237f0894c1c4bd Successfully built en_core_web_sm Installing collected packages: en_core_web_sm Attempting uninstall: en_core_web_sm Found existing installation: en-core-web-sm 3.7.1 Uninstalling en-core-web-sm-3.7.1: Successfully uninstalled en-core-web-sm-3.7.1 Successfully installed en_core_web_sm-2.3.1 ✔ Download and installation successful You can now load the model via spacy.load('en_core_web_sm')
import spacy
import neuralcoref
nlp = spacy.load('en_core_web_sm')
neuralcoref.add_to_pipe(nlp)
doc1 = nlp('My sister has a dog. She loves him.')
print(doc1._.coref_clusters)
doc2 = nlp('Angela lives in Boston. She is quite happy in that city.')
for ent in doc2.ents:
print(ent._.coref_cluster)
100%|██████████| 40155833/40155833 [00:01<00:00, 34990939.06B/s]
[My sister: [My sister, She], a dog: [a dog, him]] Boston: [Boston, that city]