Python SpaCy Create nlp Document - Argument 'string' has incorrect type -
i'm relatively new python nlp , trying process csv file spacy. i'm able load file fine using pandas, when attempt process spacy's nlp function, compiler errors out approximately 5% of way through file's contents.
code block follows:
import pandas pd df = pd.read_csv('./reviews.washington.dc.csv') import spacy nlp = spacy.load('en') parsed_doc in nlp.pipe(iter(df['comments']), batch_size=1, n_threads=4): print (parsed_doc.text)
i've tried:
df['parsed'] = df['comments'].apply(nlp)
with same result.
the traceback i'm receiving is:
traceback (most recent call last): file "/users/john/downloads/spacy_load.py", line 11, in <module> parsed_doc in nlp.pipe(iter(df['comments']), batch_size=1, n_threads=4): file "/usr/local/lib/python3.6/site-packages/spacy/language.py", line 352, in pipe doc in stream: file "spacy/syntax/parser.pyx", line 239, in pipe (spacy/syntax/parser.cpp:8912) file "spacy/matcher.pyx", line 465, in pipe (spacy/matcher.cpp:9904) file "spacy/syntax/parser.pyx", line 239, in pipe (spacy/syntax/parser.cpp:8912) file "spacy/tagger.pyx", line 231, in pipe (spacy/tagger.cpp:6548) file "/usr/local/lib/python3.6/site-packages/spacy/language.py", line 345, in <genexpr> stream = (self.make_doc(text) text in texts) file "/usr/local/lib/python3.6/site-packages/spacy/language.py", line 293, in <lambda> self.make_doc = lambda text: self.tokenizer(text) typeerror: argument 'string' has incorrect type (expected str, got float)
can shed light on why happening, how might work around it? i've tried various workarounds site no avail. try/except blocks have had no effect, either.
Comments
Post a Comment