Python SpaCy Create nlp Document - Argument 'string' has incorrect type -


i'm relatively new python nlp , trying process csv file spacy. i'm able load file fine using pandas, when attempt process spacy's nlp function, compiler errors out approximately 5% of way through file's contents.

code block follows:

import pandas pd df = pd.read_csv('./reviews.washington.dc.csv')  import spacy nlp = spacy.load('en')  parsed_doc in nlp.pipe(iter(df['comments']), batch_size=1, n_threads=4):     print (parsed_doc.text) 

i've tried:

df['parsed'] = df['comments'].apply(nlp) 

with same result.

the traceback i'm receiving is:

traceback (most recent call last):     file "/users/john/downloads/spacy_load.py", line 11, in <module>         parsed_doc in nlp.pipe(iter(df['comments']), batch_size=1,         n_threads=4):     file "/usr/local/lib/python3.6/site-packages/spacy/language.py",         line 352, in pipe doc in stream:     file "spacy/syntax/parser.pyx", line 239, in pipe         (spacy/syntax/parser.cpp:8912)     file "spacy/matcher.pyx", line 465, in pipe (spacy/matcher.cpp:9904)     file "spacy/syntax/parser.pyx", line 239, in pipe (spacy/syntax/parser.cpp:8912)     file "spacy/tagger.pyx", line 231, in pipe (spacy/tagger.cpp:6548)     file "/usr/local/lib/python3.6/site-packages/spacy/language.py", line 345,         in <genexpr> stream = (self.make_doc(text) text in texts)     file "/usr/local/lib/python3.6/site-packages/spacy/language.py", line 293,         in <lambda> self.make_doc = lambda text: self.tokenizer(text)     typeerror: argument 'string' has incorrect type (expected str, got float) 

can shed light on why happening, how might work around it? i've tried various workarounds site no avail. try/except blocks have had no effect, either.


Comments

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

performance - Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures? -

jquery - Responsive Navbar with Sub Navbar -