python - Find an element in the list, then to compare with percent vs a string element and use SequenceMatcher -

June 15, 2015

i try comparison between element list , string element. however, i've used nltk lib i'm forced use difflib sequencematcher. here script nltk lib:

import nltk import nltk.corpus import nltk.tokenize import nltk.stem.snowball import string import re  newinputlist = 'cars (2006)'  newlistmovies = ['adult world (2013)', 'trolls (2016)', 'cars (2006)', 'harry potter , prisoner of azkaban (2004)', 'the sex monster (1999)', 'pitch perfect 2 (2015)', 'avengers: age of ultron (2015)', 'jurrasic world (2015)']  stopwords = nltk.corpus.stopwords.words('english') stopwords.extend(string.punctuation) stopwords.append('') stemmer = nltk.stem.snowball.snowballstemmer('english')  def get_match_ratio(s1, s2):     tokens_s1 = [token token in nltk.word_tokenize(s1.lower())]     tokens_s2 = [token token in nltk.word_tokenize(s2.lower())]     stems_s1 = [stemmer.stem(token) token in tokens_s1]     stems_s2 = [stemmer.stem(token) token in tokens_s2]      ratio = len(set(stems_s1).intersection(stems_s2)) / float(len(set(stems_s1).union(stems_s2)))     return ratio  similarity = [[item, get_match_ratio(newinputlist, item)] item in newlistmovies]  itemmatch = [x[0] x in similarity if x[1] > 0.5] print itemmatch

output :

'cars (2006)' # percent value : 1.0

any ideas code difflib only?

Search This Blog

How Y

python - Find an element in the list, then to compare with percent vs a string element and use SequenceMatcher -

Comments

Post a Comment

Popular posts from this blog

meteor - inserting data to database gives error "insert failed: Method '/texts/insert' not found" -

angular - DownloadURL return null in below code -

html - unterminated string literal “onclick” event in anchor -