Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to 'use' this model in our own project ? #56

Open
GeekyPM07 opened this issue Jun 29, 2018 · 14 comments
Open

How to 'use' this model in our own project ? #56

GeekyPM07 opened this issue Jun 29, 2018 · 14 comments

Comments

@GeekyPM07
Copy link

GeekyPM07 commented Jun 29, 2018

I can't find a way how to incorporate this model to my code? I just need to get sentiment scores on some Feedbacks. As this model is pre-trained, it will be of much help. How do i do this? Thanks!

@GeekyPM07 GeekyPM07 changed the title How to 'use' this model in our own project? I just need to get sentiment scores on some Feedbacks. How to 'use' this model in our own project ? Jun 29, 2018
@pgurazada
Copy link

pgurazada commented Jan 31, 2019

I have used this code in a recent project on customer feedback. Here is what I did.

  • Cloned the repo to the code folder of my project
  • Opened up a new file sentiment-extractor.py in this repo folder. Filled in the following code
import pandas as pd
from encoder import Model

sentiment_model = Model()

data_df = pd.read_csv('/path/to/samples/') # all the feedback text was placed in a pandas data frame

samples = list(data_df['samples']) 

text_features = sentiment_model.transform(samples)

sentiment_scores = text_features[:, 2388]

data_df['sentiment_scores'] = sentiment_scores

data_df.to_csv('/path/to/output_dir')

Note that the code runs significantly faster on a GPU.

@massimosclaw
Copy link

massimosclaw commented Jul 15, 2019

@pgurazada I'm a total beginner to python and pandas trying to apply this code to my csv. If possible, would love any help whatsoever as to why I'm getting this error.

I replaced '/path/to/samples/' with a file names 'samples.csv in the same directory

Warning (from warnings module):
  File "/Volumes/Transcend/sentimentneuron/generating-reviews-discovering-sentiment/sentiment-extractor.py", line 14
    warnings.warn(msg, category=DeprecationWarning)
DeprecationWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.

Warning (from warnings module):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/sklearn/externals/joblib/__init__.py", line 15
    warnings.warn(msg, category=DeprecationWarning)
DeprecationWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.
WARNING:tensorflow:From /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2657, in get_loc
    return self._engine.get_loc(key)
  File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'samples'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Volumes/Transcend/sentimentneuron/generating-reviews-discovering-sentiment/sentiment-extractor.py", line 23, in <module>
    samples = list(data_df['samples'])
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/frame.py", line 2927, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2659, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'samples' 

@massimosclaw
Copy link

@pgurazada I should mention I also don't know how to convert csv to pandas dataframe

@pgurazada
Copy link

@massimosclaw

Firstly, pd.read_csv() will read the csv file into a pandas data frame. So you will not need any extra work on that. Secondly, it appears to me that your data frame does not have a column named 'samples' and the subsetting step data_df['samples'] fails (hence the KeyError). Kindly check if you have a column named 'samples' in your csv.

I understand where you are coming from. I am sure this link would help you grok pandas better - https://tomaugspurger.github.io/modern-1-intro.html

@massimosclaw
Copy link

massimosclaw commented Jul 17, 2019

@pgurazada
Thank you so much! And thank you for the resource, will definitely check it out. Now it's giving me this error.

Traceback (most recent call last):
  File "/Volumes/Transcend/sentimentneuron/generating-reviews-discovering-sentiment/sentiment-extractor.py", line 10, in <module>
    text_features = model.transform(samples)
NameError: name 'model' is not defined

It seems to me this means the (variable?) 'model' hasn't been given a value? I'm not sure what value to give it though...

@massimosclaw
Copy link

massimosclaw commented Jul 17, 2019

I saw in someone else's example: https://github.com/ModelDepot/Sentiment-Neuron-Demonstration/blob/master/Sentiment_Neuron.ipynb that they defined model = Model(), and noticed you defined it as sentiment_model.

So I also tried replacing

text_features = model.transform(samples)
with

text_features = sentiment_model.transform(samples)
And got this:

Traceback (most recent call last):
  File "/Volumes/Transcend/sentimentneuron/generating-reviews-discovering-sentiment/sentiment-extractor.py", line 12, in <module>
    text_features = sentiment_model.transform(samples)
  File "/Volumes/Transcend/sentimentneuron/generating-reviews-discovering-sentiment/encoder.py", line 156, in transform
    xs = [preprocess(x) for x in xs]
  File "/Volumes/Transcend/sentimentneuron/generating-reviews-discovering-sentiment/encoder.py", line 156, in <listcomp>
    xs = [preprocess(x) for x in xs]
  File "/Volumes/Transcend/sentimentneuron/generating-reviews-discovering-sentiment/utils.py", line 53, in preprocess
    text = html.unescape(text)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/html/__init__.py", line 130, in unescape
    if '&' not in s:
TypeError: argument of type 'float' is not iterable

@pgurazada
Copy link

@massimosclaw Thank you for pointing the error on the sentiment_model out. I now corrected my original comment. Seems to me there is some error in the data types of your columns. The encoder is returning an error on iteration? Would you be able to check if there are any missing values? Also, any data sanity checks on whether the strings are all encoded properly would help.

@massimosclaw
Copy link

massimosclaw commented Jul 17, 2019

@pgurazada No prob!

Unfortunately, I don't know how to check if there are any missing values (or what that means exactly... empty cells?) as well as data sanity checks on whether the strings are all encoded properly... don't know what that means either. Anything in particular I should search for to learn more about that?

Will do some googling around with those terms...

@massimosclaw
Copy link

@massimosclaw
Copy link

@pgurazada I finally managed to get it to work just by deleting all other columns which contained links, dates, times, and other data. Thank you so much again for the help.

Wanted to ask one last question... how do you get the code to run on your GPU? As it takes a long time to run on my CPU.

@pgurazada
Copy link

@pgurazada I finally managed to get it to work just by deleting all other columns which contained links, dates, times, and other data. Thank you so much again for the help.

Wanted to ask one last question... how do you get the code to run on your GPU? As it takes a long time to run on my CPU.

I am so sorry, missed this out for some reason. On a GPU predictions are about 10x faster. I was using the standard Colaboratory GPUs.

@divyag11
Copy link

divyag11 commented Sep 5, 2019

could you kindly tell ,why you have taken ""sentiment_scores = text_features[:, 2388]"",2388 in text features.What is the use of it?

@pgurazada
Copy link

@divyag11 Please check out the original paper from Open AI. This is the activation of the sentiment neuron.

@divyag11
Copy link

divyag11 commented Sep 9, 2019

okay,thanks for your reply

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants