typeerror：预期的字符串或字节类对象我收到此错误

5kgi1eie 于 2021-08-20 发布在 Java

关注(0)|答案(1)|浏览(185)

这是我遇到错误的代码。我有一个excel表格，其中两列a包含乌尔都语文本，b列包含文本类true或false

counter = collections.Counter()
maxlen = 0
c=0
df=pd.read_excel(INPUT_FILE)
df.columns=['Text','label']
label=df['label']
sent=df['Text']

# print (sent)

content=df
for index, row in df.iterrows():

# print(row['Text'], row['label'])

    words = [x.lower() for x in nltk.word_tokenize(row['Text'])] # here i am getting error 
    if len(words) > maxlen:
        maxlen = len(words)
        c=c+1
    for word in words:
        counter[word] += 1
print (len(counter))
print (maxlen)

我已经尝试过不同的解决方案，但都是徒劳的。请根据我的代码和我的问题向我推荐一些代码或解决方案。我在用乌尔都语，也许这是个问题。

python nlp

来源：https://stackoverflow.com/questions/68311708/typeerror-expected-string-or-bytes-like-object-i-am-getting-this-error

1条答案

按热度按时间

weylhg0b1#

"word_tokenize" 需要 "string" 键入作为输入，您将传递一个看起来不同的对象 type .
请参阅：
https://www.nltk.org/api/nltk.tokenize.html

word_tokenize(s)[source]¶
Tokenize a string to split off punctuation other than periods

赞(0）回复(0）举报 2021-08-20

我来回答

typeerror：预期的字符串或字节类对象我收到此错误

1条答案

相关问题

热门标签

最新问答