python—如何在大型查询表或任何其他主题建模资源上使用google cloud自然语言处理api？

rjee0c15 于 2021-09-08 发布在 Java

关注(0)|答案(0)|浏览(249)

正如标题中提到的，我有一个包含1800万行的bigquery表，其中近一半是无用的，我应该根据一个重要的列为每行分配一个主题/利基（该列包含关于产品和网站的详细信息），我已经在一个大小为10的样本数据上测试了nlp api，000，这确实很奇怪，但我的标准方法是迭代newarr（这是我通过查询bigquery表获得的重要细节列），在这里，我一次只发送一个单元格，等待api的响应并将其附加到结果数组。
理想情况下，我希望在最短的时间内对1800万行执行此操作，我的每分钟配额增加到3000个api请求，这是我可以完成的最大值，但我无法计算如何每分钟一个接一个地发送3000行。

for x in newarr:
    i += 1
    results.append(sample_classify_text(x))

示例分类文本是直接来自文档的函数


# this function will return category for the text

from google.cloud import language_v1

def sample_classify_text(text_content):
    """
    Classifying Content in a String

    Args:
      text_content The text content to analyze. Must include at least 20 words.
    """

    client = language_v1.LanguageServiceClient()

    # text_content = 'That actor on TV makes movies in Hollywood and also stars in a variety of popular new TV shows.'

    # Available types: PLAIN_TEXT, HTML
    type_ = language_v1.Document.Type.PLAIN_TEXT

    # Optional. If not specified, the language is automatically detected.
    # For list of supported languages:
    # https://cloud.google.com/natural-language/docs/languages
    language = "en"
    document = {"content": text_content, "type_": type_, "language": language}

    response = client.classify_text(request = {'document': document})
    #return response.categories
    # Loop through classified categories returned from the API
    for category in response.categories:
        # Get the name of the category representing the document.
        # See the predefined taxonomy of categories:
        # https://cloud.google.com/natural-language/docs/categories
        x = format(category.name)
        return x

        # Get the confidence. Number representing how certain the classifier
        # is that this category represents the provided text.

python google-bigquery google-cloud-platform bigdata nlp

来源：https://stackoverflow.com/questions/68322038/how-can-i-use-the-google-cloud-natural-language-processing-api-on-a-big-query-ta