python-3.x 字符串中最常见的字符

des4xlb0 于 4个月前发布在 Python

关注(0)|答案(4)|浏览(90)

写一个函数，它接受一个由字母字符组成的字符串作为输入参数，并返回最常见的字符。忽略白色空格，即不将任何白色空格算作字符。注意，这里的大写并不重要，即小写字符等于大写字符。如果某些字符之间有联系，则返回最后一个计数最多的字符

这是更新的代码

def most_common_character (input_str):
    input_str = input_str.lower()
    new_string = "".join(input_str.split())
    print(new_string)
    length = len(new_string)
    print(length)
    count = 1
    j = 0
    higher_count = 0
    return_character = ""
    for i in range(0, len(new_string)):
        character = new_string[i]
        while (length - 1):
            if (character == new_string[j + 1]):
                count += 1
            j += 1
            length -= 1    
            if (higher_count < count):
                higher_count = count
    return (character)     

#Main Program
input_str = input("Enter a string: ")
result = most_common_character(input_str)
print(result)

字符串
上面是我的代码。我得到了一个错误string index out of bound，我不明白为什么。而且代码只检查第一个字符的出现，我不知道如何继续到下一个字符，并采取最大计数？

运行代码时出现的错误：

> Your answer is NOT CORRECT Your code was tested with different inputs.
> For example when your function is called as shown below:
> 
> most_common_character ('The cosmos is infinite')
> 
> ############# Your function returns ############# e The returned variable type is: type 'str'
> 
> ######### Correct return value should be ######## i The returned variable type is: type 'str'
> 
> ####### Output of student print statements ###### thecosmosisinfinite 19

型

python-3.x

来源：https://stackoverflow.com/questions/35594767/most-common-character-in-a-string

4条答案

按热度按时间

h9vpoimq1#

您可以使用正则表达式模式来搜索所有字符。\w匹配任何字母数字字符和下划线;这等效于集合[a-zA-Z0-9_]。[\w]之后的+表示匹配一个或多个重复。
最后，使用Counter对它们进行求和，然后使用most_common(1)得到最大值。

from collections import Counter
import re

s = "Write a function that takes a string consisting of alphabetic characters as input argument and returns the most common character. Ignore white spaces i.e. Do not count any white space as a character. Note that capitalization does not matter here i.e. that a lower case character is equal to a upper case character. In case of a tie between certain characters return the last character that has the most count"

>>> Counter(c.lower() for c in re.findall(r"\w", s)).most_common(1)
[('t', 46)]

字符串
在平局的情况下，这是一个有点棘手。

def top_character(some_string):
    joined_characters = [c for c in re.findall(r"\w+", some_string.lower())]
    d = Counter(joined_characters)
    top_characters = [c for c, n in d.most_common() if n == max(d.values())]
    if len(top_characters) == 1:
        return top_characters[0]
    reversed_characters = joined_characters[::-1]  
    for c in reversed_characters:
        if c in top_characters:
            return c

>>> top_character(s)
't'

>>> top_character('the the')
'e'

型
在上面的代码和句子“宇宙是无限的”中，你可以看到'i'比'e'（函数的输出）出现得更频繁：

>>> Counter(c.lower() for c in "".join(re.findall(r"[\w]+", 'The cosmos is infinite'))).most_common(3)
[('i', 4), ('s', 3), ('e', 2)]

型
你可以在代码块中看到这个问题：

for i in range(0, len(new_string)):
    character = new_string[i]
    ...
return (character)

型
你正在遍历一个句子，并将该字母赋给变量character，而变量character不会在其他地方重新赋值。因此变量character将始终返回字符串中的最后一个字符。

赞(0）回复(0）举报 4个月前

cu6pst1q2#

实际上，你的代码几乎是正确的。你需要把count，j，length移到for i in range(0, len(new_string))里面，因为你需要在每次迭代中重新开始，而且如果count大于higher_count，你需要保存charater作为return_character并返回它，而不是character，因为character = new_string[i]总是字符串的最后一个字符。
我不明白你为什么要用j+1和while length-1。在纠正它们之后，它现在也包括领带的情况。

def most_common_character (input_str):
    input_str = input_str.lower()
    new_string = "".join(input_str.split())
    higher_count = 0
    return_character = ""

    for i in range(0, len(new_string)):
        count = 0
        length = len(new_string)
        j = 0
        character = new_string[i]
        while length > 0:
            if (character == new_string[j]):
                count += 1
            j += 1
            length -= 1    
            if (higher_count <= count):
                higher_count = count
                return_character = character
    return (return_character)

字符串

赞(0）回复(0）举报 4个月前

lf5gs5x23#

如果我们忽略“tie”要求; collections.Counter()可以工作：

from collections import Counter
from itertools import chain

def most_common_character(input_str):
    return Counter(chain(*input_str.casefold().split())).most_common(1)[0][0]

字符串
范例：

>>> most_common_character('The cosmos is infinite')
'i'
>>> most_common_character('ab' * 3)
'a'

型
为了返回最后一个计数最多的字符，我们可以使用collections.OrderedDict：

from collections import Counter, OrderedDict
from itertools import chain
from operator import itemgetter

class OrderedCounter(Counter, OrderedDict):
    pass

def most_common_character(input_str):
    counter = OrderedCounter(chain(*input_str.casefold().split()))
    return max(reversed(counter.items()), key=itemgetter(1))[0]

型
范例：

>>> most_common_character('ab' * 3)
'b'

型
注意事项：此解决方案假设max()返回计数最多的第一个字符（因此有一个reversed()调用，以获取最后一个），并且所有字符都是单个Unicode码点。通常，您可能希望使用\X正则表达式（由regex module支持），从Unicode字符串中提取“用户感知的字符”（eXtended grapheme cluster）。

赞(0）回复(0）举报 4个月前

wfsdck304#

from collections import Counter

def most_common_character(s):
    char_count = Counter(s)
    return char_count.most_common(1)[0][0]

# Example
string21 = "programming is fun"
common_char = most_common_character(string21)
print("Most Common Character:", common_char)

字符串

输出= r

赞(0）回复(0）举报 4个月前

我来回答

python-3.x 字符串中最常见的字符

4条答案

输出= r

相关问题

热门标签

最新问答