pandas 我想从另一个列创建一个列,将第一列中的值转换为float

bksxznpy  于 6个月前  发布在  其他
关注(0)|答案(2)|浏览(60)
import pandas as pd
pd.set_option('display.max_colwidth', None)
jeopardy = pd.read_csv('jeopardy.csv')
jeopardy.rename(columns=lambda x: x.strip(), inplace=True)

print(jeopardy.head())

字符串
这是OUTUPU:

当我尝试创建一个表时,出现了一个错误,但没有意义。

jeopardy["Value_float"] = jeopardy['Value'].apply(lambda x: float(x[1:].replace(',',"")))


当我运行这段代码时,出现错误:

TypeError: 'float' object is not subscriptable


为什么会这样?

q3aa0525

q3aa05251#

试试这个:

import pandas as pd

pd.set_option('display.max_colwidth', None)

jeopardy = pd.DataFrame(
    [
        [4680, "2004-12-31", "Jeopardy!", "HISTORY", "$200", "For the last 8 years of his life, Galileo was under house arrest for espousing this man's theory", "Copernicus"],
        [4680, "2004-12-31", "Jeopardy!", "ESPN's TOP ALL-TIME ATHLETES", "$200", "No. 2: 1912 Olympian; football star at Carlisle Indian School; 6 MLB seasons with the Reds, Giants & Braves", "Jim Thorpe"],
        [4680, "2004-12-31", "Jeopardy!", "EVERYBODY TALKS ABOUT IT...", "$200", "The city of Yuma in this state has a record average of 4,055 hours of sunshine each year", "Arizona"],
        [4680, "2004-12-31", "Jeopardy!", "THE COMPANY LINE", "$1,300.50", 'In 1963, live on "The Art Linkletter Show", this company served its billionth burger', "McDonald's"],
        [4680, "2004-12-31", "Jeopardy!", "EPITAPHS & TRIBUTES", "$ 1,200 ", "Signer of the Dec. of Indep., framer of the Constitution of Mass., second President of the United States", "John Adams"],
    ],
    columns=["Show Number", "Air Date", "Round", "Category", "Value", "Question", "Answer"]
)
jeopardy.rename(columns=lambda x: x.strip(), inplace=True)

# Convert values to float by first removing the "$" sign and the "," (thousands separator).
# Then call `pandas.to_numeric()` function with `errors="coerce"` to convert values to numeric,
# and values that fail to convert to `None`.
jeopardy["Value_float"] = pd.to_numeric(jeopardy['Value'].str.replace("$", "", regex=False).str.replace(",", "", regex=False), errors="coerce")

# Find the values that failed to convert to float
errors = jeopardy[jeopardy["Value_float"].isna()]

jeopardy

字符串

输出:

| 展会编号|播出日期|轮|类别|值|问题|回答|Value_float|
| --|--|--|--|--|--|--|--|
| 4680 |2004年12月31日|危险!|历史|两百块|在他生命的最后8年里,伽利略因为支持这个人的理论而被软禁。|哥白尼| 200 |
| 4680 |2004年12月31日|危险!|ESPN的顶级运动员|两百块|第2名:1912年奥运会选手;卡莱尔印第安学校的足球星星;在红军、巨人队和勇士队效力了6个MLB赛季|吉姆·索普| 200 |
| 4680 |2004年12月31日|危险!|每个人都在谈论它...|两百块|该州的尤马市每年平均日照时间达到创纪录的4,055小时|亚利桑那| 200 |
| 4680 |2004年12月31日|危险!|公司的路线|一千三百块五|1963年,在“艺术链接字母秀”的现场直播中,这家公司供应了第十亿个汉堡|麦当劳|一千三百点五|
| 4680 |2004年12月31日|危险!|墓志铭和颂词|一千二百块|印第安纳州12月的签署者,马萨诸塞州宪法的制定者,美国第二任总统|约翰·亚当斯| 1200 |

nkkqxpd9

nkkqxpd92#

也许像这样的东西会起作用?它允许数字通过(如果它们还不是浮点数,就变成float),但它会删除不需要的字符:

import re

def to_float(x):
    if isinstance(x, str):
        x = re.sub('[$,%]', '', x)
    return float(x)

字符串
范例:

v = ['$2,000.00', '12', 1, 1_500, '-$14']
>>> pd.Series(v).apply(to_float)
0    2000.0
1      12.0
2       1.0
3    1500.0
4     -14.0

相关问题