需要帮助简化R语言中的框架

eanckbw9  于 5个月前  发布在  R语言
关注(0)|答案(2)|浏览(67)

我有一个框架,我想让它更容易阅读,但我想不出一个方法来做到这一点。这是我所得到的:

timestamp participant participant_name  name handicap sportsbooks.bookie_key odds     bookie prop_id unique_id
1  2024-01-02T05:39:12           1      Joel Embiid  Over     34.5                 Points -116   pinnacle       1       1.1
2  2024-01-02T05:39:12           1      Joel Embiid Under     34.5                 Points -114   pinnacle       1       1.1
3  2024-01-02T07:00:24           1      Joel Embiid  Over     34.5                 Points -120    fanduel       1       1.1
4  2024-01-02T07:00:24           1      Joel Embiid Under     34.5                 Points -106    fanduel       1       1.1
5  2024-01-02T02:35:05           1      Joel Embiid  Over     34.5                 Points -135 draftkings       1       1.1
6  2024-01-02T02:35:05           1      Joel Embiid Under     34.5                 Points  105 draftkings       1       1.1
7  2024-01-02T04:43:05           1      Joel Embiid  Over     52.5                 Points -110 draftkings       1       1.1
8  2024-01-02T04:43:05           1      Joel Embiid Under     52.5                 Points -120 draftkings       1       1.1
9  2024-01-02T07:26:10           1      Joel Embiid  Over     34.5                 Points -120     betmgm       1       1.1
10 2024-01-02T04:42:14           1      Joel Embiid Under     34.5                 Points -115     betmgm       1       1.1
11 2024-01-02T07:26:12           1      Joel Embiid  Over     33.5                 Points -135   barstool       1       1.1
12 2024-01-02T07:26:12           1      Joel Embiid Under     33.5                 Points  105   barstool       1       1.1
13 2024-01-01T23:16:02           1      Joel Embiid  Over     34.5                 Points -115   barstool       1       1.1
14 2024-01-01T23:16:02           1      Joel Embiid Under     34.5                 Points -115   barstool       1       1.1
15 2024-01-02T04:43:07           1      Joel Embiid  Over     35.5                 Points  100   barstool       1       1.1
16 2024-01-02T04:43:07           1      Joel Embiid Under     35.5                 Points -130   barstool       1       1.1
17 2024-01-01T23:40:17           1      Joel Embiid  Over     52.5                 Points -110   barstool       1       1.1
18 2024-01-01T23:40:17           1      Joel Embiid Under     52.5                 Points -120   barstool       1       1.1

字符串
这是一个部分填充的例子,我希望它看起来是这样的:

pinnacle   fanduel draftkings betmgm barstool
33.5      <NA>      <NA>       <NA>     NA     <NA>
34.5 -116/-114 -120/-106   -135/105     NA     <NA>
35.5      <NA>      <NA>       <NA>     NA 100/-130
52.5      <NA>      <NA>  -110/-120     NA     <NA>


我的最终目标是把赔率(“超过赔率”/“低于赔率”)放在一个框架中,如果它们存在的话,剩下的就留给NA。
我真的不知道从哪里开始开始,这个问题已经在我的脑海里停留了几个小时。
数据类型:

data = structure(list(timestamp = c("2024-01-02T05:39:12", "2024-01-02T05:39:12", 
                                    "2024-01-02T07:00:24", "2024-01-02T07:00:24", "2024-01-02T02:35:05", 
                                    "2024-01-02T02:35:05", "2024-01-02T04:43:05", "2024-01-02T04:43:05", 
                                    "2024-01-02T07:26:10", "2024-01-02T04:42:14", "2024-01-02T07:26:12", 
                                    "2024-01-02T07:26:12", "2024-01-01T23:16:02", "2024-01-01T23:16:02", 
                                    "2024-01-02T04:43:07", "2024-01-02T04:43:07", "2024-01-01T23:40:17", 
                                    "2024-01-01T23:40:17"), participant = c(1L, 1L, 1L, 1L, 1L, 1L, 
                                                                            1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), participant_name = c("JoelEmbiid", 
                                                                                                                                                  "JoelEmbiid", "JoelEmbiid", "JoelEmbiid", "JoelEmbiid", "JoelEmbiid", 
                                                                                                                                                  "JoelEmbiid", "JoelEmbiid", "JoelEmbiid", "JoelEmbiid", "JoelEmbiid", 
                                                                                                                                                  "JoelEmbiid", "JoelEmbiid", "JoelEmbiid", "JoelEmbiid", "JoelEmbiid", 
                                                                                                                                                  "JoelEmbiid", "JoelEmbiid"), name = c("Over", "Under", "Over", 
                                                                                                                                                                                        "Under", "Over", "Under", "Over", "Under", "Over", "Under", "Over", 
                                                                                                                                                                                        "Under", "Over", "Under", "Over", "Under", "Over", "Under"), 
                      handicap = c(34.5, 34.5, 34.5, 34.5, 34.5, 34.5, 52.5, 52.5, 
                                   34.5, 34.5, 33.5, 33.5, 34.5, 34.5, 35.5, 35.5, 52.5, 52.5
                      ), sportsbooks.bookie_key = c("Points", "Points", "Points", 
                                                    "Points", "Points", "Points", "Points", "Points", "Points", 
                                                    "Points", "Points", "Points", "Points", "Points", "Points", 
                                                    "Points", "Points", "Points"), odds = c(-116L, -114L, -120L, 
                                                                                            -106L, -135L, 105L, -110L, -120L, -120L, -115L, -135L, 105L, 
                                                                                            -115L, -115L, 100L, -130L, -110L, -120L), bookie = c("pinnacle", 
                                                                                                                                                 "pinnacle", "fanduel", "fanduel", "draftkings", "draftkings", 
                                                                                                                                                 "draftkings", "draftkings", "betmgm", "betmgm", "barstool", 
                                                                                                                                                 "barstool", "barstool", "barstool", "barstool", "barstool", 
                                                                                                                                                 "barstool", "barstool"), prop_id = c(1L, 1L, 1L, 1L, 1L, 
                                                                                                                                                                                      1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), unique_id = c(1.1, 
                                                                                                                                                                                                                                                         1.1, 1.1, 1.1, 1.1, 1.1, 1.1, 1.1, 1.1, 1.1, 1.1, 1.1, 1.1, 
                                                                                                                                                                                                                                                         1.1, 1.1, 1.1, 1.1, 1.1)), class = "data.frame", row.names = c("1", 
                                                                                                                                                                                                                                                                                                                        "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
                                                                                                                                                                                                                                                                                                                        "14", "15", "16", "17", "18"))

6kkfgxo0

6kkfgxo01#

我用必要的列创建了你的框架

aa <- data.frame(handicap = c(34.5, 34.5,34.5,34.5, 34.5,34.5, 52.2),
                 odds = c(-116, 114, 120, 106, 135, 105, -110), 
                 bookie = c("pinnacle", "pinnacle", "fanduel", "fanduel", "draftkings", "draftkings", "draftkings"))

字符串
首先,我分组和粘贴赔率值,根据赌注和差点,以获得独特的价值观,最后一步,使枢轴更广泛地获得列名从赌注列和增加价值的赔率

aa %>% 
  group_by(bookie,handicap) %>% 
  summarize(odds = paste(unique(odds), collapse = " / ")) %>% 
  pivot_wider(names_from = "bookie", values_from = c(odds))


产出:

knpiaxh1

knpiaxh12#

如果你想继续使用基数R,一个可能的解决方案是aggregatereshape数据:

new = 
  with(data, aggregate(x = list(OR = odds), 
                       by = list(handicap = handicap, bookie = bookie), 
                       FUN = \(o) paste(unique(o), collapse = "/"), drop = FALSE)) |>
  reshape(idvar = "handicap", timevar = "bookie",  direction = "wide")

字符串

> new
  handicap OR.barstool OR.betmgm OR.draftkings OR.fanduel OR.pinnacle
1     33.5    -135/105      <NA>          <NA>       <NA>        <NA>
2     34.5        -115 -120/-115      -135/105  -120/-106   -116/-114
3     35.5    100/-130      <NA>          <NA>       <NA>        <NA>
4     52.5   -110/-120      <NA>     -110/-120       <NA>        <NA>


通常不建议将列移动到行名称。

colnames(new) = gsub("OR.", "", colnames(new))
rownames(new) = new$handicap
new$handicap = NULL


然后

> new
      barstool    betmgm draftkings   fanduel  pinnacle
33.5  -135/105      <NA>       <NA>      <NA>      <NA>
34.5      -115 -120/-115   -135/105 -120/-106 -116/-114
35.5  100/-130      <NA>       <NA>      <NA>      <NA>
52.5 -110/-120      <NA>  -110/-120      <NA>      <NA>

相关问题