scipy 从树形图中提取叶子颜色

okxuctiv  于 8个月前  发布在  其他
关注(0)|答案(2)|浏览(72)

我无法从scipy dendrogram字典中获得颜色叶子。正如文档和github issue中所述,树状图字典中的color_list键指的是链接,而不是叶子。如果有另一个指向叶子的键就好了,有时你需要这个来给其他类型的图形着色,比如下面例子中的散点图。

import numpy as np
import matplotlib.pyplot as plt
from scipy.cluster.hierarchy import linkage, dendrogram

# DATA EXAMPLE
x = np.array([[ 5, 3],
              [10,15],
              [15,12],
              [24,10],
              [30,30],
              [85,70],
              [71,80]])

# DENDROGRAM
plt.figure()
plt.subplot(121)
z = linkage(x, 'single')
d = dendrogram(z)

# COLORED PLOT
# This is what I would like to achieve. Colors are assigned manually by looking
# at the dendrogram, because I failed to get it from d['color_list'] (it refers 
# to links, not observations)
plt.subplot(122)
points = d['leaves']
colors = ['r','r','g','g','g','g','g']
for point, color in zip(points, colors):
    plt.plot(x[point, 0], x[point, 1], 'o', color=color)

在这个例子中,手动颜色分配似乎很容易,但我正在处理巨大的数据集,所以直到我们在字典中获得这个新功能(颜色叶子),我试图用字典中包含的当前信息以某种方式推断它,但到目前为止我已经没有想法了。有人能帮帮我吗?
谢谢.

aoyhnmkz

aoyhnmkz1#

对于scipy 1.7.1,已经实现了新的功能,并且dendogram函数在输出字典中还返回一个条目“leaves_color_list”,可以用来轻松执行此任务。
下面是OP的工作代码(见最后一行“新代码”)

import numpy as np
import matplotlib.pyplot as plt
from scipy.cluster.hierarchy import linkage, dendrogram

# DATA EXAMPLE
x = np.array([[ 5, 3],
              [10,15],
              [15,12],
              [24,10],
              [30,30],
              [85,70],
              [71,80]])

# DENDROGRAM
plt.figure()
plt.subplot(121)
z = linkage(x, 'single')
d = dendrogram(z)

# COLORED PLOT
# This is what I would like to achieve. Colors are assigned manually by looking
# at the dendrogram, because I failed to get it from d['color_list'] (it refers 
# to links, not observations)
plt.subplot(122)

#NEW CODE
plt.scatter(x[d['leaves'],0],x[d['leaves'],1], color=d['leaves_color_list'])
von4xj4u

von4xj4u2#

以下方法似乎有效。树状图返回的字典包含'color_list'和链接的颜色。而'icoord'和'dcoord'分别与xy,绘制这些连杆的坐标。这些X位置是5、15、25、.当链接从一点开始时。因此,测试这些x位置可以将我们从连杆带回到相应的点。并允许将链接的颜色指定给该点。

import numpy as np
import matplotlib.pyplot as plt
from scipy.cluster.hierarchy import linkage, dendrogram

# DATA EXAMPLE
x = np.random.uniform(0, 10, (20, 2))

# DENDROGRAM
plt.figure()
plt.subplot(121)
z = linkage(x, 'single')
d = dendrogram(z)
plt.yticks([])

# COLORED PLOT
plt.subplot(122)
points = d['leaves']
colors = ['none'] * len(points)
for xs, c in zip(d['icoord'], d['color_list']):
    for xi in xs:
        if xi % 10 == 5:
            colors[(int(xi)-5) // 10] = c
for point, color in zip(points, colors):
    plt.plot(x[point, 0], x[point, 1], 'o', color=color)
    plt.text(x[point, 0], x[point, 1], f' {point}')
plt.show()

PS:This post关于匹配点与它们的聚类也可能是相关的。

相关问题