问题遇到的现象和发生背景
Task 2 - Multivariate Statistical Analysis
Generate 1000 data 5-dimentional from the multivariate normal disribution (with the mean of your choice and the covariance matrix different from just diagonal).
Perform PCA to create 3 first PC. Visualize them.
Change the covariance structure of your data (put a different covariance matrix).
Visualize again.
Try to do with the starting dimension of 10.
If you like, you may take real data sets instead.
翻译
任务 2 - 多变量统计分析
从多元正态分布生成 1000 个 5 维数据(您选择的平均值和协方差矩阵不同于对角线)。
执行 PCA 以创建 3 个第一台 PC。 将它们可视化。
更改数据的协方差结构(放置不同的协方差矩阵)。
再观想。
尝试使用 10 的起始维度。
如果您愿意,您可以使用真实的数据集。
问题相关代码
library(MASS)
library(Matrix)
library(ggplot2)
library(tidyverse)
library(shape)
library(ggplot2)
library(RColorBrewer)
library(ggpubr)
N = 10000
mean1 = c(-100,-100,-100,-100)
mean2 = c(100,100,100,100)
mean3 = c(0,0,0,0)
mean4 = c(30,30,30,30)
x1 <- rnorm(mean = mean1, sd=3, N)
y1 <- rnorm(mean = mean1, sd=6, N)
x2 <- rnorm(mean = mean2, sd=4, N)
y2 <- rnorm(mean = mean2, sd=5, N)
x3 <- rnorm(mean = mean3, sd=5, N)
y3 <- rnorm(mean = mean3, sd=4, N)
x4 <- rnorm(mean = mean4, sd=6, N)
y4 <- rnorm(mean = mean4, sd=3, N)
#x4 <- rnorm()
data2 <- data.frame(x=c(x1,x2,x3,x4),y=c(y1,y2,y3,y4),class=rep(c("A","B","C","D"),each=10000))
ggscatterhist(
data2, x ='x', y = 'y', #iris
shape=21,color ="grey",fill= "class", size =1, alpha =0.1,
palette = c("#FFCA99", "#75D3FF", "#FC4E07","#264DFF","#A50021"),
margin.plot = "density",
margin.params = list(fill = "class", color = "black", size =1),
legend = c(0.9,0.15),
ggtheme = theme_minimal())