排行榜

  • 用户榜
  • 标签榜
  • 冲榜分奖金

频道

最新最热悬赏待采纳 筛选
  • 1

    回答

  • 19

    浏览

pycharm在运行到人脸识别这一块就直接闪退程序了 然后出现Process finished with exit code -1073740791 (0xC0000409) 求个大神帮我看看 我的配置要怎么改

T u à
采纳率66.7%
29天前
  • 0

    回答

  • 7

    浏览

import torch import numpy as np wav_data1 = torch.tensor(np.arange(0,102.4,0.01)) 请问如何把这个wav_data1,利用256的窗口,重叠区为128,变成32*128

  • 2

    回答

  • 17

    浏览

我是一名CNN新手,想运行 Wide-Area Crowd Counting via Ground-Plane Density Maps and Multi-View Fusion CNNs 。网站是 https://github.com/zqyq/Wide-Area-Crowd-Counting_CVPR2019  想请问有没有大神能告诉我步骤,谢谢。

  • 0

    回答

  • 14

    浏览

我想调运sklearn库中的随机森林分类器处理一个县的数据(地物自动分类),我把它训练集放在一个矩阵中,测试集也放在一个矩阵中,由于数据量太大,当我fit时显示内存不足。然后我想把他切片或者是分批训练(现在我把一个县的图像随机生产了N张512×512像素的图片,想把他和深度学习一样分批次训练),不知道有什么功能可以实现。期间也用过warm_start功能,不过此功能的限制是只能训练具有数量相同标签的数据(不能确保N张512×512图片中每一张都有相同数量的地物分类标签)。

  • 0

    回答

  • 9

    浏览

下面是stargan-v2安装要求; 上面显示python版本3.6.7. 当我安装下图的数据包时却出现python版本报错,这是为什么呢? 报错如下;

  • 2

    回答

  • 20

    浏览

其实就是想知道SARIMA(p, q, d)(P, Q, D)s的通式该怎么写,在EVIEWS的Equation Estimation框中该怎么写。 如图中红框(写的是SARIMA(1, 0, 1)(0, 0, 1)42,但也不确定对不对 ),flow是序列的名字

  • 2

    回答

  • 109

    浏览

seed = 7 np.random.seed(seed) #设置了seed后,会让后面每次产生的随机数相同。 def Conv2d_BN(x, nb_filter,kernel_size, strides=(1,1), padding='same',name=None): if name is not None: bn_name = name + '_bn' conv_name = name + '_conv' else: bn_name = None conv_name = None x = Conv2D(nb_filter,kernel_size,padding=padding,strides=strides,activation='relu',name=conv_name)(x) x = BatchNormalization(axis=3,name=bn_name)(x) return x def Conv_Block(inpt,nb_filter,kernel_size,strides=(1,1), with_conv_shortcut=False): x = Conv2d_BN(inpt,nb_filter=nb_filter[0],kernel_size=(1,1),strides=strides,padding='same') x = Conv2d_BN(x, nb_filter=nb_filter[1], kernel_size=(3,3), padding='same') x = Conv2d_BN(x, nb_filter=nb_filter[2], kernel_size=(1,1), padding='same') if with_conv_shortcut: shortcut = Conv2d_BN(inpt,nb_filter=nb_filter[2],strides=strides,kernel_size=kernel_size) x = add([x,shortcut]) return x else: x = add([x,inpt]) return x inpt = Input(shape=(224,224,3)) #预期的输入将是一批224*224*3维度的向量 x = ZeroPadding2D((3,3))(inpt) x = Conv2d_BN(x,nb_filter=64,kernel_size=(95,95),strides=(2,2),padding='valid') #找64种特征,特征基本像素的大小为7*7 x = MaxPooling2D(pool_size=(3,3),strides=(2,2),padding='same')(x) x = Conv_Block(x,nb_filter=[64,64,256],kernel_size=(3,3),strides=(1,1),with_conv_shortcut=True) x = Conv_Block(x,nb_filter=[64,64,256],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[64,64,256],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[128,128,512],kernel_size=(3,3),strides=(2,2),with_conv_shortcut=True) x = Conv_Block(x,nb_filter=[128,128,512],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[128,128,512],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[128,128,512],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[256,256,1024],kernel_size=(3,3),strides=(2,2),with_conv_shortcut=True) x = Conv_Block(x,nb_filter=[256,256,1024],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[256,256,1024],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[256,256,1024],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[256,256,1024],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[256,256,1024],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[512,512,2048],kernel_size=(3,3),strides=(2,2),with_conv_shortcut=True) x = Conv_Block(x,nb_filter=[512,512,2048],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[512,512,2048],kernel_size=(3,3)) x = AveragePooling2D(pool_size=(5,5))(x) # x = Dropout(0.9)(x) x = Flatten()(x) x = Dense(25,activation='softmax')(x) model = Model(inputs=inpt,outputs=x) sgd = SGD(decay=0.0001,momentum=0.9) model.compile(loss='categorical_crossentropy',optimizer=sgd,metrics=['accuracy']) model.summary() 神经网络如上所示,只运行到上面的这个步骤,我的12G显存就占了11G多。 然后下一步 print('Training ------------') # training the model,加上shuffle=True,要不然可能会overfit。只要validation随着acc上升,说明模型就没问题。 # model_load('my_model_resnet.h5') model.fit(X_train, y_train, validation_split = 0.2,shuffle = True,epochs=50, batch_size=64) NotFoundError: No algorithm worked! [[node model/conv2d/Relu (defined at <ipython-input-17-25b86b2dbd84>:5) ]] [Op:__inference_train_function_12948] Function call stack: train_function   在ubuntu系统里面。我用的是3060显卡,已经安装了显卡驱动,CUDA驱动,cuDNN驱动等。 为什么会出现这个情况,还没training显存就占满了。求助,麻烦各位大佬帮忙。

  • 0

    回答

  • 6

    浏览

有人知道stargan-v2换数据怎么换吗?会的来报价

  • 3

    回答

  • 23

    浏览

随机森林等模型,简单的从sklearn库调用的,同样的参数,数据集,某一天开始结果变得很差,找了很久了,完全找不到原因。 def Preprocess(CSV1, CSV2, shuffle=True, SDAE=False, MMN=False, Smote=False): df1 = pd.read_csv(CSV1) # 第一个数据集 # 除了最后一列的数据 X_train = df1.iloc[:, :-1] # 读取最后一列的数据 y_train = df1.iloc[:, -1] # 第二个数据集 df2 = pd.read_csv(CSV2) # 除了最后一列的数据 X_test = df2.iloc[:, :-1] # 读取最后一列的数据 y_test = df2.iloc[:, -1] if Smote: smo = SMOTE(sampling_strategy='auto', random_state=10) X_train, y_train = smo.fit_resample(X_train, y_train) if MMN: X_train = preprocessing.minmax_scale(X_train, feature_range=(0, 1), axis=0, copy=True) # 直接用标准化函数 X_test = preprocessing.minmax_scale(X_test, feature_range=(0, 1), axis=0, copy=True) # 直接用标准化函数 if SDAE: X_train, X_test = score.SDAE(X_train, X_test) return X_train, X_test, y_train, y_test def metric_standards(y_test, y_predict, y_0=None, cal_weight=None): # 无权重 accuracy = metrics.accuracy_score(y_test, y_predict) # 预测准确率输出 precision = metrics.precision_score(y_test, y_predict, zero_division="warn") # 预测宏平均精确率输出 recall = metrics.recall_score(y_test, y_predict) # 预测宏平均召回率输出 f1_scroe = metrics.f1_score(y_test, y_predict) # 预测平均f1-score输出 if y_0 is not None: false_positive_rate, true_positive_rate, thresholds = metrics.roc_curve(y_test, y_0) # 计算AUC值 roc_auc = metrics.auc(false_positive_rate, true_positive_rate) y_pred_class = true_positive_rate > thresholds else: roc_auc = 0 return accuracy, precision, recall, f1_scroe, roc_auc def ranforest(X_train, X_test, y_train, y_test, n_estimators=100, random_state=66, n_jobs=-1): cls = RandomForestClassifier(n_estimators=96, max_depth=17, min_samples_split=43, min_samples_leaf=5, n_jobs=n_jobs) # SeleFea(cls, X_train, y_train) cls.fit(X_train, y_train) y_pre_proba = cls.predict_proba(X_test) y_predict = cls.predict(X_test) y_0 = list(y_pre_proba[:, 1]) print('n_estimators = {}, random_state = {}'.format(n_estimators, random_state)) accuracy, precision, recall, f1_scroe, roc_auc = metric_standards(y_test, y_predict, y_0) return accuracy, precision, recall, f1_scroe, roc_auc  

  • 1

    回答

  • 28

    浏览

比如有多个类似的限制条件 例如对于5list里面的每一i数,都要要求f(1,2,3,4,5)>=0 可以再里面用循环写吗? 或者其他的方式?

  • 0

    回答

  • 4

    浏览

在那本机器学习红皮书上看到的,没理解,求大神解答

  • 4

    回答

  • 39

    浏览

比如我有一组数据 60 20 36 78 95 35 10 6 68 63 82 30 我想选择大于50的数据,输出它所对应的行和列要怎么操作啊?

  • 5

    回答

  • 249

    浏览

import random import numpy as np import sympy arry11=np.zeros((20,20)) def func1_V2(data_distance): #计算V2 V1=np.zeros(20) W=np.zeros(20) a=np.zeros((20,20)) b=np.zeros((20,20)) c=np.zeros(20) V2=np.zeros(20) for i in range(20): for j in range(20): a[i][j]=0.2061*sympy.exp((-10.229) * (data_distance[i][j])-1) V1[i]=V1[i]+a[i][j] for i in range(20): for j in range(20): b[i][j]=1.79*sympy.exp((-2*4.036) * (data_distance[i][j])-1) V2[i]=V2[i]+b[i][j] return V2 def func_distance(arry11): #原子之间的距离公式 data_distance = np.zeros((20, 20)) for i in range(20): for j in range(20 - i): data_distance[i][j] = sympy.sqrt((arry11[i][0] - arry11[j][0]) ** 2 + (arry11[i][1] - arry11[j][1]) ** 2 + (arry11[i][2] - arry11[j][2]) ** 2) return data_distance def func_gradient(data_distance): #计算梯度 V2=func1_V2(data_distance) r1 = np.zeros((20, 20)) r2 = np.zeros((20, 20)) r3 = np.zeros((20, 20)) for i in range(20): for j in range(20): r1[i][j] = 0.2061 * (-10.229) * sympy.exp((-10.229) * (data_distance[i][j] - 1)) r2[i][j] = 0.5 * V2[i] * 1.79 * (-2 * 4.036) * sympy.exp((-2 * 4.036) * (data_distance[i][j] - 1)) r3[i][j] = r1[i][j] - r2[i][j] return r3 def func1_Gupta(data_distance): #计算Gupta势能 V1=np.zeros(20) a=np.zeros((20,20)) b=np.zeros((20,20)) c=np.zeros(20) V2=np.zeros(20) V3 = np.zeros(20) V_W=0 for i in range(20): for j in range(20): a[i][j]=0.2061*sympy.exp((-10.229) * (data_distance[i][j])-1) V1[i]=V1[i]+a[i][j] for i in range(20): for j in range(20): b[i][j]=1.79*sympy.exp((-2*4.036) * (data_distance[i][j])-1) V2[i]=V2[i]+b[i][j] for i in range(20): V3[i]=sympy.sqrt(V2[i]) for i in range(20): c[i]=V1[i]-V3[i] for i in range(20): V_W=V_W+c[i] return V_W def DFP(arry11): #局部极小化 a=sympy.symbols("a") r=np.zeros((20,20)) r1=np.zeros((20, 20)) H=np.eye(20) Nolmprove=0 data_distance1=np.zeros((20,20)) data_distance1=func_distance(arry11) while True: r1=func_gradient(data_distance1) p=-np.dot(H,r1) arry11 = arry11 for i in range(20): for j in range(3): arry11[i][j]=arry11[i][j]+a alpha=func1_Gupta(arry11) difyL_a = sympy.diff(alpha, a) aa = sympy.solve([difyL_a], [a]) a = aa.get(a) arry11_1=arry11 arry11=arry11+np.dot(a,p) arry11_2=arry11 data_distance2=func_distance(arry11) r2=func_gradient(data_distance2) for i in range(20): for j in range(20): r[i][j]=r2[i][j]-r1[i][j] arry11_3=arry11_2-arry11_1 H1=np.dot(arry11_3,np.transpose(arry11_3))/np.dot(np.transpose(arry11_3),r)-np.dot(H,np.dot(r,np.transpose(np.dot(H,r))))/np.dot(transpose(r),np.dot(H,r)) H=H+H1 Nolmprove=Nolmprove+1 if Nolmprove ==40: return arry11 break else: continue if __name__ == '__main__': print("开始运行!") arry11=np.zeros((20,20)) for i in range(20): for j in range(20): arry11[i][j]=random.uniform(-2,2) data_distance_1 = np.zeros(20) arry11=DFP(arry11) arry11_1 = arry11 data_distance = func_distance(arry11) Gupta1 = func1_Gupta(data_distance) max_1 = 0 max_distance = 0 x = 0 y = 0 z = 0 x_list =np.zeros(20) y_list =np.zeros(20) z_list =np.zeros(20) while True: for i in range(20): for j in range(3): arry11[i][j] = arry11[i][j] + random.uniform(-0.5, 0.5); for i in range(20): x_list[i] = arry11[i][0] x = x + x_list[i] y_list[i] = arry11[i][1] y = y + y_list[i] z_list[i] = arry11[i][2] z = z + z_list[i] x = x / 20 y = y / 20 z = z / 20 xyz = np.zeros(3) for i in range(20): data_distance[i] = sympy.sqrt((arry11[i][0] - x) ** 2 + (arry11[i][1] - y) ** 2 + (arry11[i][2] - z) ** 2); if data_distance >= max_distance: max_distance = data_distance xyz[0] = arry11[i][0] xyz[1] = arry11[i][1] xyz[2] = arry11[i][2] a = i arry11[a][0] = xyz[0] + random.uniform(-0.3, 0.3) arry11[a][1] = xyz[1] + random.uniform(-0.3, 0.3) arry11[a][2] = xyz[2] + random.uniform(-0.3, 0.3) arry11 = DFP(arry11) data_distance = func_distance(arry11) Gupta2 = func1_Gupta(data_distance) if Gupta2 <= Gupta1: Gupta1 = Gupta2 arry11 = arry11 arry11_1 = arry11 max_1 = max_1 + 1 continue else: arry11 = arry11_1 # 回到初始 if max_1 == 30: print(arry11) break else: continue 子程序返回数组后,主程序调用该数据时显示类型错误,求大佬解答

  • 3

    回答

  • 26

    浏览

使用yolov3训练模型 coco_classes.txt和voc_classes.txt在训练前就已经改好了出现了这个错误,我的文件已经训练好了,不知道是哪一步出现了问题

  • 2

    回答

  • 20

    浏览

用Pycharm实现梯度下降法求余弦曲线的极小值出现了line 13, in <module>     while deta_h< error_rate: TypeError: '<' not supported between instances of 'tuple' and 'float' 这是代码 ,怎么改,求解,快哭了,小白 import math import numpy as np def f(x): return 0.03693* math.sin ( 1.165 *x- 1.538 ), math .sin (1.165 *x- 1.538) def h(x): return 1.165*0.03693*math.cos(1.165*x-1.538), math .cos (1.165 *x- 1.538) a = 4.32517219917013 # 初始点(初始横坐标) step = 0.01 #(步长) count =0 # 记录迭代次数 deta_h = h(a) # a更新前后的差值(初始值设定为起始点,也可以设置为大于阈值的任意的数) error_rate = 1e-5 # 给定的阈值 while deta_h< error_rate: # 这里怎么改,求解。 b = a-step* h(a) # 更新a,用新的变量接收 deta_h = np.abs(h(b)-h(a)) count += 1 a = b-step*h(b) y = f( b ) print('迭代次数%d'%count) print(a) print(y) print('极值点为(%f,%f)'%(a, y ))  

  • 3

    回答

  • 35

    浏览

我想要把这个表格按照酒店名称和入住房型做聚类,去掉评论日期,把评论详情合并到一起。我利用pandas的聚合处理只能算出每个聚类有多少条评论。各位大神有什么方法么?

  • 1

    回答

  • 33

    浏览

Traceback (most recent call last):line 24 , in <module>     deta_h = np.abs(h(b) - h(a))    line 12, in h     - 0.000888 * 3 * x ** 2 + 0.03487 * 2 * x + 0.00239 OverflowError: (34, 'Result too large') 小白求解,应该怎么改   import numpy as np def f(x): return -0.0007348 * x ** 8 - 0.0003244 * x ** 7 + 0.004126 * x ** 6 + 0.001076 * x ** 5 \ - 0.00814 * x ** 4 - 0.0008882 * x ** 3 + 0.03487 * x ** 2 + 0.002393 * x + 0.1168 def h(x): return -0.0007348 * 8 * x ** 7 - 0.0003244 * 7 * x ** 6 + 0.004126 * 6 * x ** 5 \ + 0.001076 * 5 * x ** 4 - 0.00814 * 4 * x ** 3 \ - 0.000888 * 3 * x ** 2 + 0.03487 * 2 * x + 0.00239 //这里数据太大怎么解决 a = 113109036.345448 # 初始点(初始横坐标) step = 1 # (步长) count: int = 0 # 记录迭代次数 deta_h = h(a) # a更新前后的差值(初始值设定为起始点,也可以设置为大于阈值的任意的数) print(deta_h) error_rate = 1e-5 # 给定的阈值 while deta_h < error_rate: b = a - step * h(a) # 更新a,用新的变量接收 deta_h = np.abs(h(b) - h(a)) //这也有问题? count += 1 a = b - step * h(b) y = f(b) print('迭代次数%d' % count) print(a) print(y) print('极值点为(%f,%f)' % (a, y))  

  • 2

    回答

  • 30

    浏览

最近在搞机器学习方面的问题,在python虚拟环境下导matplotlib库,但是import后找不到模块 错误如下: 一开始用的pip install matplotlib ,后来又试了试清华和豆瓣的源都不行。所以请教一下大佬们

  • 1

    回答

  • 35

    浏览

求助,我现在在做一个类似二分类的算法,通过学习我获得了一个指标我这里称其为A,那么我的评判标准就是A>T,则是类别1,A<T为类别二。但是呢,我发现这个分类效果并不好,正确分类的准确度太低,于是我就开始想办法,突然我通过数学公式推导出这个指标A在计算过程中可以拆分为A=B*C(过程我就不给了),于是我想这个公式表明A是另外两个指标的乘积(B和C我也做实验了,实验证明B和C也可以做为二分类的指标,即B>T1为类别1,B<T1为类别2,C同理,但是呢B的分类准确度确比A和C好许多),那么我将它拆分一下,变成A1=k1*B+k2*C (k1和k2是参数,可以调参),最终我得出这种线性组合的方式得到的指标A1比最开始的A在分类效果上好许多,现在我想问的是这个该如何解释呢(为什么A1比A好),用什么理论解释呢?还请各路大神帮忙解答一下。

  • 1

    回答

  • 29

    浏览

通过学习书本,想实现对这个维度数组进行kmean聚类,通过for循环查看聚类数为多少是最合理的? 这是通过学习书本上,了解到鸢尾花的模型,这个是随机的模型,查看到iris_data与iris_target都是直接获取的,但是我这个源数据如何获取相对应的数据,源数据本身可以使用data代替,求大神解答 from sklearn.metrics import fowlkes_mallows_score for i in range(2,7): #构建并训练模型 kmeans = KMeans(n_clusters=i,random_state=123).fit(iris_data) score = fowlkes_mallows_score(iris_target,kmeans.labels_) print('iris数据聚%d类FMI评价分值为:%f'%(i,score)) 这个是我的源数据,其中NAN已经处理了,可以忽略 incomeperperson internetuserate urbanrate country Afghanistan NaN 3.654122 24.04 Albania 1914.996551 44.989947 46.72 Algeria 2231.993335 12.500073 65.22 Andorra 21943.339900 81.000000 88.92 Angola 1381.004268 9.999954 56.70 ... ... ... ... Vietnam 722.807559 27.851822 27.84 West Bank and Gaza NaN 36.422772 71.90 Yemen, Rep. 610.357367 12.349750 30.64 Zambia 432.226337 10.124986 35.42 Zimbabwe 320.771890 11.500415 37.34 这是我修改后的,但是报错了,这是报错内容:labels_true must be 1D: shape is (182, 3)。没有数据标签集,我理解的标签集是国家名,这个理解是不是错误了?求大神解答 num = data.iloc[:,1:] print(num) from sklearn.metrics import fowlkes_mallows_score for i in range(2,7): #构建并训练模型 kmeans = KMeans(n_clusters=i,random_state=123).fit(num) score = fowlkes_mallows_score(num,kmeans.labels_) print('数据聚%d类FMI评价分值为:%f'%(i,score))    

  • 0

    回答

  • 11

    浏览

我现在abcd,四种data。 a数据集最小 b数据集中等 c数据集比b还要大一些 d数据集最大 同样的ql用这四个数据集分别跑,a和d看得到收敛的梯度向上的,b和c的反而向下了,请问各位大佬是算法问题吗我明明有核对过很多遍state和action了应该没错,会有可能收敛是受数据集影响吗

  • 1

    回答

  • 23

    浏览

考虑区域Z,其中0≤x <1,0≤y <1。考虑Z→{true,false}的二元类分类问题。   假设如果x <0.5则为true,如果x> = 0.5则为false。   作为训练数据,从真实范围中随机选择n个点,从错误范围中随机选择n个点。   作为泛化性能,从Z中随机选择100个点,并获得估计结果与真实值之间的匹配率。   (1)当n = 2时,尝试100次实验得出1-NN的泛化性能。(匹配率的平均值和标准偏差。   (2)当n = 10时,尝试100次试验以找出1-NN的泛化性能。(匹配率的平均值和标准偏差。   (3)假设n = 10,但十分之一是噪声。     此时,比较1-NN的泛化性能和3-NN的泛化性能。尝试100次表示出结果。 求相关设计思路和代码

  • 1

    回答

  • 12

    浏览

Traceback (most recent call last):   File "E:/Con-GAE-main/src/train.py", line 114, in <module>     train_loader = DataLoader(train_dataset, **params)   File "E:\Anaconda3\lib\site-packages\torch_geometric\data\dataloader.py", line 43, in __init__     super(DataLoader,   File "E:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 266, in __init__     sampler = RandomSampler(dataset, generator=generator)  # type: ignore   File "E:\Anaconda3\lib\site-packages\torch\utils\data\sampler.py", line 102, in __init__ import os import torch import numpy as np from torch_geometric.data import Data from sklearn.metrics import roc_curve, auc import argparse import os import sys import random import torch.nn as nn from torch.utils import data import pandas as pd import matplotlib.pyplot as plt from random import shuffle import pickle import torchvision.transforms as transforms import time from torch_geometric.data import InMemoryDataset, Dataset, Data, DataLoader import math from data_util import ConTrafficGraphDataset as trafficDataset from model import ConGAE,ConGAE_t, ConGAE_sp, deepConGAE parser = argparse.ArgumentParser() # model parser.add_argument('--model', default = 'ConGAE', help = 'Model type: ConGAE, ConGAE_t, ConGAE_sp, deepConGAE') # training parameters parser.add_argument('--randomseed', type=int, default = 1) parser.add_argument('--train_epoch', type =int, default = 150 , help = 'number of training epochs') parser.add_argument('--lr', default = 5e-5 , help = 'learning rate') parser.add_argument('--dropout_p', default = 0.2 , help = 'drop out rate') parser.add_argument('--adj_drop', default = 0.2 , help = 'edge dropout rate') parser.add_argument('--verbal', default = False, type = bool , help = 'print loss during training') # 2-layer ConGAE parameters parser.add_argument('--input_dim', type=int, default = 4, help = 'input feature dimension') parser.add_argument('--n_nodes', type=int, default = 50, help = 'total number of nodes in the graph') parser.add_argument('--node_dim1', type=int, default = 300, help = 'node embedding dimension of the first GCN layer') parser.add_argument('--node_dim2', type=int, default = 150, help = 'node embedding dimension of the second GCN layer') parser.add_argument('--encode_dim', type=int, default = 150, help = 'final graph embedding dimension of the Con-GAE encoder') parser.add_argument('--hour_emb', type=int, default = 100, help = 'hour emnbedding dimension') parser.add_argument('--week_emb', type=int, default = 100, help = 'week emnbedding dimension') parser.add_argument('--decoder', type=str, default = 'concatDec', help = 'decoder type:concatDec, bilinearDec') # deepConGAE parameters parser.add_argument('--hidden_list', nargs="*", type=int, default = [300, 150], help = 'the node embedding dimension of each layer of GCN') parser.add_argument('--decode_dim', type=int, default = 150, help = 'the node embedding dimension at decoding') # files parser.add_argument('--log_dir', default = '../log/' , help = 'directory to save model') args = parser.parse_args() print(args) #Reproducability np.random.seed(seed=args.randomseed) random.seed(args.randomseed) torch.manual_seed(args.randomseed) result_dir = args.log_dir + 'results/' if not os.path.exists(result_dir): os.makedirs(result_dir) # ## load data dirName = "../data/selected_50_orig/" with open(dirName + 'partition_dict', 'rb') as file: partition = pickle.load(file) # item_d: whihc time slice each id correspond to with open(dirName + 'item_dict', 'rb') as file: item_d = pickle.load(file) node_X = np.load(dirName + 'node_X.npy') node_posx = np.mean(node_X[:, :2], 1) node_posy = np.mean(node_X[:, 2:], 1) node_X = torch.from_numpy(node_X).float() tt_min, tt_max =np.load(dirName + 'tt_minmax.npy' ) start_time = 0+24*23 end_time = 23+24*27 # reset partition all_data = partition['train'] + partition['val'] partition_test = all_data[350:750] # includes NFL, 400 points, 30% are NFL partition_val = all_data[:150] partition_train = all_data[150:350] + all_data[750:] # the rest source_dir = dirName # full sample size (~2000) # Parameters params = {'batch_size': 10, 'shuffle': True, 'num_workers': 0} params_val = {'batch_size': 10, 'shuffle': False, 'num_workers': 0} root = '../data/selected_50_pg/root/' # data loaders train_dataset = trafficDataset(root, partition_train, node_X, item_d, source_dir ) test_dataset = trafficDataset(root, partition_test, node_X, item_d, source_dir) val_dataset = trafficDataset(root, partition_val, node_X, item_d, source_dir) train_loader = DataLoader(train_dataset, **params) test_loader = DataLoader(test_dataset,**params_val ) val_loader = DataLoader(val_dataset,**params_val ) # ## load model device = torch.device("cuda" if torch.cuda.is_available() else "cpu") if args.model == 'ConGAE_sp': model = ConGAE_sp(args.input_dim, args.node_dim1,args.node_dim2, args.dropout_p,args.adj_drop, decoder = args.decoder, n_nodes = args.n_nodes) if args.model == 'ConGAE_t': model = ConGAE_t(args.input_dim, args.node_dim1,args.node_dim2, args.dropout_p,args.adj_drop,decoder = args.decoder, hour_emb = args.hour_emb, week_emb = args.week_emb,n_nodes = args.n_nodes) if args.model == 'ConGAE': model = ConGAE(input_feat_dim=args.input_dim, node_dim1 =args.node_dim1, node_dim2=args.node_dim2, encode_dim = args.encode_dim ,hour_emb = args.hour_emb, week_emb = args.week_emb, n_nodes = args.n_nodes) if args.model == 'deepConGAE': model = deepConGAE(args.input_dim, hidden_list = args.hidden_list, encode_dim = args.encode_dim,decode_dim = args.decode_dim, dropout = args.dropout_p, adj_drop = args.adj_drop, hour_emb = args.hour_emb, week_emb = args.week_emb,n_nodes = args.n_nodes) model.float() # specify optimizer optimizer = torch.optim.Adam(model.parameters(), lr=args.lr) criterion = nn.MSELoss() def calc_rmse(recon_adj, adj, tt_min, tt_max): adj = adj * (tt_max - tt_min) + tt_min recon_adj = recon_adj * (tt_max - tt_min) + tt_min rmse = criterion(recon_adj, adj) return torch.sqrt(rmse) def train(epoch, train_loader ,test_loader, best_val): model.train() train_loss = 0 loss_train = [] loss_val = [] for graph_data in train_loader: graph_data = graph_data.to(device) optimizer.zero_grad() if args.model == 'ConGAE_sp': recon = model(graph_data.x, graph_data.edge_index, graph_data.edge_attr) else: recon = model(graph_data.x, graph_data.edge_index, graph_data.edge_attr,graph_data.hour, graph_data.week) loss = criterion(recon, graph_data.edge_attr) loss.backward() optimizer.step() loss_train.append(loss.item()) for graph_val in val_loader: # evaluation model.eval() graph_val = graph_val.to(device) with torch.no_grad(): if args.model == 'ConGAE_sp': recon_val = model(graph_val.x, graph_val.edge_index, graph_val.edge_attr) else: recon_val = model(graph_val.x, graph_val.edge_index, graph_val.edge_attr, graph_val.hour, graph_val.week) mse_val = criterion(recon_val, graph_val.edge_attr) loss_val.append(mse_val.item()) loss_train = sum(loss_train) / len(loss_train) loss_val = sum(loss_val) / len(loss_val) # print results if args.verbal and epoch % 15 == 0: print('Train Epoch: {} loss: {:e} val_loss: {:e}'.format( epoch, loss_train, loss_val )) rmse = math.sqrt(loss_val) * (tt_max - tt_min) print('validation travel time rmse mean: {:e}'.format(rmse)) # early-stopping if loss_val < best_val: torch.save({ 'epoch' : epoch, 'model_state_dict': model.state_dict(), 'optimizer_state_dict': optimizer.state_dict(), }, model_path) best_val = loss_val return loss_train, loss_val, best_val # ## Train loss_track = [] val_track = [] model = model.to(device) n_epochs = args.train_epoch start = time.time() best_val = float('inf') # for early stopping model_path = args.log_dir + args.model + '.pt' lr_decay_step_size = 100 for epoch in range(1, n_epochs+1): train_loss, val_loss, best_val = train(epoch, train_loader, val_loader, best_val) loss_track.append(train_loss) val_track.append(val_loss) if epoch % lr_decay_step_size == 0: for param_group in optimizer.param_groups: param_group['lr'] = 0.5 * param_group['lr'] print("time for {} epochs: {:.3f} min".format(n_epochs, (time.time() - start)/60)) # plot learning curve plt.plot(np.array(loss_track), label = 'traiing') plt.plot(np.array(val_track), label = 'validaiton') plt.title("loss") plt.xlabel("# epoch") plt.ylabel("MSE loss") plt.legend() # plt.ylim(0.4, 1) # plt.show() plt.savefig(result_dir + args.model +"_training_curve.png") # save args config with open(args.log_dir + args.model + '_args.pkl', 'wb') as fp: pickle.dump(args, fp)     if not isinstance(self.num_samples, int) or self.num_samples <= 0:   File "E:\Anaconda3\lib\site-packages\torch\utils\data\sampler.py", line 110, in num_samples     return len(self.data_source) TypeError: 'NoneType' object cannot be interpreted as an integer

  • 1

    回答

  • 19

    浏览

进行机器学习时,数据集包含多个子数据包 然后训练时,得到的结果就按照数据集分段了 请问如何训练或者设计才能让结果变成若干(即测试集数目)条从0开始的曲线,而不是连接在一起的折线?

  • 4

    回答

  • 22

    浏览

K均值聚类算法,谱聚类算法,Viola-Jones算法 这三种算法怎么通过对比分析找到它们各自在不同应用领域的优势,若应用于输油管道巡检的漏油点识别应该如何进行优化

  • 1

    回答

  • 13

    浏览

有没有比较完善的导入数据集就可以进行分析的KNN算法程序,我所能找到的都是直接导入sklearn包中自带的数据集实现的,我想要用KNN做一个简单分析,已经做出了一个数据集,但是我把那些代码进行修改,导入我自己的数据之后就会报错,我不知道是我的代码有问题还是我整理出的数据格式有问题。所以我想问问有没有不通过导入sklearn包中自带的数据集实现,通过导入自己数据实现的KNN程序,还想了解一下数据集的格式要求

  • 3

    回答

  • 21

    浏览

麻烦大佬帮忙看一下生什么情况?

  • 2

    回答

  • 20

    浏览

比如我现在有[1,2,3,4,5,6]这个数组, 如何将该数组中随机n个数字都+1, 比如将原有数组变成[2,2,4,4,5,6], 这里面就随机到了第0第2个数加一了。