主成分分析降维会影响到机器学习的精度么? 5C

图片说明
我使用五种方法同时对原始数据和主成分分析PCA处理之后的数据进行分析,并且进行回判和预测,发现SVM和神经网络前后变化不大,但是XGBoost、AdaBoost以及Bayes的成功率反而有所降低,请问是不是因为这几个方法不适合主成分分析降维?

2个回答

这个和你的数据的关联度有关。做了PCA降维以后,那些关联性比较小的被你剔除了,它们或多或少包含了一些信息也就丢失了。那么或多或少会影响精度。
但是从另外一个角度看,如果你的计算规模大幅缩小,那么你机器学习的效率就提高了,在给定的有限时间和成本上,学习效率提高,你反倒可以得到更好的效果。

打一个比方,我们要挖金子,你说是先探明哪里有矿再挖掘,找到的金子多,还是漫无目的挖掘找的多?这个问题得这么看:
如果你有无限多的时间、不计成本,你不管哪里有矿没矿,把整个地球全部挖一遍,肯定得到的金子最多,因为即便不是矿的地方,多少也能挖一点点非常微量的金子,甚至海水中都溶解着金子。
但是现实中,我们不是有无穷多的矿工,无穷多的时间,那么先找到哪里有矿,在矿场挖,在给定的成本里,肯定比随便挖挖到的金子多。

主成分析是数据处理过程,和使用那种模型没有直接的关系,成功率下降的原因:1,处理后的模型需要重新训练调优你没做 2:数据纬度和数据集尺寸和选用的不太适合所选用的模型 3、主成分析会导致一部分的信息丢失(但一般使用主成分析处理数据是在保证不影响判别性能是采用的) 。。。等等其他原因。
但可以确定的是:主成分析肯定使用于以上你所列的所有模型算法

Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
其他相关推荐
核主成分KPCA的主成分是不是与原始数据有一样的变化趋势?
拥有两个特征的原始数据。每个特征的数据分布如下图![图片说明](https://img-ask.csdn.net/upload/202001/05/1578213682_163944.png). 采用核主成分分析,得到第一个主成分。主要代码如下: kpca = KernelPCA(n_components=1, kernel='rbf', gamma=15) x_kpca = kpca.fit_transform(data) 然后对主成分x_kpca 进行绘图。图形如下:![图片说明](https://img-ask.csdn.net/upload/202001/05/1578213803_411710.png) 。请问各位大神,为什么会是这样呢,是我设置问题吗?
matlab中的主成分分析谁会编程序啊啊啊??
matlab中的主成分分析谁会编程序啊啊啊??matlab中的主成分分析谁会编程序啊啊啊??
如何用MATLAB实现主成分分析的三维矩阵到二维矩阵的转换?急
求完整代码 输入三维矩阵输出二维矩阵 尽量有注释 谢谢谢谢谢谢谢谢
spss 主成分分析与因子分析
某超市内影响咖啡销量的因素分析 某大型超市为了确定影响咖啡销量的因素,决定在一段时间内对店内的5中咖啡的销量进行统计。特用23中属性来描述5中咖啡,如下页数据表,而参与调查的顾客则针对每个属性,依据个人喜好来判断是否会因为某一属性而选择某种咖啡。 请回答: (1)影响咖啡销量的主要因素有哪些? (2)如果想改善某一种咖啡的销量,应该从哪些方面入手?![图片说明](https://img-ask.csdn.net/upload/201606/04/1465039086_860850.png)
小数化分数 的问题的实现
Problem Description Ray 在数学课上听老师说,任何小数都能表示成分数的形式,他开始了化了起来,很快他就完成了,但他又想到一个问题,如何把一个循环小数化成分数呢? 请你写一个程序不但可以将普通小数化成最简分数,也可以把循环小数化成最简分数。 Input 第一行是一个整数N,表示有多少组数据。 每组数据只有一个纯小数,也就是整数部分为0。小数的位数不超过9位,循环部分用()括起来。 Output 对每一个对应的小数化成最简分数后输出,占一行。 Sample Input 3 0.(4) 0.5 0.32(692307) Sample Output 4/9 1/2 17/52
关于PCA主成分分析32个指标,10年的数据,能做吗??
为什么我用R语言显示错误,指标数不能多于数据啊???????????????????????????
java poi导出excel文档 后台报下标越界 但文档正常导出且损坏
public static SXSSFWorkbook createCSVUtil2(int startRow, String sheetName, SXSSFWorkbook wb, Map<String, List<CellModel>> cellListMap, Integer cellRowNum, List<LinkedHashMap> exportData, int[] columnWidth, int totalcolumn, GgreportingtemplateDto ggreportingtemplateDto, int countRow) throws Exception { // 设置表格名称 SXSSFSheet sheet = (SXSSFSheet) wb.createSheet(sheetName); sheet.trackAllColumnsForAutoSizing(); // 设置表格标题 一般都是第一行第一个格子 // 标题样式 if (ggreportingtemplateDto.getTitle() != null) { XSSFCellStyle cellStyleTitle = (XSSFCellStyle) wb.createCellStyle(); cellStyleTitle.setWrapText(true); SXSSFRow rowTitle = (SXSSFRow) sheet.createRow(0); XSSFFont fontTitle = (XSSFFont) wb.createFont(); fontTitle.setFontHeightInPoints((short) 16); cellStyleTitle.setFont(fontTitle); // 合并标题单元格 sheet.addMergedRegion(new CellRangeAddress(0, 0, 0, totalcolumn - 1)); for (int i = 0; i < totalcolumn; i++) { if (i == 0) { SXSSFCell cell = (SXSSFCell) rowTitle.createCell(0); cell.setCellValue(ggreportingtemplateDto.getTitle()); cell.setCellStyle(cellStyleTitle); } else { SXSSFCell cell = (SXSSFCell) rowTitle.createCell(i); cell.setCellStyle(cellStyleTitle); } } } if (ggreportingtemplateDto.getCondition() != null) { // 设置筛选条件格子 一般是第二行第一列 // 条件样式 XSSFCellStyle cellStyleCondition = (XSSFCellStyle) wb.createCellStyle(); cellStyleCondition.setWrapText(true); SXSSFRow rowCondition = (SXSSFRow) sheet.createRow(1); XSSFFont fontCondition = (XSSFFont) wb.createFont(); fontCondition.setFontHeightInPoints((short) 16); cellStyleCondition.setFont(fontCondition); // 合并条件单元格 sheet.addMergedRegion(new CellRangeAddress(1, 1, 0, 4)); for (int i = 0; i < 4; i++) { if (i == 0) { SXSSFCell cell = (SXSSFCell) rowCondition.createCell(0); cell.setCellValue(ggreportingtemplateDto.getCondition()); cell.setCellStyle(cellStyleCondition); } else { SXSSFCell cell = (SXSSFCell) rowCondition.createCell(i); cell.setCellStyle(cellStyleCondition); } } } // 设置表格通用样式 XSSFCellStyle cellStyle = (XSSFCellStyle) wb.createCellStyle(); // 设置边框 // 设置单元格内容水平对齐 cellStyle.setAlignment(XSSFCellStyle.ALIGN_CENTER); // 设置单元格内容垂直对齐 cellStyle.setVerticalAlignment(XSSFCellStyle.VERTICAL_CENTER); // 设置自动换行 cellStyle.setWrapText(true); // 设置双边框 XSSFCellStyle cellStyleDouble = (XSSFCellStyle) wb.createCellStyle(); // 设置边框 // 设置单元格内容水平对齐 cellStyleDouble.setAlignment(XSSFCellStyle.ALIGN_CENTER); // 设置单元格内容垂直对齐 cellStyleDouble.setVerticalAlignment(XSSFCellStyle.VERTICAL_CENTER); // 设置自动换行 cellStyleDouble.setWrapText(true); // 合并单元格 for (int t = 0; t < cellRowNum; t++) { SXSSFRow row = (SXSSFRow) sheet.createRow(t + startRow); List<CellModel> cellNameList = cellListMap.get(String.valueOf(t)); for (CellModel cellModel : cellNameList) { // 如果被合并的元素只占用了一个单元格则不进行合并 if (cellModel.getStartColumn() == cellModel.getEndColumn() && cellModel.getEndRow() == cellModel.getStartRow()) { continue; } sheet.addMergedRegion(new CellRangeAddress(cellModel.getStartRow(), cellModel.getEndRow(), cellModel.getStartColumn(), cellModel.getEndColumn())); } // create的cell Set<Integer> create = new HashSet<Integer>(); for (int i = 0; i < cellNameList.size(); i++) { CellModel cellModel = cellNameList.get(i); // 遍历插入表头 SXSSFCell cell = (SXSSFCell) row.createCell(cellModel.getStartColumn()); // 设置字体 XSSFFont font = (XSSFFont) wb.createFont(); font.setFontName("Times New Roman"); font.setFontHeightInPoints((short) 8); // 设置行高 if (i == 1) { row.setHeight((short) 500); } cellStyle.setFont(font); cell.setCellValue(cellModel.getCellName()); cell.setCellStyle(cellStyle); create.add(cellModel.getStartColumn()); } // 把所有没有create的cell都create for (int i = 0; i < totalcolumn; i++) { if (!(create.contains(i))) { // 设置字体 XSSFFont font = (XSSFFont) wb.createFont(); font.setFontName("Times New Roman"); font.setFontHeightInPoints((short) 8); cellStyle.setFont(font); SXSSFCell cell = (SXSSFCell) row.createCell(i); cell.setCellStyle(cellStyle); } } } for (LinkedHashMap hashMap : exportData) { SXSSFRow rowValue = (SXSSFRow) sheet.createRow(cellRowNum + startRow); Iterator<Map.Entry> iteratorRow = hashMap.entrySet().iterator(); while (iteratorRow.hasNext()) { Map.Entry entryRow = iteratorRow.next(); Integer key = Integer.valueOf(entryRow.getKey().toString()); String value = ""; if (entryRow.getValue() != null) { value = entryRow.getValue().toString(); } else { value = ""; } SXSSFCell cellValue = (SXSSFCell) rowValue.createCell(key - 1); // 设置字体 XSSFFont font = (XSSFFont) wb.createFont(); font.setFontName("Times New Roman"); font.setFontHeightInPoints((short) 8); cellValue.setCellValue(value); cellStyle.setFont(font); cellValue.setCellStyle(cellStyle); } cellRowNum++; } // 设置备注栏 List<String> reMarkList = ggreportingtemplateDto.getRemake(); if (reMarkList != null) { for (int i = 0; i < reMarkList.size() + 1; i++) { int rowNum = i + startRow + cellRowNum + 1; XSSFCellStyle cellStyleRemake = (XSSFCellStyle) wb.createCellStyle(); SXSSFRow rowRemake = (SXSSFRow) sheet.createRow(rowNum); XSSFFont fontRemake = (XSSFFont) wb.createFont(); fontRemake.setFontHeightInPoints((short) 8); cellStyleRemake.setFont(fontRemake); String cellValue = "Notes:"; // 合并条件单元格 if (i != 0) { sheet.addMergedRegion(new CellRangeAddress(rowNum, rowNum, 0, 4)); cellValue = reMarkList.get(i - 1); SXSSFCell cell = (SXSSFCell) rowRemake.createCell(0); cell.setCellValue(cellValue); cell.setCellStyle(cellStyleRemake); } else { SXSSFCell cell = (SXSSFCell) rowRemake.createCell(0); cell.setCellValue(cellValue); cell.setCellStyle(cellStyleRemake); } } } // 设置列宽 if (columnWidth != null) { for (int i = 0; i < columnWidth.length; i++) { sheet.setColumnWidth(i, columnWidth[i]); } } return wb; } 以上是我生成sheet表的内容 //exceptional Rpt public SXSSFWorkbook exceptionalRpt (GgreportingtemplateDto ggreportingtemplateDto, String startInputDate,String endInputDate) throws Exception { // 设置最大数据行数 SXSSFWorkbook wb = new SXSSFWorkbook(5050);//这里之前看到是说能设置这个表格最多能含有多少数据 Map<String, List<CellModel>> map = new HashMap<String, List<CellModel>>(); // 设置数据 // 第一级表头 List<CellModel> firstRow = new ArrayList<CellModel>(); CellModel cellModel1 = new CellModel(); // 总占用1行 Integer cellRow = 1; cellModel1.setCellName("Member English Name"); cellModel1.setStartRow(0); cellModel1.setEndRow(0); cellModel1.setStartColumn(0); cellModel1.setEndColumn(0); CellModel cellModel2 = new CellModel(); cellModel2.setCellName("Member Chinese Name"); cellModel2.setStartRow(0); cellModel2.setEndRow(0); cellModel2.setStartColumn(1); cellModel2.setEndColumn(1); CellModel cellModel3 = new CellModel(); cellModel3.setCellName("Staff No."); cellModel3.setStartRow(0); cellModel3.setEndRow(0); cellModel3.setStartColumn(2); cellModel3.setEndColumn(2); CellModel cellModel4 = new CellModel(); cellModel4.setCellName("HKID/Passport No."); cellModel4.setStartRow(0); cellModel4.setEndRow(0); cellModel4.setStartColumn(3); cellModel4.setEndColumn(3); CellModel cellModel5 = new CellModel(); cellModel5.setCellName("Sex"); cellModel5.setStartRow(0); cellModel5.setEndRow(0); cellModel5.setStartColumn(4); cellModel5.setEndColumn(4); CellModel cellModel6 = new CellModel(); cellModel6.setCellName("Plan Code"); cellModel6.setStartRow(0); cellModel6.setEndRow(0); cellModel6.setStartColumn(5); cellModel6.setEndColumn(5); CellModel cellModel7 = new CellModel(); cellModel7.setCellName("Plan Description"); cellModel7.setStartRow(0); cellModel7.setEndRow(0); cellModel7.setStartColumn(6); cellModel7.setEndColumn(6); CellModel cellModel8 = new CellModel(); cellModel8.setCellName("Effective Date"); cellModel8.setStartRow(0); cellModel8.setEndRow(0); cellModel8.setStartColumn(7); cellModel8.setEndColumn(7); CellModel cellModel9 = new CellModel(); cellModel9.setCellName("Status"); cellModel9.setStartRow(0); cellModel9.setEndRow(0); cellModel9.setStartColumn(8); cellModel9.setEndColumn(8); firstRow.add(cellModel1); firstRow.add(cellModel2); firstRow.add(cellModel3); firstRow.add(cellModel4); firstRow.add(cellModel5); firstRow.add(cellModel6); firstRow.add(cellModel7); firstRow.add(cellModel8); firstRow.add(cellModel9); map.put("0", firstRow); // 业务数据 List<LinkedHashMap> exportData = new ArrayList<LinkedHashMap>(); GuPolicyCopyItemMedgMembDto guPolicyCopyItemMedgMembDto = new GuPolicyCopyItemMedgMembDto(); List<GuPolicyCopyItemMedgMembDto> list = ServiceManager.prpall.getGuPolicyCopyItemMedgMembService().findALL(guPolicyCopyItemMedgMembDto, null);//这里查询出来是295条 //增加数据 用这端代码增加5000条数据 /*for (int i = 0; i < 5000; i++) { list.add((GuPolicyCopyItemMedgMembDto)list.get(0)); }*/ int countRow = 0; for (int i = 0; i < list.size(); i++) { GuPolicyCopyItemMedgMembDto data = (GuPolicyCopyItemMedgMembDto) list.get(i); LinkedHashMap<String,String> row = new LinkedHashMap<String,String>(); String struts = ""; String ename = ""; String cname = ""; String idno = ""; String staffno = ""; String sex = ""; String rationType = ""; String rationName = ""; String entryDate = ""; String chineseName = ""; String hKIDandPassportNo = ""; if(data.getFlag()==null||data.getFlag().equals("I")) { struts = "Add"; }else if(data.getFlag().equals("U")){ struts = "Amendment"; }else if(data.getFlag().equals("D")){ struts = "Deletion"; }else if(data.getFlag().equals("B")){ struts = "Cancellation"; } if(data.getClientEName() != null&&!(data.getClientEName().equals(""))) { ename = data.getClientEName(); } if(data.getClientCName() != null&&!(data.getClientCName().equals(""))) { cname = data.getClientCName(); } if(data.getIdNo() != null&&!(data.getIdNo().equals(""))) { idno = data.getIdNo(); } if(data.getStaffNo() != null&&!(data.getStaffNo().equals(""))) { staffno = data.getStaffNo(); } if(data.getSex() != null&&!(data.getSex().equals(""))) { sex = data.getSex(); } if(data.getRationType() != null&&!(data.getRationType().equals(""))) { rationType = data.getRationType(); } if(data.getRationName() != null&&!(data.getRationName().equals(""))) { rationName = data.getRationName(); } if(data.getEntryDate() != null) { entryDate = sim.format(data.getEntryDate()); } /*row.put("1", ""+ename+""); row.put("2", ""+cname+""); row.put("3", ""+staffno+""); row.put("4", ""+idno+""); row.put("5", ""+sex+""); row.put("6", ""+rationType+""); row.put("7", ""+rationName+""); row.put("8", ""+entryDate+""); row.put("9", ""+struts+"");*/ row.put("1", "1"); row.put("2", "2"); row.put("3", "3"); row.put("4", "4"); row.put("5", "5"); row.put("6", "6"); row.put("7", "7"); row.put("8", "8"); row.put("9", "9"); exportData.add(row); countRow++; } // 用于写入文件 // 列宽数组 int[] columnWidth = { 10000, 7500, 5000, 7500, 5000, 7500, 10000, 5000, 5000 }; // 报表标题 ggreportingtemplateDto.setTitle(null); List<String> remakeList = null; // 报表备注 ggreportingtemplateDto.setRemake(remakeList); ggreportingtemplateDto.setCondition(null); //数据分页 if(exportData.size()>5000) { double size = exportData.size(); double divisor = size/5000; int forNumber = (new Double(Math.ceil(divisor))).intValue(); for (int i = 0; i < forNumber; i++) { if((i+1)==forNumber) { List<LinkedHashMap> exportDataTemp = exportData.subList(0+(i*5000), exportData.size()-1); wb = this.createCSVUtil2(0, "Exceptional Rpt"+(i+1), wb, map, cellRow, exportDataTemp, columnWidth, columnWidth.length, ggreportingtemplateDto, 0); }else { List<LinkedHashMap> exportDataTemp = exportData.subList(0+(i*5000), 5000+(i*5000 )); wb = this.createCSVUtil2(0, "Exceptional Rpt"+(i+1), wb, map, cellRow, exportDataTemp, columnWidth, columnWidth.length, ggreportingtemplateDto, 0); } } }else { wb = this.createCSVUtil2(0, "Exceptional Rpt", wb, map, cellRow, exportData, columnWidth, columnWidth.length, ggreportingtemplateDto, 0); } /*wb = this.createCSVUtil2(0, "Exceptional Rpt", wb, map, cellRow, exportData, columnWidth, columnWidth.length, ggreportingtemplateDto, 0);*/ return wb; } 一开始SXSSFWorkbook wb = new SXSSFWorkbook(5050)这里的值给的5000 ,然后当数据量大于5000时直接报错,文件不能正常导出,然后我把5000改成7000,会报错但是文件能导出,是损坏文件,一条数据都没有,后来修改成分页导出,自己测试的时候任然报了数组下标越界,且越界的下标为负值,文件能正常导出,但导出的文件显示已损坏,看不到数据。报错代码图片以上传,求大神解答,解答有效我微信转账请大神喝奶茶 java.lang.ArrayIndexOutOfBoundsException: -20527 at java.util.ArrayList.elementData(ArrayList.java:418) at java.util.ArrayList.get(ArrayList.java:431) at org.apache.poi.xssf.model.StylesTable.getFontAt(StylesTable.java:386) at org.apache.poi.xssf.usermodel.XSSFWorkbook.getFontAt(XSSFWorkbook.java:954) at org.apache.poi.xssf.streaming.SXSSFWorkbook.getFontAt(SXSSFWorkbook.java:857) at org.apache.poi.ss.util.SheetUtil.getCellWidth(SheetUtil.java:148) at org.apache.poi.xssf.streaming.AutoSizeColumnTracker.updateColumnWidth(AutoSizeColumnTracker.java:367) at org.apache.poi.xssf.streaming.AutoSizeColumnTracker.updateColumnWidths(AutoSizeColumnTracker.java:333) at org.apache.poi.xssf.streaming.SXSSFSheet.flushOneRow(SXSSFSheet.java:1892) at org.apache.poi.xssf.streaming.SXSSFSheet.flushRows(SXSSFSheet.java:1871) at org.apache.poi.xssf.streaming.SXSSFSheet.flushRows(SXSSFSheet.java:1882) at org.apache.poi.xssf.streaming.SXSSFWorkbook.write(SXSSFWorkbook.java:931) at com.sinosoft.application.platform.web.action.GgReportingTemplateAction.download(GgReportingTemplateAction.java:267) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.sinosoft.application.common.BaseProcessAction.execute(BaseProcessAction.java:213) at org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:216) at org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:117) at org.apache.struts.action.ActionServlet.process(ActionServlet.java:1196) at org.apache.struts.action.ActionServlet.doPost(ActionServlet.java:432) at javax.servlet.http.HttpServlet.service(HttpServlet.java:661) at javax.servlet.http.HttpServlet.service(HttpServlet.java:742) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:198) at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:176) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:265) at org.acegisecurity.intercept.web.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:107) at org.acegisecurity.intercept.web.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:72) at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:275) at org.acegisecurity.ui.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:124) at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:275) at org.acegisecurity.providers.anonymous.AnonymousProcessingFilter.doFilter(AnonymousProcessingFilter.java:125) at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:275) at org.acegisecurity.ui.rememberme.RememberMeProcessingFilter.doFilter(RememberMeProcessingFilter.java:142) at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:275) at org.acegisecurity.wrapper.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:81) at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:275) at org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:271) at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:275) at org.acegisecurity.ui.logout.LogoutFilter.doFilter(LogoutFilter.java:110) at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:275) at org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:249) at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:275) at org.acegisecurity.util.FilterChainProxy.doFilter(FilterChainProxy.java:149) at org.acegisecurity.util.FilterToBeanProxy.doFilter(FilterToBeanProxy.java:98) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at com.sinosoft.application.common.CAS_LANGUATE.doFilter(CAS_LANGUATE.java:35) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at com.sinosoft.sysframework.web.control.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:99) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at com.sinosoft.application.common.CompatibleFilter.doFilter(CompatibleFilter.java:29) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at com.sinosoft.application.common.ValidateSalt.doFilter(ValidateSalt.java:56) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:160) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at com.sinosoft.application.common.LoadSalt.doFilter(LoadSalt.java:41) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:199) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:493) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:137) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81) at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:660) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343) at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:798) at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66) at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:808) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1498) at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:748) 此问题已解决,导致报错的原因是sheet.trackAllColumnsForAutoSizing();poi自动设置列宽的代码,注释这行之后问题解决。
求C#实现PCA算法的例子
求C#实现PCA算法的例子,PCA为主成分分析, 看了一天的百度,然而线性代数的知识完全不记得了。 求各位大神指点迷津,万分感谢!
关于PCA和KPCA的特征向量的维数问题~
想问一下PCA算出来的特征向量是和样本属性的维数一样,因为主成分是原来样本属性的线性组合,那么KPCA求出来的特征向量(利用核函数),为什么维数等于样本数呢?
偏最小二乘法回归的Python代码看不懂,有大佬可以帮忙解释一下吗?
{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "#basic module\n", "import warnings\n", "warnings.filterwarnings('ignore')\n", "from sklearn import preprocessing\n", "from sklearn import metrics\n", "from sklearn.decomposition import PCA \n", "import numpy as np\n", "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Q41</th>\n", " <th>Q42</th>\n", " <th>Q43</th>\n", " <th>Q44</th>\n", " <th>Q45</th>\n", " <th>Q46</th>\n", " <th>Q47</th>\n", " <th>Q48</th>\n", " <th>Q49</th>\n", " <th>Q410</th>\n", " <th>Q411</th>\n", " <th>Q412</th>\n", " <th>Q413</th>\n", " <th>Q414</th>\n", " <th>A1</th>\n", " <th>A2</th>\n", " <th>A3</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <td>0</td>\n", " <td>3</td>\n", " <td>3</td>\n", " <td>3</td>\n", " <td>3</td>\n", " <td>3</td>\n", " <td>4</td>\n", " <td>5</td>\n", " <td>5</td>\n", " <td>5</td>\n", " <td>5</td>\n", " <td>3</td>\n", " <td>3</td>\n", " <td>3</td>\n", " <td>3</td>\n", " <td>-1.61</td>\n", " <td>1.91</td>\n", " <td>-1.06</td>\n", " </tr>\n", " <tr>\n", " <td>1</td>\n", " <td>4</td>\n", " <td>4</td>\n", " <td>4</td>\n", " <td>4</td>\n", " <td>4</td>\n", " <td>2</td>\n", " <td>2</td>\n", " <td>2</td>\n", " <td>2</td>\n", " <td>5</td>\n", " <td>5</td>\n", " <td>2</td>\n", " <td>2</td>\n", " <td>2</td>\n", " <td>1.29</td>\n", " <td>-2.49</td>\n", " <td>-0.99</td>\n", " </tr>\n", " <tr>\n", " <td>2</td>\n", " <td>4</td>\n", " <td>4</td>\n", " <td>4</td>\n", " <td>4</td>\n", " <td>4</td>\n", " <td>5</td>\n", " <td>5</td>\n", " <td>5</td>\n", " <td>5</td>\n", " <td>5</td>\n", " <td>3</td>\n", " <td>3</td>\n", " <td>3</td>\n", " <td>3</td>\n", " <td>-0.04</td>\n", " <td>1.89</td>\n", " <td>-1.29</td>\n", " </tr>\n", " <tr>\n", " <td>3</td>\n", " <td>4</td>\n", " <td>4</td>\n", " <td>4</td>\n", " <td>4</td>\n", " <td>4</td>\n", " <td>5</td>\n", " <td>5</td>\n", " <td>5</td>\n", " <td>5</td>\n", " <td>5</td>\n", " <td>3</td>\n", " <td>3</td>\n", " <td>3</td>\n", " <td>3</td>\n", " <td>-0.04</td>\n", " <td>1.89</td>\n", " <td>-1.29</td>\n", " </tr>\n", " <tr>\n", " <td>4</td>\n", " <td>4</td>\n", " <td>4</td>\n", " <td>4</td>\n", " <td>4</td>\n", " <td>3</td>\n", " <td>5</td>\n", " <td>5</td>\n", " <td>5</td>\n", " <td>3</td>\n", " <td>3</td>\n", " <td>3</td>\n", " <td>4</td>\n", " <td>4</td>\n", " <td>3</td>\n", " <td>-0.23</td>\n", " <td>0.77</td>\n", " <td>-0.60</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Q41 Q42 Q43 Q44 Q45 Q46 Q47 Q48 Q49 Q410 Q411 Q412 Q413 Q414 \\\n", "0 3 3 3 3 3 4 5 5 5 5 3 3 3 3 \n", "1 4 4 4 4 4 2 2 2 2 5 5 2 2 2 \n", "2 4 4 4 4 4 5 5 5 5 5 3 3 3 3 \n", "3 4 4 4 4 4 5 5 5 5 5 3 3 3 3 \n", "4 4 4 4 4 3 5 5 5 3 3 3 4 4 3 \n", "\n", " A1 A2 A3 \n", "0 -1.61 1.91 -1.06 \n", "1 1.29 -2.49 -0.99 \n", "2 -0.04 1.89 -1.29 \n", "3 -0.04 1.89 -1.29 \n", "4 -0.23 0.77 -0.60 " ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "raw_data = pd.read_csv(\"257 928.csv\").loc[:,['Q41','Q42','Q43','Q44','Q45',\n", " 'Q46','Q47','Q48','Q49','Q410',\n", " 'Q411','Q412','Q413','Q414',\n", " 'A1','A2','A3']]\n", "raw_data.head()" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['Q41', 'Q42', 'Q43', 'Q44', 'Q45', 'Q46', 'Q47', 'Q48', 'Q49', 'Q410',\n", " 'Q411', 'Q412', 'Q413', 'Q414', 'A1', 'A2', 'A3'],\n", " dtype='object')" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "raw_data.columns" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "def linear_model(X,w):\n", " _,loop = np.shape(X)\n", " sum = 0\n", " for i in range(loop):\n", " sum += w[i] * X[:,i]\n", " return sum" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "def loss_function(x_train, y_train, w):\n", " X = x_train\n", " Y_pred = linear_model(X,w)\n", " J = metrics.mean_squared_error(y_train, Y_pred)\n", " return J" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "def model(x_train, y_train, mini=1e-10, alpha=0.001, delta = 0.005, max_iter = 100, max_step = 5000): \n", " step = 0\n", " _,loop = np.shape(x_train)\n", " w = np.random.rand(loop)\n", " while(True):\n", " weights = np.array(w)\n", " for i in range(0, len(w)):\n", " step_ = 0\n", " while (True):\n", " # simplified gradient descent\n", " w_backup = w[i] \n", " loss_present = loss_function(x_train, y_train, w)\n", " w[i] = w_backup + delta\n", " loss_plus = loss_function(x_train, y_train, w)\n", " w[i] = w_backup - delta\n", " loss_sub = loss_function(x_train, y_train, w)\n", " if (loss_present < loss_plus and loss_present < loss_sub) or step_ >= max_step:\n", " break\n", " # update weights\n", " w[i] = w_backup - alpha*(loss_plus - loss_sub)/(2*delta)\n", " #w[i] = w_backup + alpha if loss_plus < loss_sub else w_backup - alpha\n", " step_ += 1\n", " offset = np.sum(np.square(np.array(w) - weights))\n", " # end condition\n", " step += 1\n", " if (offset < mini) or (step >= max_iter):\n", " break\n", " return w, loss_present" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "def obtain_elements(x_data, y_data, num):\n", " comp_list = []\n", " metric_list = []\n", " w_list = []\n", " pca_list = []\n", " for comp in range(num):\n", " n_comp = comp + 1\n", " # pca\n", " comp_list.append(n_comp)\n", " pca = PCA(n_components=n_comp,svd_solver='auto')\n", " X_ = x_data - np.mean(x_data)\n", " pca.fit(X_, y_data)\n", " X = pca.transform(X_)\n", " # linear reg\n", " pca_list.append(pca)\n", " w, metric = model(X, y_data )\n", " w_list.append(w)\n", " metric_list.append(metric)\n", " ind = metric_list.index(min(metric_list))\n", " print(\"metrics :\" ,metric_list)\n", " return comp_list[ind], pca_list[ind], w_list[ind]" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "# 第一组\n", "x1_data = raw_data.loc[:,['Q41','Q42','Q43','Q44','Q45']]\n", "y1_data = raw_data.loc[:,['A2']]\n", "len1 = 5\n", "# 第二组\n", "x2_data = raw_data.loc[:,['Q46','Q47','Q48','Q49','Q410']]\n", "y2_data = raw_data.loc[:,['A1']]\n", "len2 = 5\n", "# 第三组\n", "x3_data = raw_data.loc[:,['Q411','Q412','Q413','Q414']]\n", "y3_data = raw_data.loc[:,['A3']]\n", "len3 = 4" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 计算PLSR" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " 第一组" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "metrics : [0.937338177736278, 0.90366935147199, 0.9036905658620544, 0.9028029958313659, 0.9022534567720741]\n", "第一组的主成分保留5个\n", "回归系数为: [-0.17477811 0.25885742 -0.01157531 -0.07843881 0.06077944]\n" ] } ], "source": [ "comp_1, pca_1, w1 = obtain_elements(x1_data, y1_data, len1)\n", "print(\"第一组的主成分保留%d个\"%comp_1)\n", "print(\"回归系数为:\", w1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " 第二组" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "metrics : [0.9216669179495057, 0.9126959565507735, 0.900633447790304, 0.8930927646735571, 0.8824868647839249]\n", "第二组的主成分保留5个\n", "回归系数为: [-0.19525742 -0.14575206 -0.20305816 -0.17987887 -0.22941539]\n" ] } ], "source": [ "comp_2, pca_2, w2 = obtain_elements(x2_data, y2_data, len2)\n", "print(\"第二组的主成分保留%d个\"%comp_2)\n", "print(\"回归系数为:\", w2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " 第三组" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "metrics : [0.07356179930507709, 0.06803873635782488, 0.06747567528450515, 0.06722983639980304]\n", "第三组的主成分保留4个\n", "回归系数为: [-0.67029064 -0.10083604 0.0447786 -0.04806739]\n" ] } ], "source": [ "comp_3, pca_3, w3 = obtain_elements(x3_data, y3_data, len3)\n", "print(\"第三组的主成分保留%d个\"%comp_3)\n", "print(\"回归系数为:\", w3)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 2 } ``` ```
小数化分数 是怎么做的呢
Problem Description Ray 在数学课上听老师说,任何小数都能表示成分数的形式,他开始了化了起来,很快他就完成了,但他又想到一个问题,如何把一个循环小数化成分数呢? 请你写一个程序不但可以将普通小数化成最简分数,也可以把循环小数化成最简分数。 Input 第一行是一个整数N,表示有多少组数据。 每组数据只有一个纯小数,也就是整数部分为0。小数的位数不超过9位,循环部分用()括起来。 Output 对每一个对应的小数化成最简分数后输出,占一行。 Sample Input 3 0.(4) 0.5 0.32(692307) Sample Output 4/9 1/2 17/52
求GABOR滤波的matlab代码
数字图像处理gabor滤波代码不知道,还有主成分分析代码,急需急需???、、//////////////
求帮忙看一个python的SVM程序改主成分维度改哪里
从zouxy大神那里拷贝来的程序 # 源程序在这里 ``` from numpy import * import time import matplotlib.pyplot as plt # calulate kernel value def calcKernelValue(matrix_x, sample_x, kernelOption): kernelType = kernelOption[0] numSamples = matrix_x.shape[0] kernelValue = mat(zeros((numSamples, 1))) if kernelType == 'linear': kernelValue = matrix_x * sample_x.T elif kernelType == 'rbf': sigma = kernelOption[1] if sigma == 0: sigma = 1.0 for i in xrange(numSamples): diff = matrix_x[i, :] - sample_x kernelValue[i] = exp(diff * diff.T / (-2.0 * sigma**2)) else: raise NameError('Not support kernel type! You can use linear or rbf!') return kernelValue # calculate kernel matrix given train set and kernel type def calcKernelMatrix(train_x, kernelOption): numSamples = train_x.shape[0] kernelMatrix = mat(zeros((numSamples, numSamples))) for i in xrange(numSamples): kernelMatrix[:, i] = calcKernelValue(train_x, train_x[i, :], kernelOption) return kernelMatrix # define a struct just for storing variables and data class SVMStruct: def __init__(self, dataSet, labels, C, toler, kernelOption): self.train_x = dataSet # each row stands for a sample self.train_y = labels # corresponding label self.C = C # slack variable self.toler = toler # termination condition for iteration self.numSamples = dataSet.shape[0] # number of samples self.alphas = mat(zeros((self.numSamples, 1))) # Lagrange factors for all samples self.b = 0 self.errorCache = mat(zeros((self.numSamples, 2))) self.kernelOpt = kernelOption self.kernelMat = calcKernelMatrix(self.train_x, self.kernelOpt) # calculate the error for alpha k def calcError(svm, alpha_k): output_k = float(multiply(svm.alphas, svm.train_y).T * svm.kernelMat[:, alpha_k] + svm.b) error_k = output_k - float(svm.train_y[alpha_k]) return error_k # update the error cache for alpha k after optimize alpha k def updateError(svm, alpha_k): error = calcError(svm, alpha_k) svm.errorCache[alpha_k] = [1, error] # select alpha j which has the biggest step def selectAlpha_j(svm, alpha_i, error_i): svm.errorCache[alpha_i] = [1, error_i] # mark as valid(has been optimized) candidateAlphaList = nonzero(svm.errorCache[:, 0].A)[0] # mat.A return array maxStep = 0; alpha_j = 0; error_j = 0 # find the alpha with max iterative step if len(candidateAlphaList) > 1: for alpha_k in candidateAlphaList: if alpha_k == alpha_i: continue error_k = calcError(svm, alpha_k) if abs(error_k - error_i) > maxStep: maxStep = abs(error_k - error_i) alpha_j = alpha_k error_j = error_k # if came in this loop first time, we select alpha j randomly else: alpha_j = alpha_i while alpha_j == alpha_i: alpha_j = int(random.uniform(0, svm.numSamples)) error_j = calcError(svm, alpha_j) return alpha_j, error_j # the inner loop for optimizing alpha i and alpha j def innerLoop(svm, alpha_i): error_i = calcError(svm, alpha_i) ### check and pick up the alpha who violates the KKT condition ## satisfy KKT condition # 1) yi*f(i) >= 1 and alpha == 0 (outside the boundary) # 2) yi*f(i) == 1 and 0<alpha< C (on the boundary) # 3) yi*f(i) <= 1 and alpha == C (between the boundary) ## violate KKT condition # because y[i]*E_i = y[i]*f(i) - y[i]^2 = y[i]*f(i) - 1, so # 1) if y[i]*E_i < 0, so yi*f(i) < 1, if alpha < C, violate!(alpha = C will be correct) # 2) if y[i]*E_i > 0, so yi*f(i) > 1, if alpha > 0, violate!(alpha = 0 will be correct) # 3) if y[i]*E_i = 0, so yi*f(i) = 1, it is on the boundary, needless optimized if (svm.train_y[alpha_i] * error_i < -svm.toler) and (svm.alphas[alpha_i] < svm.C) or\ (svm.train_y[alpha_i] * error_i > svm.toler) and (svm.alphas[alpha_i] > 0): # step 1: select alpha j alpha_j, error_j = selectAlpha_j(svm, alpha_i, error_i) alpha_i_old = svm.alphas[alpha_i].copy() alpha_j_old = svm.alphas[alpha_j].copy() # step 2: calculate the boundary L and H for alpha j if svm.train_y[alpha_i] != svm.train_y[alpha_j]: L = max(0, svm.alphas[alpha_j] - svm.alphas[alpha_i]) H = min(svm.C, svm.C + svm.alphas[alpha_j] - svm.alphas[alpha_i]) else: L = max(0, svm.alphas[alpha_j] + svm.alphas[alpha_i] - svm.C) H = min(svm.C, svm.alphas[alpha_j] + svm.alphas[alpha_i]) if L == H: return 0 # step 3: calculate eta (the similarity of sample i and j) eta = 2.0 * svm.kernelMat[alpha_i, alpha_j] - svm.kernelMat[alpha_i, alpha_i] \ - svm.kernelMat[alpha_j, alpha_j] if eta >= 0: return 0 # step 4: update alpha j svm.alphas[alpha_j] -= svm.train_y[alpha_j] * (error_i - error_j) / eta # step 5: clip alpha j if svm.alphas[alpha_j] > H: svm.alphas[alpha_j] = H if svm.alphas[alpha_j] < L: svm.alphas[alpha_j] = L # step 6: if alpha j not moving enough, just return if abs(alpha_j_old - svm.alphas[alpha_j]) < 0.00001: updateError(svm, alpha_j) return 0 # step 7: update alpha i after optimizing aipha j svm.alphas[alpha_i] += svm.train_y[alpha_i] * svm.train_y[alpha_j] \ * (alpha_j_old - svm.alphas[alpha_j]) # step 8: update threshold b b1 = svm.b - error_i - svm.train_y[alpha_i] * (svm.alphas[alpha_i] - alpha_i_old) \ * svm.kernelMat[alpha_i, alpha_i] \ - svm.train_y[alpha_j] * (svm.alphas[alpha_j] - alpha_j_old) \ * svm.kernelMat[alpha_i, alpha_j] b2 = svm.b - error_j - svm.train_y[alpha_i] * (svm.alphas[alpha_i] - alpha_i_old) \ * svm.kernelMat[alpha_i, alpha_j] \ - svm.train_y[alpha_j] * (svm.alphas[alpha_j] - alpha_j_old) \ * svm.kernelMat[alpha_j, alpha_j] if (0 < svm.alphas[alpha_i]) and (svm.alphas[alpha_i] < svm.C): svm.b = b1 elif (0 < svm.alphas[alpha_j]) and (svm.alphas[alpha_j] < svm.C): svm.b = b2 else: svm.b = (b1 + b2) / 2.0 # step 9: update error cache for alpha i, j after optimize alpha i, j and b updateError(svm, alpha_j) updateError(svm, alpha_i) return 1 else: return 0 # the main training procedure def trainSVM(train_x, train_y, C, toler, maxIter, kernelOption = ('rbf', 1.0)): # calculate training time startTime = time.time() # init data struct for svm svm = SVMStruct(mat(train_x), mat(train_y), C, toler, kernelOption) # start training entireSet = True alphaPairsChanged = 0 iterCount = 0 # Iteration termination condition: # Condition 1: reach max iteration # Condition 2: no alpha changed after going through all samples, # in other words, all alpha (samples) fit KKT condition while (iterCount < maxIter) and ((alphaPairsChanged > 0) or entireSet): alphaPairsChanged = 0 # update alphas over all training examples if entireSet: for i in xrange(svm.numSamples): alphaPairsChanged += innerLoop(svm, i) print '---iter:%d entire set, alpha pairs changed:%d' % (iterCount, alphaPairsChanged) iterCount += 1 # update alphas over examples where alpha is not 0 & not C (not on boundary) else: nonBoundAlphasList = nonzero((svm.alphas.A > 0) * (svm.alphas.A < svm.C))[0] for i in nonBoundAlphasList: alphaPairsChanged += innerLoop(svm, i) print '---iter:%d non boundary, alpha pairs changed:%d' % (iterCount, alphaPairsChanged) iterCount += 1 # alternate loop over all examples and non-boundary examples if entireSet: entireSet = False elif alphaPairsChanged == 0: entireSet = True print 'Congratulations, training complete! Took %fs!' % (time.time() - startTime) return svm # testing your trained svm model given test set def testSVM(svm, test_x, test_y): test_x = mat(test_x) test_y = mat(test_y) numTestSamples = test_x.shape[0] supportVectorsIndex = nonzero(svm.alphas.A > 0)[0] supportVectors = svm.train_x[supportVectorsIndex] supportVectorLabels = svm.train_y[supportVectorsIndex] supportVectorAlphas = svm.alphas[supportVectorsIndex] matchCount = 0 for i in xrange(numTestSamples): kernelValue = calcKernelValue(supportVectors, test_x[i, :], svm.kernelOpt) predict = kernelValue.T * multiply(supportVectorLabels, supportVectorAlphas) + svm.b if sign(predict) == sign(test_y[i]): matchCount += 1 accuracy = float(matchCount) / numTestSamples return accuracy # show your trained svm model only available with 2-D data def showSVM(svm): if svm.train_x.shape[1] != 2: print "Sorry! I can not draw because the dimension of your data is not 2!" return 1 # draw all samples for i in xrange(svm.numSamples): if svm.train_y[i] == -1: plt.plot(svm.train_x[i, 0], svm.train_x[i, 1], 'or') elif svm.train_y[i] == 1: plt.plot(svm.train_x[i, 0], svm.train_x[i, 1], 'ob') # mark support vectors supportVectorsIndex = nonzero(svm.alphas.A > 0)[0] for i in supportVectorsIndex: plt.plot(svm.train_x[i, 0], svm.train_x[i, 1], 'oy') # draw the classify line w = zeros((2, 1)) for i in supportVectorsIndex: w += multiply(svm.alphas[i] * svm.train_y[i], svm.train_x[i, :].T) min_x = min(svm.train_x[:, 0])[0, 0] max_x = max(svm.train_x[:, 0])[0, 0] y_min_x = float(-svm.b - w[0] * min_x) / w[1] y_max_x = float(-svm.b - w[0] * max_x) / w[1] plt.plot([min_x, max_x], [y_min_x, y_max_x], '-g') plt.show() ``` # 测试代码在这里 ``` from numpy import * import SVM ################## test svm ##################### ## step 1: load data print ("step 1: load data...") dataSet = [] labels = [] fileIn = open('D:\Python33\SVM\testSet.txt') for line in fileIn.readlines(): lineArr = line.strip().split('\t') dataSet.append([float(lineArr[0]), float(lineArr[1])]) labels.append(float(lineArr[2])) dataSet = mat(dataSet) labels = mat(labels).T train_x = dataSet[0:81, :] train_y = labels[0:81, :] test_x = dataSet[80:101, :] test_y = labels[80:101, :] ## step 2: training... print ("step 2: training...") C = 0.6 toler = 0.001 maxIter = 50 svmClassifier = SVM.trainSVM(train_x, train_y, C, toler, maxIter, kernelOption = ('linear', 0)) ## step 3: testing print ("step 3: testing...") accuracy = SVM.testSVM(svmClassifier, test_x, test_y) ## step 4: show the result print ("step 4: show the result...") print ('The classify accuracy is: %.3f%%' % (accuracy * 100)) SVM.showSVM(svmClassifier) ``` # 测试的数据在这里 3.542485 1.977398 -1 3.018896 2.556416 -1 7.551510 -1.580030 1 2.114999 -0.004466 -1 8.127113 1.274372 1 7.108772 -0.986906 1 8.610639 2.046708 1 2.326297 0.265213 -1 3.634009 1.730537 -1 0.341367 -0.894998 -1 3.125951 0.293251 -1 2.123252 -0.783563 -1 0.887835 -2.797792 -1 .......... # 现在我想把这个二维主成分的SVM改成分类五维主成分的 就是把测试的数据改成比如 3.125951 0.293251 2.123252 -0.783563 0.887835 -1 0.887835 -2.797792 3.634009 1.730537 -2.797792 -1 但还是二分类, 请问源程序代码和测试代码应该改哪里?
以训练完全的bpn预测输出作为适应度再用ga来寻找输出最大解最后解码出对应的多个输入值,算法如何编写
遇到的主要问题有两个: 1、数据集只有50组左右,8个输入1个输出,调整BPANN中各种可调参数都无法使测试集R2稳定在0.85以上,最后在算法中加入循环,每次都用随机选择的训练数据进行训练,把上一次的权值和阈值赋值给下一次,重复15左右可以使测试R2稳定在预期值,这样的做法是否合理? 2、将神经网络预测值(倒数)作为fitness应该怎么写完整算法,怎么用GA算法对8个输入条件和预测输出值进行编码和解码? ps:本人不是计算机专业所有模型算法都是自己折腾,建模都是在MATLAB做的 这个方法已被用于化学催化剂开发,由于催化剂制备涉及各成分的质量配比,pH,煅烧温度和时间,甚至可以考虑应用催化剂时的影响变量包括反应温度,底物浓度等,已有相关论文发表。(有兴趣的可以发邮件542708956@qq.com,我提供数据集,帮上忙的可以给一些酬劳,谢谢) ![图片说明](https://img-ask.csdn.net/upload/201905/21/1558412603_462996.png)
通过单张正面人脸照片建立三维人脸形状建模
问题是这样:我需要通过一张正面人脸照片构建三维人脸模型。 思路:首先用主动形状模型(ASM)或其他的特征点定位方法定位照片中人脸的一定数量特征点(ASM方法中一般是68个);其次利用主成分分析法(PCA)训练三维人脸数据库,得到一个三维人脸形状模型;然后建立二维人脸模型和三维人脸模型的映射关系,得到三维人脸形状模型的形状参数;最后通过纹理映射得到照片中人物的真实人脸模型,当然应该会有一定的误差。 现在只是完成了特征点的定位,遇到的困难是: 1、怎么导出这些特征点的坐标文件; 2、有什么好用的3D人脸库; 3、怎样建立2D到3D的映射关系; 望有做过相关项目的前辈不吝赐教,正在做的可以一起讨论下。
大家帮忙看看我这段matlab,用pca处理图像,但是输出的图片为什么会是重复的三张?
function y=mypca() %%%%%%%%%%%%%%%%%%%%%%%%%PCA算法对人脸图像处理提取主成分程序 path = ['.\']; % 提取当前目录 %读取图像 numimage=4; %4张人脸 imagepath=[path 'ORL\ORL001' '.bmp']; %第一张人脸文件的路径及文件名:D:\PCA\ORL\ORL001.bmp immatrix=imread(imagepath); % 读入第一张人脸文件,构成矩阵immatrix [m,n]=size(immatrix); % 计算矩阵immatrix的行数m、列数n DATA = uint8 (rand(m*n, numimage)); %随机生成m*n行、numimage列的矩阵,并取uint8 for i=1:numimage s1=floor(i/100); % 取整,求第3位 tem=rem(i,100); % i除以100的余数,取后两位 s2=floor(tem/10); % 取第2位 s3=rem(tem,10); % 取第1位 imagepath=[path 'ORL\ORL' int2str(s1) int2str(s2) int2str(s3) '.bmp']; % 构成图像文件的路径即文件名 immatrix=imread(imagepath); % 读入每一张人脸文件,构成矩阵immatrix imVector=reshape(immatrix,m*n,1); % 将矩阵immatrix转化为一个列向量,长度为m*n DATA(:,i)=imVector; % 将列向量imVector依次加入到DATA矩阵的列中.DATA先随机生成过的 end clear i;clear j; save DATA DATA; % 保存DATA mn=mean(double(DATA'))'; % 计算DATA的行向量的均值 save mn mn; % 保存DATA的行向量的均值 %image substracted by mean of all train images DATAzeromean=double(DATA)-repmat(mn,1,numimage); save DATAzeromean DATAzeromean; clear DATA; L=DATAzeromean'*DATAzeromean; [V,D]=eig(L); enginvalue=diag(D); [enginvalue,ix]=sort(enginvalue);%按升序排列矩阵元素 ix=flipud(ix);%从上到下翻转矩阵,即按降序 V=V(:,ix); %对V的特征向量位置调整 facespace=DATAzeromean*V; %脸空间 for t=1:numimage facespace(:,t)=facespace(:,t)/norm(facespace(:,t));%Normalisation to unit length end subdim=4; facespace=facespace(:,1:subdim);%选择子特征向量的协方差矩阵 projdata=facespace'*DATAzeromean; save projdata projdata; save facespace facespace; datareconstruct=facespace*projdata; fprintf('正在保存 Wakesplace中的图片数据\n'); save datareconstruct datareconstruct; for i=1:numimage-1 imdata=datareconstruct(:,i); imdata=reshape(imdata,m,n); imwrite(imdata,['.\生成的特征脸\' int2str(i) '.bmp'],'bmp');%得到重构图像1.bmp---4.bmp end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
请问plink对VCF格式的数据进行格式转化时,会出现文件后缀为hh,nof和nosex文件呢
我按照以下流程进行主成分分析: 1、利用vcftools软件进行格式转换:vcftools --vcf tmp.vcf --plink --out tmp --plink, 改变输出文件格式 此时会生成两个文件:tmp.ped(基因型数据) 和 tmp.map 2、利用plink软件进行数据格式转换:./plink --noweb --file tmp --make-bed --out tmp 注意,输入文件和输出文件都不需要文件名的后缀,此时生成3个文件:tmp.bed,tmp.bim 和 tmp.fam  3、利用gcta软件进行pca构建 gcta --bfile tmp --make-grm --autosome --out tmp 此时生成一个文件:tmp.grm.gz 但是在使用plink对VCF格式的数据进行格式转化时,会出现文件后缀为hh,nof和nosex文件![图片说明](https://img-ask.csdn.net/upload/201807/21/1532164668_8771.png) 在接下来利用gcta软件进行pca构建时,出现了以下错误: Skipping web check... [ --noweb ] Writing this text to log file [ sample_210JS.fltsnp.log ] Analysis started: Tue Jul 10 09:21:15 2018 Options in effect: --noweb --file /lustre/waterfowl_group/practice/geneic_evolution/data/ped_map/sample_210JS.fltsnp --make-bed --out sample_210JS.fltsnp ** For gPLINK compatibility, do not use '.' in --out ** 54599752 (of 54599752) markers to be included from [ /lustre/waterfowl_group/practice/geneic_evolution/data/ped_map/sample_210JS.fltsnp.map ] Warning, found 210 individuals with ambiguous sex codes Writing list of these individuals to [ sample_210JS.fltsnp.nosex ] 210 individuals read from [ /lustre/waterfowl_group/practice/geneic_evolution/data/ped_map/sample_210JS.fltsnp.ped ] 0 individuals with nonmissing phenotypes Assuming a disease phenotype (1=unaff, 2=aff, 0=miss) Missing phenotype value is also -9 0 cases, 0 controls and 210 missing 0 males, 0 females, and 210 of unspecified sex Before frequency and genotyping pruning, there are 54599752 SNPs 210 founders and 0 non-founders found 697384 heterozygous haploid genotypes; set to missing Writing list of heterozygous haploid genotypes to [ sample_210JS.fltsnp.hh ] 15 SNPs with no founder genotypes observed Warning, MAF set to 0 for these SNPs (see --nonfounders) Writing list of these SNPs to [ sample_210JS.fltsnp.nof ] Total genotyping rate in remaining individuals is nan 请问这是哪里出了问题呢?望大神解答
windows 操作系统 位对位拷贝的磁盘成分片的DD镜像 如何自己写程序挂载 让文件系统识别?
windows 操作系统 位对位拷贝的磁盘成分片的DD镜像 如DD.001 DD.002 如何自己写程序挂载 让文件系统识别? osfmount 还有 mount image pro 能够挂载 我只要挂载DD镜像
KSVD程序中要求信号Y的列数要大于字典D的列数,不知道为什么有这个要求。
用KSVD.m训练字典,程序中有一段要求信号Y的列数要大于字典D的列数,否则就会报错,但是理论上并不需要这个要求,帮忙解释下为什么会有这个要求?(程序第91行)如果Y是n*1的向量,就不能训练出字典了吗? ``` function [Dictionary,output] = KSVD(... Data,... % an nXN matrix that contins N signals (Y), each of dimension n. param) % ========================================================================= % K-SVD algorithm % ========================================================================= % The K-SVD algorithm finds a dictionary for linear representation of % signals. Given a set of signals, it searches for the best dictionary that % can sparsely represent each signal. Detailed discussion on the algorithm % and possible applications can be found in "The K-SVD: An Algorithm for % Designing of Overcomplete Dictionaries for Sparse Representation", written % by M. Aharon, M. Elad, and A.M. Bruckstein and appeared in the IEEE Trans. % On Signal Processing, Vol. 54, no. 11, pp. 4311-4322, November 2006. % ========================================================================= % INPUT ARGUMENTS: % Data an nXN matrix that contins N signals (Y), each of dimension n. % param structure that includes all required % parameters for the K-SVD execution. % Required fields are: % K, ... the number of dictionary elements to train % K 原子个数 % numIteration,... number of iterations to perform. % numIteration 迭代次数 % errorFlag... if =0, a fix number of coefficients is % used for representation of each signal. If so, param.L must be % specified as the number of representing atom. if =1, arbitrary number % of atoms represent each signal, until a specific representation error % is reached. If so, param.errorGoal must be specified as the allowed % error. % preserveDCAtom... if =1 then the first atom in the dictionary % is set to be constant, and does not ever change. This % might be useful for working with natural % images (in this case, only param.K-1 % atoms are trained). % (optional, see errorFlag) L,... % maximum coefficients to use in OMP coefficient calculations. % (optional, see errorFlag) errorGoal, ... % allowed representation error in representing each signal. % InitializationMethod,... mehtod to initialize the dictionary, can % be one of the following arguments: % * 'DataElements' (initialization by the signals themselves), or: % * 'GivenMatrix' (initialization by a given matrix param.initialDictionary). % (optional, see InitializationMethod) initialDictionary,... % if the initialization method % is 'GivenMatrix', this is the matrix that will be used. % (optional) TrueDictionary, ... % if specified, in each % iteration the difference between this dictionary and the trained one % is measured and displayed. % displayProgress, ... if =1 progress information is displyed. If param.errorFlag==0, % the average repersentation error (RMSE) is displayed, while if % param.errorFlag==1, the average number of required coefficients for % representation of each signal is displayed. % ========================================================================= % OUTPUT ARGUMENTS: % Dictionary The extracted dictionary of size nX(param.K). % output Struct that contains information about the current run. It may include the following fields: % CoefMatrix The final coefficients matrix (it should hold that Data equals approximately Dictionary*output.CoefMatrix. % ratio If the true dictionary was defined (in % synthetic experiments), this parameter holds a vector of length % param.numIteration that includes the detection ratios in each % iteration). % totalerr The total representation error after each % iteration (defined only if % param.displayProgress=1 and % param.errorFlag = 0) % numCoef A vector of length param.numIteration that % include the average number of coefficients required for representation % of each signal (in each iteration) (defined only if % param.displayProgress=1 and % param.errorFlag = 1) % ========================================================================= if (~isfield(param,'displayProgress')) param.displayProgress = 0; end totalerr(1) = 99999; if (isfield(param,'errorFlag')==0) param.errorFlag = 0; end if (isfield(param,'TrueDictionary')) displayErrorWithTrueDictionary = 1; ErrorBetweenDictionaries = zeros(param.numIteration+1,1); ratio = zeros(param.numIteration+1,1); else displayErrorWithTrueDictionary = 0; ratio = 0; end if (param.preserveDCAtom>0) FixedDictionaryElement(1:size(Data,1),1) = 1/sqrt(size(Data,1)); else FixedDictionaryElement = []; end % coefficient calculation method is OMP with fixed number of coefficients if (size(Data,2) < param.K)% 问题在这里,Data的列小于K就不能运行 disp('Size of data is smaller than the dictionary size. Trivial solution...'); Dictionary = Data(:,1:size(Data,2)); return; elseif (strcmp(param.InitializationMethod,'DataElements')) Dictionary(:,1:param.K-param.preserveDCAtom) = Data(:,1:param.K-param.preserveDCAtom); elseif (strcmp(param.InitializationMethod,'GivenMatrix')) Dictionary(:,1:param.K-param.preserveDCAtom) = param.initialDictionary(:,1:param.K-param.preserveDCAtom); end % reduce the components in Dictionary that are spanned by the fixed % elements if (param.preserveDCAtom) tmpMat = FixedDictionaryElement \ Dictionary; Dictionary = Dictionary - FixedDictionaryElement*tmpMat; end %normalize the dictionary. 对字典进行归一化 Dictionary = Dictionary*diag(1./sqrt(sum(Dictionary.*Dictionary))); Dictionary = Dictionary.*repmat(sign(Dictionary(1,:)),size(Dictionary,1),1); totalErr = zeros(1,param.numIteration); %% % the K-SVD algorithm starts here. for iterNum = 1:param.numIteration %param.numIteration = numIterOfKsvd=10 % find the coefficients if (param.errorFlag==0) %param.errorFlag = 1; %CoefMatrix = mexOMPIterative2(Data, [FixedDictionaryElement,Dictionary],param.L); CoefMatrix = OMP([FixedDictionaryElement,Dictionary],Data, param.L); %size(Data,2)=249*249 else %CoefMatrix = mexOMPerrIterative(Data, [FixedDictionaryElement,Dictionary],param.errorGoal); CoefMatrix = OMPerr([FixedDictionaryElement,Dictionary],Data, param.errorGoal);%%%%%%%%%%param.errorGoal = sigma*C; 稀疏矩阵 param.L = 1; end replacedVectorCounter = 0; rPerm = randperm(size(Dictionary,2)); for j = rPerm [betterDictionaryElement,CoefMatrix,addedNewVector] = I_findBetterDictionaryElement(Data,... [FixedDictionaryElement,Dictionary],j+size(FixedDictionaryElement,2),... CoefMatrix,param.L); Dictionary(:,j) = betterDictionaryElement; if (param.preserveDCAtom) tmpCoef = FixedDictionaryElement\betterDictionaryElement; Dictionary(:,j) = betterDictionaryElement - FixedDictionaryElement*tmpCoef; Dictionary(:,j) = Dictionary(:,j)./sqrt(Dictionary(:,j)'*Dictionary(:,j)); end replacedVectorCounter = replacedVectorCounter+addedNewVector; end if (iterNum>1 & param.displayProgress) if (param.errorFlag==0) output.totalerr(iterNum-1) = sqrt(sum(sum((Data-[FixedDictionaryElement,Dictionary]*CoefMatrix).^2))/prod(size(Data))); disp(['Iteration ',num2str(iterNum),' Total error is: ',num2str(output.totalerr(iterNum-1))]); else %执行此句 output.numCoef(iterNum-1) = length(find(CoefMatrix))/size(Data,2); disp(['Iteration ',num2str(iterNum),' Average number of coefficients: ',num2str(output.numCoef(iterNum-1))]); end end if (displayErrorWithTrueDictionary ) [ratio(iterNum+1),ErrorBetweenDictionaries(iterNum+1)] = I_findDistanseBetweenDictionaries(param.TrueDictionary,Dictionary);%%%%%% disp(strcat(['Iteration ', num2str(iterNum),' ratio of restored elements: ',num2str(ratio(iterNum+1))])); output.ratio = ratio; end Dictionary = I_clearDictionary(Dictionary,CoefMatrix(size(FixedDictionaryElement,2)+1:end,:),Data); if (isfield(param,'waitBarHandle')) waitbar(iterNum/param.counterForWaitBar); end end output.CoefMatrix = CoefMatrix; Dictionary = [FixedDictionaryElement,Dictionary]; function [betterDictionaryElement,CoefMatrix,NewVectorAdded] = I_findBetterDictionaryElement(Data,Dictionary,j,CoefMatrix,numCoefUsed) if (length(who('numCoefUsed'))==0) numCoefUsed = 1; % liu=1%%%%没有进行此句,说明if条件不满足。 end relevantDataIndices = find(CoefMatrix(j,:)); % the data indices that uses the j'th dictionary element. 查找出系数矩阵中每一行中非0元素的序号 参考DCT字典的程序:relevantDataIndices = find(Coefs(3,:)); if (length(relevantDataIndices)<1) %(length(relevantDataIndices)==0) 如果系数矩阵为空,则进行如下的语句 。 如果relevantDataIndices为0,说明没有patch表达粗腰用到第j个原子 ErrorMat = Data-Dictionary*CoefMatrix; ErrorNormVec = sum(ErrorMat.^2); [d,i] = max(ErrorNormVec); betterDictionaryElement = Data(:,i);%ErrorMat(:,i); % betterDictionaryElement = betterDictionaryElement./sqrt(betterDictionaryElement'*betterDictionaryElement);%归一化 betterDictionaryElement = betterDictionaryElement.*sign(betterDictionaryElement(1)); CoefMatrix(j,:) = 0; NewVectorAdded = 1%%%%%实验证明(针对w.jpg图像),值累加了一次 % liuzhe=1 没进行此句,说明稀疏矩阵的每一行都有非零的元素 return; end NewVectorAdded = 0; tmpCoefMatrix = CoefMatrix(:,relevantDataIndices); %将稀疏矩阵中非0 的取出来 tmpCoefMatrix尺寸为:256*length(relevantDataIndices) tmpCoefMatrix(j,:) = 0;% the coeffitients of the element we now improve are not relevant. errors =(Data(:,relevantDataIndices) - Dictionary*tmpCoefMatrix); % vector of errors that we want to minimize with the new element D:64*256 tmpCoefMatrix尺寸为:256*length(relevantDataIndices) Data(:,relevantDataIndices):64*relevantDataIndices % % the better dictionary element and the values of beta are found using svd. % % This is because we would like to minimize || errors - beta*element ||_F^2. % % that is, to approximate the matrix 'errors' with a one-rank matrix. This % % is done using the largest singular value. %%在这里使用SVD就可以达到|| errors - beta*element ||_F^2误差最小的效果 [betterDictionaryElement,singularValue,betaVector] = svds(errors,1);%%%%%%%仅仅取出了第一主分量 errors的大小为;64*relevantDataIndices M=64 N=relevantDataIndices betterDictionaryElement*singularValue*betaVector'近似的可以表示errors %a=[1 2 3 4;5 6 7 8;9 10 11 12;2 4 6 7.99999]; [u,s,v]=svds(a) u*s*v' [u,s,v]=svds(a,1):取出的第一主成分 %对于svds函数:a为M*N的矩阵,那么u:M*M S:M*N(简写成M*M) V=N*M V'=M*N %对于svd函数:a为M*N的矩阵, 那么u:M*M S:M*N V=N*N V'=N*N %将字典原子D的解定义为U中的第一列,将系数向量CoefMatrix的解定义为V的第一列与S(1,1)的乘积 这个是核心 核心 核心!!!!!!!!!!!!!!! CoefMatrix(j,relevantDataIndices) = singularValue*betaVector';% *signOfFirstElem s*v' [u,s,v]=svds(a,1):取出的第一主成分 ,所以此时s*v'矩阵大小为 1*N,即CoefMatrix(j,relevantDataIndices)也为:1*N betterDictionaryElement:M*1,即64*1的向量 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % findDistanseBetweenDictionaries %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function [ratio,totalDistances] = I_findDistanseBetweenDictionaries(original,new) % first, all the column in oiginal starts with positive values. catchCounter = 0; totalDistances = 0; for i = 1:size(new,2) new(:,i) = sign(new(1,i))*new(:,i); end for i = 1:size(original,2) d = sign(original(1,i))*original(:,i); distances =sum ( (new-repmat(d,1,size(new,2))).^2); [minValue,index] = min(distances); errorOfElement = 1-abs(new(:,index)'*d); totalDistances = totalDistances+errorOfElement; catchCounter = catchCounter+(errorOfElement<0.01); end ratio = 100*catchCounter/size(original,2); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % I_clearDictionary %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function Dictionary = I_clearDictionary(Dictionary,CoefMatrix,Data) T2 = 0.99; T1 = 3; K=size(Dictionary,2); %%K=256 Er=sum((Data-Dictionary*CoefMatrix).^2,1); % remove identical atoms(删除相同的原子) 列求和 CoefMatrix(j,relevantDataIndices)的大小为256*relevantDataIndices G=Dictionary'*Dictionary; %256*256 G表示不同的原子求内积,可以认为是计算相似性 G 的大小是 K*K G = G-diag(diag(G));%例如:G=magic(3) diag(diag(G)) 也就是将对角的元素赋值为0 for jj=1:1:K, if max(G(jj,:))>T2 | length(find(abs(CoefMatrix(jj,:))>1e-7))<=T1 , %G(jj,:))>T2 表示两个原子间相似性很高,大于0.99 %length(find(abs(CoefMatrix(jj,:))>1e-7) 表示这使用到第jj个原子的patch少于3个 [val,pos]=max(Er); clearDictionary=1%%%%%%%%%%%%%%%%%%%%%%%%测试满足if条件的有多少次 Er(pos(1))=0;%将最大误差处的值赋值为0 Dictionary(:,jj)=Data(:,pos(1))/norm(Data(:,pos(1)));%%norm(Data(:,pos(1)):求向量的模 此整句相当于把误差最大的列归一化 G=Dictionary'*Dictionary; G = G-diag(diag(G)); end; end; ```
爬虫福利二 之 妹子图网MM批量下载
爬虫福利一:27报网MM批量下载    点击 看了本文,相信大家对爬虫一定会产生强烈的兴趣,激励自己去学习爬虫,在这里提前祝:大家学有所成! 目标网站:妹子图网 环境:Python3.x 相关第三方模块:requests、beautifulsoup4 Re:各位在测试时只需要将代码里的变量 path 指定为你当前系统要保存的路径,使用 python xxx.py 或IDE运行即可。
Java学习的正确打开方式
在博主认为,对于入门级学习java的最佳学习方法莫过于视频+博客+书籍+总结,前三者博主将淋漓尽致地挥毫于这篇博客文章中,至于总结在于个人,实际上越到后面你会发现学习的最好方式就是阅读参考官方文档其次就是国内的书籍,博客次之,这又是一个层次了,这里暂时不提后面再谈。博主将为各位入门java保驾护航,各位只管冲鸭!!!上天是公平的,只要不辜负时间,时间自然不会辜负你。 何谓学习?博主所理解的学习,它
大学四年自学走来,这些私藏的实用工具/学习网站我贡献出来了
大学四年,看课本是不可能一直看课本的了,对于学习,特别是自学,善于搜索网上的一些资源来辅助,还是非常有必要的,下面我就把这几年私藏的各种资源,网站贡献出来给你们。主要有:电子书搜索、实用工具、在线视频学习网站、非视频学习网站、软件下载、面试/求职必备网站。 注意:文中提到的所有资源,文末我都给你整理好了,你们只管拿去,如果觉得不错,转发、分享就是最大的支持了。 一、电子书搜索 对于大部分程序员...
linux系列之常用运维命令整理笔录
本博客记录工作中需要的linux运维命令,大学时候开始接触linux,会一些基本操作,可是都没有整理起来,加上是做开发,不做运维,有些命令忘记了,所以现在整理成博客,当然vi,文件操作等就不介绍了,慢慢积累一些其它拓展的命令,博客不定时更新 顺便拉下票,我在参加csdn博客之星竞选,欢迎投票支持,每个QQ或者微信每天都可以投5票,扫二维码即可,http://m234140.nofollow.ax.
比特币原理详解
一、什么是比特币 比特币是一种电子货币,是一种基于密码学的货币,在2008年11月1日由中本聪发表比特币白皮书,文中提出了一种去中心化的电子记账系统,我们平时的电子现金是银行来记账,因为银行的背后是国家信用。去中心化电子记账系统是参与者共同记账。比特币可以防止主权危机、信用风险。其好处不多做赘述,这一层面介绍的文章很多,本文主要从更深层的技术原理角度进行介绍。 二、问题引入 假设现有4个人...
程序员接私活怎样防止做完了不给钱?
首先跟大家说明一点,我们做 IT 类的外包开发,是非标品开发,所以很有可能在开发过程中会有这样那样的需求修改,而这种需求修改很容易造成扯皮,进而影响到费用支付,甚至出现做完了项目收不到钱的情况。 那么,怎么保证自己的薪酬安全呢? 我们在开工前,一定要做好一些证据方面的准备(也就是“讨薪”的理论依据),这其中最重要的就是需求文档和验收标准。一定要让需求方提供这两个文档资料作为开发的基础。之后开发
网页实现一个简单的音乐播放器(大佬别看。(⊙﹏⊙))
今天闲着无事,就想写点东西。然后听了下歌,就打算写个播放器。 于是乎用h5 audio的加上js简单的播放器完工了。 欢迎 改进 留言。 演示地点跳到演示地点 html代码如下`&lt;!DOCTYPE html&gt; &lt;html&gt; &lt;head&gt; &lt;title&gt;music&lt;/title&gt; &lt;meta charset="utf-8"&gt
Python十大装B语法
Python 是一种代表简单思想的语言,其语法相对简单,很容易上手。不过,如果就此小视 Python 语法的精妙和深邃,那就大错特错了。本文精心筛选了最能展现 Python 语法之精妙的十个知识点,并附上详细的实例代码。如能在实战中融会贯通、灵活使用,必将使代码更为精炼、高效,同时也会极大提升代码B格,使之看上去更老练,读起来更优雅。 1. for - else 什么?不是 if 和 else 才
数据库优化 - SQL优化
前面一篇文章从实例的角度进行数据库优化,通过配置一些参数让数据库性能达到最优。但是一些“不好”的SQL也会导致数据库查询变慢,影响业务流程。本文从SQL角度进行数据库优化,提升SQL运行效率。 判断问题SQL 判断SQL是否有问题时可以通过两个表象进行判断: 系统级别表象 CPU消耗严重 IO等待严重 页面响应时间过长
2019年11月中国大陆编程语言排行榜
2019年11月2日,我统计了某招聘网站,获得有效程序员招聘数据9万条。针对招聘信息,提取编程语言关键字,并统计如下: 编程语言比例 rank pl_ percentage 1 java 33.62% 2 c/c++ 16.42% 3 c_sharp 12.82% 4 javascript 12.31% 5 python 7.93% 6 go 7.25% 7
通俗易懂地给女朋友讲:线程池的内部原理
餐厅的约会 餐盘在灯光的照耀下格外晶莹洁白,女朋友拿起红酒杯轻轻地抿了一小口,对我说:“经常听你说线程池,到底线程池到底是个什么原理?”我楞了一下,心里想女朋友今天是怎么了,怎么突然问出这么专业的问题,但做为一个专业人士在女朋友面前也不能露怯啊,想了一下便说:“我先给你讲讲我前同事老王的故事吧!” 大龄程序员老王 老王是一个已经北漂十多年的程序员,岁数大了,加班加不动了,升迁也无望,于是拿着手里
经典算法(5)杨辉三角
杨辉三角 是经典算法,这篇博客对它的算法思想进行了讲解,并有完整的代码实现。
腾讯算法面试题:64匹马8个跑道需要多少轮才能选出最快的四匹?
昨天,有网友私信我,说去阿里面试,彻底的被打击到了。问了为什么网上大量使用ThreadLocal的源码都会加上private static?他被难住了,因为他从来都没有考虑过这个问题。无独有偶,今天笔者又发现有网友吐槽了一道腾讯的面试题,我们一起来看看。 腾讯算法面试题:64匹马8个跑道需要多少轮才能选出最快的四匹? 在互联网职场论坛,一名程序员发帖求助到。二面腾讯,其中一个算法题:64匹
面试官:你连RESTful都不知道我怎么敢要你?
面试官:了解RESTful吗? 我:听说过。 面试官:那什么是RESTful? 我:就是用起来很规范,挺好的 面试官:是RESTful挺好的,还是自我感觉挺好的 我:都挺好的。 面试官:… 把门关上。 我:… 要干嘛?先关上再说。 面试官:我说出去把门关上。 我:what ?,夺门而去 文章目录01 前言02 RESTful的来源03 RESTful6大原则1. C-S架构2. 无状态3.统一的接
JDK12 Collectors.teeing 你真的需要了解一下
前言 在 Java 12 里面有个非常好用但在官方 JEP 没有公布的功能,因为它只是 Collector 中的一个小改动,它的作用是 merge 两个 collector 的结果,这句话显得很抽象,老规矩,我们先来看个图(这真是一个不和谐的图????): 管道改造经常会用这个小东西,通常我们叫它「三通」,它的主要作用就是将 downstream1 和 downstre...
为啥国人偏爱Mybatis,而老外喜欢Hibernate/JPA呢?
关于SQL和ORM的争论,永远都不会终止,我也一直在思考这个问题。昨天又跟群里的小伙伴进行了一番讨论,感触还是有一些,于是就有了今天这篇文。 声明:本文不会下关于Mybatis和JPA两个持久层框架哪个更好这样的结论。只是摆事实,讲道理,所以,请各位看官勿喷。 一、事件起因 关于Mybatis和JPA孰优孰劣的问题,争论已经很多年了。一直也没有结论,毕竟每个人的喜好和习惯是大不相同的。我也看
SQL-小白最佳入门sql查询一
不要偷偷的查询我的个人资料,即使你再喜欢我,也不要这样,真的不好;
项目中的if else太多了,该怎么重构?
介绍 最近跟着公司的大佬开发了一款IM系统,类似QQ和微信哈,就是聊天软件。我们有一部分业务逻辑是这样的 if (msgType = "文本") { // dosomething } else if(msgType = "图片") { // doshomething } else if(msgType = "视频") { // doshomething } else { // doshom...
【图解经典算法题】如何用一行代码解决约瑟夫环问题
约瑟夫环问题算是很经典的题了,估计大家都听说过,然后我就在一次笔试中遇到了,下面我就用 3 种方法来详细讲解一下这道题,最后一种方法学了之后保证让你可以让你装逼。 问题描述:编号为 1-N 的 N 个士兵围坐在一起形成一个圆圈,从编号为 1 的士兵开始依次报数(1,2,3…这样依次报),数到 m 的 士兵会被杀死出列,之后的士兵再从 1 开始报数。直到最后剩下一士兵,求这个士兵的编号。 1、方...
致 Python 初学者
欢迎来到“Python进阶”专栏!来到这里的每一位同学,应该大致上学习了很多 Python 的基础知识,正在努力成长的过程中。在此期间,一定遇到了很多的困惑,对未来的学习方向感到迷茫。我非常理解你们所面临的处境。我从2007年开始接触 python 这门编程语言,从2009年开始单一使用 python 应对所有的开发工作,直至今天。回顾自己的学习过程,也曾经遇到过无数的困难,也曾经迷茫过、困惑过。开办这个专栏,正是为了帮助像我当年一样困惑的 Python 初学者走出困境、快速成长。希望我的经验能真正帮到你
“狗屁不通文章生成器”登顶GitHub热榜,分分钟写出万字形式主义大作
一、垃圾文字生成器介绍 最近在浏览GitHub的时候,发现了这样一个骨骼清奇的雷人项目,而且热度还特别高。 项目中文名:狗屁不通文章生成器 项目英文名:BullshitGenerator 根据作者的介绍,他是偶尔需要一些中文文字用于GUI开发时测试文本渲染,因此开发了这个废话生成器。但由于生成的废话实在是太过富于哲理,所以最近已经被小伙伴们给玩坏了。 他的文风可能是这样的: 你发现,...
程序员:我终于知道post和get的区别
是一个老生常谈的话题,然而随着不断的学习,对于以前的认识有很多误区,所以还是需要不断地总结的,学而时习之,不亦说乎
GitHub标星近1万:只需5秒音源,这个网络就能实时“克隆”你的声音
作者 | Google团队 译者 | 凯隐 编辑 | Jane 出品 | AI科技大本营(ID:rgznai100) 本文中,Google 团队提出了一种文本语音合成(text to speech)神经系统,能通过少量样本学习到多个不同说话者(speaker)的语音特征,并合成他们的讲话音频。此外,对于训练时网络没有接触过的说话者,也能在不重新训练的情况下,仅通过未知...
《程序人生》系列-这个程序员只用了20行代码就拿了冠军
你知道的越多,你不知道的越多 点赞再看,养成习惯GitHub上已经开源https://github.com/JavaFamily,有一线大厂面试点脑图,欢迎Star和完善 前言 这一期不算《吊打面试官》系列的,所有没前言我直接开始。 絮叨 本来应该是没有这期的,看过我上期的小伙伴应该是知道的嘛,双十一比较忙嘛,要值班又要去帮忙拍摄年会的视频素材,还得搞个程序员一天的Vlog,还要写BU...
加快推动区块链技术和产业创新发展,2019可信区块链峰会在京召开
11月8日,由中国信息通信研究院、中国通信标准化协会、中国互联网协会、可信区块链推进计划联合主办,科技行者协办的2019可信区块链峰会将在北京悠唐皇冠假日酒店开幕。   区块链技术被认为是继蒸汽机、电力、互联网之后,下一代颠覆性的核心技术。如果说蒸汽机释放了人类的生产力,电力解决了人类基本的生活需求,互联网彻底改变了信息传递的方式,区块链作为构造信任的技术有重要的价值。   1...
程序员把地府后台管理系统做出来了,还有3.0版本!12月7号最新消息:已在开发中有github地址
第一幕:缘起 听说阎王爷要做个生死簿后台管理系统,我们派去了一个程序员…… 996程序员做的梦: 第一场:团队招募 为了应对地府管理危机,阎王打算找“人”开发一套地府后台管理系统,于是就在地府总经办群中发了项目需求。 话说还是中国电信的信号好,地府都是满格,哈哈!!! 经常会有外行朋友问:看某网站做的不错,功能也简单,你帮忙做一下? 而这次,面对这样的需求,这个程序员...
网易云6亿用户音乐推荐算法
网易云音乐是音乐爱好者的集聚地,云音乐推荐系统致力于通过 AI 算法的落地,实现用户千人千面的个性化推荐,为用户带来不一样的听歌体验。 本次分享重点介绍 AI 算法在音乐推荐中的应用实践,以及在算法落地过程中遇到的挑战和解决方案。 将从如下两个部分展开: AI算法在音乐推荐中的应用 音乐场景下的 AI 思考 从 2013 年 4 月正式上线至今,网易云音乐平台持续提供着:乐屏社区、UGC...
【技巧总结】位运算装逼指南
位算法的效率有多快我就不说,不信你可以去用 10 亿个数据模拟一下,今天给大家讲一讲位运算的一些经典例子。不过,最重要的不是看懂了这些例子就好,而是要在以后多去运用位运算这些技巧,当然,采用位运算,也是可以装逼的,不信,你往下看。我会从最简单的讲起,一道比一道难度递增,不过居然是讲技巧,那么也不会太难,相信你分分钟看懂。 判断奇偶数 判断一个数是基于还是偶数,相信很多人都做过,一般的做法的代码如下...
【管理系统课程设计】美少女手把手教你后台管理
【文章后台管理系统】URL设计与建模分析+项目源码+运行界面 栏目管理、文章列表、用户管理、角色管理、权限管理模块(文章最后附有源码) 1. 这是一个什么系统? 1.1 学习后台管理系统的原因 随着时代的变迁,现如今各大云服务平台横空出世,市面上有许多如学生信息系统、图书阅读系统、停车场管理系统等的管理系统,而本人家里就有人在用烟草销售系统,直接在网上完成挑选、购买与提交收货点,方便又快捷。 试想,若没有烟草销售系统,本人家人想要购买烟草,还要独自前往药...
4G EPS 第四代移动通信系统
目录 文章目录目录4G 与 LTE/EPCLTE/EPC 的架构E-UTRANE-UTRAN 协议栈eNodeBEPCMMES-GWP-GWHSSLTE/EPC 协议栈概览 4G 与 LTE/EPC 4G,即第四代移动通信系统,提供了 3G 不能满足的无线网络宽带化,主要提供数据(上网)业务。而 LTE(Long Term Evolution,长期演进技术)是电信领域用于手机及数据终端的高速无线通...
相关热词 c# 输入ip c# 乱码 报表 c#选择结构应用基本算法 c# 收到udp包后回包 c#oracle 头文件 c# 序列化对象 自定义 c# tcp 心跳 c# ice连接服务端 c# md5 解密 c# 文字导航控件
立即提问