分类HTTP发布对象的最便宜方法

I can use SciPy to classify text on my machine, but I need to categorize string objects from HTTP POST requests at, or in near, real time. What algorithms should I research if my goals are high concurrency, near real-time output and small memory footprint? I figured I could get by with the Support Vector Machine (SVM) implementation in Go, but is that the best algorithm for my use case?

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
doukezi4606 2016-10-27 03:26
关注
Yes, SVM (with a linear kernel) should be a good starting point. You can use scikit-learn (it wraps liblinear I believe) to train your model. After the model is learned, the model is simply a list of feature:weight for each category you want to classifying into. Something like this (suppose you have only 3 classes):

class1[feature1] = weight11 class1[feature2] = weight12 ... class1[featurek] = weight1k ------- for class 1 ... different <feature, weight> ------ for class 2 ... different <feature, weight> ------ for class 3 , etc

At prediction time, you don't need scikit-learn at all, you can use whatever language you are using on the server backend to do a linear computation. Suppose a specific POST request contains features (feature3, feature5), what you need to do is like this:

linear_score[class1] = 0 linear_score[class1] += lookup weight of feature3 in class1 linear_score[class1] += lookup weight of feature5 in class1 linear_score[class2] = 0 linear_score[class2] += lookup weight of feature3 in class2 linear_score[class2] += lookup weight of feature5 in class2 ..... same thing for class3 pick class1, or class2 or class3 whichever has the highest linear_score

One step further: If you could have some way to define the feature weight (e.g., using tf-idf score of tokens), then your prediction could become:

linear_score[class1] += class1[feature3] x feature_weight[feature3] so on and so forth.

Note feature_weight[feature k] is usually different for each request. Since for each request, the total number of active features must be much smaller than the total number of considered features (consider 50 tokens or features vs your entire vocabulary of 1 MM tokens), the prediction should be very fast. I can imagine once your model is ready, an implementation of the prediction could be just written based on a key-value store (e.g., redis).
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

分类HTTP发布对象的最便宜方法
2016-10-27 02:56

回答 1 已采纳 Yes, SVM (with a linear kernel) should be a good starting point. You can use scikit-learn (it wrap
新手求助 java对象为何不能在方法体外使用方法？ java
2018-11-14 14:54

回答 2 已采纳代码必须写在方法或者初始化块里，初始化块的本质其实还是构造函数，构造函数是一个特殊的方法。代码不能写在类的定义下面，否则这些代码什么时候执行？谁来调用。你提到的加了大括号，就是初始化块。 h
javascript 面向对象编程获取生成对象的属性和方法 javascript
2014-11-27 15:33

回答 4 已采纳你那样申明a是全局变量，哪里都能访问，加var申明就变为内部变量，需要在A函数体内提供一个方法访问a，并且这个方法只能放A函数体中，不能通过prototype原型添加，这样访问不到 functi
存储分类及对象存储osd的技术原理
2021-01-06 09:36

qq_42533216的博客什么是对象存储（OSD）？存储局域网(SAN)和网络附加...1999年成立的全球网络存储工业协会(SNIA)的对象存储设备(Object Storage Device)工作组发布了ANSI的X3T10标准。对象存储的优点：总体上来讲，对象存储同兼具
java定义类，对象，方法的问题 java
2018-11-23 10:56

回答 1 已采纳 ```java package cn.base; /** * @Description: * @Author :小书包 * @CreateDate :2018-11-23
python的方法存贮在类中还是对象中？ python
2018-02-20 01:18

回答 5 已采纳 ```python In [263]: class Sample(object): ...: def ordinary(self): ...: pr
uploadify对象不支持此属性或方法求助
2016-12-09 03:11

回答 1 已采纳 jquery.uploadify.js 文件没有正确导入，自己检查这个文件路径，并且还需要先导入jquery框架，然后才是jquery.uploadify.js，循序不能错
Wider Challenge结果爆出，实时3D对象探测技术发布
2019-02-25 16:15

BigDataDigest的博客大数据文摘专栏作品作者：Christopher Dossman...AI Scholar Weekly是AI领域的学术专栏，致力于为你带来最新潮、最全面、最深度的AI学术概览，一网打尽每周AI学术的前沿资讯，文末还会不定期更新AI黑镜系列小故事。...
HttpClient请求返回结果对象HttpEntity处理 http
2017-06-20 09:20

回答 3 已采纳多次使用了EntityUtils.toString了
java 创建对象调用方法问题 java
2015-12-30 00:52

回答 5 已采纳 Phone pe = new Phone(); pe.setPrice(2000); 这俩句代码你放在了类体中。其中，Phone pe = new Phone(); 放在类体没有问题，可
javascript 无法找到对象的对应方法 javascript
2015-01-19 10:15

回答 5 已采纳参考[JS与FLASH交互](http://laolang.xtmm.cn/?p=14246 "JS与FLASH交互") 你查看的页面是包含flash，支持网页js和flash交互的吗？如果是的话，
4-HTTP协议
2023-04-17 18:46

X-musk的博客 HTTP协议
关于访问对象中的类和方法
2015-09-09 10:36

回答 3 已采纳 Persion per=null;没有初始化 Persion per=new Person();
阿里云的对象存储服务，oss 简介
2021-10-22 13:51

非正经研究生的博客阿里云的对象存储服务，oss 简介在电子科大的21学年的数据库新技术课程中，我接触到了一个对象存储的概念。过往接触的就是那些 mysql 这些二维表的数据库，在做一些java 项目，比如 springboot 的项目中，也是...
【CV】用于图像恢复的深度学习方法综述论文（2022年）
2022-04-27 17:10

datamonday的博客 [75] 对传统的图像去模糊方法进行了回顾，定义了常见成像中发生的模糊，并根据各自的特征将方法分类为五个主要框架。由于当时基于学习的方法还没有得到很好的发展，神经网络只是被认为是一个有前途的进一步研究的...
没有解决我的问题, 去提问

悬赏问题

¥15 运筹学排序问题中的在线排序
¥15 关于docker部署flink集成hadoop的yarn，请教个问题 flink启动yarn-session.sh连不上hadoop，这个整了好几天一直不行，求帮忙看一下怎么解决
¥30 求一段fortran代码用IVF编译运行的结果
¥15 深度学习根据CNN网络模型，搭建BP模型并训练MNIST数据集
¥15 C++ 头文件/宏冲突问题解决
¥15 用comsol模拟大气湍流通过底部加热（温度不同）的腔体
¥50 安卓adb backup备份子用户应用数据失败
¥20 有人能用聚类分析帮我分析一下文本内容嘛
¥30 python代码，帮调试，帮帮忙吧
¥15 #MATLAB仿真#车辆换道路径规划

分类HTTP发布对象的最便宜方法

1条回答 默认 最新

悬赏问题

1条回答默认最新