Mysql:如何实时运行繁重的分析查询

我正在运行一个使用mysql数据库的crm应用程序。 我的应用程序在mysql中生成大量数据。 现在我想给我的客户一个报告部分,管理员可以在其中查看实时报告,他们应该能够实时过滤。 基本上我希望我的数据尽可能快地切片和切块。
我已经使用mysql和php实现了报告。 但是现在由于数据太多,查询需要花费太多时间而且页面无法加载。 经过几次阅读后,我遇到了一些名词,如Nosql,mongoDb,cassandra,OLAP,hadoop等,但我很困惑可以选择。 是否有任何机制可以将我的数据从mysql传输到nosql,我可以运行我的报告查询和服务我的客户保持我的mysql数据库原样? </ p>
</ div>

展开原文

原文

I am running a crm application which uses mysql database. My application generating lots of data in mysql. Now i want to give my customer a reporting section where admin can view real time report, they should be able to filter at real time. Basically i want my data to be slice and dice at real time fast as possible. I have implemented the reporting using mysql and php. But now as data is too much query takes too much time and page does not load. After few read i came across few term like Nosql, mongoDb , cassandra , OLAP , hadoop etc but i was confuse which to choose. Is there any mechanism which would transfer my data from mysql to nosql on which i can run my reporting query ans serve my customer keeping my mysql database as it is ?

dqb14659
dqb14659 我没有数据说我使用nosql数据库报告的速度比原始mysql快多少。但是nosql确保可扩展且快速。
5 年多之前 回复
doutou19761022
doutou19761022 出于好奇,与原始的mysql数据库相比,使用nosql数据库报告的速度有多快?它似乎不太可能那么多。
5 年多之前 回复
douzhu3367
douzhu3367 链接相关:stackoverflow.com/q/2476280/544342
5 年多之前 回复

2个回答



您使用哪种数据库/数据存储技术报告无关紧要:您仍需设计它以提取您需要的信息 有效率的。 </ p>

通过从MySQL切换到MongoDB或其他可扩展的键/值存储系统之一来提高性能就像通过建设铁路解决行人交通拥堵一样。 它需要做很多工作才能帮助解决问题。 我建议你首先尝试在MySQL中更好地工作。</ p>

首先,您需要仔细查看报告系统中哪些SQL查询导致问题。 您可以通过添加索引或进行其他重构来优化其性能。 这应该是你迈出的第一步。 MySQL的查询日志很慢。 看它。 </ p>

其次,您可以向MySQL添加资源(RAM,更快的磁盘等),并且您可以调整它以获得更高的性能。 有一本名为High Performance MySQL的书为这样做提供了一种合理的方法。</ p>

第三,许多需要在忙碌的应用程序中添加报告功能的人使用MySQL复制。 也就是说,它们配置一个或两个从MySQL服务器以接受来自主服务器的所有数据的副本。</ p>

http://dev.mysql.com/doc/refman/5.5/en/replication-howto.html </ p> \ n

然后,他们使用从服务器或服务器来运行报告查询。 奴隶通常在主人后面几秒或几分钟(也就是说,他们稍微过时了)。 但它通常足以让用户产生实时报告的错觉。</ p>

请注意,如果您使用MongoDB或其他技术,您还必须复制数据。 </ p>
</ div>

展开原文

原文

It doesn't matter what database / datastore technology you use for reporting: you still will have to design it to extract the information you need efficiently.

Improving performance by switching from MySQL to MongoDB or one of the other scalable key/value store systems is like solving a pedestrian traffic jam by building a railroad. It's going to take a lot of work to make it help the situation. I suggest you try getting things to work better in MySQL first.

First of all, you need to take a careful look at which SQL queries in your reporting system are causing trouble. You may be able to optimize their performance by adding indexes or doing other refactoring. That should be your first step. MySQL has a slow query log. Look at it.

Secondly, you may be able to add resources (RAM, faster disks, etc) to MySQL, and you may be able to tune it for higher performance. There's a book called High Performance MySQL that offers a sound methodology for doing this.

Thirdly, many people who need to add a reporting function to their busy application use MySQL replication. That is, they configure one or two slave MySQL servers to accept copies of all data from the master server.

http://dev.mysql.com/doc/refman/5.5/en/replication-howto.html

They then use the slave server or servers to run reporting queries. The slaves are ordinarily a few seconds or minutes behind the master (that is, they're slightly out of date). But it usually is good enough to give users the illusion of real-time reporting.

Notice that if you use MongoDB or some other technology you will also have to replicate your data.

douhui4699
douhui4699 你想说我应该在不同的服务器上保存不同的客户端数据并处理并动态指向它们吗?
5 年多之前 回复
douding7189
douding7189 这种应用程序有助于在几个月后从主要事务表中迁移已关闭票证的记录。 它还有助于分片 - 通过在不同的数据库服务器上保留不同客户的记录来扩展。
5 年多之前 回复
dongping4901
dongping4901 。 谢谢!! 我有一个票务和在线聊天应用程序。 门票可以通过邮件,聊天,移动和离线等方式打开。您可以说360客户服务。随着我们添加客户,门票数量大幅增加。 有代理人负责处理此票证。我们的报告主要基于此票证和代理商。
5 年多之前 回复
dtp0760
dtp0760 众所周知,MySQL可以很好地扩展。 (各种NoSQL解决方案也是如此。)你还没有解释你的应用程序,因此我不敢预测你会遇到什么样的瓶颈。 在过去的工作中,我们构建了一个单独的报告实例(已复制),它提供了直到昨天的报告,并从事务生产实例中提取了最新的数据。 有各种各样的方法可以向用户提供实时数据,并且没有魔法银子弹,特别是在大型系统中。 建立它,让您的用户,解决您的问题!
5 年多之前 回复
dsaf32131
dsaf32131 你建议我应该选择什么? 因为我的php与mysql报告不起作用?
5 年多之前 回复
dongshi7350
dongshi7350 我也不知道复制是如何影响性能的,请记住,正如许多人所了解的那样,在从属服务器上运行查询实际上并没有那么大的帮助,尤其是对于MongoDB,因为MongoDB的结果不需要IO或物理结果集 提供你的内存容量非常轻。 很多关于MySQL和MongoDB的人都发现,对Slave的runnig查询会导致过多的复杂和糟糕的过时陈旧数据。
5 年多之前 回复
dqlxtv1452
dqlxtv1452 我忘记了多少次这样说:MongoDB不是一个关键的价值商店
5 年多之前 回复
drt3751
drt3751 谢谢奥利! 我得到了你想说的。你的意思是通过添加主从模型,mysql集群和调优来从mysql本身获得更好和最好。 与你达成一致意见。 一旦我获得更多客户端,我的应用程序数据将会快速增长。 我上面的mysql建议是否仍然有效? 或者我需要稍后切换到nosql(我仍然需要设计它来有效地提取你需要的信息)? 那为什么不现在呢?
5 年多之前 回复



我将抛​​出此链接供您阅读实际给出某些用例的内容: http://www.mongodb.com/use-cases/real-time-analytics 但我会代表一个 更传统的MongoDB设置。</ p>

我已经使用MySQL和MongoDB进行分析,我发现MongoDB更适合,如果不需要一点点黑客来使它运行良好。< / p>

在检索分析数据时,MongoDB的优点在于它不需要IO /内存每次都写出单独的结果集。 这使得对副本集的单个成员的读取具有极高的可伸缩性,因为您只需将分析集合添加到工作集(也称为内存),并直接使用批处理响应(这是驱动程序的默认实现)。</ p >

因此,MongoDB复制在读/写方面很少有优势,而实际上在MySQL中我发现它也没有。 如果它确实那么你正在做错误的查询,无论如何都不会扩展; 此时你将memcache安装到你的数据库服务器上,看起来,无论如何,你都会以NoSQL的方式从内存中提供过时的数据......呐喊,我猜。</ p>

好的,所以我们 有一些基本的想法; 是时候谈论那个黑客了。 为了从MongoDB中获得最佳速度,并且由于它没有JOIN,您需要展平数据,这样就不需要任何结果集。</ p>

那里 对此有很多策略,但我在这里提到的是: http ://docs.mongodb.org/ecosystem/use-cases/pre-aggregated-reports/ 预先汇总的报告。 这种方法在SQL技术中也很有效,因为它与逻辑分割表基本相同,可以在大型表上更快更轻松地查询。</ p>

你做的就是你得到你的 分析数据,将其分成每天或每月(或两者)的命名,然后以非规范化的方式聚合这些范围的数据,基本上是一行。</ p>

之后,您可以直接从集合中显示报表,而无需使用结果集进行快速查询。</ p>

稍后您可以添加地图缩减步骤以创建更好的分析但是 到目前为止,我还没有必要,我已经完成了没有这种需要的基于视频的完整视频。</ p>

这应该可以让你开始。</ p>
</ div>

展开原文

原文

I will throw this link out there for you to read which actually gives certain use cases: http://www.mongodb.com/use-cases/real-time-analytics but I will speak for a more traditional setup of just MongoDB.

I have used both MySQL and MongoDB for analytical purposes and I find MongoDB better suited, if not needing a little bit of hacking to get it working well.

The great thing about MongoDB when it comes to retreiving analytical data is that it does not require the IO/memory to write out a separate result set each time. This makes reads on a single member of a replica set extremely scalable since you just add your analytical collections to the working set (a.k.a memory) and serve straight from those using batch responses (this is the default implementation of the drivers).

So with MongoDB replication rarely gives an advantage in terms of read/write, and in reality with MySQL I have found it does not either. If it does then you are doing the wrong queries which will not scale anyway; at which point you install memcache onto your database servers and, look, you have stale data being served from memory in a NoSQL fashion anyway...whoop, I guess.

Okay, so we have some basic ideas set out; time to talk about that hack. In order to get the best possible speed out of MongoDB, and since it does not have JOINs, you need to flatten your data so that no result set will even be needed your side.

There are many tactics for this, but the one I will mention here is: http://docs.mongodb.org/ecosystem/use-cases/pre-aggregated-reports/ pre-aggregated reports. This method also works well in SQL techs since it essentially is the in the same breath as logically splitting tables to make queries faster and lighter on a large table.

What you do is you get your analytical data, split it into a demomination such as per day or month (or both) and then you aggregate your data across those ranges in a de-normalised manner, essentially, all one row.

After this you can show reports straight from a collection without any need for a result set making for some very fast querying.

Later on you could add a map reduce step to create better analytics but so far I have not needed to, I have completed full video based anlytics without such need.

This should get you started.

Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
立即提问