dousu1900 2014-10-29 08:59
浏览 30

使用哪个数据库以及如何加速[关闭]

I've a project to store and process expenses of a user. There will be only two operations on the database: INSERT and SELECT. The database may contain millions of entries per day (depending on no. of users or if the user is a company etc.).

common queries:

  1. Show the expenses from date x to date y. (Mostly)
  2. Applying filters on query (1).
  3. Show expenses on a particular item till date. (Queries throughout the table)
  4. Show all expenses till date. (rarely)

1: I'm confused in which database to use for this: SQL or NoSQL or SQL and NoSQL combinely (How in my case?). I need a comparison based on speed while querying a large amount of data.

2: Since per day it may contain millions of records, Millions of rows resulted by query need to be transferred from the server to client. (In my case there is one more overhead. As the database server is remote to Web server. So the result need to be transferred from database server to web server and then to client.) How to make it faster?

4: If i choose MySQL, which will be better: a: Dumping the whole data in one big SQL table. b: Creating a table for each day (with date as the table name) which will contain smaller amounts of data. (I thought (b) will be faster while giving a date range since I know which table to select instead of looking in a big table and querying for the specific date.)

3: For now I'm trying with MySQL. (The test data is already there. I am using a python script to parse that data and dump it in MySQL. I can edit the script and make it work for any type of database.) I tried the query (4) mentioned above. With the result from the database server, my web-server/client (since I'm testing, my web server is the client now.) kind of hanged while around 13 million of rows as the result of the query are being transferred from the database server. So I used a loop in my PHP code to limit the query 1000 rows at a time like the sample below:

(Loop until getting data from database){
    i=0;
    SELECT * FROM <Table> LIMIT i, 1000;
    i+=1000;
}

It's still slow, but now the system is not hanging during the transfer. But does LIMIT here will work like, get 1st 1000 records (while i=0), Then the 2nd 1000 records (while i=1000) and so on. Or It'll get 1st 1000 records (while i=0), Then start from 0 again but skip the 1st 1000 records and get the 2nd 1000 records (while i=1000)and so on which will be much slower obviously. (I searched online to know the mechanism of LIMIT but everywhere they are saying about LIMIT with ORDER BY not how to get page wise data with LIMIT and what is the impact on performance with it.)

P.S. I'm not a database pro. Just a beginner. So asking for Experts' suggestions before starting the project.

  • 写回答

1条回答 默认 最新

  • douxiong2999 2014-10-29 09:35
    关注

    If you have millions entries per day I think you should go for a NoSQL database. It will be more faster and efficient at handling big data. I suggest elasticsearch for you as you only perform INSERT and SELECT functions on a huge amount of data. It has a good documentation and easy enough to use. I think it will serve you well.

    评论

报告相同问题?

悬赏问题

  • ¥15 关于#MATLAB#的问题,如何解决?(相关搜索:信噪比,系统容量)
  • ¥500 52810做蓝牙接受端
  • ¥15 基于PLC的三轴机械手程序
  • ¥15 多址通信方式的抗噪声性能和系统容量对比
  • ¥15 winform的chart曲线生成时有凸起
  • ¥15 msix packaging tool打包问题
  • ¥15 finalshell节点的搭建代码和那个端口代码教程
  • ¥15 Centos / PETSc / PETGEM
  • ¥15 centos7.9 IPv6端口telnet和端口监控问题
  • ¥20 完全没有学习过GAN,看了CSDN的一篇文章,里面有代码但是完全不知道如何操作