Have a Golang utility which is able to decrease data points per day in a table with historic data.
The records range from 20 to 400 records per day. totally there are a minimum of 100 million records.
The utility is able to trim it down to n records per day prior to a given date. (n can range from 1 to 300 records per day)
The method I am using is as follows:
STEP 1:
CREATE TABLE main_table_tmp LIKE main_table;
STEP 2:
ALTER TABLE main_table_tmp ADD COLUMN timekey INT;
STEP 3:
INSERT INTO main_table_tmp
SELECT * FROM (
SELECT *,FLOOR(UNIX_TIMESTAMP(column_name)/((1440/2)*60)) AS timekey
FROM main_table
WHERE column_name <= '2018-01-01'
GROUP BY timekey
) m
UNION ALL
(SELECT * ,0 As timekey FROM main_table where column_name > 'date') ;
STEP 4:
ALTER TABLE main_table_tmp DROP COLUMN timekey;
DROP TABLE maintable;
RENAME TABLE maintable_tmp TO maintable;
I am achieving the above using golang.
func somefuncname(){
----
----
----
q := "CREATE TABLE " + *tablename + "_tmp LIKE " + *tablename + ";"
rows, err := db.Query(q)
if err != nil {
fmt.Println(err)
}
//--ALTER ADD timekey
//--INSERT INTO SELECT *....
//--ALTER DROP timekey ,DROP table and rename
}
The current response time of this query is very slow
Some of the Results:
Total Records : 2 million
Execution Time: 180 seconds
This is on a 16Gb RAM CPU It is very slow when it is deployed on a low grade system
Steps I have took to resolve this:
Looked into indexes of all the tables. Tried removing the index and running the utility. Removing indexes made the utility faster by 5 seconds which is also not much.
Executed the utility in stages: if total records crosses more than 1 million then run the utility 1 million at a time
But after all these efforts looks like the main problem is in the query itself.
It is just not fast enough. I just need a way to increase the efficiency of the query
Any help appreciated, thank you guys!!