I have built two proof-of-concept projects for a RESTful API in Golang which will return a large amount of JSON data by querying for a specific ID. I built the first using MongoDB, and then a modified 2nd to support Postgres. The MongoDB version queries by the ID in the JSON itself, but Postgres I simply had two columns in a table, with a primary key on the ID and a jsonb column for all the JSON data. I assumed this was faster than indexing all of the json data and selecting that way. Since I am always returning all of the json data, it seemed unnecessary to query via json data.
I heard a lot of positive things about Postgres being 2.8-3x faster than MongoDB for handling JSON data, and the initial results seemed to prove so. I built a tester which would sequentially perform GETs on the API for all (roughly 30k) objects in the database and print them out, and record the amount of time they took. Postgres finished all 30k in 1.5 minutes, with MongoDB finishing in 6. I decided to write a second test, performed the same way, but instead query 100 funds at the same time, wait for all 100 to be processed (printed), then feed in another set of 100 IDs to query the API for. This is repeated until all the funds have been queried for and returned. It's quite easy to do in Golang using goroutines. MongoDB finished in roughly 6 minutes again... with Postgres finishing in 45. The slowness of Postgres in the console for printing was very noticeable.
How can sequential vs. concurrency in this test of simple SELECT statements be so incredibly slow? Is there some major thing I am missing? I increased buffer sized in the config for postgres, I tried using the pg_prewarm
extension on the table to load it into memory so it would be faster, as well as a CREATE INDEX CONCURRENTLY
index on the ID in the table. I am relatively new to Postgres, having primarily used MySQL in the past. The sequential tests showed Postgres was definitely a lot faster in handling the data, but when scaled up to high demand, its performance completely went out the window. This is a small microservice API to be used internally to request data for a website which can be under heavy load at times.
Is there any things I could try to fix this, or does Postgres plateau this way? Thanks for any help!
Here's the table creation code in Postgres if this helps diagnose anything:
(The ID is a string, so text
is intentional)
CREATE TABLE public.tblname
(
id text NOT NULL,
data jsonb,
CONSTRAINT tblname_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
CREATE INDEX idx_id
ON public.tblname
USING btree
(id COLLATE pg_catalog."default");
Edit: The query is a simple select that looks like:
SELECT data FROM tblname WHERE id= $1
$1 is where the string is passed in for the ID in Go. I am also using the pq Postgres driver library for the project. It mainly extends the native database/sql library built into Go already. https://github.com/lib/pq