doushengyou2617 2015-05-05 12:09
浏览 30

PHP / MYSQL网站全文搜索查询带过滤器

I have a conceptual programming/database (PHP/MYSQL) question for a project I'm working on. Just wondering if anyone can help with a solution. I've been racking my brains for ages trying to work out the best way to solve this issue and am pretty sure I've looked everywhere for an answer on SO.

I can't be particularly verbose about the specifics of what the site does due to the concept being the property of someone else. But I'll try to explain what I can...

I have a project that will involve a large amount of data being quickly queried. We're talking a minimum of 2 Million Records...This in itself isn't a massive issue, but the way I need to query and interact with this data is.

Take the following as an example work flow for this site.

1) User Visits Website

2) User Logs In

3) User Clicks on the 'Random' Link - A bit like Google's 'I'm feeling Lucky'

4) User is randomly displayed an item from the database that he/she hasn't seen before from the 2Mil+ List of items in the db. This should have the ability to filter by item location/type.

5) User gives 'feedback' on the item displayed, then clicks next

6) the next random item in the table is displayed and the process repeats

The storage of 2 million records isn't the problem. The real problem is, how do I hold the feedback data so it is then not included in Generate Link system from that point forth for that specific User.

That is, how would I do this in such a way that the user doesn't have to wait a long time between items for the query to complete. I understand that Splits and Joins could be used, but this still doesn't seem very fast when I've tried it.

To be 100% Clear as an example, this is the closest I've come to a solution before grinding to a halt.

Say for instance we have five tables:

item_list
- ItemId (int, inc, PK)
- ItemName (varchar)
- ItemCity (int, rel)
- ItemCountry (int, rel)

item_city
- LocationId (int, inc, PK)
- LocationName (varchar)

item_country
- LocationId (int, inc, PK)
- LocationName (varchar)

user_list
- UserId (int, inc, PK)
- UserName (varchar)

user_input
- InputId (int, inc, PK)
- UserId (int, rel)
- ItemId (int, rel)
- ItemFeedback (text)

When a user gives feedback on an item within item_list, a record is added to the user_input table with the relevant UserId and ItemId. I could then do a select on item_list where ItemId is not in the user_input table against their UserId. This strikes me as a bad way to go though. If you have 2 million items, that's 1 user with 2 mil records in the user_input table if he completes every item...However, I expect at least 50 users which equates to 100 million records if every user gives feedback on every item. Same goes for a comma separated list of values.

I just have no idea how to achieve this without massive slowdowns or lag as soon as you reach 1000+ pieces of feedback from a user.

I really hope this makes sense let me know if you have any questions and I will try my hardest to answer them.

Thanks -Dave

  • 写回答

0条回答 默认 最新

    报告相同问题?

    悬赏问题

    • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
    • ¥100 为什么这个恒流源电路不能恒流?
    • ¥15 有偿求跨组件数据流路径图
    • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
    • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
    • ¥15 CSAPPattacklab
    • ¥15 一直显示正在等待HID—ISP
    • ¥15 Python turtle 画图
    • ¥15 stm32开发clion时遇到的编译问题
    • ¥15 lna设计 源简并电感型共源放大器