每个唯一ID最多运行一个并发线程的算法

I have a Go web application which needs to execute a given section of code in only one goroutine per unique ID. The scenario is that I have requests that are coming with various IDs that represent a sort of transaction. A certain subset of the operations on these needs to be guaranteed to be run only "one at at time" for a given ID (and other competing requests should block until the prior one working on/for that ID is done).

I can think of a few ways to do this but the book keeping seems tricky - need to keep a global mutex to lock access to a map of what concurrent requests are happening and then use a mutex or a counter from there, and then make sure it doesn't deadlock, and then garbage collect (or carefully reference count) old request entries. I can do this, but sounds error prone.

Is there a pattern or something in the standard library that can be easily used to good effect in this case? Didn't see anything obvious.

EDIT: One thing I think was confusing in my explanation above is the use of the word "transaction". In my case each of these does not need an explicit close - it's just an identifier to associate multiple operations with. Since I don't have an explicit "close" or "end" concept to these, I might receive 3 requests within the same second and each operation takes 2 seconds - and I need to serialize those because running them concurrently will wreak havoc; but then I might get a request a week later with that same ID and it would be referring to the same set of operations (the ID is just the PK on a table in a database).

写回答
好问题 0 提建议
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

3条回答默认最新

dousa2794 2015-07-23 18:28

关注

You've got a good start with the locked global map. You can have a worker per "transaction" and handlers send requests to them over channels, using a locked map to keep track of the channels. Workers can close transactions when they receive a special request. You don't want dangling transactions to become a problem, so you should probably arrange to get an artificial close request sent after a timeout.

That isn't the only way, though it might be convenient. If you only need make certain requests wait while their transaction is being worked on elsewhere, there is probably a construction with a map of *sync.Mutexes, rather than channels talking to worker goroutines, that has better resource use. (There's now code for that approach, more or less, in bgp's answer.)

An example of the channel approach is below; besides serializing work within each transaction, it demonstrates how you might do graceful shutdown with close and a sync.WaitGroup for a setup like this, and timeouts. It's on the Playground.

package main

import (
    "fmt"
    "log"
    "sync"
    "time"
)

// Req represents a request. In real use, if there are many kinds of requests, it might be or contain an interface value that can point to one of several different concrete structs.
type Req struct {
    id      int
    payload string // just for demo
    // ...
}

// Worker represents worker state.
type Worker struct {
    id   int
    reqs chan *Req
    // ...
}

var tasks = map[int]chan *Req{}
var tasksLock sync.Mutex

const TimeoutDuration = 100 * time.Millisecond // to demonstrate; in reality higher

// for graceful shutdown, you probably want to be able to wait on all workers to exit
var tasksWg sync.WaitGroup

func (w *Worker) Work() {
    defer func() {
        tasksLock.Lock()
        delete(tasks, w.id)
        if r := recover(); r != nil {
            log.Println("worker panic (continuing):", r)
        }
        tasksLock.Unlock()
        tasksWg.Done()
    }()
    for req := range w.reqs {
        // ...do work...
        fmt.Println("worker", w.id, "handling request", req)
        if req.payload == "close" {
            fmt.Println("worker", w.id, "quitting because of a close req")
            return
        }
    }
    fmt.Println("worker", w.id, "quitting since its channel was closed")
}

// Handle dispatches the Request to a Worker, creating one if needed.
func (r *Req) Handle() {
    tasksLock.Lock()
    defer tasksLock.Unlock()
    id := r.id
    reqs := tasks[id]
    if reqs == nil {
        // making a buffered channel here would let you queue up
        // n tasks for a given ID before the the Handle() call
        // blocks
        reqs = make(chan *Req)
        tasks[id] = reqs
        w := &Worker{
            id:   id,
            reqs: reqs,
        }
        tasksWg.Add(1)
        go w.Work()
        time.AfterFunc(TimeoutDuration, func() {
            tasksLock.Lock()
            if reqs := tasks[id]; reqs != nil {
                close(reqs)
                delete(tasks, id)
            }
            tasksLock.Unlock()
        })
    }
    // you could close(reqs) if you get a request that means
    // 'end the transaction' with no further info. I'm only
    // using close for graceful shutdown, though.
    reqs <- r
}

// Shutdown asks the workers to shut down and waits.
func Shutdown() {
    tasksLock.Lock()
    for id, w := range tasks {
        close(w)
        // delete so timers, etc. won't see a ghost of a task
        delete(tasks, id)
    }
    // must unlock b/c workers can't finish shutdown
    // until they can remove themselves from maps
    tasksLock.Unlock()
    tasksWg.Wait()
}

func main() {
    fmt.Println("Hello, playground")
    reqs := []*Req{
        {id: 1, payload: "foo"},
        {id: 2, payload: "bar"},
        {id: 1, payload: "baz"},
        {id: 1, payload: "close"},
        // worker 2 will get closed because of timeout
    }
    for _, r := range reqs {
        r.Handle()
    }
    time.Sleep(75*time.Millisecond)
    r := &Req{id: 3, payload: "quux"}
    r.Handle()
    fmt.Println("worker 2 should get closed by timeout")
    time.Sleep(75*time.Millisecond)
    fmt.Println("worker 3 should get closed by shutdown")
    Shutdown()
}

报告相同问题？

关注问题

linux机器最多能有多少个进程,浅析linux环境下一个进程最多能有多少个线程
2021-05-10 10:07

weixin_39715926的博客浅析linux环境下一个进程最多能有多少个线程默认情况下：主线程＋辅助线程＋<253个自己的线程含主线程和一个辅助线程，最多255个，即你自己只能生成253个线程。据说可以设置线程数目：据说是可以设置的，但本人...
Js Snowflake(雪花算法)生成随机ID的实现方法
2020-10-14 21:00

这允许有最多32个数据中心和每个数据中心最多32个工作进程。 3. **序列号**（12位）：在同一毫秒内，同一工作节点可以生成的最大序列号，范围是0到4095，用于处理同一毫秒内的并发请求。在JavaScript中实现...
【分布式】一种分布式唯一ID生成算法-雪花算法
2025-06-17 09:47

代码先锋的博客雪花算法（Snowflake Algorithm）是Twitter开源的一种分布式唯一ID生成算法，通过组合时间戳、机器标识和序列号生成64位全局唯一ID，适用于高并发分布式系统。
雪花算法 — 集群高并发情况下如何保证分布式唯一全局ID生成？
2023-07-13 13:42

小成同学_的博客 Twitter的分布式自增ID算法——Snowflake。最初Twitter把存储系统从MySQL...Twitter的分布式雪花算法SnowFlake，经测试SnowFlake每秒可以产生26万个自增可排序的ID。Twitter的SnowFlake生成ID能够按照时间有序生成。
PHP实现Snowflake生成分布式唯一ID的方法示例
2020-10-14 20:35

在`getId()`方法中，首先通过Swoole的互斥锁`$lock`确保同一时刻只有一个线程能够执行ID生成过程，避免了并发问题。接着，根据当前时间戳与上一次生成ID的时间戳对比，更新序列号并检查是否溢出。最后，通过位运算将...
集群高并发情况下，通过SnowFlake雪花算法保证生成分布式唯一全局ID
2020-05-26 16:21

loulanyue_的博客猫眼电影等产品的系统中数据日渐增长，对数据分库分表后需要有一个唯一ID来标识一条数据或消息；特别一点的如订单、骑手、优惠券也都需要有唯一ID做标识，此时一个能够生成全局唯一ID的系统是非常必要的二、生成ID...
雪花算法：分布式系统唯一 ID 生成的核心方案
2025-04-25 14:00

Python智慧行囊的博客雪花算法作为分布式系统中唯一 ID 生成的经典方案，凭借其高性能、有序性和可扩展性等优点，在互联网行业得到了广泛应用。尽管它存在依赖系统时间等缺陷，但通过合理的优化和改进，仍然能够在大多数分布式场景中发挥...
高并发情况下，雪花ID一秒400W个，以及分布式ID算法（详析）
2021-04-22 10:22

Janson_Lin的博客背景最近在研究雪花算法，在研究同时，想了一个问题，在高并发的情况下，一秒内，雪花算法能生成多少个ID。闲话少说，开撸。 ...spm=1001.2101.3001.4242 ...
生成数字的全局唯一Id.zip
2019-08-12 09:47

"生成数字的全局唯一Id.zip" 提供了一个Java实现，利用雪花算法来生成Long类型的唯一ID。下面将详细解释雪花算法以及如何在Java中实现它。雪花算法（Snowflake Algorithm）是由Twitter开源的一种分布式ID生成方案...
分布式唯一ID生成算法——雪花算法（Snowflake）
2025-03-15 07:00

纪元A梦的博客分布式唯一ID生成算法——雪花算法（Snowflake）全面详解
没有解决我的问题, 去提问

码龄粉丝数原力等级 --

每个唯一ID最多运行一个并发线程的算法

3条回答默认最新

码龄粉丝数原力等级 --

每个唯一ID最多运行一个并发线程的算法

3条回答 默认 最新

3条回答默认最新