weixin_39929253
weixin_39929253
2020-12-02 09:57

Envelope rate limits

Notes:

  • Expected response example: https://github.com/getsentry/sentry-java/blob/main/sentry/src/test/java/io/sentry/transport/HttpTransportTest.kt#L146-L161
  • Existing 429 handling should remain

HTTP Responses

Rate limits are communicated to downstream clients (both Relay and SDKs) via status codes and a custom response header. For regular rate limit responses, we emit a [429](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429) status code and specify a Retry-After header.

Changes

  • We include a special response header containing a list of all rate limits in addition to †other information.

To inform downstream clients (both Relay and SDKs) about the rate limits that applied, a custom response header is appended:

python
X-Sentry-Rate-Limits: *quota_limit*, *quota_limit*, ...

Each quota_limit has the form retry_after:categories:scope:... with the following parameters:

  • retry_after: Number of seconds until this rate limit expires.
  • categories: Semicolon separated list of categories. If empty, this limit applies to all categories.
  • scope: The scope that this limit applies to. Can be ignored by SDKs.
  • More parameters can be added in the future.

The header may contain spaces which need to be ignored. Relay will only emit spaces after , to separate quota_limits.

json
{
  "": 60,
  "event": 10002700,
  "transaction": 10000060,
  "security": 10009000,
}

Example response (formatted for readability):

python
HTTP/1.1 429 Too Many Requests
Retry-After: 2700
X-Sentry-Rate-Limits: 
  60:transaction:key, 
  2700:default;error;security:organization

Example response (with empty categories list):

python
HTTP/1.1 429 Too Many Requests
Retry-After: 60
X-Sentry-Rate-Limits: 60::organization, 2700::organization
  • [Accepted] Option 1: Header-based solution
  • [Discarded] Option 2: Header-based solution
  • [Discarded] Option 3: JSON response body solution

Client Enforcement

Clients are expected to honor 429 responses and rate limits communicated from Sentry by stopping data transmission until the rate limit has expired. Events, transactions and sessions during this period are to be discarded.

Stage 1: Parse Response Headers

To detect rate limits in a response from Sentry, apply the following steps:

  1. On every response, look for X-Sentry-Rate-Limits. If present, parse it and immediately go to Stage 2.
  2. On 429 responses, look for Retry-After. If present, treat it like scope=key and categories=[] and go to Stage 2. It is allowed but not required to do this on other status codes.
  3. Otherwise, there are no rate limits communicated by Sentry.

Stage 2: Determine Rate Limits

The exact behavior depends on the feature level of the Client:

  • Clients are generally allowed to ignore any dimension (such as category or scope) if they do not support it. However, if they have explicit support, they must obey them. For example, an SDK without support for tracing may ignore transaction, but SDKs with tracing must also implement and obey the transaction category.
  • SDKs that only support one event type can ignore X-Sentry-Rate-Limits and use the Retry-After **header to determine rate limit expiration.
  • SDKs that support multiple event types must parse the categories dimension. For each category, they should maintain a separate limit, and another separate limit for the implicit "all" category. They can ignore the scope, as long as they only support one DSN per instance.
  • Proxy clients like Relay should create one limit-bucket per (category, scope) combination since they both handle multiple categories and scopes.
  • Categories and scopes that are unknown to the client should be ignored. The limit still applies to the known categories. Clients should expect that more will be added in the future.
  • Limits where all categories are unknown must be ignored. Do not apply unknown categories to a default category. Note that this is distinct from limits with no category, which implicitly apply to all categories (think of this as a magic category).
  • Always keep the maximum rate limit if multiple rate limits reference the same bucket. If a new rate limit is shorter than an already cached rate limit, then keep the longer one.
  • Unknown dimensions must be ignored, i.e., all additional colons and text after the scope.
  • X-Sentry-Rate-Limits returned in 200 OK responses should be treated like on 429 responses. Sentry may choose to respond with 200 OK regardless of a rate limit or may choose to inform a client proactively about a rate limit that is unrelated to the current request. This happens specifically for reject-all quotas to prevent clients from sending requests.

该提问来源于开源项目:getsentry/sentry-dotnet

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

10条回答

  • weixin_39995439 weixin_39995439 5月前

    Codecov Report

    Merging #553 into main will decrease coverage by 0.28%. The diff coverage is 77.06%.

    Impacted file tree graph

    diff
    @@            Coverage Diff             @@
    ##             main     #553      +/-   ##
    ==========================================
    - Coverage   85.34%   85.06%   -0.29%     
    ==========================================
      Files         138      142       +4     
      Lines        3357     3461     +104     
      Branches      752      775      +23     
    ==========================================
    + Hits         2865     2944      +79     
    - Misses        294      310      +16     
    - Partials      198      207       +9     
    

    | Impacted Files | Coverage Δ | | |---|---|---| | src/Sentry/Internal/Http/RateLimitCategory.cs | 45.45% <45.45%> (ø) | | | test/Sentry.Testing/HttpClientExtensions.cs | 72.72% <69.23%> (-5.06%) | :arrow_down: | | src/Sentry/Internal/Http/HttpTransport.cs | 84.05% <80.55%> (-5.42%) | :arrow_down: | | test/Sentry.Testing/FakeHttpMessageHandler.cs | 87.50% <87.50%> (ø) | | | src/Sentry/Internal/Http/RateLimit.cs | 100.00% <100.00%> (ø) | | | src/Sentry/Protocol/EnvelopeItem.cs | 60.71% <100.00%> (+0.71%) | :arrow_up: | | test/Sentry.Testing/EmptySerializable.cs | 100.00% <100.00%> (ø) | | | test/Sentry.Testing/SentryResponses.cs | 100.00% <100.00%> (ø) | | | ... and 2 more | |

    Continue to review full report at Codecov.

    Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 5d76498...8bde8b7. Read the comment docs.

    点赞 评论 复制链接分享
  • weixin_39956182 weixin_39956182 5月前

    sessions are fully independent of events and attachments attachments are tied to an event. if the event is dropped, then so should be the attachment. however, you must keep an event if the attachment is dropped. headers don't matter in that case. if you remove an event (and it's attachments), you can leave the event ID in the header. it will not cause any harm, and will be ignored. if you want to be super clean, then remove it, or don't even write it there in the first place. 2 and 4 are optimizations. so you can send the attachment and we will drop it at a later point, but it actually does get persisted temporarily. so it is highly encouraged to drop attachments with their events. the same goes for user feedback.

    点赞 评论 复制链接分享
  • weixin_39929253 weixin_39929253 5月前

    TBH, still don't understand how categories work. I wrote some tests to document it, so maybe you can suggest which test cases should be added if I missed something. Otherwise the PR is ready for review. Still needs refactoring though.

    点赞 评论 复制链接分享
  • weixin_39929253 weixin_39929253 5月前

    -garcia

    1. The documentation refers to the individual item rate limits as "quote-limits". Should we use that name as well?
    2. Is category the same as type on envelope item?
    点赞 评论 复制链接分享
  • weixin_39929253 weixin_39929253 5月前

    You mean quota-limits? Or maybe the docs are wrong since quote would be incorrect.

    Yes, quota. That was a typo. I meant quota limit as opposed to rate limit.

    点赞 评论 复制链接分享
  • weixin_39929253 weixin_39929253 5月前

    Another interesting question.

    Presume we have an envelope was created from a single event and an attachment. Because of that, it has event_id header set. Turns out the event category was rate-limited so it had to be discarded. What should happen to the header?

    点赞 评论 复制链接分享
  • weixin_39929253 weixin_39929253 5月前

    Also it appears we will not actually retry envelopes, so I guess we will have have to drop RetryAfterHandler and implement short-circuit mechanism in HttpTransport.

    点赞 评论 复制链接分享
  • weixin_39929253 weixin_39929253 5月前

    They are different. No sure there's official documentation of known mappings at this point. I'd start with this: https://github.com/getsentry/sentry-cocoa/blob/fc02d71795aa047382332dbc198aac40b59852fa/Tests/SentryTests/Networking/RateLimits/SentryRateLimitCategoryMapperTests.swift

    I can't seen to understand from this how, for example default maps to item type exactly. Also, in the tests it seems that error maps to event, but in the docs there's an example that uses event as category name.

    点赞 评论 复制链接分享
  • weixin_39956182 weixin_39956182 5月前

    Presume we have an envelope was created from a single event and an attachment. Because of that, it has event_id header set. Turns out the event category was rate-limited so it had to be discarded. What should happen to the header?

    I don't think it matters in this case, Relay should figure this out. I'll double check.

    Also it appears we will not actually retry envelopes, so I guess we will have have to drop RetryAfterHandler and implement short-circuit mechanism in HttpTransport.

    We don't have any retries. RetryAfterHandler is used to drop events for the duration of RetryAfter header. Can we just keep that in the pipeline? I believe it's documented that we still need to continue with that and drop everything if we hit 429, or am I misunderstanding?

    I can't seen to understand from this how, for example default maps to item type exactly. Also, in the tests it seems that error maps to event, but in the docs there's an example that uses event as category name.

    event.type = default and error both events. I assume the category for rate limit is event then, and if it's default or error it should match. It should be the case in cocoa and the other SDKs, isn't it?

    Yes, quota. That was a typo. I meant quota limit as opposed to rate limit.

    I assume the docs will change the wording, please keep it consistent with the other SDKs for now

    点赞 评论 复制链接分享
  • weixin_39929253 weixin_39929253 5月前

    We don't have any retries. RetryAfterHandler is used to drop events for the duration of RetryAfter header. Can we just keep that in the pipeline? I believe it's documented that we still need to continue with that and drop everything if we hit 429, or am I misunderstanding?

    Ah, okay. I couldn't figure out the behavior from the tests and didn't look too deeply in the implementation, but if it drops events, then we can keep it.

    点赞 评论 复制链接分享

相关推荐