Document consistency model for repo data

main
Max Ignatenko 2024-03-17 15:21:54 +00:00
parent 0cc11e75b2
commit 346aa7c539
1 changed files with 116 additions and 0 deletions

View File

@ -0,0 +1,116 @@
# Data consistency model
## Available operations
### Repo fetch
We can fetch a full copy of a repo. Each commit contains a `rev` - string value
that is strictly increasing with each new commit.
### Consuming firehose
We can stream new events from each PDS. Every event comes with a cursor value -
integer number that is strictly increasing, scoped to a PDS. Events also contain
repo-specific `rev` which is the same with a full repo fetch.
## Metadata fields
### PDS
* `Cursor` - last cursor value received from this PDS.
* `FirstCursorSinceReset` - earliest cursor we have uninterrupted sequence of
records up to now.
### Repo
* `LastIndexedRev` - last `rev` recorded during most recent full repo re-index
* Up to this `rev` we do have all records
* `FirstRevSinceReset` - first `rev` seen on firehose since the most recent
cursor reset.
* Changes only when an event for this repo is received, so it alone doesn't
guarantee that we have all subsequent records
* `FirstCursorSinceReset` - copy of the PDS field with the same name.
* If `FirstCursorSinceReset` >= `PDS`.`FirstCursorSinceReset` and PDS's
firehose is live - then we indeed have all records since
`FirstRevSinceReset`
* `LastFirehoseRev` - last `rev` seen on the firehose
* Currently recorded, but not used for anything
## Guarantees
* Up to and including `LastIndexedRev` - all records have been indexed.
* If `FirstCursorSinceReset` >= `PDS`.`FirstCursorSinceReset`:
* Starting from and including `FirstRevSinceReset` - we have indexed all newer
records
* Consequently, if `LastIndexedRev` >= `FirstRevSinceReset` - we have a
complete copy of the repo
* If `FirstCursorSinceReset` < `PDS`.`FirstCursorSinceReset`:
* There was a cursor reset, we might be missing some records after
`FirstRevSinceReset`
## Operations
### Indexing a repo
* Fetch the repo
* Upsert all fetched records
* Set `LastIndexedRev` to `rev` of the fetched repo
### Connecting to firehose
* If the first message is `#info` - this means that our cursor is too old
* Update PDS's `FirstCursorSinceReset` to the value supplied in the `#info`
message
Workaround for a buggy relay that doesn't send `#info`:
* If the first message has cursor value that is different from `Cursor`+1:
* Assume there was a cursor reset and update PDS's `FirstCursorSinceReset` to
the value provided in the message
### Receiving event on firehose
* Check that the event is coming from the correct PDS for a given repo
* TODO: maybe drop this and just check the signature
* Process the event normally
* If `Repo`.`FirstCursorSinceReset` >= `PDS`.`FirstCursorSinceReset`:
* No metadata updates needed for the repo
* If `Repo`.`FirstCursorSinceReset` < `PDS`.`FirstCursorSinceReset`:
* Set repo's `FirstRevSinceReset` to the event's `rev` and
`FirstCursorSinceReset` to `PDS`.`FirstCursorSinceReset`
* If `tooBig` flag is set on the message (MST diff was larger than PDS's size
limit, so some records were dropped):
* Set repo's `FirstRevSinceReset` to the event's `rev` and
`FirstCursorSinceReset` to `PDS`.`FirstCursorSinceReset`
* Note: `FirstCursorSinceReset` might be the same, but moving forward
`FirstRevSinceReset` likely will trigger repo reindexing
* Update `LastFirehoseRev` to event's `rev`
* Update PDS's `Cursor` to the value provided in the message
### Listing repos
* Fetch a list of repos from a PDS. Response also includes the last `rev` for
every repo.
* For each repo:
* If `FirstRevSinceReset` is not set:
* Set `FirstRevSinceReset` to received `rev`
* Set `FirstCursorSinceReset` to the PDS's `FirstCursorSinceReset`
### Repo migrating to a different PDS
TODO
Currently we're simply resetting `FirstRevSinceReset`.
### Finding repos that need indexing
* Repo index is incomplete and needs to be indexed if one of these is true:
* `LastIndexedRev` is not set
* `LastIndexedRev` < `FirstCursorSinceReset`
* `Repo`.`FirstCursorSinceReset` < `PDS`.`FirstCursorSinceReset`
* TODO: avoid reindexing the repo forever if there are no new firehose
events for it.