Commit Graph

51 Commits (1d68d70a418422bd14a2b5766d8a1c94854dd060)

Author SHA1 Message Date
Max Ignatenko 1d68d70a41 Avoid lister having stuck on a single broken PDS 2024-05-06 21:17:42 +01:00
Max Ignatenko f315059994 Avoid querying too few repos from each single PDS 2024-05-06 21:14:18 +01:00
Max Ignatenko c364822818 Tweak rate limit a little bit 2024-04-14 13:07:24 +01:00
Max Ignatenko bc24c23afc Switch to batch inserts 2024-04-14 12:45:05 +01:00
Max Ignatenko c5f3a55ac8 Add a target for waiting until PLC mirror catches up 2024-04-13 19:42:36 +01:00
Max Ignatenko d6b5850827 Add plc mirror
If you want to avoid sending all requests to https://plc.directory while
it's catching up for the first time, do
`docker compose up -d --build plc` and wait for it to catch up before
updating all other containers.
2024-04-13 16:45:02 +01:00
Max Ignatenko c17730c11f Drop the hack for detecting cursor reset from non-compliant servers 2024-04-10 23:32:21 +01:00
Max Ignatenko 4b40c5919b Don't update LastFirehoseRev more than once a day 2024-04-07 15:06:13 +01:00
Max Ignatenko 52dd38f11b Don't attribute context cancellation to firehose record content 2024-04-06 22:14:58 +01:00
Max Ignatenko ecf2fc57d8 Disconnect from firehoses in parallel when shutting down 2024-04-06 21:59:21 +01:00
Max Ignatenko 1358bc3f08 Export a counter of firehose connection errors 2024-04-06 21:55:36 +01:00
Max Ignatenko ff0ea08296 Implement commit signature validation 2024-04-06 21:50:51 +01:00
Max Ignatenko 7c09c37a51 Pass the correct context to Consumer.Start so that it will actually stop
when singalled
2024-03-29 10:16:22 +00:00
Max Ignatenko 1abe505ef9 Add support for discovering new PDSs from relays 2024-03-28 20:55:02 +00:00
Max Ignatenko c919050833 Keep the set of running consumers up to date 2024-03-28 20:02:48 +00:00
Max Ignatenko 337f3ef2b8 Delete commented out code 2024-03-28 18:46:39 +00:00
Max Ignatenko fc5307a971 Add a bit more logging for cursor values 2024-03-28 16:00:28 +00:00
Max Ignatenko 5d3d562ecc Stop the ticker when goroutine exits 2024-03-28 16:00:28 +00:00
Max Ignatenko ffa2faa420 Properly escape null character in the consumer too 2024-03-24 12:43:52 +00:00
Max Ignatenko 693ae1ba0a Bump the limit on stored bad records to 500 2024-03-24 12:32:13 +00:00
Max Ignatenko bae23a62d0 Don't try to upsert zero records 2024-03-20 13:31:14 +00:00
Max Ignatenko e8c816a3a3 Update FirstCursorSinceReset even if we received zero new records 2024-03-17 21:22:34 +00:00
Max Ignatenko 328a676e2a Fix potential infinite loop for inactive repos after a cursor reset 2024-03-17 19:36:12 +00:00
Max Ignatenko 4c41389e9b Detect CAR files with zero blocks and handle them accordingly 2024-03-17 17:42:09 +00:00
Max Ignatenko 638bdcf515 Add a check for zero bytes fetched from PDS
And log additional info when we're failing to extract rev from repo
bytes
2024-03-17 17:15:23 +00:00
Max Ignatenko 553894dc6a Switch to partial repo fetches 2024-03-17 15:34:26 +00:00
Max Ignatenko 0cc11e75b2 Avoid updating records when then didn't actually change 2024-03-17 14:16:25 +00:00
Max Ignatenko 9c51a4621f Start recording last rev for each repo 2024-03-13 22:28:54 +00:00
Max Ignatenko 44d2b25951 Fix nil pointer deref 2024-03-13 16:38:58 +00:00
Max Ignatenko e150f1da90 Populate FirstRevSinceReset if empty 2024-03-13 11:32:23 +00:00
Max Ignatenko 87d510e67a Check repos against PDS cursor resets, instead of waiting for a first new even for them on firehose 2024-03-13 10:39:41 +00:00
Max Ignatenko 57aa4731e5 Properly update repo consistency metadata
Previously these two queries did nothing at all -_-

Note that this will trigger a full re-index of every repo seen on the
firehose.
2024-03-13 10:36:41 +00:00
Max Ignatenko ddde20a014 Fix comparison with NULL when listing PDSs 2024-02-23 15:22:22 +00:00
Max Ignatenko 0f191b2609 Increment exported vars for every language in a post 2024-02-23 11:12:39 +00:00
mathan 78a17bf238 Merge branch 'main' of github.com:uabluerail/indexer 2024-02-22 18:56:19 -08:00
mathan db425b1d5f Fix view in migration. Add by lang metric to consumer. 2024-02-22 18:54:29 -08:00
Max Ignatenko a20ddf0717 Fix the fucking regexp 2024-02-22 18:16:43 +00:00
Max Ignatenko a28199fb92 Handle the new #identity message 2024-02-22 12:17:21 +00:00
Max Ignatenko 8f32c494f7 Add whitelist for PDS hosts and update repo PDS pointer on appropriate occasions 2024-02-22 12:17:21 +00:00
mathan 600dac7694 Added metric for posts by lang (olivčykom). 2024-02-21 18:09:12 -08:00
Max Ignatenko b2003530ba Handle repos with unknown PDS 2024-02-21 09:35:25 +00:00
Max Ignatenko a0934c360e fix division by zero 2024-02-19 18:34:31 +00:00
Max Ignatenko 758c5fe5e6 Add quick&dirty quarantine logic for bad records 2024-02-19 17:06:19 +00:00
Max Ignatenko 1d25842b78 Add few more metrics 2024-02-18 17:23:54 +00:00
Max Ignatenko 1038ca3bea Add AtRev column to only overwrite records with a newer version 2024-02-17 14:29:45 +00:00
Max Ignatenko 1d3c6edf0a Fix panic on closing already closed channel 2024-02-17 12:26:57 +00:00
Max Ignatenko 04d521f58c Increase request timeout, default 30 seconds are not enough to download some repos 2024-02-16 16:47:36 +00:00
Max Ignatenko 4626b8b9ca Limit the number of retries when failing to index a repo 2024-02-16 15:23:34 +00:00
Max Ignatenko 8313c74482 Remove deleted_at 2024-02-15 20:29:08 +00:00
Max Ignatenko 8561d90caf A bit of query optimization 2024-02-15 18:39:29 +00:00