Commit Graph

33 Commits (ffa2faa4208fbdefadbfc39a4e3d91030646c04f)

Author SHA1 Message Date
Max Ignatenko ffa2faa420 Properly escape null character in the consumer too 2024-03-24 12:43:52 +00:00
Max Ignatenko 693ae1ba0a Bump the limit on stored bad records to 500 2024-03-24 12:32:13 +00:00
Max Ignatenko bae23a62d0 Don't try to upsert zero records 2024-03-20 13:31:14 +00:00
Max Ignatenko e8c816a3a3 Update FirstCursorSinceReset even if we received zero new records 2024-03-17 21:22:34 +00:00
Max Ignatenko 328a676e2a Fix potential infinite loop for inactive repos after a cursor reset 2024-03-17 19:36:12 +00:00
Max Ignatenko 4c41389e9b Detect CAR files with zero blocks and handle them accordingly 2024-03-17 17:42:09 +00:00
Max Ignatenko 638bdcf515 Add a check for zero bytes fetched from PDS
And log additional info when we're failing to extract rev from repo
bytes
2024-03-17 17:15:23 +00:00
Max Ignatenko 553894dc6a Switch to partial repo fetches 2024-03-17 15:34:26 +00:00
Max Ignatenko 0cc11e75b2 Avoid updating records when then didn't actually change 2024-03-17 14:16:25 +00:00
Max Ignatenko 9c51a4621f Start recording last rev for each repo 2024-03-13 22:28:54 +00:00
Max Ignatenko 44d2b25951 Fix nil pointer deref 2024-03-13 16:38:58 +00:00
Max Ignatenko e150f1da90 Populate FirstRevSinceReset if empty 2024-03-13 11:32:23 +00:00
Max Ignatenko 87d510e67a Check repos against PDS cursor resets, instead of waiting for a first new even for them on firehose 2024-03-13 10:39:41 +00:00
Max Ignatenko 57aa4731e5 Properly update repo consistency metadata
Previously these two queries did nothing at all -_-

Note that this will trigger a full re-index of every repo seen on the
firehose.
2024-03-13 10:36:41 +00:00
Max Ignatenko ddde20a014 Fix comparison with NULL when listing PDSs 2024-02-23 15:22:22 +00:00
Max Ignatenko 0f191b2609 Increment exported vars for every language in a post 2024-02-23 11:12:39 +00:00
mathan 78a17bf238 Merge branch 'main' of github.com:uabluerail/indexer 2024-02-22 18:56:19 -08:00
mathan db425b1d5f Fix view in migration. Add by lang metric to consumer. 2024-02-22 18:54:29 -08:00
Max Ignatenko a20ddf0717 Fix the fucking regexp 2024-02-22 18:16:43 +00:00
Max Ignatenko a28199fb92 Handle the new #identity message 2024-02-22 12:17:21 +00:00
Max Ignatenko 8f32c494f7 Add whitelist for PDS hosts and update repo PDS pointer on appropriate occasions 2024-02-22 12:17:21 +00:00
mathan 600dac7694 Added metric for posts by lang (olivčykom). 2024-02-21 18:09:12 -08:00
Max Ignatenko b2003530ba Handle repos with unknown PDS 2024-02-21 09:35:25 +00:00
Max Ignatenko a0934c360e fix division by zero 2024-02-19 18:34:31 +00:00
Max Ignatenko 758c5fe5e6 Add quick&dirty quarantine logic for bad records 2024-02-19 17:06:19 +00:00
Max Ignatenko 1d25842b78 Add few more metrics 2024-02-18 17:23:54 +00:00
Max Ignatenko 1038ca3bea Add AtRev column to only overwrite records with a newer version 2024-02-17 14:29:45 +00:00
Max Ignatenko 1d3c6edf0a Fix panic on closing already closed channel 2024-02-17 12:26:57 +00:00
Max Ignatenko 04d521f58c Increase request timeout, default 30 seconds are not enough to download some repos 2024-02-16 16:47:36 +00:00
Max Ignatenko 4626b8b9ca Limit the number of retries when failing to index a repo 2024-02-16 15:23:34 +00:00
Max Ignatenko 8313c74482 Remove deleted_at 2024-02-15 20:29:08 +00:00
Max Ignatenko 8561d90caf A bit of query optimization 2024-02-15 18:39:29 +00:00
Max Ignatenko 63a767d890 Import 2024-02-15 16:10:39 +00:00