Max Ignatenko
f315059994
Avoid querying too few repos from each single PDS
2024-05-06 21:14:18 +01:00
Max Ignatenko
c364822818
Tweak rate limit a little bit
2024-04-14 13:07:24 +01:00
Max Ignatenko
bc24c23afc
Switch to batch inserts
2024-04-14 12:45:05 +01:00
Max Ignatenko
c5f3a55ac8
Add a target for waiting until PLC mirror catches up
2024-04-13 19:42:36 +01:00
Max Ignatenko
d6b5850827
Add plc mirror
...
If you want to avoid sending all requests to https://plc.directory while
it's catching up for the first time, do
`docker compose up -d --build plc` and wait for it to catch up before
updating all other containers.
2024-04-13 16:45:02 +01:00
Max Ignatenko
c17730c11f
Drop the hack for detecting cursor reset from non-compliant servers
2024-04-10 23:32:21 +01:00
Max Ignatenko
4b40c5919b
Don't update LastFirehoseRev more than once a day
2024-04-07 15:06:13 +01:00
Max Ignatenko
52dd38f11b
Don't attribute context cancellation to firehose record content
2024-04-06 22:14:58 +01:00
Max Ignatenko
ecf2fc57d8
Disconnect from firehoses in parallel when shutting down
2024-04-06 21:59:21 +01:00
Max Ignatenko
1358bc3f08
Export a counter of firehose connection errors
2024-04-06 21:55:36 +01:00
Max Ignatenko
ff0ea08296
Implement commit signature validation
2024-04-06 21:50:51 +01:00
Max Ignatenko
7c09c37a51
Pass the correct context to Consumer.Start so that it will actually stop
...
when singalled
2024-03-29 10:16:22 +00:00
Max Ignatenko
1abe505ef9
Add support for discovering new PDSs from relays
2024-03-28 20:55:02 +00:00
Max Ignatenko
c919050833
Keep the set of running consumers up to date
2024-03-28 20:02:48 +00:00
Max Ignatenko
337f3ef2b8
Delete commented out code
2024-03-28 18:46:39 +00:00
Max Ignatenko
fc5307a971
Add a bit more logging for cursor values
2024-03-28 16:00:28 +00:00
Max Ignatenko
5d3d562ecc
Stop the ticker when goroutine exits
2024-03-28 16:00:28 +00:00
Max Ignatenko
ffa2faa420
Properly escape null character in the consumer too
2024-03-24 12:43:52 +00:00
Max Ignatenko
693ae1ba0a
Bump the limit on stored bad records to 500
2024-03-24 12:32:13 +00:00
Max Ignatenko
bae23a62d0
Don't try to upsert zero records
2024-03-20 13:31:14 +00:00
Max Ignatenko
e8c816a3a3
Update FirstCursorSinceReset even if we received zero new records
2024-03-17 21:22:34 +00:00
Max Ignatenko
328a676e2a
Fix potential infinite loop for inactive repos after a cursor reset
2024-03-17 19:36:12 +00:00
Max Ignatenko
4c41389e9b
Detect CAR files with zero blocks and handle them accordingly
2024-03-17 17:42:09 +00:00
Max Ignatenko
638bdcf515
Add a check for zero bytes fetched from PDS
...
And log additional info when we're failing to extract rev from repo
bytes
2024-03-17 17:15:23 +00:00
Max Ignatenko
553894dc6a
Switch to partial repo fetches
2024-03-17 15:34:26 +00:00
Max Ignatenko
0cc11e75b2
Avoid updating records when then didn't actually change
2024-03-17 14:16:25 +00:00
Max Ignatenko
9c51a4621f
Start recording last rev for each repo
2024-03-13 22:28:54 +00:00
Max Ignatenko
44d2b25951
Fix nil pointer deref
2024-03-13 16:38:58 +00:00
Max Ignatenko
e150f1da90
Populate FirstRevSinceReset if empty
2024-03-13 11:32:23 +00:00
Max Ignatenko
87d510e67a
Check repos against PDS cursor resets, instead of waiting for a first new even for them on firehose
2024-03-13 10:39:41 +00:00
Max Ignatenko
57aa4731e5
Properly update repo consistency metadata
...
Previously these two queries did nothing at all -_-
Note that this will trigger a full re-index of every repo seen on the
firehose.
2024-03-13 10:36:41 +00:00
Max Ignatenko
ddde20a014
Fix comparison with NULL when listing PDSs
2024-02-23 15:22:22 +00:00
Max Ignatenko
0f191b2609
Increment exported vars for every language in a post
2024-02-23 11:12:39 +00:00
mathan
78a17bf238
Merge branch 'main' of github.com:uabluerail/indexer
2024-02-22 18:56:19 -08:00
mathan
db425b1d5f
Fix view in migration. Add by lang metric to consumer.
2024-02-22 18:54:29 -08:00
Max Ignatenko
a20ddf0717
Fix the fucking regexp
2024-02-22 18:16:43 +00:00
Max Ignatenko
a28199fb92
Handle the new #identity message
2024-02-22 12:17:21 +00:00
Max Ignatenko
8f32c494f7
Add whitelist for PDS hosts and update repo PDS pointer on appropriate occasions
2024-02-22 12:17:21 +00:00
mathan
600dac7694
Added metric for posts by lang (olivčykom).
2024-02-21 18:09:12 -08:00
Max Ignatenko
b2003530ba
Handle repos with unknown PDS
2024-02-21 09:35:25 +00:00
Max Ignatenko
a0934c360e
fix division by zero
2024-02-19 18:34:31 +00:00
Max Ignatenko
758c5fe5e6
Add quick&dirty quarantine logic for bad records
2024-02-19 17:06:19 +00:00
Max Ignatenko
1d25842b78
Add few more metrics
2024-02-18 17:23:54 +00:00
Max Ignatenko
1038ca3bea
Add AtRev column to only overwrite records with a newer version
2024-02-17 14:29:45 +00:00
Max Ignatenko
1d3c6edf0a
Fix panic on closing already closed channel
2024-02-17 12:26:57 +00:00
Max Ignatenko
04d521f58c
Increase request timeout, default 30 seconds are not enough to download some repos
2024-02-16 16:47:36 +00:00
Max Ignatenko
4626b8b9ca
Limit the number of retries when failing to index a repo
2024-02-16 15:23:34 +00:00
Max Ignatenko
8313c74482
Remove deleted_at
2024-02-15 20:29:08 +00:00
Max Ignatenko
8561d90caf
A bit of query optimization
2024-02-15 18:39:29 +00:00
Max Ignatenko
63a767d890
Import
2024-02-15 16:10:39 +00:00