From 955e1e3e0a0ab40bef24f91037707a43ec44bdd5 Mon Sep 17 00:00:00 2001 From: Ethan Dalool Date: Fri, 6 Nov 2020 23:08:08 -0800 Subject: [PATCH] Add note about update_items overwriting properties. --- README.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 70734e3..4f89228 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,15 @@ Please `pip install requests` and `pip install voussoirkit`. According to the [HN API docs](https://github.com/HackerNews/API) there is no enforced ratelimit, so just use a `threads` count that seems polite. -To get started, just run `python hnarchive.py update` and it will start from 1. In the future, you can run `update` on a cronjob or use `livestream` to get new items forever. Note, `update` always starts from the highest ID in the database. If you use `get` to get a range of IDs that is ahead of your update schedule, your next `update` will miss the skipped IDs. +To get started, just run `python hnarchive.py update` and it will start from 1. In the future, you can run `update` on a cronjob or use `livestream` to get new items forever. + +Notes: + +- `update` always starts from the highest ID in the database. If you use `get` to get a range of IDs that is ahead of your update schedule, your next `update` will miss the skipped IDs. + +- `update_items` will overwrite previously fetched data with the new properties. Please know that HN moderators occasionally migrate comments between threads, adjust thread titles, etc. HN has a tight window in which authors can edit their own posts so you can expect actual item texts to remain pretty static outside of moderator action. + + The exception is if an item is deleted and comes back as `None` from the server, then hnarchive keeps the old data. Here are all of the subcommands: