Replace "> timesearch.py" with "python timesearch.py"

The bracket was meant to look like the command prompt, but for
linux users they would be more used to $ or #, and for people who
do not use the command line often they may be confused by it
altogether. This change should help make it a little more clear as
to what's going on.
master
Ethan Dalool 2020-04-22 23:25:42 -07:00
parent 30f0fed6c3
commit b98f096fd4
2 changed files with 34 additions and 34 deletions

View File

@ -26,38 +26,38 @@ Timesearch is a collection of utilities for archiving subreddits.
### This package consists of:
- **get_submissions**: If you try to page through `/new` on a subreddit, you'll hit a limit at or before 1,000 posts. Timesearch uses the pushshift.io dataset to get information about very old posts, and then queries the reddit api to update their information. Previously, we used the `timestamp` cloudsearch query parameter on reddit's own API, but reddit has removed that feature and pushshift is now the only viable source for initial data.
`> timesearch.py get_submissions -r subredditname <flags>`
`> timesearch.py get_submissions -u username <flags>`
`python timesearch.py get_submissions -r subredditname <flags>`
`python timesearch.py get_submissions -u username <flags>`
- **get_comments**: Similar to `get_submissions`, this tool queries pushshift for comment data and updates it from reddit.
`> timesearch.py get_comments -r subredditname <flags>`
`> timesearch.py get_comments -u username <flags>`
`python timesearch.py get_comments -r subredditname <flags>`
`python timesearch.py get_comments -u username <flags>`
- **livestream**: get_submissions+get_comments is great for starting your database and getting the historical posts, but it's not the best for staying up-to-date. Instead, livestream monitors `/new` and `/comments` to continuously ingest data.
`> timesearch.py livestream -r subredditname <flags>`
`> timesearch.py livestream -u username <flags>`
`python timesearch.py livestream -r subredditname <flags>`
`python timesearch.py livestream -u username <flags>`
- **get_styles**: Downloads the stylesheet and CSS images.
`> timesearch.py get_styles -r subredditname`
`python timesearch.py get_styles -r subredditname`
- **get_wiki**: Downloads the wiki pages, sidebar, etc. from /wiki/pages.
`> timesearch.py get_wiki -r subredditname`
`python timesearch.py get_wiki -r subredditname`
- **offline_reading**: Renders comment threads into HTML via markdown.
Note: I'm currently using the [markdown library from pypi](https://pypi.python.org/pypi/Markdown), and it doesn't do reddit's custom markdown like `/r/` or `/u/`, obviously. So far I don't think anybody really uses o_r so I haven't invested much time into improving it.
`> timesearch.py offline_reading -r subredditname <flags>`
`> timesearch.py offline_reading -u username <flags>`
`python timesearch.py offline_reading -r subredditname <flags>`
`python timesearch.py offline_reading -u username <flags>`
- **index**: Generates plaintext or HTML lists of submissions, sorted by a property of your choosing. You can order by date, author, flair, etc. With the `--offline` parameter, you can make all the links point to the files you generated with `offline_reading`.
`> timesearch.py index -r subredditname <flags>`
`> timesearch.py index -u username <flags>`
`python timesearch.py index -r subredditname <flags>`
`python timesearch.py index -u username <flags>`
- **breakdown**: Produces a JSON file indicating which users make the most posts in a subreddit, or which subreddits a user posts in.
`> timesearch.py breakdown -r subredditname` <flags>
`> timesearch.py breakdown -u username` <flags>
`python timesearch.py breakdown -r subredditname` <flags>
`python timesearch.py breakdown -u username` <flags>
- **merge_db**: Copy all new data from one timesearch database into another. Useful for syncing or merging two scans of the same subreddit.
`> timesearch.py merge_db --from filepath/database1.db --to filepath/database2.db`
`python timesearch.py merge_db --from filepath/database1.db --to filepath/database2.db`
### To use it

View File

@ -16,13 +16,13 @@ The subreddit archiver
The basics:
1. Collect a subreddit's submissions
> timesearch.py get_submissions -r subredditname
python timesearch.py get_submissions -r subredditname
2. Collect the comments for those submissions
> timesearch.py get_comments -r subredditname
python timesearch.py get_comments -r subredditname
3. Stay up-to-date
> timesearch.py livestream -r subredditname
python timesearch.py livestream -r subredditname
Commands for collecting:
@ -47,7 +47,7 @@ Commands for processing:
{offline_reading}
TO SEE DETAILS ON EACH COMMAND, RUN
> timesearch.py <command>
python timesearch.py <command>
'''.lstrip()
MODULE_DOCSTRINGS = dict(
@ -59,8 +59,8 @@ breakdown:
Automatically dumps into a <database>_breakdown.json file
in the same directory as the database.
> timesearch.py breakdown -r subredditname <flags>
> timesearch.py breakdown -u username <flags>
python timesearch.py breakdown -r subredditname <flags>
python timesearch.py breakdown -u username <flags>
flags:
-r "test" | --subreddit "test":
@ -77,8 +77,8 @@ get_comments='''
get_comments:
Collect comments on a subreddit or comments made by a user.
> timesearch.py get_comments -r subredditname <flags>
> timesearch.py get_comments -u username <flags>
python timesearch.py get_comments -r subredditname <flags>
python timesearch.py get_comments -u username <flags>
flags:
-s "t3_xxxxxx" | --specific "t3_xxxxxx":
@ -110,7 +110,7 @@ get_styles='''
get_styles:
Collect the stylesheet, and css images.
> timesearch.py get_styles -r subredditname
python timesearch.py get_styles -r subredditname
'''.strip(),
get_submissions='''
@ -118,8 +118,8 @@ get_submissions:
Collect submissions from the subreddit across all of history, or
Collect submissions by a user (as many as possible).
> timesearch.py get_submissions -r subredditname <flags>
> timesearch.py get_submissions -u username <flags>
python timesearch.py get_submissions -r subredditname <flags>
python timesearch.py get_submissions -u username <flags>
-r "test" | --subreddit "test":
The subreddit to scan. Mutually exclusive with username.
@ -149,15 +149,15 @@ get_wiki='''
get_wiki:
Collect all available wiki pages.
> timesearch.py get_wiki -r subredditname
python timesearch.py get_wiki -r subredditname
'''.strip(),
index='''
index:
Dump submission listings to a plaintext or HTML file.
> timesearch.py index -r subredditname <flags>
> timesearch.py index -u username <flags>
python timesearch.py index -r subredditname <flags>
python timesearch.py index -u username <flags>
flags:
-r "test" | --subreddit "test":
@ -220,8 +220,8 @@ livestream='''
livestream:
Continously collect submissions and/or comments.
> timesearch.py livestream -r subredditname <flags>
> timesearch.py livestream -u username <flags>
python timesearch.py livestream -r subredditname <flags>
python timesearch.py livestream -u username <flags>
flags:
-r "test" | --subreddit "test":
@ -253,7 +253,7 @@ merge_db='''
merge_db:
Copy all new posts from one timesearch database into another.
> timesearch merge_db --from redditdev1.db --to redditdev2.db
python timesearch.py merge_db --from redditdev1.db --to redditdev2.db
flags:
--from:
@ -269,8 +269,8 @@ offline_reading='''
offline_reading:
Render submissions and comment threads to HTML via Markdown.
> timesearch.py offline_reading -r subredditname <flags>
> timesearch.py offline_reading -u username <flags>
python timesearch.py offline_reading -r subredditname <flags>
python timesearch.py offline_reading -u username <flags>
flags:
-s "t3_xxxxxx" | --specific "t3_xxxxxx":