1
0
Fork 0
gallery-dl-archive-manager/README.md

88 lines
4 KiB
Markdown

# gallery-dl-archive-manager
Scripts to manage a (currently twitter only) archive using gallery-dl. Much of the code came from a need to augment pre-existing, outdated archives that were originally created from the `twittermediadownloader` browser extension.
## Config
This repo uses its own `config.json` in order to save media in the same format as `twittermediadownloader`. The scripts depend on the media being saved in this format.
## Scripts
### `node run-downloadDb.js`
Runs a full download of all users listed in the db.json of the archive (the provided `--path`). If db.json is not present, one will be created. If any user ends on skipped media during the `/media` check, the `/search` check will be skipped.
Args:
- `--site={"twitter"|"bluesky"}`
- `--path={/path/to/your/archive}`
- `--threads={#}`
- `--args={gallery-dl args}`
- `--usersPerBatch={#}`
- `--waitTime={#}`
- `--skipMediaAfter={#}`
- `--skipSearchAfter={#}`
Example:
- `node run-downloadDb.js --path=/mnt/data/archive --threads=3 --args="-r 2.5M --no-skip"` will run a full download (`/media` followed by `/search` starting from the oldest pulled file from `/media`) of all the users in the `/mnt/data/archive/db.json` file, limiting concurrent download threads to 3. It will pass the additional args `-r 2.5M --no-skip` to the gallery-dl bin being executed; `-r 2.5M --no-skip` corresponds to limiting the download rate to 2.5M and downloading all files without skipping (for the sake of example).
Adding `--usersPerBatch={#}` and `--waitTime={#}` together will activate a batching mechanism which will split the userList in the db.json in chunks of the specified `usersPerBatch` and then wait `waitTime` amount of seconds between each batch in order to throttle any downloads. Without this, 100+ users in a short amount of time could introduce problems, whereas for example ~30 users with ~5 minutes between each batch tends to avoid problems.
### `run-downloadUsers.js`
Adds new user(s) to the db and initiate a full download similar to `run-downloadDb.js`. If db.json is not present, one will be created. If any user ends on skipped media during the `/media` check, the `/search` check will be skipped.
Args:
- `--users={comma,separated,userlist}`
- `--site={"twitter"|"bluesky"}`
- `--path={/path/to/your/archive}`
- `--threads={#}`
- `--args={gallery-dl args}`
- `--skipMediaAfter={#}`
- `--skipSearchAfter={#}`
### `run-convertDb.js`
Converts `db.json` to the latest version. See `./lib/schema.js` for full db.json schema.
Args:
- `--path={/path/to/your/archive}`
Historical Versions:
- v0: simple array of users with `user`, `lastUpdated`, `lastError` fields
- v1 (**CURRENT**): object with `version` and `userList` fields, `userList` containing key-value entries where the key is the username, the value is an informational object regarding that username.
## Args
Standard args:
### `--path={/path/to/your/archive}`
The path to the archive. This is a parent directory with a list of child directories which correspond to users.
### `--threads={#}`
Max number of concurrent download threads. Only this number of concurrent gallery-dl download threads will run at a given time, other remaining users will be queued.
### `--args={gallery-dl args}`
Additional args to pass to gallery-dl. See [gallery-dl CLI options](https://github.com/mikf/gallery-dl/blob/master/docs/options.md#selection-options) for reference. Note that these aren't currently checked for duplicates that may be used by this repo.
### `--skipMediaAfter={#}`
Appends `-A #` to the args of gallery-dl during the `/media` round, which stops the download early after # amount of skipped media.
### `--skipSearchAfter={#}`
Appends `-A #` to the args of gallery-dl during the `/search` round, which stops the download early after # amount of skipped media.
## TODO
### `run-renameUser.js`
Should rename an existing user in the db, optionally renaming their existing archive and its contents if `--full=true`.
Args:
- `--from={'username'}`
- `--to={'username'}`
- `--full={true|false}`
- `--path={/path/to/your/archive}`
- `--args={gallery-dl args}`