1
0
Fork 0
gallery-dl-archive-manager/README.md
2024-02-19 12:32:30 -05:00

93 lines
4.3 KiB
Markdown

# gallery-dl-archive-manager
Scripts to manage a (currently twitter only) archive using gallery-dl. Much of the code came from a need to augment pre-existing, outdated archives that were originally created from the `twittermediadownloader` browser extension.
## Config
This repo uses its own `config.json` in order to save media in the same format as `twittermediadownloader`. The scripts depend on the media being saved in this format.
## Scripts
### `node run-initDb.js`
Initializes a user database from existing folders. Useful if you have a pre-existing archive of users.
Args:
- `--path={/path/to/your/archive}`
Example:
- `node run-initDb.js --path=/mnt/data/archive` will read all child directories in `/mnt/data/archive` (e.g. `/mnt/data/archive/userA`, `/mnt/data/archive/userB`, etc.) and create a `db.json` file in `/mnt/data/archive` listing the users.
### `node run-downloadDb.js`
Runs a full download of all users listed in the db.json of the archive (the provided `--path`).
Args:
- `--path={/path/to/your/archive}`
- `--threads={#}`
- `--args={gallery-dl args}`
Example:
- `node run-downloadDb.js --path=/mnt/data/archive --threads=3 --args="-r 2.5M"` will run a full download (`/media` followed by `/search` starting from the oldest pulled file from `/media`) of all the users in the `/mnt/data/archive/db.json` file, limiting concurrent download threads to 3. It will pass the additional args `-r 2.5M --no-skip` to the gallery-dl bin being executed; `-r 2.5M --no-skip` corresponds to limiting the download rate to 2.5M and downloading all files without skipping (for the sake of example).
### `run-downloadUsers.js`
Should add a new user to the db and initiate a full download similar to `run-downloadDb.js`
Args:
- `--path={/path/to/your/archive}`
- `--users={comma,separated,userlist}`
- `--threads={#}`
### `run-convertDb.js`
Converts `db.json` to the latest version. See `./lib/schema.js` for full db.json schema.
Args:
- `--path={/path/to/your/archive}`
Historical Versions:
- v0: simple array of users with `user`, `lastUpdated`, `lastError` fields
- v1 (**CURRENT**): object with `version` and `userList` fields, `userList` containing key-value entries where the key is the username, the value is an informational object regarding that username.
## Args
Standard args:
### `--path={/path/to/your/archive}`
The path to the archive. This is a parent directory with a list of child directories which correspond to users.
### `--threads={#}`
Max number of concurrent download threads. Only this number of concurrent gallery-dl download threads will run at a given time, other remaining users will be queued.
### `--args={gallery-dl args}`
Additional args to pass to gallery-dl. See [gallery-dl CLI options](https://github.com/mikf/gallery-dl/blob/master/docs/options.md#selection-options) for reference. Note that these aren't currently checked for duplicates that may be used by this repo.
## TODO
### `run-updateDb.js`
Should pull from the user database and update the archive without doing a full download. The DB should save with a lastUpdated field. This should be used as a date for the `/search` API. Preferred if it's been a long time since an update has happened for a user and/or the user has uploaded a significant amount of media since lastUpdated.
Note: if you've updated the DB recently, it may be more performant to run `node run-downloadDb.js` with `--args="-A {#}"` to simply run the `/media` check instead of the `/search` check, where `-A` will abort the user after `{#}` tries.
Args:
- `--path={/path/to/your/archive}`
- `--threads={#}`
- `--args={gallery-dl args}`
### `run-renameUser.js`
Should rename an existing user in the db, optionally renaming their existing archive and its contents if `--full=true`.
Args:
- `--from={'username'}`
- `--to={'username'}`
- `--full={true|false}`
- `--path={/path/to/your/archive}`
- `--args={gallery-dl args}`
### Rename Detection
Should detect renames when running `/search`. Occasionally `/media` will fail due to the rename, but `/search` will return results, causing a full download from `/search` and adding the user to the db without notice. This should stop the download and print out an error that they've been updated from `username` to `username`. Because the command will have already finished `/media` and thus be halfway through the process, this should be done manually after the command has finished.