- rate limit is stderr, so need to check to make sure only finishing on other errors - color errors in log - log errors to userDb - use fallback date in case dates weren't logged (e.g. /media failed) - report on not-found users, which may indicate a username change or deleted account - report on authorization errors |
||
---|---|---|
lib | ||
.editorconfig | ||
.gitignore | ||
config.json | ||
LICENSE.md | ||
package-lock.json | ||
package.json | ||
README.md | ||
run-downloadDb.js | ||
run-initDb.js |
gallery-dl-archive-manager
Scripts to manage a (currently twitter only) archive using gallery-dl. Much of the code came from a need to augment pre-existing, outdated archives that were originally created from the twittermediadownloader
browser extension.
Config
This repo uses its own config.json
in order to save media in the same format as twittermediadownloader
. The scripts depend on the media being saved in this format.
Scripts
node run-initDb.js
Initializes a user database from existing folders. Useful if you have a pre-existing archive of users.
Args:
--path={/path/to/your/archive}
Example:
node run-initDb.js --path=/mnt/data/archive
will read all child directories in/mnt/data/archive
(e.g./mnt/data/archive/userA
,/mnt/data/archive/userB
, etc.) and create adb.json
file in/mnt/data/archive
listing the users.
node run-downloadDb.js
Runs a full download of all users listed in the db.json of the archive (the provided --path
).
Args:
--path={/path/to/your/archive}
--threads={#}
--args={gallery-dl args}
Example:
node run-downloadDb.js --path=/mnt/data/archive --threads=3 --args="-r 2.5M"
will run a full download (/media
followed by/search
starting from the oldest pulled file from/media
) of all the users in the/mnt/data/archive/db.json
file, limiting concurrent download threads to 3. It will pass the additional args-r 2.5M --no-skip
to the gallery-dl bin being executed;-r 2.5M --no-skip
corresponds to limiting the download rate to 2.5M and downloading all files without skipping (for the sake of example).
Args
Standard args:
--path={/path/to/your/archive}
The path to the archive. This is a parent directory with a list of child directories which correspond to users.
--threads={#}
Max number of concurrent download threads. Only this number of concurrent gallery-dl download threads will run at a given time, other remaining users will be queued.
--args={gallery-dl args}
Additional args to pass to gallery-dl. Note that these aren't currently checked for duplicates that may be used by this repo.
TODO
run-addUsers.js
Should add a new user to the db and initiate a full download similar to run-downloadDb.js
Args:
--path={/path/to/your/archive}
--users={string array of user(s)}
--threads={#}
run-updateDb.js
Should pull from the user database and update the archive without doing a full download.
Args:
--path={/path/to/your/archive}
--mode={search|media}
search
: The DB should save with a lastUpdated field. This should be used as a date for the /search API. Preferred if it's been a long time since an update has happened for a user and/or the user has uploaded a significant amount of media since lastUpdated.media
: This should run /media and stop after hitting skipped files, as skipped files indicate hitting lastUpdated. Preferred in normal circumstances.
--threads={#}
--args={gallery-dl args}