1
0
Fork 0
gallery-dl-archive-manager/README.md
2024-02-09 23:17:32 -05:00

2.9 KiB

gallery-dl-archive-manager

Scripts to manage a (currently twitter only) archive using gallery-dl. Much of the code came from a need to augment pre-existing, outdated archives that were originally created from the twittermediadownloader browser extension.

Config

This repo uses its own config.json in order to save media in the same format as twittermediadownloader. The scripts depend on the media being saved in this format.

Scripts

node run-initDb.js

Initializes a user database from existing folders. Useful if you have a pre-existing archive of users.

Args:

  • --path={/path/to/your/archive}

Example:

  • node run-initDb.js --path=/mnt/data/archive will read all child directories in /mnt/data/archive (e.g. /mnt/data/archive/userA, /mnt/data/archive/userB, etc.) and create a db.json file in /mnt/data/archive listing the users.

node run-downloadDb.js

Runs a full download of all users listed in the db.json of the archive (the provided --path).

Args:

  • --path={/path/to/your/archive}
  • --threads={#}
  • --args={gallery-dl args}

Example:

  • node run-downloadDb.js --path=/mnt/data/archive --threads=3 --args="-r 2.5M" will run a full download (/media followed by /search starting from the oldest pulled file from /media) of all the users in the /mnt/data/archive/db.json file, limiting concurrent download threads to 3. It will pass the additional args -r 2.5M --no-skip to the gallery-dl bin being executed; -r 2.5M --no-skip corresponds to limiting the download rate to 2.5M and downloading all files without skipping (for the sake of example).

Args

Standard args:

--path={/path/to/your/archive}

The path to the archive. This is a parent directory with a list of child directories which correspond to users.

--threads={#}

Max number of concurrent download threads. Only this number of concurrent gallery-dl download threads will run at a given time, other remaining users will be queued.

Additional args to pass to gallery-dl. Note that these aren't currently checked for duplicates that may be used by this repo.

TODO

run-addUsers.js

Should add a new user to the db and initiate a full download similar to run-downloadDb.js

Args:

  • --path={/path/to/your/archive}
  • --users={string array of user(s)}
  • --threads={#}

run-updateDb.js

Should pull from the user database and update the archive without doing a full download.

Args:

  • --path={/path/to/your/archive}
  • --mode={search|media}
    • search: The DB should save with a lastUpdated field. This should be used as a date for the /search API. Preferred if it's been a long time since an update has happened for a user and/or the user has uploaded a significant amount of media since lastUpdated.
    • media: This should run /media and stop after hitting skipped files, as skipped files indicate hitting lastUpdated. Preferred in normal circumstances.
  • --threads={#}
  • --args={gallery-dl args}