aboutsummaryrefslogtreecommitdiffstats
path: root/docs/configuration.rst
diff options
context:
space:
mode:
Diffstat (limited to 'docs/configuration.rst')
-rw-r--r--docs/configuration.rst172
1 files changed, 148 insertions, 24 deletions
diff --git a/docs/configuration.rst b/docs/configuration.rst
index c606c6c..32a529a 100644
--- a/docs/configuration.rst
+++ b/docs/configuration.rst
@@ -217,7 +217,7 @@ extractor.*.user-agent
----------------------
=========== =====
Type ``string``
-Default ``"Mozilla/5.0 (X11; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"``
+Default ``"Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0"``
Description User-Agent header value to be used for HTTP requests.
Note: This option has no effect on `pixiv` and
@@ -300,8 +300,9 @@ extractor.*.retries
-------------------
=========== =====
Type ``integer``
-Default ``5``
-Description Number of times a failed HTTP request is retried before giving up.
+Default ``4``
+Description Maximum number of times a failed HTTP request is retried before
+ giving up or ``-1`` for infinite retries.
=========== =====
@@ -333,6 +334,22 @@ Description Controls whether to verify SSL/TLS certificates for HTTPS requests.
=========== =====
+extractor.*.download
+--------------------
+=========== =====
+Type ``bool``
+Default ``true``
+Description Controls whether to download media files.
+
+ Setting this to ``false`` won't download any files, but all other
+ functions (postprocessors_, `download archive`_, etc.)
+ will be executed as normal.
+=========== =====
+
+.. _postprocessors: `extractor.*.postprocessors`_
+.. _download archive: `extractor.*.archive`_
+
+
extractor.*.image-range
-----------------------
=========== =====
@@ -381,6 +398,40 @@ Description Like `image-filter`__, but applies to delegated URLs
__ `extractor.*.image-filter`_
+extractor.*.image-unique
+------------------------
+=========== =====
+Type ``bool``
+Default ``false``
+Description Ignore image URLs that have been encountered before during the
+ current extractor run.
+=========== =====
+
+
+extractor.*.chapter-unique
+--------------------------
+=========== =====
+Type ``bool``
+Default ``false``
+Description Like `image-unique`__, but applies to delegated URLs
+ like manga-chapters, etc.
+=========== =====
+
+__ `extractor.*.image-unique`_
+
+
+extractor.*.date-format
+----------------------------
+=========== =====
+Type ``string``
+Default ``"%Y-%m-%dT%H:%M:%S"``
+Description Format string used to parse ``string`` values of
+ `date-min` and `date-max`.
+
+ See |strptime|_ for a list of formatting directives.
+=========== =====
+
+
Extractor-specific Options
==========================
@@ -737,24 +788,9 @@ Description Retrieve additional comments by resolving the ``more`` comment
extractor.reddit.date-min & .date-max
-------------------------------------
=========== =====
-Type ``integer`` or ``string``
+Type |Date|_
Default ``0`` and ``253402210800`` (timestamp of |datetime.max|_)
Description Ignore all submissions posted before/after this date.
-
- * If this is an ``integer``, it represents the date as UTC timestamp.
- * If this is a ``string``, it will get parsed according to date-format_.
-=========== =====
-
-
-extractor.reddit.date-format
-----------------------------
-=========== =====
-Type ``string``
-Default ``"%Y-%m-%dT%H:%M:%S"``
-Description An explicit format string used to parse the ``string`` values of
- `date-min and date-max`_.
-
- See |strptime|_ for a list of formatting directives.
=========== =====
@@ -831,6 +867,15 @@ Description Download blog avatars.
=========== =====
+extractor.tumblr.date-min & .date-max
+-------------------------------------
+=========== =====
+Type |Date|_
+Default ``0`` and ``null``
+Description Ignore all posts published before/after this date.
+=========== =====
+
+
extractor.tumblr.external
-------------------------
=========== =====
@@ -877,6 +922,15 @@ Description A (comma-separated) list of post types to extract images, etc. from.
=========== =====
+extractor.twitter.content
+-------------------------
+=========== =====
+Type ``bool``
+Default ``false``
+Description Extract tweet text as ``content`` metadata.
+=========== =====
+
+
extractor.twitter.retweets
--------------------------
=========== =====
@@ -945,6 +999,16 @@ Description Enable/Disable this downloader module.
=========== =====
+downloader.*.mtime
+------------------
+=========== =====
+Type ``bool``
+Default ``true``
+Description Use |Last-Modified|_ HTTP response headers
+ to set file modification times.
+=========== =====
+
+
downloader.*.part
-----------------
=========== =====
@@ -992,7 +1056,8 @@ downloader.*.retries
=========== =====
Type ``integer``
Default `extractor.*.retries`_
-Description Number of retries during file downloads.
+Description Maximum number of retries during file downloads
+ or ``-1`` for infinite retries.
=========== =====
@@ -1240,6 +1305,23 @@ Description Custom format string to build content of metadata files.
=========== =====
+mtime
+-----
+
+Set file modification time according to its metadata
+
+mtime.key
+---------
+=========== =====
+Type ``string``
+Default ``"date"``
+Description Name of the metadata field whose value should be used.
+
+ This value must either be a UNIX timestamp or a
+ |datetime|_ object.
+=========== =====
+
+
ugoira
------
@@ -1375,6 +1457,19 @@ Description Path of the SQLite3 database used to cache login sessions,
=========== =====
+ciphers
+-------
+=========== =====
+Type ``bool`` or ``string``
+Default ``true``
+Description * ``true``: Update urllib3's default cipher list
+ * ``false``: Leave the default cipher list as is
+ * Any ``string``: Replace urllib3's default ciphers with these
+ (See `SSLContext.set_ciphers() <https://docs.python.org/3/library/ssl.html#ssl.SSLContext.set_ciphers>`__
+ for details)
+=========== =====
+
+
API Tokens & IDs
================
@@ -1479,6 +1574,20 @@ Custom Types
============
+Date
+----
+=========== =====
+Type ``string`` or ``integer``
+Examples * ``"2019-01-01T00:00:00"``
+ * ``"2019"`` with ``"%Y"`` as date-format_
+ * ``1546297200``
+Description A |Date|_ value represents a specific point in time.
+
+ * If given as ``string``, it is parsed according to date-format_.
+ * If given as ``integer``, it is interpreted as UTC timestamp.
+=========== =====
+
+
Path
----
=========== =====
@@ -1508,7 +1617,7 @@ Logging Configuration
=========== =====
Type ``object``
-Example .. code::
+Examples .. code::
{
"format": "{asctime} {name}: {message}",
@@ -1517,10 +1626,21 @@ Example .. code::
"encoding": "ascii"
}
+ {
+ "level": "debug",
+ "format": {
+ "debug" : "debug: {message}",
+ "info" : "[{name}] {message}",
+ "warning": "Warning: {message}",
+ "error" : "ERROR: {message}"
+ }
+ }
+
Description Extended logging output configuration.
* format
- * Format string for logging messages
+ * General format string for logging messages
+ or a dictionary with format strings for each loglevel.
In addition to the default
`LogRecord attributes <https://docs.python.org/3/library/logging.html#logrecord-attributes>`__,
@@ -1587,16 +1707,18 @@ Description An object with the ``name`` of a post-processor and its options.
.. |verify| replace:: ``verify``
.. |mature_content| replace:: ``mature_content``
.. |webbrowser.open()| replace:: ``webbrowser.open()``
+.. |datetime| replace:: ``datetime``
.. |datetime.max| replace:: ``datetime.max``
+.. |Date| replace:: ``Date``
.. |Path| replace:: ``Path``
+.. |Last-Modified| replace:: ``Last-Modified``
.. |Logging Configuration| replace:: ``Logging Configuration``
.. |Postprocessor Configuration| replace:: ``Postprocessor Configuration``
.. |strptime| replace:: strftime() and strptime() Behavior
.. _base-directory: `extractor.*.base-directory`_
.. _skipped: `extractor.*.skip`_
-.. _`date-min and date-max`: `extractor.reddit.date-min & .date-max`_
-.. _date-format: extractor.reddit.date-format_
+.. _date-format: `extractor.*.date-format`_
.. _deviantart.metadata: extractor.deviantart.metadata_
.. _.netrc: https://stackoverflow.com/tags/.netrc/info
@@ -1604,12 +1726,14 @@ Description An object with the ``name`` of a post-processor and its options.
.. _requests.request(): https://docs.python-requests.org/en/master/api/#requests.request
.. _timeout: https://docs.python-requests.org/en/latest/user/advanced/#timeouts
.. _verify: https://docs.python-requests.org/en/master/user/advanced/#ssl-cert-verification
+.. _Last-Modified: https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.29
.. _`Requests' proxy documentation`: http://docs.python-requests.org/en/master/user/advanced/#proxies
.. _format string: https://docs.python.org/3/library/string.html#formatstrings
.. _format strings: https://docs.python.org/3/library/string.html#formatstrings
.. _strptime: https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior
.. _mature_content: https://www.deviantart.com/developers/http/v1/20160316/object/deviation
.. _webbrowser.open(): https://docs.python.org/3/library/webbrowser.html
+.. _datetime: https://docs.python.org/3/library/datetime.html#datetime-objects
.. _datetime.max: https://docs.python.org/3/library/datetime.html#datetime.datetime.max
.. _Authentication: https://github.com/mikf/gallery-dl#5authentication
.. _youtube-dl: https://github.com/ytdl-org/youtube-dl