diff options
| author | 2025-05-05 01:18:58 -0400 | |
|---|---|---|
| committer | 2025-05-05 01:18:58 -0400 | |
| commit | c679cd7a13bdbf6896e53d68fe2093910bc6625a (patch) | |
| tree | 6047abcc55283d7e631b7a73039865417a303428 | |
| parent | 4a18b5837c1dd82f5964afcfc3fecc53cd97e79c (diff) | |
New upstream version 1.29.6.upstream/1.29.6
27 files changed, 590 insertions, 262 deletions
diff --git a/CHANGELOG.md b/CHANGELOG.md index 182d685..2d352f5 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,41 +1,22 @@ -## 1.29.5 - 2025-04-26 +## 1.29.6 - 2025-05-04 ### Extractors #### Additions -- [bluesky] add `video` extractor ([#4438](https://github.com/mikf/gallery-dl/issues/4438)) -- [instagram] add `followers` extractor ([#7374](https://github.com/mikf/gallery-dl/issues/7374)) -- [itaku] add `stars` extractor ([#7411](https://github.com/mikf/gallery-dl/issues/7411)) -- [pictoa] add support ([#6683](https://github.com/mikf/gallery-dl/issues/6683) [#7409](https://github.com/mikf/gallery-dl/issues/7409)) -- [twitter] add `followers` extractor ([#6331](https://github.com/mikf/gallery-dl/issues/6331)) +- [manganelo] support `nelomanga.net` and mirror domains ([#7423](https://github.com/mikf/gallery-dl/issues/7423)) #### Fixes -- [architizer] fix `project` extractor ([#7421](https://github.com/mikf/gallery-dl/issues/7421)) -- [bluesky:likes] fix infinite loop ([#7194](https://github.com/mikf/gallery-dl/issues/7194) [#7287](https://github.com/mikf/gallery-dl/issues/7287)) -- [deviantart] fix `401 Unauthorized` errors for for multi-image posts ([#6653](https://github.com/mikf/gallery-dl/issues/6653)) -- [everia] fix `title` extraction ([#7379](https://github.com/mikf/gallery-dl/issues/7379)) -- [fanbox] fix `comments` extraction -- [fapello] stop pagination on empty results ([#7385](https://github.com/mikf/gallery-dl/issues/7385)) -- [kemonoparty] fix `archives` option ([#7416](https://github.com/mikf/gallery-dl/issues/7416) [#7419](https://github.com/mikf/gallery-dl/issues/7419)) -- [pixiv] fix `user_details` requests not being cached ([#7414](https://github.com/mikf/gallery-dl/issues/7414)) -- [pixiv:novel] handle exceptions during `embeds` extraction ([#7422](https://github.com/mikf/gallery-dl/issues/7422)) -- [subscribestar] fix username & password login -- [wikifeet] support site redesign ([#7286](https://github.com/mikf/gallery-dl/issues/7286) [#7396](https://github.com/mikf/gallery-dl/issues/7396)) +- [deviantart] unescape `\'` in JSON data ([#6653](https://github.com/mikf/gallery-dl/issues/6653)) +- [kemonoparty] revert to using default creator posts endpoint ([#7438](https://github.com/mikf/gallery-dl/issues/7438) [#7450](https://github.com/mikf/gallery-dl/issues/7450) [#7462](https://github.com/mikf/gallery-dl/issues/7462)) +- [pixiv:novel] fix `embeds` extraction by using AJAX API ([#7422](https://github.com/mikf/gallery-dl/issues/7422) [#7435](https://github.com/mikf/gallery-dl/issues/7435)) +- [scrolller] fix exception for albums with missing media ([#7428](https://github.com/mikf/gallery-dl/issues/7428)) +- [twitter] fix `404 Not Found ()` errors ([#7382](https://github.com/mikf/gallery-dl/issues/7382) [#7386](https://github.com/mikf/gallery-dl/issues/7386) [#7426](https://github.com/mikf/gallery-dl/issues/7426) [#7430](https://github.com/mikf/gallery-dl/issues/7430) [#7431](https://github.com/mikf/gallery-dl/issues/7431) [#7445](https://github.com/mikf/gallery-dl/issues/7445) [#7459](https://github.com/mikf/gallery-dl/issues/7459)) #### Improvements -- [bluesky:likes] use `repo.listRecords` endpoint ([#7194](https://github.com/mikf/gallery-dl/issues/7194) [#7287](https://github.com/mikf/gallery-dl/issues/7287)) -- [gelbooru] don't hardcode image server domains ([#7392](https://github.com/mikf/gallery-dl/issues/7392)) -- [instagram] support `/share/` URLs ([#7241](https://github.com/mikf/gallery-dl/issues/7241)) -- [kemonoparty] use `/posts-legacy` endpoint ([#6780](https://github.com/mikf/gallery-dl/issues/6780) [#6931](https://github.com/mikf/gallery-dl/issues/6931) [#7404](https://github.com/mikf/gallery-dl/issues/7404)) -- [naver] support videos ([#4682](https://github.com/mikf/gallery-dl/issues/4682) [#7395](https://github.com/mikf/gallery-dl/issues/7395)) -- [scrolller] support album posts ([#7339](https://github.com/mikf/gallery-dl/issues/7339)) -- [subscribestar] add warning for missing login cookie -- [twitter] update API endpoint query hashes ([#7382](https://github.com/mikf/gallery-dl/issues/7382) [#7386](https://github.com/mikf/gallery-dl/issues/7386)) -- [weasyl] use `gallery-dl` User-Agent header ([#7412](https://github.com/mikf/gallery-dl/issues/7412)) +- [kemonoparty] add `endpoint` option ([#7438](https://github.com/mikf/gallery-dl/issues/7438) [#7450](https://github.com/mikf/gallery-dl/issues/7450) [#7462](https://github.com/mikf/gallery-dl/issues/7462)) +- [tumblr] improve error message for dashboard-only blogs ([#7455](https://github.com/mikf/gallery-dl/issues/7455)) +- [weasyl] support `/view/` URLs ([#7469](https://github.com/mikf/gallery-dl/issues/7469)) #### Metadata -- [deviantart:stash] extract more metadata ([#7397](https://github.com/mikf/gallery-dl/issues/7397)) -- [moebooru:pool] replace underscores in pool names ([#4646](https://github.com/mikf/gallery-dl/issues/4646)) -- [naver] fix recent `date` bug ([#4682](https://github.com/mikf/gallery-dl/issues/4682)) +- [chevereto] extract `date` metadata ([#7437](https://github.com/mikf/gallery-dl/issues/7437)) +- [civitai] implement retrieving `model` and `version` metadata ([#7432](https://github.com/mikf/gallery-dl/issues/7432)) +- [manganelo] extract more metadata ### Post Processors -- [ugoira] restore `keep-files` functionality ([#7304](https://github.com/mikf/gallery-dl/issues/7304)) -- [ugoira] support `"keep-files": true` + custom extension ([#7304](https://github.com/mikf/gallery-dl/issues/7304)) -- [ugoira] use `_ugoira_frame_index` to detect `.zip` files +- [directory] add `directory` post processor ([#7432](https://github.com/mikf/gallery-dl/issues/7432)) ### Miscellaneous -- [util] auto-update Chrome version -- use internal version of `re.compile()` for extractor patterns +- [job] do not reset skip count when `skip-filter` fails ([#7433](https://github.com/mikf/gallery-dl/issues/7433)) @@ -1,6 +1,6 @@ Metadata-Version: 2.4 Name: gallery_dl -Version: 1.29.5 +Version: 1.29.6 Summary: Command-line program to download image galleries and collections from several image hosting sites Home-page: https://github.com/mikf/gallery-dl Download-URL: https://github.com/mikf/gallery-dl/releases/latest @@ -133,9 +133,9 @@ Standalone Executable Prebuilt executable files with a Python interpreter and required Python packages included are available for -- `Windows <https://github.com/mikf/gallery-dl/releases/download/v1.29.5/gallery-dl.exe>`__ +- `Windows <https://github.com/mikf/gallery-dl/releases/download/v1.29.6/gallery-dl.exe>`__ (Requires `Microsoft Visual C++ Redistributable Package (x86) <https://aka.ms/vs/17/release/vc_redist.x86.exe>`__) -- `Linux <https://github.com/mikf/gallery-dl/releases/download/v1.29.5/gallery-dl.bin>`__ +- `Linux <https://github.com/mikf/gallery-dl/releases/download/v1.29.6/gallery-dl.bin>`__ Nightly Builds @@ -77,9 +77,9 @@ Standalone Executable Prebuilt executable files with a Python interpreter and required Python packages included are available for -- `Windows <https://github.com/mikf/gallery-dl/releases/download/v1.29.5/gallery-dl.exe>`__ +- `Windows <https://github.com/mikf/gallery-dl/releases/download/v1.29.6/gallery-dl.exe>`__ (Requires `Microsoft Visual C++ Redistributable Package (x86) <https://aka.ms/vs/17/release/vc_redist.x86.exe>`__) -- `Linux <https://github.com/mikf/gallery-dl/releases/download/v1.29.5/gallery-dl.bin>`__ +- `Linux <https://github.com/mikf/gallery-dl/releases/download/v1.29.6/gallery-dl.bin>`__ Nightly Builds diff --git a/data/man/gallery-dl.1 b/data/man/gallery-dl.1 index 7a6c97d..a50a0c0 100644 --- a/data/man/gallery-dl.1 +++ b/data/man/gallery-dl.1 @@ -1,4 +1,4 @@ -.TH "GALLERY-DL" "1" "2025-04-26" "1.29.5" "gallery-dl Manual" +.TH "GALLERY-DL" "1" "2025-05-04" "1.29.6" "gallery-dl Manual" .\" disable hyphenation .nh diff --git a/data/man/gallery-dl.conf.5 b/data/man/gallery-dl.conf.5 index d329d9c..ba2e048 100644 --- a/data/man/gallery-dl.conf.5 +++ b/data/man/gallery-dl.conf.5 @@ -1,4 +1,4 @@ -.TH "GALLERY-DL.CONF" "5" "2025-04-26" "1.29.5" "gallery-dl Manual" +.TH "GALLERY-DL.CONF" "5" "2025-05-04" "1.29.6" "gallery-dl Manual" .\" disable hyphenation .nh .\" disable justification (adjust text to left margin only) @@ -2167,12 +2167,12 @@ It is possible to use \f[I]"all"\f[] instead of listing all values separately. .IP "Example:" 4 .br -* "generation" +* "generation,version" .br -* ["generation"] +* ["generation", "version"] .IP "Description:" 4 -Extract additional \f[I]generation\f[] metadata. +Extract additional \f[I]generation\f[] and \f[I]version\f[] metadata. Note: This requires 1 additional HTTP request per image or video. @@ -3659,12 +3659,47 @@ Extract a user's direct messages as \f[I]dms\f[] metadata. Extract a user's announcements as \f[I]announcements\f[] metadata. +.SS extractor.kemonoparty.endpoint +.IP "Type:" 6 +\f[I]string\f[] + +.IP "Default:" 9 +\f[I]"posts"\f[] + +.IP "Description:" 4 +API endpoint to use for retrieving creator posts. + +\f[I]"legacy"\f[] +Use the results from +.br +\f[I]/v1/{service}/user/{creator_id}/posts-legacy\f[] +Provides less metadata, but is more reliable at returning all posts. +.br +Supports filtering results by \f[I]tag\f[] query parameter. +.br +\f[I]"legacy+"\f[] +Use the results from +.br +\f[I]/v1/{service}/user/{creator_id}/posts-legacy\f[] +to retrieve post IDs +and one request to +.br +\f[I]/v1/{service}/user/{creator_id}/post/{post_id}\f[] +to get a full set of metadata for each. +\f[I]"posts"\f[] +Use the results from +.br +\f[I]/v1/{service}/user/{creator_id}\f[] +Provides more metadata, but might not return a creator's first/last posts. +.br + + .SS extractor.kemonoparty.favorites .IP "Type:" 6 \f[I]string\f[] .IP "Default:" 9 -\f[I]artist\f[] +\f[I]"artist"\f[] .IP "Description:" 4 Determines the type of favorites to be downloaded. @@ -7523,6 +7558,22 @@ after \f[I]N\f[] consecutive files compared as equal. Only compare file sizes. Do not read and compare their content. +.SS directory.event +.IP "Type:" 6 +.br +* \f[I]string\f[] +.br +* \f[I]list\f[] of \f[I]strings\f[] + +.IP "Default:" 9 +\f[I]"prepare"\f[] + +.IP "Description:" 4 +The event(s) for which \f[I]directory\f[] format strings are (re)evaluated. + +See \f[I]metadata.event\f[] for a list of available events. + + .SS exec.archive .IP "Type:" 6 .br @@ -9049,6 +9100,8 @@ Compare versions of the same file and replace/enumerate them on mismatch .br (requires \f[I]downloader.*.part\f[] = \f[I]true\f[] and \f[I]extractor.*.skip\f[] = \f[I]false\f[]) .br +\f[I]directory\f[] +Reevaluate \f[I]directory\f[] format strings \f[I]exec\f[] Execute external commands \f[I]hash\f[] diff --git a/docs/gallery-dl.conf b/docs/gallery-dl.conf index b8b46d6..2df1ec3 100644 --- a/docs/gallery-dl.conf +++ b/docs/gallery-dl.conf @@ -379,6 +379,7 @@ "comments" : false, "dms" : false, "duplicates" : false, + "endpoint" : "posts", "favorites" : "artist", "files" : ["attachments", "file", "inline"], "max-posts" : null, diff --git a/gallery_dl.egg-info/PKG-INFO b/gallery_dl.egg-info/PKG-INFO index 7d3e9ca..0c2a61e 100644 --- a/gallery_dl.egg-info/PKG-INFO +++ b/gallery_dl.egg-info/PKG-INFO @@ -1,6 +1,6 @@ Metadata-Version: 2.4 Name: gallery_dl -Version: 1.29.5 +Version: 1.29.6 Summary: Command-line program to download image galleries and collections from several image hosting sites Home-page: https://github.com/mikf/gallery-dl Download-URL: https://github.com/mikf/gallery-dl/releases/latest @@ -133,9 +133,9 @@ Standalone Executable Prebuilt executable files with a Python interpreter and required Python packages included are available for -- `Windows <https://github.com/mikf/gallery-dl/releases/download/v1.29.5/gallery-dl.exe>`__ +- `Windows <https://github.com/mikf/gallery-dl/releases/download/v1.29.6/gallery-dl.exe>`__ (Requires `Microsoft Visual C++ Redistributable Package (x86) <https://aka.ms/vs/17/release/vc_redist.x86.exe>`__) -- `Linux <https://github.com/mikf/gallery-dl/releases/download/v1.29.5/gallery-dl.bin>`__ +- `Linux <https://github.com/mikf/gallery-dl/releases/download/v1.29.6/gallery-dl.bin>`__ Nightly Builds diff --git a/gallery_dl.egg-info/SOURCES.txt b/gallery_dl.egg-info/SOURCES.txt index a6afdd7..5dc17bd 100644 --- a/gallery_dl.egg-info/SOURCES.txt +++ b/gallery_dl.egg-info/SOURCES.txt @@ -28,6 +28,7 @@ gallery_dl/option.py gallery_dl/output.py gallery_dl/path.py gallery_dl/text.py +gallery_dl/transaction_id.py gallery_dl/update.py gallery_dl/util.py gallery_dl/version.py @@ -147,7 +148,6 @@ gallery_dl/extractor/lynxchan.py gallery_dl/extractor/mangadex.py gallery_dl/extractor/mangafox.py gallery_dl/extractor/mangahere.py -gallery_dl/extractor/mangakakalot.py gallery_dl/extractor/manganelo.py gallery_dl/extractor/mangapark.py gallery_dl/extractor/mangaread.py @@ -261,6 +261,7 @@ gallery_dl/postprocessor/__init__.py gallery_dl/postprocessor/classify.py gallery_dl/postprocessor/common.py gallery_dl/postprocessor/compare.py +gallery_dl/postprocessor/directory.py gallery_dl/postprocessor/exec.py gallery_dl/postprocessor/hash.py gallery_dl/postprocessor/metadata.py diff --git a/gallery_dl/extractor/__init__.py b/gallery_dl/extractor/__init__.py index 9a7ca53..2da471e 100644 --- a/gallery_dl/extractor/__init__.py +++ b/gallery_dl/extractor/__init__.py @@ -105,7 +105,6 @@ modules = [ "mangadex", "mangafox", "mangahere", - "mangakakalot", "manganelo", "mangapark", "mangaread", diff --git a/gallery_dl/extractor/chevereto.py b/gallery_dl/extractor/chevereto.py index 600d231..dc963c5 100644 --- a/gallery_dl/extractor/chevereto.py +++ b/gallery_dl/extractor/chevereto.py @@ -78,6 +78,8 @@ class CheveretoImageExtractor(CheveretoExtractor): "id" : self.path.rpartition(".")[2], "url" : url, "album": text.extr(extr("Added to <a", "/a>"), ">", "<"), + "date" : text.parse_datetime(extr( + '<span title="', '"'), "%Y-%m-%d %H:%M:%S"), "user" : extr('username: "', '"'), } diff --git a/gallery_dl/extractor/civitai.py b/gallery_dl/extractor/civitai.py index 034a3c2..de8f86c 100644 --- a/gallery_dl/extractor/civitai.py +++ b/gallery_dl/extractor/civitai.py @@ -10,6 +10,7 @@ from .common import Extractor, Message from .. import text, util, exception +from ..cache import memcache import itertools import time @@ -49,10 +50,11 @@ class CivitaiExtractor(Extractor): if isinstance(metadata, str): metadata = metadata.split(",") elif not isinstance(metadata, (list, tuple)): - metadata = ("generation",) + metadata = ("generation", "version") self._meta_generation = ("generation" in metadata) + self._meta_version = ("version" in metadata) else: - self._meta_generation = False + self._meta_generation = self._meta_version = False def items(self): models = self.models() @@ -77,9 +79,12 @@ class CivitaiExtractor(Extractor): post["publishedAt"], "%Y-%m-%dT%H:%M:%S.%fZ") data = { "post": post, - "user": post["user"], + "user": post.pop("user"), } - del post["user"] + if self._meta_version: + data["version"] = version = self.api.model_version( + post["modelVersionId"]).copy() + data["model"] = version.pop("model") yield Message.Directory, data for file in self._image_results(images): @@ -94,6 +99,18 @@ class CivitaiExtractor(Extractor): if self._meta_generation: image["generation"] = self.api.image_generationdata( image["id"]) + if self._meta_version: + if "modelVersionId" in image: + version_id = image["modelVersionId"] + else: + post = image["post"] = self.api.post( + image["postId"]) + post.pop("user", None) + version_id = post["modelVersionId"] + image["version"] = version = self.api.model_version( + version_id).copy() + image["model"] = version.pop("model") + image["date"] = text.parse_datetime( image["createdAt"], "%Y-%m-%dT%H:%M:%S.%fZ") text.nameext_from_url(url, image) @@ -464,6 +481,7 @@ class CivitaiRestAPI(): endpoint = "/v1/models/{}".format(model_id) return self._call(endpoint) + @memcache(keyarg=1) def model_version(self, model_version_id): endpoint = "/v1/model-versions/{}".format(model_version_id) return self._call(endpoint) @@ -504,7 +522,7 @@ class CivitaiTrpcAPI(): self.root = extractor.root + "/api/trpc/" self.headers = { "content-type" : "application/json", - "x-client-version": "5.0.542", + "x-client-version": "5.0.701", "x-client-date" : "", "x-client" : "web", "x-fingerprint" : "undefined", @@ -576,6 +594,7 @@ class CivitaiTrpcAPI(): params = {"id": int(model_id)} return self._call(endpoint, params) + @memcache(keyarg=1) def model_version(self, model_version_id): endpoint = "modelVersion.getById" params = {"id": int(model_version_id)} diff --git a/gallery_dl/extractor/deviantart.py b/gallery_dl/extractor/deviantart.py index ae475e2..37f57fe 100644 --- a/gallery_dl/extractor/deviantart.py +++ b/gallery_dl/extractor/deviantart.py @@ -868,7 +868,9 @@ x2="45.4107524%" y2="71.4898596%" id="app-root-3">\ yield self.api.deviation(deviation_uuid) def _unescape_json(self, json): - return json.replace('\\"', '"').replace("\\\\", "\\") + return json.replace('\\"', '"') \ + .replace("\\'", "'") \ + .replace("\\\\", "\\") class DeviantartUserExtractor(DeviantartExtractor): diff --git a/gallery_dl/extractor/kemonoparty.py b/gallery_dl/extractor/kemonoparty.py index 79070ee..4893f19 100644 --- a/gallery_dl/extractor/kemonoparty.py +++ b/gallery_dl/extractor/kemonoparty.py @@ -317,11 +317,25 @@ class KemonopartyUserExtractor(KemonopartyExtractor): KemonopartyExtractor.__init__(self, match) def posts(self): + endpoint = self.config("endpoint") + if endpoint == "legacy": + endpoint = self.api.creator_posts_legacy + elif endpoint == "legacy+": + endpoint = self._posts_legacy_plus + else: + endpoint = self.api.creator_posts + _, _, service, creator_id, query = self.groups params = text.parse_query(query) - return self.api.creator_posts_legacy( - service, creator_id, - params.get("o"), params.get("q"), params.get("tag")) + return endpoint(service, creator_id, + params.get("o"), params.get("q"), params.get("tag")) + + def _posts_legacy_plus(self, service, creator_id, + offset=0, query=None, tags=None): + for post in self.api.creator_posts_legacy( + service, creator_id, offset, query, tags): + yield self.api.creator_post( + service, creator_id, post["id"])["post"] class KemonopartyPostsExtractor(KemonopartyExtractor): @@ -525,9 +539,10 @@ class KemonoAPI(): endpoint = "/file/" + file_hash return self._call(endpoint) - def creator_posts(self, service, creator_id, offset=0, query=None): + def creator_posts(self, service, creator_id, + offset=0, query=None, tags=None): endpoint = "/{}/user/{}".format(service, creator_id) - params = {"q": query, "o": offset} + params = {"q": query, "tag": tags, "o": offset} return self._pagination(endpoint, params, 50) def creator_posts_legacy(self, service, creator_id, diff --git a/gallery_dl/extractor/mangakakalot.py b/gallery_dl/extractor/mangakakalot.py deleted file mode 100644 index 9fc8681..0000000 --- a/gallery_dl/extractor/mangakakalot.py +++ /dev/null @@ -1,92 +0,0 @@ -# -*- coding: utf-8 -*- - -# Copyright 2020 Jake Mannens -# Copyright 2021-2023 Mike Fährmann -# -# This program is free software; you can redistribute it and/or modify -# it under the terms of the GNU General Public License version 2 as -# published by the Free Software Foundation. - -"""Extractors for https://mangakakalot.tv/""" - -from .common import ChapterExtractor, MangaExtractor -from .. import text -import re - -BASE_PATTERN = r"(?:https?://)?(?:ww[\dw]?\.)?mangakakalot\.tv" - - -class MangakakalotBase(): - """Base class for mangakakalot extractors""" - category = "mangakakalot" - root = "https://ww8.mangakakalot.tv" - - -class MangakakalotChapterExtractor(MangakakalotBase, ChapterExtractor): - """Extractor for manga chapters from mangakakalot.tv""" - pattern = BASE_PATTERN + r"(/chapter/[^/?#]+/chapter[_-][^/?#]+)" - example = "https://ww6.mangakakalot.tv/chapter/manga-ID/chapter-01" - - def __init__(self, match): - self.path = match.group(1) - ChapterExtractor.__init__(self, match, self.root + self.path) - - def metadata(self, page): - _ , pos = text.extract(page, '<span itemprop="title">', '<') - manga , pos = text.extract(page, '<span itemprop="title">', '<', pos) - info , pos = text.extract(page, '<span itemprop="title">', '<', pos) - author, pos = text.extract(page, '. Author:', ' already has ', pos) - - match = re.match( - r"(?:[Vv]ol\. *(\d+) )?" - r"[Cc]hapter *([^:]*)" - r"(?:: *(.+))?", info or "") - volume, chapter, title = match.groups() if match else ("", "", info) - chapter, sep, minor = chapter.partition(".") - - return { - "manga" : text.unescape(manga), - "title" : text.unescape(title) if title else "", - "author" : text.unescape(author).strip() if author else "", - "volume" : text.parse_int(volume), - "chapter" : text.parse_int(chapter), - "chapter_minor": sep + minor, - "lang" : "en", - "language" : "English", - } - - def images(self, page): - return [ - (url, None) - for url in text.extract_iter(page, '<img data-src="', '"') - ] - - -class MangakakalotMangaExtractor(MangakakalotBase, MangaExtractor): - """Extractor for manga from mangakakalot.tv""" - chapterclass = MangakakalotChapterExtractor - pattern = BASE_PATTERN + r"(/manga/[^/?#]+)" - example = "https://ww6.mangakakalot.tv/manga/manga-ID" - - def chapters(self, page): - data = {"lang": "en", "language": "English"} - data["manga"], pos = text.extract(page, "<h1>", "<") - author, pos = text.extract(page, "<li>Author(s) :", "</a>", pos) - data["author"] = text.remove_html(author) - - results = [] - for chapter in text.extract_iter(page, '<div class="row">', '</div>'): - url, pos = text.extract(chapter, '<a href="', '"') - title, pos = text.extract(chapter, '>', '</a>', pos) - data["title"] = title.partition(": ")[2] - data["date"] , pos = text.extract( - chapter, '<span title=" ', '"', pos) - - chapter, sep, minor = url.rpartition("/chapter-")[2].partition(".") - data["chapter"] = text.parse_int(chapter) - data["chapter_minor"] = sep + minor - - if url[0] == "/": - url = self.root + url - results.append((url, data.copy())) - return results diff --git a/gallery_dl/extractor/manganelo.py b/gallery_dl/extractor/manganelo.py index 232b98d..5e92aee 100644 --- a/gallery_dl/extractor/manganelo.py +++ b/gallery_dl/extractor/manganelo.py @@ -1,107 +1,128 @@ # -*- coding: utf-8 -*- +# Copyright 2020 Jake Mannens +# Copyright 2021-2025 Mike Fährmann +# # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License version 2 as # published by the Free Software Foundation. -"""Extractors for https://manganato.com/""" +"""Extractors for https://www.mangakakalot.gg/ and mirror sites""" -from .common import ChapterExtractor, MangaExtractor -from .. import text -import re +from .common import BaseExtractor, ChapterExtractor, MangaExtractor +from .. import text, util -BASE_PATTERN = ( - r"(?:https?://)?" - r"((?:chap|read|www\.|m\.)?mangan(?:at|el)o" - r"\.(?:to|com))" -) +class ManganeloExtractor(BaseExtractor): + basecategory = "manganelo" -class ManganeloBase(): - category = "manganelo" - root = "https://chapmanganato.com" - _match_chapter = None - def __init__(self, match): - domain, path = match.groups() - super().__init__(match, "https://" + domain + path) - - def _init(self): - if self._match_chapter is None: - ManganeloBase._match_chapter = re.compile( - r"(?:[Vv]ol\.?\s*(\d+)\s?)?" - r"[Cc]hapter\s*(\d+)([^:]*)" - r"(?::\s*(.+))?").match - - def _parse_chapter(self, info, manga, author, date=None): - match = self._match_chapter(info) - if match: - volume, chapter, minor, title = match.groups() - else: - volume = chapter = minor = "" - title = info - - return { - "manga" : manga, - "author" : author, - "date" : date, - "title" : text.unescape(title) if title else "", - "volume" : text.parse_int(volume), - "chapter" : text.parse_int(chapter), - "chapter_minor": minor, - "lang" : "en", - "language" : "English", - } +BASE_PATTERN = ManganeloExtractor.update({ + "nelomanga": { + "root" : "https://www.nelomanga.net", + "pattern": r"(?:www\.)?nelomanga\.net", + }, + "natomanga": { + "root" : "https://www.natomanga.com", + "pattern": r"(?:www\.)?natomanga\.com", + }, + "manganato": { + "root" : "https://www.manganato.gg", + "pattern": r"(?:www\.)?manganato\.gg", + }, + "mangakakalot": { + "root" : "https://www.mangakakalot.gg", + "pattern": r"(?:www\.)?mangakakalot\.gg", + }, +}) -class ManganeloChapterExtractor(ManganeloBase, ChapterExtractor): - """Extractor for manga chapters from manganelo.com""" - pattern = BASE_PATTERN + r"(/(?:manga-\w+|chapter/\w+)/chapter[-_][^/?#]+)" - example = "https://chapmanganato.com/manga-ID/chapter-01" +class ManganeloChapterExtractor(ManganeloExtractor, ChapterExtractor): + """Extractor for manganelo manga chapters""" + pattern = BASE_PATTERN + r"(/manga/[^/?#]+/chapter-[^/?#]+)" + example = "https://www.mangakakalot.gg/manga/MANGA_NAME/chapter-123" + + def __init__(self, match): + ManganeloExtractor.__init__(self, match) + self.gallery_url = self.root + self.groups[-1] def metadata(self, page): extr = text.extract_from(page) - extr('class="a-h"', ">") - manga = extr('title="', '"') - info = extr('title="', '"') - author = extr("- Author(s) : ", "</p>") - return self._parse_chapter( - info, text.unescape(manga), text.unescape(author)) + data = { + "date" : text.parse_datetime(extr( + '"datePublished": "', '"')[:19], "%Y-%m-%dT%H:%M:%S"), + "date_updated": text.parse_datetime(extr( + '"dateModified": "', '"')[:19], "%Y-%m-%dT%H:%M:%S"), + "manga_id" : text.parse_int(extr("comic_id =", ";")), + "chapter_id" : text.parse_int(extr("chapter_id =", ";")), + "manga" : extr("comic_name =", ";").strip('" '), + "lang" : "en", + "language" : "English", + } + + chapter_name = extr("chapter_name =", ";").strip('" ') + chapter, sep, minor = chapter_name.rpartition(" ")[2].partition(".") + data["chapter"] = text.parse_int(chapter) + data["chapter_minor"] = sep + minor + data["author"] = extr(". Author:", " already has ").strip() + + return data def images(self, page): - page = text.extr( - page, 'class="container-chapter-reader', 'class="container') + extr = text.extract_from(page) + cdns = util.json_loads(extr("var cdns =", ";"))[0] + imgs = util.json_loads(extr("var chapterImages =", ";")) + + if cdns[-1] != "/": + cdns += "/" + return [ - (url, None) - for url in text.extract_iter(page, '<img src="', '"') - if not url.endswith("/gohome.png") - ] or [ - (url, None) - for url in text.extract_iter( - page, '<img class="reader-content" src="', '"') + (cdns + path, None) + for path in imgs ] -class ManganeloMangaExtractor(ManganeloBase, MangaExtractor): - """Extractor for manga from manganelo.com""" +class ManganeloMangaExtractor(ManganeloExtractor, MangaExtractor): + """Extractor for manganelo manga""" chapterclass = ManganeloChapterExtractor - pattern = BASE_PATTERN + r"(/(?:manga[-/]|read_)\w+)/?$" - example = "https://manganato.com/manga-ID" + pattern = BASE_PATTERN + r"(/manga/[^/?#]+)$" + example = "https://www.mangakakalot.gg/manga/MANGA_NAME" - def chapters(self, page): - results = [] - append = results.append + def __init__(self, match): + ManganeloExtractor.__init__(self, match) + self.manga_url = self.root + self.groups[-1] + def chapters(self, page): extr = text.extract_from(page) + manga = text.unescape(extr("<h1>", "<")) - author = text.remove_html(extr("</i>Author(s) :</td>", "</tr>")) - - extr('class="row-content-chapter', '') - while True: - url = extr('class="chapter-name text-nowrap" href="', '"') - if not url: - return results - info = extr(">", "<") - date = extr('class="chapter-time text-nowrap" title="', '"') - append((url, self._parse_chapter(info, manga, author, date))) + author = text.remove_html(extr("<li>Author(s) :", "</a>")) + status = extr("<li>Status :", "<").strip() + update = text.parse_datetime(extr( + "<li>Last updated :", "<").strip(), "%b-%d-%Y %I:%M:%S %p") + tags = text.split_html(extr(">Genres :", "</li>"))[::2] + + results = [] + for chapter in text.extract_iter(page, '<div class="row">', '</div>'): + url, pos = text.extract(chapter, '<a href="', '"') + title, pos = text.extract(chapter, '>', '</a>', pos) + date, pos = text.extract(chapter, '<span title="', '"', pos) + chapter, sep, minor = url.rpartition("/chapter-")[2].partition("-") + + if url[0] == "/": + url = self.root + url + results.append((url, { + "manga" : manga, + "author" : author, + "status" : status, + "tags" : tags, + "date_updated": update, + "chapter" : text.parse_int(chapter), + "chapter_minor": (sep and ".") + minor, + "title" : title.partition(": ")[2], + "date" : text.parse_datetime(date, "%b-%d-%Y %H:%M"), + "lang" : "en", + "language": "English", + })) + return results diff --git a/gallery_dl/extractor/pixiv.py b/gallery_dl/extractor/pixiv.py index dfed1aa..c063216 100644 --- a/gallery_dl/extractor/pixiv.py +++ b/gallery_dl/extractor/pixiv.py @@ -866,16 +866,6 @@ class PixivNovelExtractor(PixivExtractor): embeds = self.config("embeds") covers = self.config("covers") - if embeds: - headers = { - "User-Agent" : "Mozilla/5.0", - "App-OS" : None, - "App-OS-Version": None, - "App-Version" : None, - "Referer" : self.root + "/", - "Authorization" : None, - } - novels = self.novels() if self.max_posts: novels = itertools.islice(novels, self.max_posts) @@ -935,15 +925,12 @@ class PixivNovelExtractor(PixivExtractor): if desktop: try: - novel_id = str(novel["id"]) - url = "{}/novel/show.php?id={}".format( - self.root, novel_id) - data = util.json_loads(text.extr( - self.request(url, headers=headers).text, - "id=\"meta-preload-data\" content='", "'")) - images = (data["novel"][novel_id] - ["textEmbeddedImages"]).values() - except Exception: + body = self._request_ajax("/novel/" + str(novel["id"])) + images = body["textEmbeddedImages"].values() + except Exception as exc: + self.log.warning( + "%s: Failed to get embedded novel images (%s: %s)", + novel["id"], exc.__class__.__name__, exc) images = () for image in images: diff --git a/gallery_dl/extractor/scrolller.py b/gallery_dl/extractor/scrolller.py index f97fa14..7bfc550 100644 --- a/gallery_dl/extractor/scrolller.py +++ b/gallery_dl/extractor/scrolller.py @@ -56,7 +56,12 @@ class ScrolllerExtractor(Extractor): files = [] for num, media in enumerate(album, 1): - src = max(media["mediaSources"], key=self._sort_key) + sources = media.get("mediaSources") + if not sources: + self.log.warning("%s/%s: Missing media file", + post.get("id"), num) + continue + src = max(sources, key=self._sort_key) src["num"] = num files.append(src) return files diff --git a/gallery_dl/extractor/tumblr.py b/gallery_dl/extractor/tumblr.py index 6f2114e..a2cce83 100644 --- a/gallery_dl/extractor/tumblr.py +++ b/gallery_dl/extractor/tumblr.py @@ -474,8 +474,14 @@ class TumblrAPI(oauth.OAuth1API): board = False if board: - self.log.info("Run 'gallery-dl oauth:tumblr' " - "to access dashboard-only blogs") + if self.api_key is None: + self.log.info( + "Ensure your 'access-token' and " + "'access-token-secret' belong to the same " + "application as 'api-key' and 'api-secret'") + else: + self.log.info("Run 'gallery-dl oauth:tumblr' " + "to access dashboard-only blogs") raise exception.AuthorizationError(error) raise exception.NotFoundError("user or post") diff --git a/gallery_dl/extractor/twitter.py b/gallery_dl/extractor/twitter.py index e2fe000..896bf28 100644 --- a/gallery_dl/extractor/twitter.py +++ b/gallery_dl/extractor/twitter.py @@ -1069,6 +1069,7 @@ class TwitterImageExtractor(Extractor): class TwitterAPI(): + client_transaction = None def __init__(self, extractor): self.extractor = extractor @@ -1101,6 +1102,7 @@ class TwitterAPI(): "x-csrf-token": csrf_token, "x-twitter-client-language": "en", "x-twitter-active-user": "yes", + "x-client-transaction-id": None, "Sec-Fetch-Dest": "empty", "Sec-Fetch-Mode": "cors", "Sec-Fetch-Site": "same-origin", @@ -1503,12 +1505,38 @@ class TwitterAPI(): self.extractor.cookies.set( "gt", guest_token, domain=self.extractor.cookies_domain) + @cache(maxage=10800) + def _client_transaction(self): + self.log.info("Initializing client transaction keys") + + from .. import transaction_id + ct = transaction_id.ClientTransaction() + ct.initialize(self.extractor) + + # update 'x-csrf-token' header (#7467) + csrf_token = self.extractor.cookies.get( + "ct0", domain=self.extractor.cookies_domain) + if csrf_token: + self.headers["x-csrf-token"] = csrf_token + + return ct + + def _transaction_id(self, url, method="GET"): + if self.client_transaction is None: + TwitterAPI.client_transaction = self._client_transaction() + path = url[url.find("/", 8):] + self.headers["x-client-transaction-id"] = \ + self.client_transaction.generate_transaction_id(method, path) + def _call(self, endpoint, params, method="GET", auth=True, root=None): url = (root or self.root) + endpoint while True: - if not self.headers["x-twitter-auth-type"] and auth: - self._authenticate_guest() + if auth: + if self.headers["x-twitter-auth-type"]: + self._transaction_id(url, method) + else: + self._authenticate_guest() response = self.extractor.request( url, method=method, params=params, diff --git a/gallery_dl/extractor/weasyl.py b/gallery_dl/extractor/weasyl.py index ed2a395..9f6b021 100644 --- a/gallery_dl/extractor/weasyl.py +++ b/gallery_dl/extractor/weasyl.py @@ -72,7 +72,7 @@ class WeasylExtractor(Extractor): class WeasylSubmissionExtractor(WeasylExtractor): subcategory = "submission" - pattern = BASE_PATTERN + r"(?:~[\w~-]+/submissions|submission)/(\d+)" + pattern = BASE_PATTERN + r"(?:~[\w~-]+/submissions|submission|view)/(\d+)" example = "https://www.weasyl.com/~USER/submissions/12345/TITLE" def __init__(self, match): diff --git a/gallery_dl/job.py b/gallery_dl/job.py index bea35e3..a88f536 100644 --- a/gallery_dl/job.py +++ b/gallery_dl/job.py @@ -496,8 +496,6 @@ class DownloadJob(Job): self._skipcnt += 1 if self._skipcnt >= self._skipmax: raise self._skipexc() - else: - self._skipcnt = 0 def download(self, url): """Download 'url'""" diff --git a/gallery_dl/postprocessor/__init__.py b/gallery_dl/postprocessor/__init__.py index 7837b06..dd44a8a 100644 --- a/gallery_dl/postprocessor/__init__.py +++ b/gallery_dl/postprocessor/__init__.py @@ -11,6 +11,7 @@ modules = [ "classify", "compare", + "directory", "exec", "hash", "metadata", diff --git a/gallery_dl/postprocessor/directory.py b/gallery_dl/postprocessor/directory.py new file mode 100644 index 0000000..ed8c02e --- /dev/null +++ b/gallery_dl/postprocessor/directory.py @@ -0,0 +1,30 @@ +# -*- coding: utf-8 -*- + +# Copyright 2025 Mike Fährmann +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License version 2 as +# published by the Free Software Foundation. + +"""Trigger directory format string evaluation""" + +from .common import PostProcessor + + +class DirectoryPP(PostProcessor): + + def __init__(self, job, options): + PostProcessor.__init__(self, job) + + events = options.get("event") + if events is None: + events = ("prepare",) + elif isinstance(events, str): + events = events.split(",") + job.register_hooks({event: self.run for event in events}, options) + + def run(self, pathfmt): + pathfmt.set_directory(pathfmt.kwdict) + + +__postprocessor__ = DirectoryPP diff --git a/gallery_dl/transaction_id.py b/gallery_dl/transaction_id.py new file mode 100644 index 0000000..25f1775 --- /dev/null +++ b/gallery_dl/transaction_id.py @@ -0,0 +1,246 @@ +# -*- coding: utf-8 -*- + +# Copyright 2025 Mike Fährmann +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License version 2 as +# published by the Free Software Foundation. + +# Adapted from iSarabjitDhiman/XClientTransaction +# https://github.com/iSarabjitDhiman/XClientTransaction + +# References: +# https://antibot.blog/posts/1741552025433 +# https://antibot.blog/posts/1741552092462 +# https://antibot.blog/posts/1741552163416 + +"""Twitter 'x-client-transaction-id' header generation""" + +import math +import time +import random +import hashlib +import binascii +import itertools +from . import text, util +from .cache import cache + + +class ClientTransaction(): + __slots__ = ("key_bytes", "animation_key") + + def __getstate__(self): + return (self.key_bytes, self.animation_key) + + def __setstate__(self, state): + self.key_bytes, self.animation_key = state + + def initialize(self, extractor, homepage=None): + if homepage is None: + homepage = extractor.request("https://x.com/").text + + key = self._extract_verification_key(homepage) + if not key: + extractor.log.error( + "Failed to extract 'twitter-site-verification' key") + + ondemand_s = text.extr(homepage, '"ondemand.s":"', '"') + indices = self._extract_indices(ondemand_s, extractor) + if not indices: + extractor.log.error("Failed to extract KEY_BYTE indices") + + frames = self._extract_frames(homepage) + if not frames: + extractor.log.error("Failed to extract animation frame data") + + self.key_bytes = key_bytes = binascii.a2b_base64(key) + self.animation_key = self._calculate_animation_key( + frames, indices[0], key_bytes, indices[1:]) + + def _extract_verification_key(self, homepage): + pos = homepage.find('name="twitter-site-verification"') + beg = homepage.rfind("<", 0, pos) + end = homepage.find(">", pos) + return text.extr(homepage[beg:end], 'content="', '"') + + @cache(maxage=36500*86400, keyarg=1) + def _extract_indices(self, ondemand_s, extractor): + url = ("https://abs.twimg.com/responsive-web/client-web" + "/ondemand.s." + ondemand_s + "a.js") + page = extractor.request(url).text + pattern = util.re_compile(r"\(\w\[(\d\d?)\],\s*16\)") + return [int(i) for i in pattern.findall(page)] + + def _extract_frames(self, homepage): + return list(text.extract_iter( + homepage, 'id="loading-x-anim-', "</svg>")) + + def _calculate_animation_key(self, frames, row_index, key_bytes, + key_bytes_indices, total_time=4096): + frame = frames[key_bytes[5] % 4] + array = self._generate_2d_array(frame) + frame_row = array[key_bytes[row_index] % 16] + + frame_time = 1 + for index in key_bytes_indices: + frame_time *= key_bytes[index] % 16 + frame_time = round_js(frame_time / 10) * 10 + target_time = frame_time / total_time + + return self.animate(frame_row, target_time) + + def _generate_2d_array(self, frame): + split = util.re_compile(r"[^\d]+").split + return [ + [int(x) for x in split(path) if x] + for path in text.extr( + frame, '</path><path d="', '"')[9:].split("C") + ] + + def animate(self, frames, target_time): + curve = [scale(float(frame), is_odd(index), 1.0, False) + for index, frame in enumerate(frames[7:])] + cubic = cubic_value(curve, target_time) + + color_a = (float(frames[0]), float(frames[1]), float(frames[2])) + color_b = (float(frames[3]), float(frames[4]), float(frames[5])) + color = interpolate_list(cubic, color_a, color_b) + color = [0.0 if c <= 0.0 else 255.0 if c >= 255.0 else c + for c in color] + + rotation_a = 0.0 + rotation_b = scale(float(frames[6]), 60.0, 360.0, True) + rotation = interpolate_value(cubic, rotation_a, rotation_b) + matrix = rotation_matrix_2d(rotation) + + result = ( + hex(round(color[0]))[2:], + hex(round(color[1]))[2:], + hex(round(color[2]))[2:], + float_to_hex(abs(round(matrix[0], 2))), + float_to_hex(abs(round(matrix[1], 2))), + float_to_hex(abs(round(matrix[2], 2))), + float_to_hex(abs(round(matrix[3], 2))), + "00", + ) + return "".join(result).replace(".", "").replace("-", "") + + def generate_transaction_id(self, method, path, + keyword="obfiowerehiring", rndnum=3): + bytes_key = self.key_bytes + + now = int(time.time()) - 1682924400 + bytes_time = ( + (now ) & 0xFF, # noqa: E202 + (now >> 8) & 0xFF, # noqa: E222 + (now >> 16) & 0xFF, + (now >> 24) & 0xFF, + ) + + payload = "{}!{}!{}{}{}".format( + method, path, now, keyword, self.animation_key) + bytes_hash = hashlib.sha256(payload.encode()).digest()[:16] + + num = random.randrange(256) + result = bytes( + byte ^ num + for byte in itertools.chain( + (0,), bytes_key, bytes_time, bytes_hash, (rndnum,)) + ) + return binascii.b2a_base64(result).rstrip(b"=\n") + + +# Cubic Curve + +def cubic_value(curve, t): + if t <= 0.0: + if curve[0] > 0.0: + value = curve[1] / curve[0] + elif curve[1] == 0.0 and curve[2] > 0.0: + value = curve[3] / curve[2] + else: + value = 0.0 + return value * t + + if t >= 1.0: + if curve[2] < 1.0: + value = (curve[3] - 1.0) / (curve[2] - 1.0) + elif curve[2] == 1.0 and curve[0] < 1.0: + value = (curve[1] - 1.0) / (curve[0] - 1.0) + else: + value = 0.0 + return 1.0 + value * (t - 1.0) + + start = 0.0 + end = 1.0 + while start < end: + mid = (start + end) / 2.0 + est = cubic_calculate(curve[0], curve[2], mid) + if abs(t - est) < 0.00001: + return cubic_calculate(curve[1], curve[3], mid) + if est < t: + start = mid + else: + end = mid + return cubic_calculate(curve[1], curve[3], mid) + + +def cubic_calculate(a, b, m): + m1 = 1.0 - m + return 3.0*a*m1*m1*m + 3.0*b*m1*m*m + m*m*m + + +# Interpolation + +def interpolate_list(x, a, b): + return [ + interpolate_value(x, a[i], b[i]) + for i in range(len(a)) + ] + + +def interpolate_value(x, a, b): + if isinstance(a, bool): + return a if x <= 0.5 else b + return a * (1.0 - x) + b * x + + +# Rotation + +def rotation_matrix_2d(deg): + rad = math.radians(deg) + cos = math.cos(rad) + sin = math.sin(rad) + return [cos, -sin, sin, cos] + + +# Utilities + +def float_to_hex(numf): + numi = int(numf) + + fraction = numf - numi + if not fraction: + return hex(numi)[2:] + + result = ["."] + while fraction > 0.0: + fraction *= 16.0 + integer = int(fraction) + fraction -= integer + result.append(chr(integer + 87) if integer > 9 else str(integer)) + return hex(numi)[2:] + "".join(result) + + +def is_odd(num): + return -1.0 if num % 2 else 0.0 + + +def round_js(num): + floor = math.floor(num) + return floor if (num - floor) < 0.5 else math.ceil(num) + + +def scale(value, value_min, value_max, rounding): + result = value * (value_max-value_min) / 255.0 + value_min + return math.floor(result) if rounding else round(result, 2) diff --git a/gallery_dl/version.py b/gallery_dl/version.py index af4acf5..d40dacd 100644 --- a/gallery_dl/version.py +++ b/gallery_dl/version.py @@ -6,5 +6,5 @@ # it under the terms of the GNU General Public License version 2 as # published by the Free Software Foundation. -__version__ = "1.29.5" +__version__ = "1.29.6" __variant__ = None diff --git a/test/test_postprocessor.py b/test/test_postprocessor.py index 8b073b4..76e728c 100644 --- a/test/test_postprocessor.py +++ b/test/test_postprocessor.py @@ -173,6 +173,24 @@ class ClassifyTest(BasePostprocessorTest): self.assertEqual(self.pathfmt.realpath, path + "/file.foo") +class DirectoryTest(BasePostprocessorTest): + + def test_default(self): + self._create() + + path = os.path.join(self.dir.name, "test") + self.assertEqual(self.pathfmt.realdirectory, path + "/") + self.assertEqual(self.pathfmt.realpath, path + "/file.ext") + + self.pathfmt.kwdict["category"] = "custom" + self._trigger() + + path = os.path.join(self.dir.name, "custom") + self.assertEqual(self.pathfmt.realdirectory, path + "/") + self.pathfmt.build_path() + self.assertEqual(self.pathfmt.realpath, path + "/file.ext") + + class ExecTest(BasePostprocessorTest): def test_command_string(self): diff --git a/test/test_results.py b/test/test_results.py index 28db6c3..6e04e1d 100644 --- a/test/test_results.py +++ b/test/test_results.py @@ -239,7 +239,11 @@ class TestExtractorResults(unittest.TestCase): key = key[1:] if key not in kwdict: continue + path = "{}.{}".format(parent, key) if parent else key + if key.startswith("!"): + self.assertNotIn(key[1:], kwdict, msg=path) + continue self.assertIn(key, kwdict, msg=path) value = kwdict[key] @@ -272,8 +276,11 @@ class TestExtractorResults(unittest.TestCase): elif test.startswith("type:"): self.assertEqual(test[5:], type(value).__name__, msg=path) elif test.startswith("len:"): - self.assertIsInstance(value, (list, tuple), msg=path) - self.assertEqual(int(test[4:]), len(value), msg=path) + cls, _, length = test[4:].rpartition(":") + if cls: + self.assertEqual( + cls, type(value).__name__, msg=path + "/type") + self.assertEqual(int(length), len(value), msg=path) else: self.assertEqual(test, value, msg=path) else: |
