zcooldl package¶

Submodules¶

zcooldl.cli module¶

Console script for zcooldl.

zcooldl.utils module¶

zcooldl.utils.mkdirs_if_not_exist(dir)[source]¶

文件夹不存在时则创建。

Parameters:	dir (str) – 文件夹路径，支持多级

zcooldl.utils.parse_resources(ids, names, collections)[source]¶

解析用户名或 ID。

Parameters:	ids (str) – 半角逗号分隔的用户 ID names (str) – 半角逗号分隔的用户名
Return list:	包含 User 数据的列表

zcooldl.utils.retry(exceptions, tries=3, delay=1, backoff=2, logger=None)[source]¶

Retry calling the decorated function using an exponential backoff.

Parameters:	exceptions – The exception to check. may be a tuple of exceptions to check. tries – Number of times to try (not retry) before giving up. delay – Initial delay between retries in seconds. backoff – Backoff multiplier (e.g. value of 2 will double the delay each retry). logger – Logger to use. If None, print.

zcooldl.utils.safe_filename(filename)[source]¶

去掉文件名中的非法字符。

Parameters:	filename (str) – 文件名
Return str:	合法文件名

zcooldl.utils.sort_records(records: Iterable[T_co], order: dict)[source]¶

根据自定义的排序规则排序

Parameters:	records (Iterable) – 要排序的记录 order (dict) – 自定义的排序
Returns:

zcooldl.zcooldl module¶

class zcooldl.zcooldl.Scrapy(type, author, title, objid, index, url)¶

Bases: tuple

author¶: Alias for field number 1

index¶: Alias for field number 4

objid¶: Alias for field number 3

title¶: Alias for field number 2

type¶: Alias for field number 0

url¶: Alias for field number 5

class zcooldl.zcooldl.ZCoolScraper(user_id=None, username=None, collection=None, destination=None, max_pages=None, spec_topics=None, max_topics=None, max_workers=None, retries=None, redownload=None, overwrite=False, thumbnail=False)[source]¶

Bases: object

download_image(scrapy)[source]¶

下载图片保存到本地。

Parameters:	scrapy – 记录任务信息的数据体
Return Scrapy:	记录任务信息的数据体

fetch_all(initialized: bool = False)[source]¶: 同时爬取主页、主题，并更新状态。

fetch_images()[source]¶: 从任务队列中获取要爬取的主题，使用多线程处理得到需要下载的图片。

fetch_topics()[source]¶: 从任务队列中获取要爬取的主页，使用多线程处理得到需要爬取的主题。

generate_pages()[source]¶: 根据最大下载页数，生成需要爬取主页的任务。

parse_collection_topics(topics: List[dict], offset: int = 0)[source]¶

parse_images(scrapy)[source]¶

爬取 topic，获得 objid 后直接调用 API，从返回数据里获得图片地址等信息，

并将下载图片的任务添加到任务队列。 :param scrapy: 记录任务信息的数据体 :return Scrapy: 记录任务信息的数据体

parse_objid(url: str, is_collection: bool = False) → str[source]¶

根据 topic 页面解析 objid

Parameters:	url – topic 或 collection 的 URL
Returns:	objid

parse_topics(scrapy)[source]¶

爬取主页，解析所有 topic，并将爬取主题的任务添加到任务队列。

Parameters:	scrapy – 记录任务信息的数据体
Return Scrapy:	记录任务信息的数据体

reload_records(file)[source]¶

从本地下载记录里读取下载失败的内容。

Parameters:	file (str) – 下载记录文件的路径。
Return str:	用户名

run_scraper()[source]¶: 使用多线程下载所有图片，完成后保存记录并退出程序。

save_records()[source]¶

将成功及失败的下载记录保存到本地文件。

Return str:	记录文件的路径

search_id_by_username(username)[source]¶

通过用户昵称查找用户 ID。

Parameters:	username (str) – 用户昵称
Return int:	用户 ID

show_download_status(interval=0.5, end=None)[source]¶

用于后台线程，实现边下载边显示状态。

Parameters:	interval (int) – 状态更新间隔，秒 end (function) – 用于控制退出线程

show_fetch_status(interval=0.5, end=None)[source]¶

用于后台线程，实现边爬取边显示状态。

Parameters:	interval (int) – 状态更新间隔，秒 end (function) – 用于控制退出线程

zcooldl.zcooldl.get_session()[source]¶

使线程获取同一个 Session，可减少 TCP 连接数，加速请求。

Return requests.Session:
	session

zcooldl.zcooldl.session_request(url: str, method: str = 'GET') → requests.models.Response[source]¶

使用 session 请求数据。使用了装饰器 retry，在网络异常导致错误时会重试。

Return requests.Response:
Parameters:	url (str) – 目标请求 URL method (str) – 请求方式
	响应数据

Module contents¶

Top-level package for ZCool Downloader.

class zcooldl.ZCoolScraper(user_id=None, username=None, collection=None, destination=None, max_pages=None, spec_topics=None, max_topics=None, max_workers=None, retries=None, redownload=None, overwrite=False, thumbnail=False)[source]¶