新时代区找图经验总结-BB姬

终于升侠客了，发了50多个主题贴。一半是技术区，一半是新时代区。
感谢V大，在最后我发的一个技术贴给加了10分。
之前新时代区如何找图一直困扰着我，最后自己摸索出一点经验。特地在此分享一下。
有违规之处，请删除。
以一个国外网站Pornpics为例，介绍一下找图的流程。
前提：科学上网
1. 提取图集链接

Quote:

使用Python脚本提取图片链接，图集的要求是至少20章图片，且该图集和以前的没有重复，防止重复使用同一个图集

Quote:

#!/usr/bin/python3
import requests
from bs4 import BeautifulSoup
import json
import time
import logging

logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
# create file handler which logs even debug messages
fh = logging.FileHandler('pronpic.log')
fh.setLevel(logging.DEBUG)
# create console handler with a higher log level
ch = logging.StreamHandler()
ch.setLevel(logging.WARNING)
# create formatter and add it to the handlers
formatter = logging.Formatter('%(asctime)s – %(message)s')
fh.setFormatter(formatter)
ch.setFormatter(formatter)
# add the handlers to the logger
logger.addHandler(fh)
logger.addHandler(ch)
logger.info('You can find this written in pornpic.log')

proxies = {'http':'http://127.0.0.1:1080','https':'http://127.0.0.1:1080'}

class Random_PornPic():
def __init__(self, url):
self.url = url
self.history = self.load_local_his()
self.gallery_info = list()
# history字典格式，{"gallery_url":gallery_info}
# gallery_info格式   self.gallery_info = {'name':name,
# 'length':length,
# 'gallery_pics':gallery_pics
# }

def parse_gallery(self,url):
if url == "":
return []
logger.info("Parsing the gallery")
try:

r = requests.get(url,proxies=proxies,verify=False)
assert r.status_code == 200
soup = BeautifulSoup(r.text, 'lxml')
result = soup.select(".thumbwook > .rel-link")
result = [x['href'] for x in result]
return result
except Exception:
logger.info("Parse Gallary Failed")
return []

# 检查重复
def check_useful(self, url,limit=20):
self.history = self.load_local_his()
if url in self.history:
return False
else:
gallery_pics = self.parse_gallery(url)
name = url.split('/')[-2]
length = len(gallery_pics)
self.gallery_info = {'name':name,
   'length':length,
   'gallery_pics':gallery_pics
   }
# print(self.gallery_info)
self.history[url] = self.gallery_info
self.save_local_his()
if self.gallery_info['length'] >= limit:
logger.info(url + " ——- VALID")
return True
else:
return False

def load_local_his(self):
try:
with open('./history.json', 'r', encoding='utf-8') as f_toc:
data = json.load(f_toc)
except Exception:
data = dict()
return data

def save_local_his(self):
f_toc = open('./history.json', 'w', encoding='utf-8')
json.dump(self.history, f_toc, ensure_ascii=False, indent=4)
f_toc.close()

def gen_gallery(self):
if self.check_useful(self.url):
logger.info("找到一个符合条件的相册")
with open(self.gallery_info['name'], 'w', encoding='utf-8') as f:
f.write('\n'.join(self.gallery_info['gallery_pics']))
logger.warning("保存一个相册成功")
print(self.gallery_info)
else:
print("可能有重复或者图片数量较少")
return

if __name__ == "__main__":
tmp_url = input("url is:\n")
Random_PornPic(tmp_url).gen_gallery()

运行上边的脚本,会要求输入想提取链接的网页网址。
如：https://www.pornpics.com/galleries/nubilesnet-atenas-andrade-90359301/

然后就会输入图集中的图片链接地址。

2. 上传图床
获得地址后，由于地址是外链，且是国外服务器的地址。国内可能无法访问，需要上传到图床。这里以https://www.privacypic.com/为例。
它可以直接输入链接来上传图片。

将第一步提取的图片链接复制到文本框后，上传。会得到图床的地址，选择BBCcode详细。
然后复制图床返回的地址。

3. 调整格式，起名字
每个图片地址之间都需要空一行
使用Notepad++文本编辑器来批量修改，其他文本编辑器也可以。

然后给图集起一个贴切的标题，就可以发布了。
——————
当然，可以将三个步骤合成为一个步骤，实现自动化。
上传图床可以通过API自动上传，（图床的API一般不开放，需要自己去分析，略麻烦。）然后可以直接得到三个符合要求的图集链接。（每日只可以发3贴）

———–
例如随机获取Pornpic的图集，筛选出图片数量在20个以上的图集。

Quote:

import requests
from bs4 import BeautifulSoup
import json
import time
import logging

logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
# create file handler which logs even debug messages
fh = logging.FileHandler('pronpic.log')
fh.setLevel(logging.DEBUG)
# create console handler with a higher log level
ch = logging.StreamHandler()
ch.setLevel(logging.WARNING)
# create formatter and add it to the handlers
formatter = logging.Formatter('%(asctime)s – %(message)s')
fh.setFormatter(formatter)
ch.setFormatter(formatter)
# add the handlers to the logger
logger.addHandler(fh)
logger.addHandler(ch)
logger.info('You can find this written in pornpic.log')
# 设置代理
proxies = {'http':'http://127.0.0.1:1080','https':'http://127.0.0.1:1080'}

class Random_PornPic():
def __init__(self, gallery_num):
self.number = gallery_num
self.history = self.load_local_his()
self.gallery_info = list()
# history字典格式，{"gallery_url":gallery_info}
# gallery_info格式   self.gallery_info = {'name':name,
# 'length':length,
# 'gallery_pics':gallery_pics
# }

def parse_gallery(self,url):
if url == "":
return []
logger.info("Parsing the gallery")
try:
r = requests.get(url,proxies=proxies,verify=False)
assert r.status_code == 200
soup = BeautifulSoup(r.text, 'lxml')
result = soup.select(".thumbwook > .rel-link")
result = [x['href'] for x in result]
return result
except Exception:
logger.info("Parse Gallary Failed")
return []

# 检查重复
def check_useful(self, url,limit=20):
self.history = self.load_local_his()
if url in self.history:
return False
else:
gallery_pics = self.parse_gallery(url)
name = url.split('/')[-2]
length = len(gallery_pics)
self.gallery_info = {'name':name,
   'length':length,
   'gallery_pics':gallery_pics
   }
# print(self.gallery_info)
self.history[url] = self.gallery_info
self.save_local_his()
if self.gallery_info['length'] >= limit:
logger.info(url + " ——- VALID")
return True
else:
return False

def get_random_gallery(self):
try:
r = requests.get("https://www.pornpics.com/random/index.php",verify=False,timeout=30)
if r.status_code == 200:
url = r.json()['link']
logger.info("get random gallery: " + url)
time.sleep(30)
return url
except Exception:
logger.warning("Fail to get random gallery")
time.sleep(300)

def load_local_his(self):
try:
with open('./history.json', 'r', encoding='utf-8') as f_toc:
data = json.load(f_toc)
except Exception:
data = dict()
return data

def save_local_his(self):
f_toc = open('./history.json', 'w', encoding='utf-8')
json.dump(self.history, f_toc, ensure_ascii=False, indent=4)
f_toc.close()

def gen_gallery(self):
i = 0
while i < self.number:
url = self.get_random_gallery()
if self.check_useful(url):
logger.info("找到一个符合条件的相册")
with open(self.gallery_info['name'], 'w', encoding='utf-8') as f:
f.write('\n'.join(self.gallery_info['gallery_pics']))
logger.warning("保存一个相册成功")
i = i + 1
logger.info("生成了" + self.number + "个相册")
return

if __name__ == "__main__":
# 生成3个图集
Random_PornPic(3).gen_gallery()

[ 此貼被牛河在2020-06-05 08:16重新編輯 ]

新时代区找图经验总结

Quote:

Quote:

Quote:

一苇航之

相关推荐

评论抢沙发

热门文章

BB姬

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏

QQ咨询

回顶部

Quote:

Quote:

Quote:

一苇航之

相关推荐

评论 抢沙发

热门文章

BB姬

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏

QQ咨询

回顶部

评论抢沙发