Domain Name: cucas.cn
ROID: 20091020s10001s02250747-cn
Domain Status: ok
Registrant: 北京问程教育科技有限公司
Registrant Contact Email: [email protected]
Sponsoring Registrar: 阿里云计算有限公司(万网)
Name Server: dns17.hichina.com
Name Server: dns18.hichina.com
Registration Time: 2009-10-20 17:57:59
Expiration Time: 2027-10-20 17:57:59
DNSSEC: unsigned
# ============================================ # 默认规则:禁止所有爬虫 # ============================================ User-agent: * Disallow: / # ============================================ # 例外规则:允许的爬虫 # ============================================ Plaintext User-agent: * Disallow: /search Disallow: /search/ Disallow: /*?q= Disallow: /*?keyword= Disallow: /index/search Disallow: /index/search/ Disallow: /find Disallow: /find/ # 1. Google所有官方爬虫 User-agent: Googlebot Allow: / Crawl-delay: 5 User-agent: Googlebot-Image Allow: / Crawl-delay: 5 User-agent: Googlebot-Mobile Allow: / Crawl-delay: 5 User-agent: Googlebot-News Allow: / Crawl-delay: 5 User-agent: Googlebot-Video Allow: / Crawl-delay: 5 User-agent: AdsBot-Google Allow: / # Google广告爬虫 Crawl-delay: 5 User-agent: AdsBot-Google-Mobile Allow: / # Google移动广告爬虫 Crawl-delay: 5 User-agent: Mediapartners-Google Allow: / # Google AdSense Crawl-delay: 5 User-agent: GoogleOther Allow: / # Google其他爬虫 Crawl-delay: 5 User-agent: Google-InspectionTool Allow: / # Google检查工具 Crawl-delay: 5 # 2. ChatGPT相关爬虫 User-agent: GPTBot Allow: / Crawl-delay: 5 # ChatGPT可以设置较长延迟 User-agent: ChatGPT-User Allow: / Crawl-delay: 5 # 其他OpenAI相关爬虫 User-agent: OpenAI Allow: / Crawl-delay: 4 User-agent: OpenAI-GPT Allow: / Crawl-delay: 4 User-agent: OpenAI-User Allow: / Crawl-delay: 4 # 已知的恶意爬虫 User-agent: MJ12bot User-agent: AhrefsBot User-agent: SemrushBot User-agent: DotBot User-agent: MegaIndex User-agent: SISTRIX User-agent: ZoominfoBot User-agent: Seekport User-agent: KomodiaBot User-agent: Applebot Disallow: / # 网站扫描器 User-agent: Nessus User-agent: Nmap User-agent: sqlmap User-agent: w3af User-agent: owasp Disallow: / # 数据收集器 User-agent: DataForSeoBot User-agent: 80legs User-agent: masscan User-agent: masscanner Disallow: / User-agent: Sogou web spider Disallow: / Crawl-delay: 30 # 阻止重复语言参数 Disallow: /*?*&lang=[^&]+&lang=[^&]+ # 阻止分页参数(更精确) Disallow: /*?*[&?]page=[0-9]+ Disallow: /*?*[&?]page%3D[0-9]+ Disallow: /*?*[&?]p=[0-9]+ Disallow: /*?*[&?]pg=[0-9]+ # 阻止复杂的分页模式(如日志中的) Disallow: /*_page[0-9]+_ Disallow: /*_page[0-9]+_page[0-9]+ Sitemap: https://www.cucas.cn/sitemap.xml Sitemap: https://city.cucas.cn/sitemap.xml Sitemap: https://feature.cucas.cn/sitemap.xml Sitemap: https://forum.cucas.cn/sitemap.xml Sitemap: https://news.cucas.cn/sitemap.xml Sitemap: https://scholarship.cucas.cn/sitemap.xml Sitemap: https://school.cucas.cn/sitemap.xml Sitemap: https://m.cucas.cn/sitemap.xml Sitemap: https://mbbs.cucas.cn/sitemap.xml Sitemap: https://chinambbs.cucas.cn/sitemap.xml Sitemap: https://ranking.cucas.cn/sitemap.xml
| Posició | Frase | Pàgina | Fragment |
|---|---|---|---|
| 3 | / |