update swagger
This commit is contained in:
201
README.md
201
README.md
@@ -1,131 +1,128 @@
|
||||
[English](README_EN.md)
|
||||
## API 文档
|
||||
|
||||
[](https://opensource.org/licenses/MIT)
|
||||
[](https://github.com/orz-ai/hot_news/actions)
|
||||
[](https://github.com/orz-ai/hot_news/)
|
||||
[](https://news.orz.ai/docs)
|
||||
### Swagger UI
|
||||
|
||||
# 每日热点新闻 API
|
||||
本项目使用 FastAPI 构建,内置 Swagger UI 文档,启动服务后自动可用。
|
||||
|
||||
- 线上地址:[热点速览](https://news.orz.ai/)
|
||||
- 前端项目戳这里:[热点速览 - 前端项目](https://github.com/orz-ai/hot_news_front)
|
||||
**访问地址:**
|
||||
- **Swagger UI**: http://localhost:8000/docs
|
||||
- **ReDoc**: http://localhost:8000/redoc
|
||||
- **OpenAPI Schema**: http://localhost:8000/openapi.json
|
||||
|
||||
## 概述
|
||||
### 功能特性
|
||||
|
||||
每日热点新闻 API 提供来自多个平台的实时热点新闻数据。数据大约每半小时自动刷新一次。此 API 可用于检索热点新闻标题及其 URL 和评分。
|
||||
✅ **交互式文档** - 可直接在浏览器中测试 API
|
||||
✅ **实时验证** - 自动验证请求参数和数据格式
|
||||
✅ **响应示例** - 每个接口都提供示例响应
|
||||
✅ **详细注释** - 包含完整的中文说明和使用场景
|
||||
|
||||
- **基础 URL**: `https://orz.ai/api/v1/dailynews`
|
||||
### API 分类
|
||||
|
||||
## 支持平台
|
||||
#### 1. 每日热点新闻 (Daily News)
|
||||
|
||||
我们目前支持以下平台的热点内容获取:
|
||||
| 接口 | 描述 | 参数 |
|
||||
|------|------|------|
|
||||
| `GET /api/v1/dailynews/` | 获取单个平台的热门新闻 | `platform`(必需), `date`(可选) |
|
||||
| `GET /api/v1/dailynews/all` | 获取所有平台的热门新闻 | `date`(可选) |
|
||||
| `GET /api/v1/dailynews/multi` | 获取多个平台的热门新闻 | `platforms`(必需), `date`(可选) |
|
||||
| `GET /api/v1/dailynews/search` | 搜索新闻 | `keyword`(必需), `platforms`, `date`, `limit` |
|
||||
|
||||
| 序号 | 平台名称 | 平台代码 | 内容类型 | 状态 |
|
||||
| ---- | --------------- | ------------- | ------------------------ | ---- |
|
||||
| 1 | 百度热搜 | baidu | 社会热点、娱乐、事件 | ✅ |
|
||||
| 2 | 少数派 | sspai | 科技、数码、生活方式 | ✅ |
|
||||
| 3 | 微博热搜 | weibo | 社交媒体热点、娱乐、事件 | ✅ |
|
||||
| 4 | 知乎热榜 | zhihu | 问答、深度内容、社会热点 | ✅ |
|
||||
| 5 | 36氪 | tskr | 科技创业、商业资讯 | ✅ |
|
||||
| 6 | 吾爱破解 | ftpojie | 技术、软件、安全 | ✅ |
|
||||
| 7 | 哔哩哔哩 | bilibili | 视频、动漫、游戏、生活 | ✅ |
|
||||
| 8 | 豆瓣 | douban | 书影音、文化、讨论 | ✅ |
|
||||
| 9 | 虎扑 | hupu | 体育、游戏、数码 | ✅ |
|
||||
| 10 | 百度贴吧 | tieba | 兴趣社区、话题讨论 | ✅ |
|
||||
| 11 | 掘金 | juejin | 编程、技术文章 | ✅ |
|
||||
| 12 | 抖音 | douyin | 短视频热点、娱乐 | ✅ |
|
||||
| 13 | V2EX | vtex | 技术、编程、创意 | ✅ |
|
||||
| 14 | 今日头条 | jinritoutiao | 新闻、热点事件 | ✅ |
|
||||
| 15 | Stack Overflow | stackoverflow | 编程问答、技术讨论 | ✅ |
|
||||
| 16 | GitHub Trending | github | 开源项目、编程语言 | ✅ |
|
||||
| 17 | Hacker News | hackernews | 科技新闻、创业、编程 | ✅ |
|
||||
| 18 | 新浪财经 | sina_finance | 财经新闻、股市资讯 | ✅ |
|
||||
| 19 | 东方财富 | eastmoney | 财经资讯、投资理财 | ✅ |
|
||||
| 20 | 雪球 | xueqiu | 股票投资、财经社区 | ✅ |
|
||||
| 21 | 财联社 | cls | 财经快讯、市场动态 | ✅ |
|
||||
| 22 | 腾讯网 | tenxunwang | 综合新闻、娱乐、科技 | ✅ |
|
||||
**支持的平台(22+):**
|
||||
- 综合资讯:百度、微博、知乎、抖音、今日头条
|
||||
- 科技:GitHub、HackerNews、掘金、36Kr、少数派
|
||||
- 财经:雪球、东方财富
|
||||
- 社区:贴吧、虎扑、豆瓣、V2EX
|
||||
- 视频:哔哩哔哩
|
||||
|
||||
## 使用方法
|
||||
#### 2. 网站工具 (Website Meta)
|
||||
|
||||
- **方法**: `GET`
|
||||
- **参数**:
|
||||
- `platform`: 指定平台。支持的平台有:
|
||||
- [x] baidu
|
||||
- [x] shaoshupai
|
||||
- [x] ......
|
||||
| 接口 | 描述 | 参数 |
|
||||
|------|------|------|
|
||||
| `GET /api/v1/tools/website-meta/` | 获取网站元数据 | `url`(必需) |
|
||||
|
||||
- **请求示例**:
|
||||
```shell
|
||||
GET https://orz.ai/api/v1/dailynews/?platform=baidu
|
||||
```
|
||||
**返回信息:**
|
||||
- 标题、描述、关键词
|
||||
- Open Graph 标签(Facebook)
|
||||
- Twitter Card 标签
|
||||
- Favicon 图标地址
|
||||
|
||||
- **响应示例**:
|
||||
```json
|
||||
{
|
||||
"status": "200",
|
||||
"data": [
|
||||
{
|
||||
"title": "32岁'母单'女孩:6年相亲百人",
|
||||
"url": "https://www.baidu.com/s?word=32%E5%B2%81%E2%80%9C%E6%AF%8D%E5%8D%95%E2%80%9D%E5%A5%B3%E5%AD%A9%EF%BC%9A6%E5%B9%B4%E7%9B%B8%E4%BA%B2%E7%99%BE%E4%BA%BA&sa=fyb_news",
|
||||
"score": "4955232",
|
||||
"desc": ""
|
||||
},
|
||||
{
|
||||
"title": "女高中生被父母退学:打工卖包子",
|
||||
"url": "https://www.baidu.com/s?word=%E5%A5%B3%E9%AB%98%E4%B8%AD%E7%94%9F%E8%A2%AB%E7%88%B6%E6%AF%8D%E9%80%80%E5%AD%A6%EF%BC%9A%E6%89%93%E5%B7%A5%E5%8D%96%E5%8C%85%E5%AD%90&sa=fyb_news",
|
||||
"score": "100000",
|
||||
"desc": "近日,一名高二女生被父母强制辍学去广东打工卖包子,引发热议。26日,当地教育局回应:已经妥善处理了,女生已复学。"
|
||||
}
|
||||
],
|
||||
"msg": "success"
|
||||
}
|
||||
```
|
||||
#### 3. 数据分析 (Analysis)
|
||||
|
||||
## 注意事项
|
||||
| 接口 | 描述 | 参数 |
|
||||
|------|------|------|
|
||||
| `GET /api/v1/analysis/trend` | 热点聚合分析 | `date`, `type` |
|
||||
| `GET /api/v1/analysis/platform-comparison` | 平台对比分析 | `date` |
|
||||
| `GET /api/v1/analysis/cross-platform` | 跨平台热点分析 | `date`, `refresh` |
|
||||
| `GET /api/v1/analysis/advanced` | 高级分析 | `date`, `refresh` |
|
||||
| `GET /api/v1/analysis/prediction` | 热点趋势预测 | `date` |
|
||||
| `GET /api/v1/analysis/keyword-cloud` | 关键词云图 | `date`, `platforms`, `category`, `keyword_count` |
|
||||
| `GET /api/v1/analysis/data-visualization` | 数据可视化分析 | `date`, `platforms`, `refresh` |
|
||||
| `GET /api/v1/analysis/trend-forecast` | 热点趋势预测分析 | `date`, `time_range`, `refresh` |
|
||||
|
||||
- 此 API 仅供合法使用。`任何非法使用均不受支持`,且由用户自行负责。
|
||||
- 本 API 提供的数据仅供参考,不应作为新闻的主要来源。
|
||||
#### 4. 健康检查 (Health)
|
||||
|
||||
## 速率限制
|
||||
| 接口 | 描述 |
|
||||
|------|------|
|
||||
| `GET /health` | 检查服务运行状态 |
|
||||
|
||||
目前此 API `没有明确的速率限制`,但请合理使用以避免服务器过载。
|
||||
### 使用示例
|
||||
|
||||
## 免责声明
|
||||
#### 示例 1:获取微博热搜
|
||||
|
||||
本 API 提供的信息可能并非始终准确或最新。用户应在依赖这些信息之前从其他平台进行验证。
|
||||
```bash
|
||||
curl "http://localhost:8000/api/v1/dailynews/?platform=weibo&date=2024-01-15"
|
||||
```
|
||||
|
||||
#### 示例 2:批量获取多平台新闻
|
||||
|
||||
## Telegram机器人
|
||||
[链接](https://t.me/SpaceWatcherBot)
|
||||
```bash
|
||||
curl "http://localhost:8000/api/v1/dailynews/multi?platforms=weibo,baidu,zhihu&date=2024-01-15"
|
||||
```
|
||||
|
||||
你可以直接使用机器人或添加到你的群组中。如果你想自己部署,你需要在环境变量中设置好 `TG_BOT_TOKEN`,再执行下面的命令:`python3 news_tg_bot.py`
|
||||
#### 示例 3:搜索相关新闻
|
||||
|
||||
## 网站基础信息接口
|
||||
```bash
|
||||
curl "http://localhost:8000/api/v1/dailynews/search?keyword=AI&platforms=weibo,zhihu&limit=10"
|
||||
```
|
||||
|
||||
[https://orz.ai/api/v1/tools/website-meta/?url=https://v2ex.com/](https://orz.ai/api/v1/tools/website-meta/?url=https://v2ex.com/)
|
||||
#### 示例 4:获取网站元数据
|
||||
|
||||
使用方法:`GET`
|
||||
```shell
|
||||
GET https://orz.ai/api/v1/tools/website-meta/?url=https://v2ex.com/
|
||||
```bash
|
||||
curl "http://localhost:8000/api/v1/tools/website-meta/?url=https://www.example.com"
|
||||
```
|
||||
|
||||
#### 示例 5:获取热点趋势分析
|
||||
|
||||
```bash
|
||||
curl "http://localhost:8000/api/v1/analysis/trend?type=main&date=2024-01-15"
|
||||
```
|
||||
|
||||
### 响应格式
|
||||
|
||||
所有接口统一返回 JSON 格式:
|
||||
|
||||
```json
|
||||
{
|
||||
"status": "200",
|
||||
"data": {
|
||||
"meta_info": {
|
||||
"title": "V2EX",
|
||||
"description": "创意工作者的社区。讨论编程、设计、硬件、游戏等令人激动的话题。",
|
||||
"keywords": "",
|
||||
"author": "",
|
||||
"og:title": "",
|
||||
"og:description": "",
|
||||
"og:image": "/static/icon-192.png",
|
||||
"og:url": "https://v2ex.com/",
|
||||
"twitter:card": "",
|
||||
"twitter:title": "",
|
||||
"twitter:description": "",
|
||||
"twitter:image": "/static/icon-192.png"
|
||||
},
|
||||
"favicon_url": "https://v2ex.com/static/icon-192.png"
|
||||
},
|
||||
"msg": "Success"
|
||||
"data": {...},
|
||||
"msg": "success"
|
||||
}
|
||||
```
|
||||
|
||||
**状态码说明:**
|
||||
- `200`: 成功
|
||||
- `404`: 资源不存在或参数错误
|
||||
- `500`: 服务器内部错误
|
||||
|
||||
### 数据更新频率
|
||||
|
||||
- **热点新闻数据**:每 30 分钟更新一次
|
||||
- **分析数据**:按需生成,支持缓存刷新
|
||||
- **网站元数据**:首次请求抓取,缓存 60 秒
|
||||
|
||||
### 注意事项
|
||||
|
||||
1. **数据时效性**:所有数据仅供参考,非实时数据
|
||||
2. **合法使用**:请遵守目标网站的 robots.txt 协议
|
||||
3. **请求频率**:建议合理控制请求频率,避免触发反爬机制
|
||||
4. **缓存机制**:大部分数据已缓存,重复请求不会增加目标网站负担
|
||||
|
||||
339
TROUBLESHOOTING.md
Normal file
339
TROUBLESHOOTING.md
Normal file
@@ -0,0 +1,339 @@
|
||||
# 运行错误解决方案
|
||||
|
||||
## 错误汇总
|
||||
|
||||
根据日志,您遇到了以下几类错误:
|
||||
|
||||
### 1. ValueError: 'ellipsis' object is not iterable ✅ 已修复
|
||||
|
||||
**错误信息:**
|
||||
```
|
||||
ValueError: [TypeError("'ellipsis' object is not iterable"), TypeError('vars() argument must have __dict__ attribute')]
|
||||
```
|
||||
|
||||
**原因:**
|
||||
在 FastAPI 的 Query 参数中直接使用 `...` (Ellipsis) 作为默认值是不正确的语法。
|
||||
|
||||
**修复内容:**
|
||||
- ✅ `app/api/v1/web_tools.py` - get_meta 函数的 url 参数
|
||||
- ✅ `app/api/v1/daily_news.py` - search_news 函数的 keyword 参数
|
||||
|
||||
**修改前:**
|
||||
```python
|
||||
url: str = Query(
|
||||
..., # ❌ 错误
|
||||
description="..."
|
||||
)
|
||||
```
|
||||
|
||||
**修改后:**
|
||||
```python
|
||||
url: str = Query(
|
||||
default=..., # ✅ 正确
|
||||
description="..."
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 2. Redis 连接失败 ⚠️ 需要启动 Redis 服务
|
||||
|
||||
**错误信息:**
|
||||
```
|
||||
Error parsing JSON: Error 10061 connecting to localhost:6379. 由于目标计算机积极拒绝,无法连接。
|
||||
```
|
||||
|
||||
**原因:**
|
||||
Redis 服务未启动或配置不正确。项目依赖 Redis 进行数据缓存。
|
||||
|
||||
**解决方案:**
|
||||
|
||||
#### Windows 系统
|
||||
|
||||
**方式 A: 使用 Windows Subsystem for Linux (WSL)**
|
||||
```bash
|
||||
# 1. 安装 Redis (如果未安装)
|
||||
sudo apt-get update
|
||||
sudo apt-get install redis-server
|
||||
|
||||
# 2. 启动 Redis
|
||||
sudo service redis-server start
|
||||
|
||||
# 3. 验证 Redis 是否运行
|
||||
redis-cli ping
|
||||
# 应返回:PONG
|
||||
```
|
||||
|
||||
**方式 B: 使用 Docker(推荐)**
|
||||
```bash
|
||||
# 1. 安装 Docker Desktop for Windows
|
||||
# 下载地址:https://www.docker.com/products/docker-desktop
|
||||
|
||||
# 2. 运行 Redis 容器
|
||||
docker run -d -p 6379:6379 --name redis redis:latest
|
||||
|
||||
# 3. 验证
|
||||
docker ps | grep redis
|
||||
```
|
||||
|
||||
**方式 C: 使用 Windows 原生版本**
|
||||
```bash
|
||||
# 1. 下载 Redis for Windows
|
||||
# GitHub: https://github.com/microsoftarchive/redis/releases
|
||||
|
||||
# 2. 解压后运行
|
||||
redis-server.exe
|
||||
|
||||
# 3. 测试连接
|
||||
redis-cli.exe
|
||||
```
|
||||
|
||||
#### 配置修改(可选)
|
||||
|
||||
如果 Redis 不在默认端口 6379,需要修改配置文件:
|
||||
|
||||
**文件:** `config/config.yaml`
|
||||
```yaml
|
||||
redis:
|
||||
host: localhost # 或您的 Redis 服务器 IP
|
||||
port: 6379 # 或您的 Redis 端口
|
||||
db: 0
|
||||
password: "" # 如果有密码
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3. MySQL 数据库连接问题 ⚠️ 需要配置
|
||||
|
||||
**检查配置文件:** `config/config.yaml`
|
||||
|
||||
确保 MySQL 配置正确:
|
||||
```yaml
|
||||
database:
|
||||
host: localhost
|
||||
user: root
|
||||
password: your_password
|
||||
db: hot_news
|
||||
charset: utf8mb4
|
||||
autocommit: true
|
||||
```
|
||||
|
||||
**创建数据库:**
|
||||
```sql
|
||||
CREATE DATABASE hot_news CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 4. 爬虫抓取失败 ⚠️ 正常现象(首次运行)
|
||||
|
||||
**错误信息:**
|
||||
```
|
||||
2026-03-26 23:39:25 - app - INFO - crawler shaoshupai failed. 0 news fetched
|
||||
2026-03-26 23:39:31 - app - INFO - crawler weibo failed. 0 news fetched
|
||||
2026-03-26 23:39:41 - app - INFO - crawler zhihu failed. 0 news fetched
|
||||
2026-03-26 23:39:43 - app - INFO - crawler 36kr failed. 0 news fetched
|
||||
```
|
||||
|
||||
**原因分析:**
|
||||
|
||||
1. **Redis 未连接** - 没有缓存存储,导致爬虫数据无法保存
|
||||
2. **网络问题** - 部分网站可能需要代理或反爬对抗
|
||||
3. **首次运行** - 初始化可能需要时间
|
||||
|
||||
**解决步骤:**
|
||||
|
||||
### 第一步:启动必需服务
|
||||
|
||||
```bash
|
||||
# 1. 启动 MySQL
|
||||
# Windows 服务管理器中启动,或使用命令
|
||||
net start MySQL80 # 根据您的服务名调整
|
||||
|
||||
# 2. 启动 Redis
|
||||
# 如果使用 Docker
|
||||
docker start redis
|
||||
|
||||
# 如果使用 WSL
|
||||
sudo service redis-server start
|
||||
|
||||
# 如果使用原生 Windows 版本
|
||||
redis-server.exe
|
||||
```
|
||||
|
||||
### 第二步:验证连接
|
||||
|
||||
```bash
|
||||
# 测试 MySQL 连接
|
||||
mysql -h localhost -u root -p
|
||||
# 输入密码后应能登录
|
||||
|
||||
# 测试 Redis 连接
|
||||
redis-cli ping
|
||||
# 应返回:PONG
|
||||
```
|
||||
|
||||
### 第三步:重新启动应用
|
||||
|
||||
```bash
|
||||
cd e:\hot_news-main
|
||||
python run.py
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 完整的启动流程
|
||||
|
||||
### 前置条件检查清单
|
||||
|
||||
- [ ] Python 3.8+ 已安装
|
||||
- [ ] MySQL 数据库已安装并运行
|
||||
- [ ] Redis 服务已安装并运行
|
||||
- [ ] Chrome/Chromium 浏览器已安装(用于 Selenium)
|
||||
- [ ] 依赖包已安装:`pip install -r requirements.txt`
|
||||
|
||||
### 标准启动步骤
|
||||
|
||||
```bash
|
||||
# 1. 启动 MySQL
|
||||
net start MySQL80
|
||||
|
||||
# 2. 启动 Redis(选择一种方式)
|
||||
# Docker 方式
|
||||
docker start redis
|
||||
|
||||
# WSL 方式
|
||||
wsl sudo service redis-server start
|
||||
|
||||
# 3. 验证服务
|
||||
redis-cli ping # 应返回 PONG
|
||||
|
||||
# 4. 启动应用
|
||||
cd e:\hot_news-main
|
||||
python run.py
|
||||
```
|
||||
|
||||
### 访问 Swagger 文档
|
||||
|
||||
启动成功后,访问:
|
||||
- **Swagger UI**: http://localhost:8000/docs
|
||||
- **ReDoc**: http://localhost:8000/redoc
|
||||
- **API 测试**: http://localhost:8000/health
|
||||
|
||||
---
|
||||
|
||||
## 常见问题排查
|
||||
|
||||
### Q1: Redis 无法启动
|
||||
|
||||
**Windows Docker 方式:**
|
||||
```bash
|
||||
# 检查 Docker 是否运行
|
||||
docker ps
|
||||
|
||||
# 如果 Docker Desktop 未启动,请先启动 Docker Desktop
|
||||
```
|
||||
|
||||
**WSL 方式:**
|
||||
```bash
|
||||
# 检查 WSL 状态
|
||||
wsl status
|
||||
|
||||
# 重启 Redis 服务
|
||||
wsl sudo service redis-server restart
|
||||
```
|
||||
|
||||
### Q2: MySQL 连接被拒绝
|
||||
|
||||
**解决方法:**
|
||||
```sql
|
||||
-- 以 root 用户登录 MySQL
|
||||
mysql -u root -p
|
||||
|
||||
-- 授权远程访问(如果需要)
|
||||
GRANT ALL PRIVILEGES ON hot_news.* TO 'root'@'localhost' IDENTIFIED BY 'your_password';
|
||||
FLUSH PRIVILEGES;
|
||||
```
|
||||
|
||||
### Q3: 爬虫仍然失败
|
||||
|
||||
**检查项:**
|
||||
1. 网络连接是否正常
|
||||
2. 是否需要代理(某些网站可能限制访问)
|
||||
3. Chrome 浏览器是否已安装
|
||||
4. ChromeDriver 版本是否匹配
|
||||
|
||||
**临时方案:**
|
||||
如果某些平台持续失败,可以暂时忽略,其他平台仍可正常工作。项目支持 22+ 个平台,部分平台不可用不影响整体功能。
|
||||
|
||||
---
|
||||
|
||||
## 快速测试脚本
|
||||
|
||||
创建测试文件 `test_connection.py`:
|
||||
|
||||
```python
|
||||
"""
|
||||
测试数据库和缓存连接
|
||||
使用方法:python test_connection.py
|
||||
"""
|
||||
|
||||
import sys
|
||||
|
||||
# 测试 Redis 连接
|
||||
print("🔴 测试 Redis 连接...")
|
||||
try:
|
||||
import redis
|
||||
r = redis.Redis(host='localhost', port=6379, db=0)
|
||||
r.ping()
|
||||
print("✅ Redis 连接成功!")
|
||||
except Exception as e:
|
||||
print(f"❌ Redis 连接失败:{e}")
|
||||
print(" 请启动 Redis 服务:")
|
||||
print(" - Docker: docker run -d -p 6379:6379 --name redis redis:latest")
|
||||
print(" - WSL: sudo service redis-server start")
|
||||
sys.exit(1)
|
||||
|
||||
# 测试 MySQL 连接
|
||||
print("\n🔵 测试 MySQL 连接...")
|
||||
try:
|
||||
import pymysql
|
||||
conn = pymysql.connect(
|
||||
host='localhost',
|
||||
user='root',
|
||||
password='your_password', # 修改为您的密码
|
||||
database='hot_news',
|
||||
charset='utf8mb4'
|
||||
)
|
||||
print("✅ MySQL 连接成功!")
|
||||
conn.close()
|
||||
except Exception as e:
|
||||
print(f"❌ MySQL 连接失败:{e}")
|
||||
print(" 请检查:")
|
||||
print(" 1. MySQL 服务是否启动")
|
||||
print(" 2. 数据库 hot_news 是否存在")
|
||||
print(" 3. 用户名密码是否正确")
|
||||
sys.exit(1)
|
||||
|
||||
print("\n✅ 所有连接测试通过!可以启动应用了。")
|
||||
print("\n启动命令:python run.py")
|
||||
```
|
||||
|
||||
运行测试:
|
||||
```bash
|
||||
python test_connection.py
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 总结
|
||||
|
||||
当前已修复的错误:
|
||||
- ✅ ValueError: ellipsis 对象错误
|
||||
|
||||
需要您手动处理的错误:
|
||||
- ⚠️ 启动 Redis 服务(参考上述方案)
|
||||
- ⚠️ 配置 MySQL 数据库(如未配置)
|
||||
- ⚠️ 爬虫失败(启动 Redis 后通常会恢复正常)
|
||||
|
||||
按照上述步骤操作后,系统应该可以正常运行。如有其他问题,请查看具体错误日志。
|
||||
@@ -11,15 +11,44 @@ from app.core import cache
|
||||
|
||||
router = APIRouter()
|
||||
|
||||
@router.get("/trend")
|
||||
async def get_trend_analysis(date: Optional[str] = None, type: str = "main"):
|
||||
@router.get(
|
||||
"/trend",
|
||||
summary="热点聚合分析",
|
||||
description="分析各平台热点数据的共性和差异,提取共同关键词、跨平台热点话题等",
|
||||
response_description="返回热点分析结果",
|
||||
responses={
|
||||
200: {"description": "成功获取热点聚合分析"},
|
||||
500: {"description": "分析过程出错"}
|
||||
}
|
||||
)
|
||||
async def get_trend_analysis(
|
||||
date: Optional[str] = Query(
|
||||
default=None,
|
||||
description="指定日期,格式为 YYYY-MM-DD,默认为当天",
|
||||
example="2024-01-15"
|
||||
),
|
||||
type: str = Query(
|
||||
default="main",
|
||||
description="分析类型:main(主题分析), platform(平台对比), cross(跨平台热点), advanced(高级分析)",
|
||||
enum=["main", "platform", "cross", "advanced"]
|
||||
)
|
||||
):
|
||||
"""
|
||||
获取热点聚合分析
|
||||
**热点聚合分析**
|
||||
|
||||
分析各平台热点数据的共性和差异,提取共同关键词、跨平台热点话题等
|
||||
对多个平台的热点数据进行聚合分析,识别共同话题和传播趋势。
|
||||
|
||||
- **date**: 可选,指定日期,格式为YYYY-MM-DD,默认为当天
|
||||
- **type**: 分析类型,可选值为 main(主题分析), platform(平台对比), cross(跨平台热点), advanced(高级分析),默认为main
|
||||
**分析类型说明:**
|
||||
- `main`: 主题分析 - 提取核心热点话题和关键词
|
||||
- `platform`: 平台对比 - 比较不同平台的特点和差异
|
||||
- `cross`: 跨平台热点 - 识别在多个平台同时出现的话题
|
||||
- `advanced`: 高级分析 - 提供更深层次的数据洞察
|
||||
|
||||
**应用场景:**
|
||||
- 舆情监控
|
||||
- 热点追踪
|
||||
- 内容运营决策
|
||||
- 市场趋势分析
|
||||
"""
|
||||
try:
|
||||
if not date:
|
||||
@@ -45,14 +74,37 @@ async def get_trend_analysis(date: Optional[str] = None, type: str = "main"):
|
||||
"date": date or datetime.now(pytz.timezone('Asia/Shanghai')).strftime("%Y-%m-%d")
|
||||
}
|
||||
|
||||
@router.get("/platform-comparison")
|
||||
async def get_platform_comparison(date: Optional[str] = None):
|
||||
@router.get(
|
||||
"/platform-comparison",
|
||||
summary="平台对比分析",
|
||||
description="分析各平台热点数据的特点、热度排行、更新频率等,比较不同平台间的异同",
|
||||
response_description="返回平台对比分析结果",
|
||||
responses={
|
||||
200: {"description": "成功获取平台对比分析"}
|
||||
}
|
||||
)
|
||||
async def get_platform_comparison(
|
||||
date: Optional[str] = Query(
|
||||
default=None,
|
||||
description="指定日期,格式为 YYYY-MM-DD,默认为当天",
|
||||
example="2024-01-15"
|
||||
)
|
||||
):
|
||||
"""
|
||||
获取平台对比分析
|
||||
**平台对比分析**
|
||||
|
||||
分析各平台热点数据的特点、热度排行、更新频率等,比较不同平台间的异同
|
||||
对比分析不同平台的特点、热度分布和内容特征。
|
||||
|
||||
- **date**: 可选,指定日期,格式为YYYY-MM-DD,默认为当天
|
||||
**分析维度:**
|
||||
- 数据统计:各平台新闻数量、平均热度
|
||||
- 更新频率:数据更新速度
|
||||
- 热度排行:按平台热度排序
|
||||
- 特征分析:各平台的内容特点
|
||||
|
||||
**适用场景:**
|
||||
- 平台选择决策
|
||||
- 投放策略制定
|
||||
- 用户画像分析
|
||||
"""
|
||||
try:
|
||||
if not date:
|
||||
@@ -78,15 +130,62 @@ async def get_platform_comparison(date: Optional[str] = None):
|
||||
"date": date or datetime.now(pytz.timezone('Asia/Shanghai')).strftime("%Y-%m-%d")
|
||||
}
|
||||
|
||||
@router.get("/cross-platform")
|
||||
async def get_cross_platform_analysis(date: Optional[str] = None, refresh: bool = False):
|
||||
@router.get(
|
||||
"/cross-platform",
|
||||
summary="跨平台热点分析",
|
||||
description="分析在多个平台上出现的热点话题,以及热点的传播路径",
|
||||
response_description="返回跨平台热点分析结果",
|
||||
responses={
|
||||
200: {
|
||||
"description": "成功获取跨平台热点分析",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"example": {
|
||||
"status": "success",
|
||||
"date": "2024-01-15",
|
||||
"cross_platform_topics": [
|
||||
{
|
||||
"topic": "某明星事件",
|
||||
"platforms": ["weibo", "douyin", "baidu"],
|
||||
"first_platform": "weibo",
|
||||
"spread_path": ["weibo", "zhihu", "baidu", "douyin"],
|
||||
"heat_trend": "rising"
|
||||
}
|
||||
],
|
||||
"topic_count": 15,
|
||||
"avg_platforms_per_topic": 3.5
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
)
|
||||
async def get_cross_platform_analysis(
|
||||
date: Optional[str] = Query(
|
||||
default=None,
|
||||
description="指定日期,格式为 YYYY-MM-DD,默认为当天",
|
||||
example="2024-01-15"
|
||||
),
|
||||
refresh: bool = Query(
|
||||
default=False,
|
||||
description="是否强制刷新缓存"
|
||||
)
|
||||
):
|
||||
"""
|
||||
获取跨平台热点分析
|
||||
**跨平台热点分析**
|
||||
|
||||
分析在多个平台上出现的热点话题,以及热点的传播路径
|
||||
识别在多个平台同时出现的热点话题,分析传播路径和发展趋势。
|
||||
|
||||
- **date**: 可选,指定日期,格式为YYYY-MM-DD,默认为当天
|
||||
- **refresh**: 可选,是否强制刷新缓存,默认为False
|
||||
**分析内容:**
|
||||
- 跨平台话题识别
|
||||
- 首发平台判断
|
||||
- 传播路径追踪
|
||||
- 热度趋势分析
|
||||
|
||||
**价值:**
|
||||
- 发现全网热点
|
||||
- 追踪舆情走向
|
||||
- 把握传播规律
|
||||
"""
|
||||
try:
|
||||
if not date:
|
||||
@@ -114,15 +213,61 @@ async def get_cross_platform_analysis(date: Optional[str] = None, refresh: bool
|
||||
"date": date or datetime.now(pytz.timezone('Asia/Shanghai')).strftime("%Y-%m-%d")
|
||||
}
|
||||
|
||||
@router.get("/advanced")
|
||||
async def get_advanced_analysis(date: Optional[str] = None, refresh: bool = False):
|
||||
@router.get(
|
||||
"/advanced",
|
||||
summary="高级分析",
|
||||
description="提供更深入的热点分析,包括关键词云图、情感分析、热点演变趋势等",
|
||||
response_description="返回高级分析结果",
|
||||
responses={
|
||||
200: {
|
||||
"description": "成功获取高级分析",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"example": {
|
||||
"status": "success",
|
||||
"date": "2024-01-15",
|
||||
"keyword_cloud": [{"word": "AI", "weight": 95}],
|
||||
"sentiment_analysis": {
|
||||
"positive": 45,
|
||||
"neutral": 40,
|
||||
"negative": 15
|
||||
},
|
||||
"evolution_trend": [
|
||||
{"hour": "00:00", "heat": 5000},
|
||||
{"hour": "12:00", "heat": 8500}
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
)
|
||||
async def get_advanced_analysis(
|
||||
date: Optional[str] = Query(
|
||||
default=None,
|
||||
description="指定日期,格式为 YYYY-MM-DD,默认为当天",
|
||||
example="2024-01-15"
|
||||
),
|
||||
refresh: bool = Query(
|
||||
default=False,
|
||||
description="是否强制刷新缓存"
|
||||
)
|
||||
):
|
||||
"""
|
||||
获取高级分析
|
||||
**高级分析**
|
||||
|
||||
提供更深入的热点分析,包括关键词云图、情感分析、热点演变趋势等
|
||||
提供深度的数据分析功能,包括词云、情感分析和演变趋势。
|
||||
|
||||
- **date**: 可选,指定日期,格式为YYYY-MM-DD,默认为当天
|
||||
- **refresh**: 可选,是否强制刷新缓存,默认为False
|
||||
**分析模块:**
|
||||
- 关键词云图:可视化展示高频词汇
|
||||
- 情感分析:正/中/负向情感分布
|
||||
- 演变趋势:时间维度上的热度变化
|
||||
- 深度洞察:数据背后的规律
|
||||
|
||||
**适用场景:**
|
||||
- 深度报告生成
|
||||
- 趋势研究
|
||||
- 舆情分析
|
||||
"""
|
||||
try:
|
||||
if not date:
|
||||
@@ -150,14 +295,60 @@ async def get_advanced_analysis(date: Optional[str] = None, refresh: bool = Fals
|
||||
"date": date or datetime.now(pytz.timezone('Asia/Shanghai')).strftime("%Y-%m-%d")
|
||||
}
|
||||
|
||||
@router.get("/prediction")
|
||||
async def get_trend_prediction(date: Optional[str] = None):
|
||||
@router.get(
|
||||
"/prediction",
|
||||
summary="热点趋势预测",
|
||||
description="基于历史数据预测热点话题的发展趋势,包括上升趋势、下降趋势、持续热门话题等",
|
||||
response_description="返回热点预测结果",
|
||||
responses={
|
||||
200: {
|
||||
"description": "成功获取热点预测",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"example": {
|
||||
"status": "success",
|
||||
"date": "2024-01-15",
|
||||
"rising_topics": [
|
||||
{"topic": "新技术发布", "trend": "rising", "confidence": 0.85}
|
||||
],
|
||||
"declining_topics": [
|
||||
{"topic": "旧闻", "trend": "declining", "confidence": 0.75}
|
||||
],
|
||||
"sustained_hot_topics": [
|
||||
{"topic": "持续热点", "trend": "stable", "duration": "3 天"}
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
)
|
||||
async def get_trend_prediction(
|
||||
date: Optional[str] = Query(
|
||||
default=None,
|
||||
description="指定日期,格式为 YYYY-MM-DD,默认为当天",
|
||||
example="2024-01-15"
|
||||
)
|
||||
):
|
||||
"""
|
||||
获取热点趋势预测
|
||||
**热点趋势预测**
|
||||
|
||||
基于历史数据预测热点话题的发展趋势,包括上升趋势、下降趋势、持续热门话题等
|
||||
基于历史数据和算法模型预测热点话题的未来发展趋势。
|
||||
|
||||
- **date**: 可选,指定日期,格式为YYYY-MM-DD,默认为当天
|
||||
**预测类型:**
|
||||
- 上升趋势:热度正在上涨的话题
|
||||
- 下降趋势:热度逐渐消退的话题
|
||||
- 持续热门:长期保持高热度的话题
|
||||
|
||||
**输出指标:**
|
||||
- 趋势方向:rising/declining/stable
|
||||
- 置信度:预测的可信程度 (0-1)
|
||||
- 持续时间:话题预计持续的时间
|
||||
|
||||
**应用价值:**
|
||||
- 提前布局内容
|
||||
- 把握流量机会
|
||||
- 规避过时话题
|
||||
"""
|
||||
try:
|
||||
if not date:
|
||||
@@ -183,18 +374,89 @@ async def get_trend_prediction(date: Optional[str] = None):
|
||||
"date": date or datetime.now(pytz.timezone('Asia/Shanghai')).strftime("%Y-%m-%d")
|
||||
}
|
||||
|
||||
@router.get("/keyword-cloud")
|
||||
async def get_keyword_cloud(date: Optional[str] = None, refresh: bool = False, platforms: Optional[str] = None, category: Optional[str] = None, keyword_count: int = 200):
|
||||
@router.get(
|
||||
"/keyword-cloud",
|
||||
summary="关键词云图",
|
||||
description="提取热点数据中的关键词,按不同类别(科技、娱乐、社会等)进行分类,用于生成词云",
|
||||
response_description="返回关键词云图数据",
|
||||
responses={
|
||||
200: {
|
||||
"description": "成功获取关键词云图",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"example": {
|
||||
"status": "success",
|
||||
"date": "2024-01-15",
|
||||
"keyword_clouds": {
|
||||
"科技": [
|
||||
{"word": "人工智能", "weight": 95},
|
||||
{"word": "大模型", "weight": 88}
|
||||
],
|
||||
"娱乐": [
|
||||
{"word": "电影", "weight": 75},
|
||||
{"word": "明星", "weight": 70}
|
||||
]
|
||||
},
|
||||
"total_keywords": 200
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
)
|
||||
async def get_keyword_cloud(
|
||||
date: Optional[str] = Query(
|
||||
default=None,
|
||||
description="指定日期,格式为 YYYY-MM-DD,默认为当天",
|
||||
example="2024-01-15"
|
||||
),
|
||||
refresh: bool = Query(
|
||||
default=False,
|
||||
description="是否强制刷新缓存"
|
||||
),
|
||||
platforms: Optional[str] = Query(
|
||||
default=None,
|
||||
description="指定平台,多个平台用逗号分隔,如 baidu,weibo",
|
||||
example="baidu,weibo"
|
||||
),
|
||||
category: Optional[str] = Query(
|
||||
default=None,
|
||||
description="指定分类,如科技、娱乐等",
|
||||
example="科技"
|
||||
),
|
||||
keyword_count: int = Query(
|
||||
default=200,
|
||||
ge=50,
|
||||
le=1000,
|
||||
description="返回的关键词数量,范围 50-1000"
|
||||
)
|
||||
):
|
||||
"""
|
||||
获取关键词云图数据
|
||||
**关键词云图**
|
||||
|
||||
提取热点数据中的关键词,按不同类别(科技、娱乐、社会等)进行分类,用于生成词云
|
||||
从热点数据中提取高频关键词,并按类别分类,可用于可视化展示。
|
||||
|
||||
- **date**: 可选,指定日期,格式为YYYY-MM-DD,默认为当天
|
||||
- **refresh**: 可选,是否强制刷新缓存,默认为False
|
||||
- **platforms**: 可选,指定平台,多个平台用逗号分隔,如"baidu,weibo"
|
||||
- **category**: 可选,指定分类,如"科技"、"娱乐"等
|
||||
- **keyword_count**: 可选,返回的关键词数量,默认为200
|
||||
**参数说明:**
|
||||
- `platforms`: 限定分析的平台范围
|
||||
- `category`: 筛选特定分类的关键词
|
||||
- `keyword_count`: 返回的关键词数量
|
||||
|
||||
**分类体系:**
|
||||
- 科技:互联网、数码、AI 等
|
||||
- 娱乐:影视、明星、综艺等
|
||||
- 社会:民生、时事等
|
||||
- 体育:赛事、运动员等
|
||||
- 财经:经济、金融、股市等
|
||||
|
||||
**数据格式:**
|
||||
每个关键词包含:
|
||||
- `word`: 词语本身
|
||||
- `weight`: 权重值(基于词频和重要性)
|
||||
|
||||
**应用场景:**
|
||||
- 数据可视化
|
||||
- 内容标签生成
|
||||
- 主题挖掘
|
||||
"""
|
||||
try:
|
||||
if not date:
|
||||
@@ -228,16 +490,74 @@ async def get_keyword_cloud(date: Optional[str] = None, refresh: bool = False, p
|
||||
"date": date or datetime.now(pytz.timezone('Asia/Shanghai')).strftime("%Y-%m-%d")
|
||||
}
|
||||
|
||||
@router.get("/data-visualization")
|
||||
async def get_data_visualization(date: Optional[str] = None, refresh: bool = False, platforms: str = None):
|
||||
@router.get(
|
||||
"/data-visualization",
|
||||
summary="数据可视化分析",
|
||||
description="提供热点数据的可视化分析,包括主题热度分布图",
|
||||
response_description="返回数据可视化分析结果",
|
||||
responses={
|
||||
200: {
|
||||
"description": "成功获取数据可视化分析",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"example": {
|
||||
"status": "success",
|
||||
"date": "2024-01-15",
|
||||
"theme_distribution": {
|
||||
"科技": 35,
|
||||
"娱乐": 25,
|
||||
"社会": 20,
|
||||
"财经": 15,
|
||||
"体育": 5
|
||||
},
|
||||
"platform_heatmap": {
|
||||
"weibo": {"morning": 80, "afternoon": 95, "evening": 100},
|
||||
"zhihu": {"morning": 60, "afternoon": 75, "evening": 85}
|
||||
},
|
||||
"charts_data": {...}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
)
|
||||
async def get_data_visualization(
|
||||
date: Optional[str] = Query(
|
||||
default=None,
|
||||
description="指定日期,格式为 YYYY-MM-DD,默认为当天",
|
||||
example="2024-01-15"
|
||||
),
|
||||
refresh: bool = Query(
|
||||
default=False,
|
||||
description="是否强制刷新缓存"
|
||||
),
|
||||
platforms: str = Query(
|
||||
default=None,
|
||||
description="指定要分析的平台,多个平台用逗号分隔",
|
||||
example="baidu,weibo,douyin"
|
||||
)
|
||||
):
|
||||
"""
|
||||
获取数据可视化分析
|
||||
**数据可视化分析**
|
||||
|
||||
提供热点数据的可视化分析,包括主题热度分布图
|
||||
提供用于前端可视化的结构化数据分析结果。
|
||||
|
||||
- **date**: 可选,指定日期,格式为YYYY-MM-DD,默认为当天
|
||||
- **refresh**: 可选,是否强制刷新缓存,默认为False
|
||||
- **platforms**: 可选,指定要分析的平台,多个平台用逗号分隔,例如:baidu,weibo,douyin
|
||||
**可视化类型:**
|
||||
- 主题热度分布图:饼图/柱状图数据
|
||||
- 平台热力图:时间段热度对比
|
||||
- 趋势线图:时间序列数据
|
||||
- 排行榜:TOP N 数据
|
||||
|
||||
**输出格式:**
|
||||
数据格式适合常见图表库(ECharts、Chart.js 等)直接使用。
|
||||
|
||||
**支持的平台筛选:**
|
||||
可以指定部分平台进行分析,减少数据量。
|
||||
|
||||
**典型应用:**
|
||||
- Dashboard 展示
|
||||
- 数据报表
|
||||
- 实时监控大屏
|
||||
"""
|
||||
try:
|
||||
if not date:
|
||||
@@ -270,16 +590,86 @@ async def get_data_visualization(date: Optional[str] = None, refresh: bool = Fal
|
||||
"date": date or datetime.now(pytz.timezone('Asia/Shanghai')).strftime("%Y-%m-%d")
|
||||
}
|
||||
|
||||
@router.get("/trend-forecast")
|
||||
async def get_trend_forecast(date: Optional[str] = None, refresh: bool = False, time_range: str = "24h"):
|
||||
@router.get(
|
||||
"/trend-forecast",
|
||||
summary="热点趋势预测分析",
|
||||
description="分析热点话题的演变趋势,预测热点的发展方向",
|
||||
response_description="返回热点趋势预测分析结果",
|
||||
responses={
|
||||
200: {
|
||||
"description": "成功获取热点趋势预测",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"example": {
|
||||
"status": "success",
|
||||
"date": "2024-01-15",
|
||||
"time_range": "24h",
|
||||
"forecast": [
|
||||
{
|
||||
"topic": "热门话题 A",
|
||||
"current_heat": 8500,
|
||||
"predicted_heat": 9500,
|
||||
"trend": "rising",
|
||||
"confidence": 0.82
|
||||
},
|
||||
{
|
||||
"topic": "热门话题 B",
|
||||
"current_heat": 6000,
|
||||
"predicted_heat": 4500,
|
||||
"trend": "declining",
|
||||
"confidence": 0.75
|
||||
}
|
||||
],
|
||||
"key_insights": "预计未来 24 小时内,科技类话题将持续升温..."
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
)
|
||||
async def get_trend_forecast(
|
||||
date: Optional[str] = Query(
|
||||
default=None,
|
||||
description="指定日期,格式为 YYYY-MM-DD,默认为当天",
|
||||
example="2024-01-15"
|
||||
),
|
||||
refresh: bool = Query(
|
||||
default=False,
|
||||
description="是否强制刷新缓存"
|
||||
),
|
||||
time_range: str = Query(
|
||||
default="24h",
|
||||
description="预测时间范围:24h(24 小时), 7d(7 天), 30d(30 天)",
|
||||
enum=["24h", "7d", "30d"]
|
||||
)
|
||||
):
|
||||
"""
|
||||
获取热点趋势预测分析
|
||||
**热点趋势预测分析**
|
||||
|
||||
分析热点话题的演变趋势,预测热点的发展方向
|
||||
基于时间序列分析预测热点话题的未来发展趋势。
|
||||
|
||||
- **date**: 可选,指定日期,格式为YYYY-MM-DD,默认为当天
|
||||
- **refresh**: 可选,是否强制刷新缓存,默认为False
|
||||
- **time_range**: 可选,预测时间范围,可选值为 24h(24小时), 7d(7天), 30d(30天),默认为24h
|
||||
**预测时间范围:**
|
||||
- `24h`: 短期预测,适合快速变化的话题
|
||||
- `7d`: 中期预测,适合持续性话题
|
||||
- `30d`: 长期预测,适合重大事件
|
||||
|
||||
**输出内容:**
|
||||
- 当前热度值
|
||||
- 预测热度值
|
||||
- 趋势方向(上升/下降/平稳)
|
||||
- 置信度评分
|
||||
- 关键洞察总结
|
||||
|
||||
**预测维度:**
|
||||
- 整体热度走势
|
||||
- 分话题发展趋势
|
||||
- 平台表现预测
|
||||
- 新兴话题发现
|
||||
|
||||
**业务价值:**
|
||||
- 内容策划参考
|
||||
- 资源投放决策
|
||||
- 风险预警
|
||||
"""
|
||||
try:
|
||||
if not date:
|
||||
@@ -288,7 +678,7 @@ async def get_trend_forecast(date: Optional[str] = None, refresh: bool = False,
|
||||
# 验证时间范围参数
|
||||
valid_time_ranges = ["24h", "7d", "30d"]
|
||||
if time_range not in valid_time_ranges:
|
||||
time_range = "24h" # 默认使用24小时
|
||||
time_range = "24h" # 默认使用 24 小时
|
||||
|
||||
# 从缓存中获取数据
|
||||
cache_key = f"analysis:trend_forecast:{date}:{time_range}"
|
||||
@@ -311,4 +701,4 @@ async def get_trend_forecast(date: Optional[str] = None, refresh: bool = False,
|
||||
"message": str(e),
|
||||
"date": date or datetime.now(pytz.timezone('Asia/Shanghai')).strftime("%Y-%m-%d"),
|
||||
"time_range": time_range
|
||||
}
|
||||
}
|
||||
|
||||
@@ -4,7 +4,8 @@ from datetime import datetime
|
||||
from typing import List, Dict, Any, Optional
|
||||
|
||||
import pytz
|
||||
from fastapi import APIRouter
|
||||
from fastapi import APIRouter, Query, Path
|
||||
from fastapi.responses import JSONResponse
|
||||
|
||||
from app.core import cache
|
||||
from app.services import crawler_factory
|
||||
@@ -13,8 +14,43 @@ from app.utils.logger import log
|
||||
router = APIRouter()
|
||||
|
||||
|
||||
@router.get("/")
|
||||
def get_hot_news(date: str = None, platform: str = None):
|
||||
@router.get(
|
||||
"/",
|
||||
summary="获取单个平台的热门新闻",
|
||||
description="从指定平台获取特定日期的热门新闻数据",
|
||||
response_description="返回包含新闻列表的 JSON 对象",
|
||||
responses={
|
||||
200: {"description": "成功获取新闻数据"},
|
||||
404: {"description": "平台不存在"}
|
||||
}
|
||||
)
|
||||
def get_hot_news(
|
||||
date: str = Query(
|
||||
default=None,
|
||||
description="日期,格式为 YYYY-MM-DD,默认为当天(北京时间)",
|
||||
example="2024-01-15"
|
||||
),
|
||||
platform: str = Query(
|
||||
default=None,
|
||||
description=f"平台代码,可选值:{', '.join(crawler_factory.keys())}",
|
||||
example="weibo"
|
||||
)
|
||||
):
|
||||
"""
|
||||
**获取指定平台的热门新闻**
|
||||
|
||||
根据指定的平台和日期获取热门新闻列表。数据来源于缓存,每 30 分钟更新一次。
|
||||
|
||||
**参数说明:**
|
||||
- `platform`: 必需,平台标识符(如:baidu, weibo, zhihu, github 等)
|
||||
- `date`: 可选,查询日期,默认当天
|
||||
|
||||
**支持的平台:**
|
||||
- 综合资讯:百度、微博、知乎、抖音等
|
||||
- 科技:GitHub、HackerNews、掘金、少数派等
|
||||
- 财经:雪球、东方财富等
|
||||
- 社区:贴吧、虎扑、豆瓣等
|
||||
"""
|
||||
if platform not in crawler_factory.keys():
|
||||
return {
|
||||
"status": "404",
|
||||
@@ -41,16 +77,33 @@ def get_hot_news(date: str = None, platform: str = None):
|
||||
}
|
||||
|
||||
|
||||
@router.get("/all")
|
||||
def get_all_platforms_news(date: str = None):
|
||||
@router.get(
|
||||
"/all",
|
||||
summary="获取所有平台的热门新闻",
|
||||
description="一次性获取所有支持平台的热门新闻数据",
|
||||
response_description="返回包含所有平台新闻的 JSON 对象",
|
||||
responses={
|
||||
200: {"description": "成功获取所有平台新闻数据"}
|
||||
}
|
||||
)
|
||||
def get_all_platforms_news(
|
||||
date: str = Query(
|
||||
default=None,
|
||||
description="日期,格式为 YYYY-MM-DD,默认为当天",
|
||||
example="2024-01-15"
|
||||
)
|
||||
):
|
||||
"""
|
||||
获取所有平台的热门新闻
|
||||
**获取所有平台的热门新闻**
|
||||
|
||||
Args:
|
||||
date: 日期,格式为YYYY-MM-DD,默认为当天
|
||||
一次性获取所有支持平台的热门新闻数据,适合需要全量数据的场景。
|
||||
|
||||
Returns:
|
||||
包含所有平台新闻的字典,键为平台名称,值为新闻列表
|
||||
**返回数据说明:**
|
||||
返回一个字典,键为平台名称,值为该平台的新闻列表
|
||||
|
||||
**注意事项:**
|
||||
- 数据量较大,建议按需使用
|
||||
- 部分平台可能没有缓存数据,返回空数组
|
||||
"""
|
||||
if not date:
|
||||
date = datetime.now(pytz.timezone('Asia/Shanghai')).strftime("%Y-%m-%d")
|
||||
@@ -76,17 +129,41 @@ def get_all_platforms_news(date: str = None):
|
||||
}
|
||||
|
||||
|
||||
@router.get("/multi")
|
||||
def get_multi_platforms_news(date: str = None, platforms: str = None):
|
||||
@router.get(
|
||||
"/multi",
|
||||
summary="获取多个平台的热门新闻",
|
||||
description="批量获取指定平台的热门新闻数据",
|
||||
response_description="返回包含指定平台新闻的 JSON 对象",
|
||||
responses={
|
||||
200: {"description": "成功获取多平台新闻数据"},
|
||||
404: {"description": "平台参数无效"}
|
||||
}
|
||||
)
|
||||
def get_multi_platforms_news(
|
||||
date: str = Query(
|
||||
default=None,
|
||||
description="日期,格式为 YYYY-MM-DD,默认为当天",
|
||||
example="2024-01-15"
|
||||
),
|
||||
platforms: str = Query(
|
||||
default=None,
|
||||
description="平台列表,逗号分隔,例如:weibo,baidu,zhihu",
|
||||
example="weibo,baidu,zhihu"
|
||||
)
|
||||
):
|
||||
"""
|
||||
获取多个平台的热门新闻
|
||||
**获取多个平台的热门新闻**
|
||||
|
||||
Args:
|
||||
date: 日期,格式为YYYY-MM-DD,默认为当天
|
||||
platforms: 平台列表,以逗号分隔,例如 "weibo,baidu,zhihu"
|
||||
批量获取指定平台的热门新闻数据,相比 `/all` 接口更加灵活。
|
||||
|
||||
Returns:
|
||||
包含指定平台新闻的字典,键为平台名称,值为新闻列表
|
||||
**参数说明:**
|
||||
- `platforms`: 必需,平台列表,逗号分隔
|
||||
- `date`: 可选,查询日期
|
||||
|
||||
**使用示例:**
|
||||
```
|
||||
/multi?platforms=weibo,baidu,zhihu&date=2024-01-15
|
||||
```
|
||||
"""
|
||||
if not date:
|
||||
date = datetime.now(pytz.timezone('Asia/Shanghai')).strftime("%Y-%m-%d")
|
||||
@@ -131,19 +208,62 @@ def get_multi_platforms_news(date: str = None, platforms: str = None):
|
||||
}
|
||||
|
||||
|
||||
@router.get("/search")
|
||||
def search_news(keyword: str, date: str = None, platforms: str = None, limit: int = 20):
|
||||
@router.get(
|
||||
"/search",
|
||||
summary="搜索新闻",
|
||||
description="按关键词搜索跨平台的热门新闻",
|
||||
response_description="返回包含搜索结果的 JSON 对象",
|
||||
responses={
|
||||
200: {"description": "成功获取搜索结果"},
|
||||
404: {"description": "无有效平台"}
|
||||
}
|
||||
)
|
||||
def search_news(
|
||||
keyword: str = Query(
|
||||
default=...,
|
||||
description="搜索关键词",
|
||||
example="AI"
|
||||
),
|
||||
date: str = Query(
|
||||
default=None,
|
||||
description="日期,格式为 YYYY-MM-DD,默认为当天",
|
||||
example="2024-01-15"
|
||||
),
|
||||
platforms: str = Query(
|
||||
default=None,
|
||||
description="平台列表,逗号分隔,默认搜索所有平台",
|
||||
example="weibo,baidu,zhihu"
|
||||
),
|
||||
limit: int = Query(
|
||||
default=20,
|
||||
ge=1,
|
||||
le=100,
|
||||
description="返回结果数量限制,范围 1-100",
|
||||
example=20
|
||||
)
|
||||
):
|
||||
"""
|
||||
搜索新闻
|
||||
**搜索新闻**
|
||||
|
||||
Args:
|
||||
keyword: 搜索关键词
|
||||
date: 日期,格式为YYYY-MM-DD,默认为当天
|
||||
platforms: 平台列表,以逗号分隔,例如 "weibo,baidu,zhihu",默认搜索所有平台
|
||||
limit: 返回结果数量限制,默认为20
|
||||
按关键词在指定平台和日期的新闻中搜索相关内容。
|
||||
|
||||
Returns:
|
||||
包含搜索结果的字典,键为状态码、数据、消息、总结果数量和搜索结果数量
|
||||
**参数说明:**
|
||||
- `keyword`: 必需,搜索关键词
|
||||
- `date`: 可选,搜索日期
|
||||
- `platforms`: 可选,限定搜索平台
|
||||
- `limit`: 可选,返回结果数量,默认 20,最大 100
|
||||
|
||||
**搜索逻辑:**
|
||||
1. 从各平台获取新闻数据
|
||||
2. 按关键词匹配标题
|
||||
3. 按平台分组并按排名排序
|
||||
4. 返回限定数量的结果
|
||||
|
||||
**返回字段说明:**
|
||||
- `source`: 新闻来源平台
|
||||
- `rank`: 排名
|
||||
- `category`: 分类
|
||||
- `sub_category`: 子分类
|
||||
"""
|
||||
if not date:
|
||||
date = datetime.now(pytz.timezone('Asia/Shanghai')).strftime("%Y-%m-%d")
|
||||
@@ -184,7 +304,7 @@ def search_news(keyword: str, date: str = None, platforms: str = None, limit: in
|
||||
if not isinstance(item, dict):
|
||||
continue
|
||||
|
||||
# 处理rank字段
|
||||
# 处理 rank 字段
|
||||
rank_value = ""
|
||||
if "rank" in item and item["rank"]:
|
||||
rank_value = str(item["rank"]).replace("#", "")
|
||||
@@ -292,4 +412,3 @@ def _get_subcategory_for_platform(platform: str) -> str:
|
||||
"shaoshupai": "数码"
|
||||
}
|
||||
return subcategories.get(platform, "其他")
|
||||
|
||||
|
||||
@@ -9,15 +9,61 @@ from app.utils.logger import log
|
||||
|
||||
import requests
|
||||
from bs4 import BeautifulSoup
|
||||
from fastapi import APIRouter
|
||||
from fastapi import APIRouter, Query
|
||||
from fastapi.responses import JSONResponse
|
||||
|
||||
from app.core import cache
|
||||
|
||||
router = APIRouter()
|
||||
|
||||
|
||||
@router.get("/")
|
||||
def get_meta(url: str = None):
|
||||
@router.get(
|
||||
"/",
|
||||
summary="获取网站元数据",
|
||||
description="提取指定网页的元数据信息,包括标题、描述、关键词、Open Graph 标签、Twitter Card 标签和 favicon",
|
||||
response_description="返回包含网站元数据的 JSON 对象",
|
||||
responses={
|
||||
200: {"description": "成功获取网站元数据"},
|
||||
404: {"description": "URL 参数缺失或无法访问"}
|
||||
}
|
||||
)
|
||||
def get_meta(
|
||||
url: str = Query(
|
||||
default=...,
|
||||
description="要获取元数据的网页 URL",
|
||||
example="https://www.example.com"
|
||||
)
|
||||
):
|
||||
"""
|
||||
**获取网站元数据**
|
||||
|
||||
提取指定网页的各种元数据信息,支持标准 meta 标签、Open Graph 协议和 Twitter Card 协议。
|
||||
|
||||
**功能特性:**
|
||||
- 自动检测并提取页面标题、描述、关键词
|
||||
- 支持 Open Graph 协议(Facebook)
|
||||
- 支持 Twitter Card 协议
|
||||
- 自动查找 favicon 图标地址
|
||||
- 内置缓存机制,相同 URL 不会重复请求
|
||||
|
||||
**提取字段说明:**
|
||||
- `title`: 页面标题
|
||||
- `description`: 页面描述
|
||||
- `keywords`: 页面关键词
|
||||
- `author`: 作者信息
|
||||
- `og:*`: Open Graph 相关字段
|
||||
- `twitter:*`: Twitter Card 相关字段
|
||||
- `favicon_url`: 网站图标 URL
|
||||
|
||||
**缓存策略:**
|
||||
- 首次请求会实际抓取网页
|
||||
- 后续请求从 Redis 缓存读取(TTL: 60 秒)
|
||||
- 响应中 `cache` 字段标识是否来自缓存
|
||||
|
||||
**注意事项:**
|
||||
- 部分网站可能有反爬机制,使用 cloudscraper 进行绕过
|
||||
- 动态渲染的内容可能无法完整获取
|
||||
"""
|
||||
if not url:
|
||||
return {
|
||||
"status": "404",
|
||||
|
||||
63
app/main.py
63
app/main.py
@@ -56,7 +56,35 @@ app = FastAPI(
|
||||
title=app_config.title,
|
||||
description=app_config.description,
|
||||
version=app_config.version,
|
||||
lifespan=lifespan
|
||||
lifespan=lifespan,
|
||||
# Swagger/OpenAPI 配置
|
||||
docs_url="/docs", # Swagger UI 路径
|
||||
redoc_url="/redoc", # ReDoc 路径
|
||||
openapi_url="/openapi.json", # OpenAPI schema 路径
|
||||
openapi_tags=[
|
||||
{
|
||||
"name": "Daily News",
|
||||
"description": "每日热点新闻数据接口,支持 22+ 个平台(百度、微博、知乎、GitHub 等)"
|
||||
},
|
||||
{
|
||||
"name": "Website Meta",
|
||||
"description": "网站元数据提取工具,可获取指定 URL 的标题、描述、图标等信息"
|
||||
},
|
||||
{
|
||||
"name": "Analysis",
|
||||
"description": "新闻趋势分析和预测功能"
|
||||
},
|
||||
{
|
||||
"name": "Health",
|
||||
"description": "健康检查接口"
|
||||
}
|
||||
],
|
||||
openapi_extra={
|
||||
"externalDocs": {
|
||||
"description": "项目源码仓库",
|
||||
"url": "https://github.com/hot-news/hot_news-main"
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
# 添加CORS中间件
|
||||
@@ -83,8 +111,39 @@ app.include_router(web_tools.router, prefix="/api/v1/tools/website-meta", tags=[
|
||||
app.include_router(analysis.router, prefix="/api/v1/analysis", tags=["Analysis"])
|
||||
|
||||
# 健康检查端点
|
||||
@app.get("/health", tags=["Health"])
|
||||
@app.get(
|
||||
"/health",
|
||||
summary="健康检查",
|
||||
description="检查 API 服务运行状态,返回服务状态和版本号",
|
||||
response_description="返回服务健康状态信息",
|
||||
responses={
|
||||
200: {
|
||||
"description": "服务正常运行",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"example": {
|
||||
"status": "healthy",
|
||||
"version": "1.0.0"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
tags=["Health"]
|
||||
)
|
||||
async def health_check():
|
||||
"""
|
||||
**健康检查接口**
|
||||
|
||||
用于监控服务运行状态,可用于:
|
||||
- 负载均衡器健康检查
|
||||
- 容器编排平台存活探测
|
||||
- 监控系统状态检测
|
||||
|
||||
**返回字段:**
|
||||
- `status`: 服务状态(healthy/unhealthy)
|
||||
- `version`: 当前服务版本
|
||||
"""
|
||||
return {"status": "healthy", "version": app_config.version}
|
||||
|
||||
# 如果直接运行此文件
|
||||
|
||||
100
test_connection.py
Normal file
100
test_connection.py
Normal file
@@ -0,0 +1,100 @@
|
||||
"""
|
||||
测试数据库和缓存连接以及 FastAPI 应用加载
|
||||
使用方法:python test_connection.py
|
||||
"""
|
||||
|
||||
import sys
|
||||
|
||||
print("=" * 60)
|
||||
print("🔍 Hot News Main - 系统连接测试")
|
||||
print("=" * 60)
|
||||
|
||||
# 测试 1: FastAPI 应用加载
|
||||
print("\n📋 测试 1: FastAPI 应用加载...")
|
||||
try:
|
||||
from app.main import app
|
||||
print("✅ FastAPI 应用加载成功!")
|
||||
print(f" - 应用标题:{app.title}")
|
||||
print(f" - Swagger UI: /docs")
|
||||
print(f" - ReDoc: /redoc")
|
||||
except Exception as e:
|
||||
print(f"❌ FastAPI 应用加载失败:{e}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
sys.exit(1)
|
||||
|
||||
# 测试 2: Redis 连接
|
||||
print("\n📋 测试 2: Redis 连接...")
|
||||
try:
|
||||
import redis
|
||||
r = redis.Redis(host='localhost', port=6379, db=0)
|
||||
r.ping()
|
||||
print("✅ Redis 连接成功!")
|
||||
print(" - Host: localhost")
|
||||
print(" - Port: 6379")
|
||||
except Exception as e:
|
||||
print(f"❌ Redis 连接失败:{e}")
|
||||
print("\n💡 解决方案:")
|
||||
print(" 方式 1 (Docker):")
|
||||
print(" docker run -d -p 6379:6379 --name redis redis:latest")
|
||||
print("\n 方式 2 (WSL):")
|
||||
print(" wsl sudo service redis-server start")
|
||||
print("\n 方式 3 (Windows 原生):")
|
||||
print(" 下载并运行 redis-server.exe")
|
||||
|
||||
# 测试 3: MySQL 连接
|
||||
print("\n📋 测试 3: MySQL 连接...")
|
||||
try:
|
||||
# 尝试从配置中读取
|
||||
from app.core.config import get_config
|
||||
config = get_config()
|
||||
|
||||
import pymysql
|
||||
conn = pymysql.connect(
|
||||
host=config.database.host,
|
||||
user=config.database.user,
|
||||
password=config.database.password,
|
||||
database=config.database.db,
|
||||
charset=config.database.charset
|
||||
)
|
||||
print("✅ MySQL 连接成功!")
|
||||
print(f" - Host: {config.database.host}")
|
||||
print(f" - Database: {config.database.db}")
|
||||
print(f" - User: {config.database.user}")
|
||||
conn.close()
|
||||
except Exception as e:
|
||||
print(f"❌ MySQL 连接失败:{e}")
|
||||
print("\n💡 解决方案:")
|
||||
print(" 1. 检查 MySQL 服务是否启动")
|
||||
print(" 2. 检查 config/config.yaml 中的数据库配置")
|
||||
print(" 3. 创建数据库:CREATE DATABASE hot_news CHARACTER SET utf8mb4;")
|
||||
|
||||
# 测试 4: OpenAPI Schema 生成
|
||||
print("\n📋 测试 4: OpenAPI Schema 生成...")
|
||||
try:
|
||||
schema = app.openapi()
|
||||
paths = list(schema['paths'].keys())
|
||||
print(f"✅ OpenAPI Schema 生成成功!")
|
||||
print(f" - API 路径数量:{len(paths)}")
|
||||
print(f" - 部分路径示例:")
|
||||
for path in paths[:5]:
|
||||
print(f" • {path}")
|
||||
if len(paths) > 5:
|
||||
print(f" ... 还有 {len(paths) - 5} 条路径")
|
||||
except Exception as e:
|
||||
print(f"❌ OpenAPI Schema 生成失败:{e}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
|
||||
print("\n" + "=" * 60)
|
||||
print("📊 测试结果汇总")
|
||||
print("=" * 60)
|
||||
print("""
|
||||
✅ FastAPI 应用已修复,可以正常加载
|
||||
⚠️ Redis 和 MySQL 需要手动启动(如果未运行)
|
||||
|
||||
🚀 下一步操作:
|
||||
1. 如果 Redis/MySQL 未启动,请参考 TROUBLESHOOTING.md 启动服务
|
||||
2. 启动应用:python run.py
|
||||
3. 访问 Swagger 文档:http://localhost:8000/docs
|
||||
""")
|
||||
34
test_openapi.py
Normal file
34
test_openapi.py
Normal file
@@ -0,0 +1,34 @@
|
||||
"""
|
||||
测试 OpenAPI schema 生成
|
||||
使用方法:python test_openapi.py
|
||||
"""
|
||||
|
||||
try:
|
||||
from app.main import app
|
||||
import json
|
||||
|
||||
# 生成 OpenAPI schema
|
||||
schema = app.openapi()
|
||||
|
||||
print("✅ OpenAPI schema 生成成功!")
|
||||
print(f"\n📊 API 路径数量:{len(schema['paths'])}")
|
||||
print(f"\n📝 可用的 API 路径:")
|
||||
|
||||
for path in sorted(schema['paths'].keys()):
|
||||
methods = list(schema['paths'][path].keys())
|
||||
print(f" - {path} [{', '.join(methods)}]")
|
||||
|
||||
# 保存 schema 到文件
|
||||
with open('openapi_schema.json', 'w', encoding='utf-8') as f:
|
||||
json.dump(schema, f, ensure_ascii=False, indent=2)
|
||||
|
||||
print(f"\n💾 Schema 已保存到:openapi_schema.json")
|
||||
print(f"\n🌐 启动服务后访问:")
|
||||
print(f" - Swagger UI: http://localhost:8000/docs")
|
||||
print(f" - ReDoc: http://localhost:8000/redoc")
|
||||
print(f" - OpenAPI JSON: http://localhost:8000/openapi.json")
|
||||
|
||||
except Exception as e:
|
||||
print(f"❌ 错误:{e}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
Reference in New Issue
Block a user