上一篇介绍如何利用百度音乐的API构建互动的音乐播放器,接下来我们继续增加播报新闻的功能,这个实现起来也不难,如果你对网络爬虫有所研究,应该还可以实现很多你想要实现的功能,你可以定制化你需要的信息,让智能音箱帮你播报给你听。
基本的逻辑就是,编写一个爬虫函数,抓取百度新闻页面的新闻内容,返回文本文件给百度语音合成音频,然后调用播放器播放就是了。直接上代码:
import urllib2, urllib #from urllib import request import re from voice import Voice import voiceAPI import random voice = Voice() baiduAPI = voiceAPI.BaiDuAPI() baiduAPI.getToken() def dialogue(text): try: url = baiduAPI.voiceSynthesis(text) except: url1 = "audio/cnnerror.mp3" os.system('mpg123 "%s"'%url1) return voice.playVoice(url) def getNews_guonei(): #获取国内新闻 url = "http://news.baidu.com/guonei" response = urllib2.urlopen(url) html = response.read().decode('utf-8') pattern_of_instant_news = re.compile('<div id="instant-news.*?</div>',re.S) instant_news_html = re.findall(pattern_of_instant_news,html)[0] pattern_of_news = re.compile('<li><a.*?>(.*?)</a></li>',re.S) news_list = re.findall(pattern_of_news,instant_news_html) for news in news_list: #print(news) dialogue(news) #播放新闻 #return news def getNews_guoji(): #获取国际新闻 url = "http://news.baidu.com/guoji" response = urllib2.urlopen(url) html = response.read().decode('utf-8') pattern_of_instant_news = re.compile('<div id="instant-news.*?</div>',re.S) instant_news_html = re.findall(pattern_of_instant_news,html)[0] pattern_of_news = re.compile('<li><a.*?>(.*?)</a></li>',re.S) news_list = re.findall(pattern_of_news,instant_news_html) for news in news_list: #print(news) dialogue(news) def getNews_mil(): #获取军事新闻 url = "http://news.baidu.com/mil" response = urllib2.urlopen(url) html = response.read().decode('utf-8') pattern_of_instant_news = re.compile('<div id="instant-news.*?</div>',re.S) instant_news_html = re.findall(pattern_of_instant_news,html)[0] pattern_of_news = re.compile('<li><a.*?>(.*?)</a></li>',re.S) news_list = re.findall(pattern_of_news,instant_news_html) for news in news_list: #print(news) dialogue(news) def getNews_caijing(): #播报财经新闻 url = "http://news.baidu.com/finance" response = urllib2.urlopen(url) html = response.read().decode('utf-8') pattern_of_instant_news = re.compile('<div id="instant-news.*?</div>',re.S) instant_news_html = re.findall(pattern_of_instant_news,html)[0] pattern_of_news = re.compile('<li><a.*?>(.*?)</a></li>',re.S) news_list = re.findall(pattern_of_news,instant_news_html) for news in news_list: #print(news) dialogue(news) def getNews_yule(): #播报娱乐新闻 url = "http://news.baidu.com/ent" response = urllib2.urlopen(url) html = response.read().decode('utf-8') pattern_of_instant_news = re.compile('<div id="instant-news.*?</div>',re.S) instant_news_html = re.findall(pattern_of_instant_news,html)[0] pattern_of_news = re.compile('<li><a.*?>(.*?)</a></li>',re.S) news_list = re.findall(pattern_of_news,instant_news_html) for news in news_list: #print(news) dialogue(news) def getNews_tech(): #播报科技新闻 url = "http://news.baidu.com/tech" response = urllib2.urlopen(url) html = response.read().decode('utf-8') pattern_of_instant_news = re.compile('<div id="instant-news.*?</div>',re.S) instant_news_html = re.findall(pattern_of_instant_news,html)[0] pattern_of_news = re.compile('<li><a.*?>(.*?)</a></li>',re.S) news_list = re.findall(pattern_of_news,instant_news_html) for news in news_list: #print(news) dialogue(news) def getNews_net(): #播报互联网新闻 url = "http://news.baidu.com/internet" response = urllib2.urlopen(url) html = response.read().decode('utf-8') pattern_of_instant_news = re.compile('<div id="internet_news.*?</div>',re.S) instant_news_html = re.findall(pattern_of_instant_news,html)[0] pattern_of_news = re.compile('<li><a.*?>(.*?)</a></li>',re.S) news_list = re.findall(pattern_of_news,instant_news_html) for news in news_list: #print(news) dialogue(news)
只需要把播报新闻的函数放到主程序中调用即可。顺便也整一个笑话的爬虫,至于怎么让智能音箱播报给你听,自己开动脑筋,按照上面播报新闻的思路应该也很容易完成。
糗事爬虫代码清单:
# -*- coding:utf-8 -*- import requests from bs4 import BeautifulSoup from itertools import * import time #伪装成浏览器 user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)' headers = {'User-Agent': user_agent} def get_url(start, end): #根据网址结构构造爬取多页的函数 for num in (start, end): url = 'http://www.qiushibaike.com/text/page/'+str(num)+'/' get(url) time.sleep(3) def get(url): #删选出搞笑数大于1000并且评论在十条以上的段子 wb = requests.get(url,headers = headers) soup = BeautifulSoup(wb.text,'lxml') contents = soup.select('a > div[class="content"] > span') marks = soup.select('div > span > i') comments = soup.select('div > span > a > i') for content, mark, comment in zip(contents, marks, comments): if int(mark.get_text()) > 1000 and int(comment.get_text()) > 10: print (content.get_text())
关于新闻类的播报实现,就介绍到这里,回去自己动手比什么都有成就感。智能音箱的唤醒除了用语音唤醒之外,其实你可以通过传感器来唤醒,比如加一个定时器,到点自动播放你喜欢的节目,或者加上一个红外传感器,当人接近时,启动对话,等等,在树莓派上,你可以定制你自己喜欢的方式来交互,只有想不到,没有什么不可能的。
本文暂时没有评论,来添加一个吧(●'◡'●)