– 吴恩达(Andrew Ng):百度首席科学家,“百度大脑”、“谷歌大脑”负责人,斯坦福大学计算机科学系和电子工程系终身教授,人工智能实验室主任,Coursera联合创始人

– 徐伟:百度IDL杰出科学家,前Facebook大规模推荐平台负责人、NEC lab高级研究员

– 韩旭:密苏里大学教授,百度硅谷人工智能实验室任Principal Scientist













【吴恩达】Regarding the second question, yes there\’s been a lot of hype about Deep Learning. I think it is creating tremendous value today—it is letting us turn the huge amounts of data we have into huge amounts of value. I\’m also confident that deep learning will keep on creating a lot of value in the next few years—we still have far too many ideas, and too few people to do them. But we\’re also very far from \”human level intelligence,\” and do not yet see any clear path to get there. I think some of the 炒作 has been a bit irresponsible.人工智能会对整个社会有很大的影响力,so I think it\’s important that all of us have a clear understanding of what\’s coming, but also what is not, so that we can plan appropriately.





【徐伟】目前为止一个比较成功的非监督的例子是word embedding;不过也有很多人不认为word embedding 是深度学习。另外word embedding 和传统的非监督学习也并不完全一样,实际上运用了上下文作为监督。我认为非监督学习要取得成功,实际上更会使用类似这样的弱监督学习。

【吴恩达】I agree with Xu Wei. Despite all the value created by Deep Learning, most of it is currently supervised learning, meaning learning relatively simple A–>B mappings. For example, perhaps A is an email, and B whether or not it is spam. That\’s a spam filter. Or perhaps A is an image, and B is an object label. That\’s object recognition. With a lot of labeled data (i.e., (A, B) pairs) and a big enough network, you can prove that a deep learning algorithm can learn almost any function to a very high level of accuracy. one of the most exciting recent breakthroughs is Deep Learning algorithms can now learn A–>B mapping where B isn\’t just 0/1 or an integer (like the examples above), but can be very complex things like a sentence.

我同意徐伟的观点。深度学习所创造的所有价值,大部分现在都是监督学习,也就是学习相对简单的A–>B映射。举例来说:可能A是一封电子邮件,B表示A是或不是垃圾邮件,这就是一个垃圾邮件过滤器;或者A是一个图像,B是一个对象标签,这就是物体识别。有很多标记的数据(例如A,B配对)和一个足够大的网络,你可以证明一个深度学习算法可以以极高的精度学习任意函数。近期最激动人心的重大突破之一就是:深度学习算法现在可以学习一个A–>B的映射,其中B不只是1 /0或一个整数(如前述的例子),而可以是非常复杂的事情,比如一句话。


【吴恩达】For example, if A is an image, and B is a caption, then that\’s image captioning. (The first paper to do this was by Xu Wei and his colleagues. :-)) Or if A is an English sentence, and B is a french sentence, then that\’s machine translation, which was done by Ilya Suskevar and others. Or if A is an (Image, Question) pair and B is an answer, that\’s Image Question Answering (also by Xu Wei!). Supervised learning has been very successful for both 翻译 and 语音识别。Our most successful approach at Baidu on speech recognition has been to use a very large neural network, and to learn an A–>B mapping directly where A is an audio clip and B is the text transcript. Tony who\’s next to me had led the team working on the mandarin version of this, and we believe this is now the world\’s best mandarin speech recognition system.

例如:A是图片,B是对应的文字说明,这个问题就可以被定义为图像摘要生成 (第一篇论文由徐伟以及他的同事发表);如果A是英语句子,B是法语句子,这个问题就是机器翻译问题(可以参考Ilya Suskevar等人的研究);如果A是图片问题对,B是相应的答案,这个问题就变成了基于图像的QA问题(也是徐伟的成果)。监督学习在机器翻译和语音识别里得到了广泛的应用。例如,在百度语音识别系统里,我们利用大规模的神经网络系统让机器学习将语音片段映射到相应的文本。坐在我身边的Tony负责其中的普通话识别版本,我们相信这个系统是目前最好的普通话识别系统。




【吴恩达】I think speech recognition will move toward end-to-end learning. We are finding that the dataset size is one of the biggest drivers of performance. I find some of the recent work on attention models also promising. We were also very heavily influenced by Alex Grave\’s work on CTC.

我认为语音识别会逐步使用端到端的学习方法。在这些方法里,数据集的大小是影响性能的重要因素。最近我发现使用基于attention模型的结果很好。我们的工作也深受Alex Grave在CTC工作上的影响。

Two challenges that remain: Transcribing long utterances. We surpass human-level performance for short phrases, out of context. But we are still much worse than human-level performance when there\’s more context, such as long conversations. A second major challenge is understanding the content of the text, rather than only transcribing it.


But I\’m excited about building a speech-enabled world. Just as (thanks to Steve Jobs) the smartphone touchscreen fundamentally changed how we interact with computers, I think that speech will also fundamentally transform how we interact with computers in the next few years.




【徐伟】对像ImageNet这样的问题,目前的deep residue net的结果已经非常好了。目前我们也看到很深的模型在一些NLP也有很好的效果。对于视频内容,运算能力还是一个重大的瓶颈;只有百万级别的数据,训练就需要数天。







【吴恩达】I\’m fortunate to have gotten to work on several different cars before. The Baidu one is the 4th car I\’ve helped build. But this is the first time in my life that I\’ve felt we see a clear path to making autonomous vehicles a reality. Just for fun, here\’re pictures of some other cars I had worked on!


【吴恩达】Here in China, 500 people a day die from car accidents. If we can make autonomous driving a reality only one day sooner, that\’s an extra 500 people whose lives we save. This will be one of the most important applications of AI in this decade. Why do I think we now have a clear shot toward making this a reality? We now have very sophisticated deep learning algorithms that are performing far better than ever before. We were fortunate that Lin Yuanqing had joined Baidu a few months ago and is now leading a big part of this effort. But in addition to that, we have also developed a unique strategy that is different than most others that have been working on autonomous driving. We call this strategy TRAIN TERRAIN (铁轨战略).



【吴恩达】We hope to have commercial autonomous driving services by 2018, and be in mass production by 2020. Here\’re the key elements of the TRAIN TERRAIN strategy:


1.Don\’t try to roll out autonomous vehicles everywhere all at once. Instead, start from a small region (such as a shuttle route or small city), and grow from there.


2.Realize that computer-driven cars are not the same thing as a human-driven car. They behave differently: They never drive drunk. But, they don’t understand a policeman’s hand gestures. Make sure people in the “autonomy enabled” regions have realistic expectations.


3.Design autonomous cars to be clearly recognizable, so that people can immediately spot them for what they are.


4.Make the behavior of the autonomous cars highly predictable. Predictability, even more than cleverness, leads to safety.


5.Implement modest infrastructure changes in the autonomy enabled regions to make sure the cars understand what they need to do. For example, give emergency workers a clear way (such as a wireless becon) to communicate with the car. Make sure the roads are well maintained, with clearly painted lines. And so on. With these changes, I think we can safely put autonomous cars on the roads soon.


6.We are rapidly growing our teams in both Beijing and in the US (Silicon Valley) office. Thanks to our unique strategy, we\’ve been thrilled at the number of people applying to join us to work on this grand mission of saving 500 lives per day!


【雷鸣】自动驾驶,一个挺科幻的概念,再有几年就满大街都是,真让人感慨技术发展速度啊。我们再谈一下机器人,Google现在要销售掉Boston Dynamics,让人非常震惊!

【吴恩达】Yes I agree! I want to say something to all the young people reading this. I think we\’re in an unique moment in history where AI can really change the world. If you know how to use or apply AI, you can be a position where the decision you make today will really change how the world is in 10 years. There will be thousands or millions whose would have lost their lives but for your efforts. Or you can transform entire industries, and help countless people. That\’s why I\’m really excited about AI. If you are young and considering what career path to pursue, I hope that you will consider learning about AI, and joining the AI research and development community at Baidu or elsewhere, so that we can all work together to make the world a better place!



【吴恩达】我很高兴听到有很多人对机器学习感兴趣。现在的问题是机器学习的想法和机会太多,不过能做的人太少。The world needs more AI people!



【徐伟】功能非常专用的机器人,应该还是会有很快的发展和应用。但像人那样的机器人,还比较遥远。self-driving car可能是最重要的一种。未来几年内,家用机器人还很难真正帮人做事。


【吴恩达】There\’s a lot of exciting work in robotic applications right now that focuses on specific narrow/vertical applications. Other than autonomous cars, I see exciting work in precision agriculture, automated power plant inspection, automated picking (for ecommerce fulfillment), automated security robots, and so on. Most of these robots have hardware and software designed to carry out that particular task, and so do not look like humanoid robots. I think truly general-purpose robots (other than robot arms in factory automation) are still a little further away.


【雷鸣】What do you think about Amazon Echo. It\’s pretty popular in US now.

你怎么看待Amazon Echo,它在美国非常流行

【吴恩达】I have an Amazon Echo in my home. I think it is a nice start to home automation using voice commands, but it is still the very early days of a new industry. I\’ve been impressed by Amazon\’s work, but it\’s still too early to say whether this will be the right design in the long term.

我家里面就有一个Amazon Echo。我认为这是基于语音的家庭自动化的很好的开始,但这一新的产业现在仍然处于最早期。亚马逊的产品非常棒,但是从长期来讲,我认为现在来讨论这个设计是否正确仍然为时尚早。

【吴恩达】But I do think that in the future, we should be able to talk to all the devices in our homes and have them understand and respond to us. I hope that a few decades from now, I will have grandchildren who are mystified as to how, back in 2016, if you were to go home and say something to your microwave oven, it would just sit there and rudely ignore you!




【吴恩达】AI is changing so rapidly, I think all of us that work in AI have to keep on learning. Once again, I want to say something to the young people reading this. Every Saturday, you will have a choice: You can either watch TV, or you can study. If you study, it turns out that there will be almost no short term reward. The following Monday, you won\’t be that much better at your job, and your boss almost certainly won\’t know you spent all day studying nor tell you \”good job.\” So, you have almost nothing to show for your day of hard work. But here\’s the secret: If you study hard not just for one weekend, but do so weekend after weekend… for a year… then you will become great at it. I think studying has almost no short term rewards. But the long term rewards are huge!



【吴恩达】So one of the challenges – which I hope many readers will rise to—is to keep ourselves motivated and to keep learning and studying, week after week, year after year.


【徐伟】说到reward, 延迟很久的reward目前也是deep reinforcement learning很难处理的一点。



1、End to end的方法,对数据的依赖有多高?机器学习如何提高对数据的学习和处理能力,而不是所有的数据必须人工标注后才能使用?如果不能解决这个问题,AI如何真正的进化?


2、深度学习应用在自然语言处理方面感觉还是不怎么理想。有这么一种说法:图像是你直接看到的,语言却是人类抽象之后的;特别是将深度学习应用在贴吧的帖子,或者电商网站上的评论数据,貌似效果都不如人为的规则 传统的模型。你们怎么看呢?



【吴恩达】谢谢你的问题。Yes, Deep Learning is the best algorithm for a lot of the most important tasks that we use in production systems, including search, recommendations, and others. The general pattern is as follows: If you have a relatively small training set, then the performance depends more on your skill at hand-engineering features, and deep learning won\’t have a significant advantage over SVMs, Boosting, Decision Trees. But in the regime of big data—where you have a massive labeled training set—the supervised deep learning is more likely to do well. This is partially because deep learning algorithms are very \”high capacity\” (say high VC dimension, if you know what that is). This lets it exploit very large datasets better than most other algorithms. They are also more scalable than say an SVM with a non-linear kernel. This lets us build the systems needed to train them on huge datasets. To help visualize all this, here is a cartoon plot that explains how I think of the overall trends of the performance of DL vs. more traditional algorithms.



【徐伟】其实这是深度学习的一个优势,把高位稀疏直接作为输入,它学到的是类似embedding一样的东西。Thus, my question is whether it is possible to apply some methods in NLP to deal with biology problems, and which ones might be most possible? — The most widely used model for NLP is recurrent models. Perhaps they are already used for biology problems.


5、Do you know any work is being done where AI actually helps extending and training human intelligence? Take the example of google AlphaGo. Would it be nice if professional GO players can benefit from AlphaGo\’s reasoning of a game?


【吴恩达】We saw this happen after Gary Kasparov lost to Deep Blue. Human chess players are now far better through learning from and also partnering with computer chess players. I\’ve heard of this starting in Go as well, but that feels like it\’s in an earlier stage. But more generally, I see a lot of opportunities for computers to supplement the human brain. I am especially excited about online education. I think MOOCs like Coursera and open.163.com have been a great start. I hope that online education becomes more adaptive and flexible over time, and that computers can really help customize our learning experiences, the way a personal tutor might.

其实从Gary Kasparov输给深蓝之后,这种情况就发生了。现在的象棋选手可以通过向电脑棋手学习或者与其对战而提高自己的水平。我也听说了关于围棋这方面的消息,但是这么说还为时尚早。更一般地说,我看到了许多计算机可以来弥补人脑的机会。我对在线教育很感兴趣,我认为像Coursera和网易公开课都是一个很好的开始;我希望随着时间的推移,在线教育将来可以更加灵活、更能适应需求,可能会成为大家的私人导师。

6、For medical imaging, it\’s very difficult to collect large scale, accurate, well labeled data. How can we get a better performance?


【吴恩达】There\’s a lot of low-hanging fruit today in deep learning in problems with a lot of data. If you don\’t have a lot of data, in the short term you might end up have to just rely on more traditional engineering methods (including careful feature design). But looking slightly further out, I\’m excited about other forms of learning, including transfer learning, semi-supervised learning and unsupervised learning – and quite possibly ones we haven\’t imagined yet – that would help us do well even on small amounts of data. There\’s a lot of active research on these topics in Baidu and elsewhere. I don\’t think any of us feel like we have the right algorithms yet, but I\’m seeing a lot of progress each year.


【徐伟】Human has the amazing ability to learn from a small amount of data, partly from its modeling capability, partly from its ability to learn from other human. Current deep learning is still lacking these abilities.


7、Will deep learning networks evolve to develop logical thinking? Or logical thinking is completely different from deep learning methods in nature so we need different method to compensate deep learning network?


【徐伟】Right now, there is no good way for to evolve a large deep learning model (there\’s work on evolving small models). So whether we will have deep learning model capable of handling logical reasoning will depends on the new models designed by researchers. But I do believe it is possible to be achieved by deep learning model, as evidenced by the rapid progress in the area of NLP using deep learning.



【吴恩达】 谢谢大家的热情,这么晚的时间还在和我与徐伟交流。我们也希望未来会有更多的机会和中国的人工智能人士交流,也希望会有机会来支持中国的人工智能发展!


