EPISODE 139: Marketer of the Month Podcast with Martin Riedmiller

Table of Contents

Hey there! Welcome to the Marketer Of The Month blog!

We recently interviewed Martin Riedmiller for our monthly podcast – ‘Marketer of the Month’! We had some amazing insightful conversations with Martin and here’s what we discussed about-

1. Reinforcement Learning Core: How AI learns through trial-and-error with feedback.

2. DeepMind’s Notable Projects: Atari project and teaching robots complex behaviors.

3. AI Challenges: Scaling software and team coordination hurdles persist.

4. Applications of Reinforcement Learning: Promising in controlling systems autonomously.

5. Ethical Considerations in AI: Precise goal-setting vital to prevent unintended outcomes.

6. Future AI Trends: Exploring human intelligence, merging language models with controllers.

About our host:

Dr. Saksham Sharda is the Chief Information Officer at Outgrow.co. He specializes in data collection, analysis, filtering, and transfer by means of widgets and applets. Interactive, cultural, and trending widgets designed by him have been featured on TrendHunter, Alibaba, ProductHunt, New York Marketing Association, FactoryBerlin, Digimarcon Silicon Valley, and at The European Affiliate Summit.

About our guest:

Martin Riedmiller is a Research Scientist and Team Lead of the Controls Team at Google DeepMind. His profound interest lies in intelligent machines capable of autonomous learning from scratch, focusing on neural networks as mathematical brain models. Throughout his journey, he has consistently pushed the boundaries of machine learning, from pioneering neural forecasting systems for self-driving cars to groundbreaking reinforcement learning algorithms.

How AI Learns to Learn: Google DeepMind’s Control Team Lead Martin Riedmiller on AI Ethics

The Intro!

Saksham Sharda: Hi, everyone. Welcome to another episode of Outgrow’s Marketer of the Month. I’m your host, Dr. Saksham Sharda, and I’m the creative director at Outgrow. co. And for this month we are going to interview Martin Riedmiller who is a Research Scientist and Team Lead of the Controls Team at Google DeepMind.

Martin Riedmiller: Great to be here. Thank you.

Don’t have time to read? No problem, just watch the Podcast!

Or you can just listen to it on Spotify!

The Rapid Fire Round!

Saksham Sharda: So let’s start with the rapid-fire round. We’re recording. The first one is, at what age do you want to retire?

Martin Riedmiller: 65.

Saksham Sharda: Okay. How long does it take you to get ready in the mornings?

Martin Riedmiller: 10 minutes.

Saksham Sharda: Most embarrassing moment of your life?

Martin Riedmiller: When we received our first child.

Saksham Sharda: Okay. You can also pass the question if you do not want to answer.

Saksham Sharda: Favorite color?

Martin Riedmiller: Blue.

Saksham Sharda: What time of day are you most inspired?

Martin Riedmiller: Usually in the mornings.

Saksham Sharda: How many hours of sleep can you survive on?

Martin Riedmiller: Eight.

Saksham Sharda: Fill in the blank. An upcoming technology trend is ____.

Martin Riedmiller: AI in robots.

Saksham Sharda: The city in which the best kiss of your life happened?

Martin Riedmiller: Pass.

Saksham Sharda: Pick one. Mark Zuckerberg or Elon Musk.

Martin Riedmiller: Elon Musk.

Saksham Sharda: The biggest mistake of your career?

Martin Riedmiller: I did many of them, but I don’t regret them.

Saksham Sharda: How do you relax?

Martin Riedmiller: By doing sports.

Saksham Sharda: How many cups of coffee do you drink per day?

Martin Riedmiller: About four.

Saksham Sharda: A habit of yours that you hate?

Martin Riedmiller: I’m very often very nervous.

Saksham Sharda: The most valuable skill you’ve learned in life?

Martin Riedmiller: To be honest.

Saksham Sharda: Your favorite Netflix show?

Martin Riedmiller: We like a lot of suits.

The Big Questions!

Saksham Sharda: Well, that was the end of the rapid-fire, moving on to the longer questions, which you can answer with as much ease and time as you like. The first one is, can you share a bit about your background and how you got started in the field of artificial intelligence and reinforcement there?

Martin Riedmiller: Yeah. I was always fascinated by machines autonomously doing things in my study, and I looked into the field of robotics, but towards the end of my study, I came across two favorite books, parallel Distributed Processing by Omaha and McClelland that explained all about neural networks. And it was fascinating from the first moment. And with that, I got into machine learning and learning in neural networks. So I did my master’s thesis in that field, with supervised learning and a fast supervised learning algorithm. And then I moved to the even more fascinating field of reinforcement learning. So very early in my career, and, with reinforcement learning, I made my Ph.D. thesis. I then later did my postdoc on reinforcement learning and multi-agent systems. And when, when I got a professor, I was looking more into data-efficient reinforcement learning to make it even more practical, and then at some point, I became a professor. I was doing this more with my team. And, later in my career, I was approached by DeepMind where I finally also could do reinforcement learning at scale.

Saksham Sharda: So what would you say are the pivotal moments in your journey that shaped your expertise in this field?

Martin Riedmiller: I think very early in my career, when we did our first experiments, and, we saw the connection between reinforcement learning and dynamic programming, and that is, a very nice optimization paradigm behind that, and it is grounded in some mathematical concepts. I was fascinated, and I thought, this might take us a very long way. So this was one of the moments and another big moment was when we finally came up with a very data-efficient reinforcement learning method, and this worked on a real device, from scratch, so without going the sim to the real node but the route, but being so data efficient that you actually could learn on the real device. So, I think, these were probably the most inspiring and exciting moments in my research career.

Saksham Sharda: DeepMind is known for its groundbreaking work on AI. Could you give us an overview of some of the most exciting projects and developments you’ve been involved in during your time there?

Martin Riedmiller: Yeah. The very first project was the Atari project, and that’s where I still was, part-time at the university, and I was very much fascinated by ambition from the very first moment. So at the university, we were also tackling a couple of video games, but partitioning the problem first, going from perception to a latent representation, and then from a latent representation to the control. And at DeepMind, there was immediately the idea, no, let’s do it end to end. And let’s not only solve one game but let’s solve 50 games at once and in a very short amount of time. So this was a very fascinating project, and then when we started the robots, we did the learning by playing work, where we could show that, by robots playing around, they could, by themselves, learn very data efficiently, highly complex behaviors. So this was also a very good moment. And then of course, when we ended the fusion work, reinforcement learning was applied to a real fusion reactor, which was something, with a high risk because there was also an external partner involved, with a lot, a lot of experts. So nobody had all the knowledge, at any point in time. But it was a big teamwork project, and to get this, to a success and a nature publication. This was also a very big moment in my career at DeepMind.

Saksham Sharda: And so what were the biggest challenges while doing these projects? Are there any memorable obstacles that you faced?

Martin Riedmiller: Yeah, the challenges are always, the scale, that software, writing software is not so easy anymore. So I used to write a lot of code by myself when I was at the university and always tried to keep it simple. When you do things at scales, these simple things often do not work. And so you, you also have a lot of technical things that you have to solve and resolve, of course, as always, bringing a lot of people together with different backgrounds and, and with different ambitions, with different styles of working, that is always a challenge in all the projects I’ve encountered so far, but on the other hand, all these challenging things are in the end also rewarding because you’re doing things that are not possible in another location or with other people or on a smaller scale.

Saksham Sharda: So reinforcement learning is a fascinating area within AI. Could you explain it in very simple terms for our listeners who might not be familiar with the context?

Martin Riedmiller: Yeah, probably, the simplest way of doing learning is learning by examples where a teacher provides you with all the examples. Reinforcement learning takes this away and asks you, asks the learning system to learn the policy, the way it behaves completely by itself. So it’s only told what to achieve, but not how, so there are no examples of how to achieve it. This is at the same time also very fascinating because as an engineer or as an outsider, you just provide the goal and then you see how the system slowly starts to keep up learning, and at the end probably solves a task that you were not even able to solve, by yourself, just of out of self-improvement. And I also have an example of that. So in my PhD thesis, I was looking at the double card pole. So that’s a card system where you want to balance a pole, but these are two poles, and you want to balance both poles at the same time. And for a single pole, it’s kind of, you can kind of imagine as a hand how you do it, but for two poles, it’s kind of tricky. In the end, the learning system figured this out and also came up with a very nice policy. It first kind of brought the first, and the second pole in, in a line with the first pole, and then swung up both poles at the same time. And I found this a very elegant solution, which in Stein is kind of obvious but, for me, it wasn’t the solution that I would have come up with immediately, but the learning system did. And I think these are exactly the moments that we are after. And, that keeps our excitement in these kinds of research.

Saksham Sharda: So what do you think are the most promising applications of reinforcement learning shortly? Both in academia and industry?

Martin Riedmiller: I was always striving to bring reinforcement learning to control systems, applications, and control systems is a very broad field from, controlling the temperature in a room, too, controlling cars, the engine in the car, to also controlling the whole body like, complete robots. And I think in the future when we understand this reinforcement learning even more, then we will see more and more applications of actually very complicated robots doing also very complicated things instead of being, everything is programmed as it’s currently, the state of the art in industry. And I think this will be something that will unleash a lot of potential applications in the industry but also even if we go away from robots to just controllers being able to control devices it’s very difficult to come up with a simulation because physics is not completely understood. There are some uncertainties or it’s not clear how to come to an optimal solution. I think there’s a really broad range of applications that can be unleashed if reinforcement learning works very efficiently, robustly, and reliably.

Saksham Sharda: Are there any ethical considerations or challenges associated with implementing reinforcement learning in certain applications?

Martin Riedmiller: Yeah, of course. Then, the first ethical consideration is that this is a very powerful technique, in principle, when you can system just the what, and it just figures out by itself, there are a lot of dangers. One is you can give it the wrong goals, and it goes for the wrong goals and solves a task that wouldn’t be possible otherwise. So you have a bad intention in use, but also, even if you have good intentions, then, the goal is not, not precisely formulated or has some holes in it, we even see now that the system is very creative to come up with solutions that we haven’t thought of. So safety is a big concern. You always usually want to have your reinforcement learning within a safety envelope, so that you make sure that no matter what the agent tries to do, the system always stays in a safe mode but I think the more powerful the system will finally get, the more important it is also to look at the ethical sides and to be very sure that if you give a certain goal to the system, there’s no way that the agent can come up with solutions that are very dangerous or ethically not adequate for humanity.

Saksham Sharda: So how do you see the future of AI research evolving, and what role do you think reinforcement learning will play in shaping that future?

Martin Riedmiller: So I hope, of course, and I’m kind of convinced that reinforcement learning will play a very important role, a bit with the current advent of this, large language models and their big success. This is kind of going to the background, because of course, what we’ve seen in these large language models is astonishing and, and unleashes a lot of applications already. However, I think, if we want to understand how intelligence works, we also need to understand how they can build up behavior from raw experience and how they can learn by themselves, and not just by being given a large corpus of language or hand-curated data. So I think, this current trend is very important because we see that we can, if we have a lot of experience, we can generalize and have unexpected generalization effects. So this is a very good thing that happens at scale but then we really will go back to the way and want to understand how we can make sense of all the experience, and how to put it into real behaviors and beneficial behaviors and I think there’s probably no way that a hand can show all the possible behaviors that are possible within a certain context. But there we need the power of reinforcement, learning our similar techniques so that the systems can learn autonomously by themselves.

Saksham Sharda: Are there any interdisciplinary areas where reinforcement learning could intersect with other AI subfields to drive innovation?

Martin Riedmiller: I think one, prominent area is cognitive science, of course, understanding both what concepts are in cognitive science and, seeing, what we can learn from there and bringing them to our agents to make them more efficient or have ideas for sub-components of these systems. And also, probably, vice versa, like, AI systems can help, to detect new things, in physics. They can help, and have a lot of impacts, on drug design, for example, in medicine, in biology. So, I expect that when we understand AI more and more, then, we will be getting more used to assistance that will help us also to make discoveries in all areas of science. We are seeing the first signs of this already so there are quite a couple of publications out there already in different disciplines, but this will be something where I definitely will have a huge impact in the coming years.

Saksham Sharda: So you’ve had experience both in academia and industry. So what are some of the key differences and similarities between these two environments when it comes to VR research and development?

Martin Riedmiller: So my kind of industrial experience is very exclusive being at DeepMind from the very first moment, we were a research company, to solve intelligence. And then later by joining Google, we had a lot of potential in computing and all that. So, this is a very special situation for industry and therefore me, the, the, the switch from academia to industry was, was a big improvement basically in all my, capabilities, that I had within, in terms of, not being responsible for writing grants to hire Ph.D. students, but just having a team of Ph.D. students or even postdocs to do research and not being restricted to a three or five years, time plan, but being able to plan for longer times and ambitious, so this was where a lot of the benefits that I encountered when I switched from academia to a very exclusive, industrial environment of course, always with the idea that finally this AI will also have, some benefits to humanity, and will, will bring up tools. But, on the other side also, with a research ambition and understanding of things, I think, the other side is that, in academia, you have a lot of freedom. You can always also switch your research topic completely. Of course, at DeepMind, I would probably not be able to do some research on art or physics, but my research area is kind of restricted to all that has to do with AI, which admittedly is still very large.

Saksham Sharda: So, how do you manage the balance between academic curiosity and practical application when working in both settings?

Martin Riedmiller: So for me, that is the field, the area where I always felt very comfortable, so I always wanted to do academic research and understand how to build these systems to solve a certain task. So from the very first beginning in reinforcement learning, I always had this dream that at some point these controllers that we developed would be, in a car, for example, or would be, working in an actual robot because that was driving my motivation. So, I would say for me, this is a very natural setting.

Saksham Sharda: DeepMind is known for its collaborations with various organizations and industries. Can you share any interesting insights or lessons learned from these collaborations?

Martin Riedmiller: So, the most recent and most prominent one for me is first the collaboration with a Swiss plasma center on the fusion work that we had. And there it was amazing because this collaboration enabled us first to have access to a real, fusion reactor, which of course you cannot build so easily as a company and at this, and on the other side, also to a huge team of experts and, and huge expertise, with people that are also very open to encounter new methods like, for example, AI, so, this was a very positive experience on a very high level of expertise.

Saksham Sharda: And are there any examples of a collaboration that resulted in unexpected discoveries or outcomes or anything interesting? Interesting story or collaboration?

Martin Riedmiller: So yeah, I think all the collaborations that I had at the university were kind of interesting. So, my very first experience with collaboration was in the finance industry. We were very early. We used neural networks to predict dollars and at that time DAC exchange rates. And they used it in a real-world system. And so for me, it was always kind of, we try and we just give them the prediction that we have and we actually could get to a reasonable prediction rate, for, for this kind of changes in the rate. But then they said yeah, predictions rates are nice, but if we don’t see the money, we don’t believe it, why don’t we take a million dollars and just put it on that system for me at that time? And still, a million euros or a million dollars was a lot of money. And, but they were just saying, okay, this is much more convincing. And then they put the money, and fortunately, they also made a profit off of it. But this was one of the early examples where I at least got to know how the financial industry works and how people are thinking in different contexts.

Saksham Sharda: Has there been anything more interesting in terms of risk investment?

Martin Riedmiller, yeah. So, we had a lot of applications, for example, also with the auto automotive industry, where we were able to train controllers that came up with policies that they only discovered very recently. So the learning system came up with a solution that was kind of surprising and even if they had done research on that in that area for 30 or 40 years, they only discovered that kind of policy two or three years ago so this was a moment where the power of reinforcement learning started to shine. And where I thought, okay, there might be a path forward.

Saksham Sharda: So what about the exciting trends or developments in AI and reinforcement learning that you’re personally excited about? Are there any trends that you’re personally excited about?

Martin Riedmiller: Yeah, I’m excited about this, that we see these effects in scaling in these large models because this is something that, if this is true also for more raw data, for other data than language then that would mean that at a, at a certain point in experience or with a certain point of ex collected experience, out of a sudden, the system can do things that we never had imagined. And, so there’s a magic thing that pops up, that where we can just profit from, but we don’t have to code in. This is a very exciting development. I’m not so sure how much data we would need and whether it’s so easy or whether some pieces are still missing. I’m more convinced that some pieces are missing until we see that also in reinforcement learning but, I’m really kind of, very excited about this trend of bringing large language models together, with these low-level controllers and then understanding a bit more how potentially the intelligence acquisition is working.

Saksham Sharda: So the last question for you is of a personal kind, what would you be doing in your life, if not this right now?

Martin Riedmiller: That’s a very difficult question. So if I wouldn’t, if I wouldn’t work in that area, I would probably just read books in that area or, or code a bit of myself. So I liked coding a lot when I did it actively from a very early age, I would probably just be doing music, playing music in some kind of bar. I think that would be a kind of a different career probably. I’m not good enough for that. So I would only have a very small audience there. But at least, I could imagine having some fun in that area.

Let’s Conclude!

Saksham Sharda: Thanks, everyone for joining us for this month’s episode of Outgrow’s Marketer of the Month. That was Martin Riedmiller who is a Research Scientist and Team Lead of the Controls Team at Google DeepMind.

Martin Riedmiller: Pleasure. Thanks for having me.

Saksham Sharda: Check out the website for more details and we’ll see you once again next month with another marketer of the month.

Muskan

Muskan is a Marketing Analyst at Outgrow. She is working on multiple areas of marketing. On her days off though, she loves exploring new cafes, drinking coffee, and catching up with friends.

EPISODE 139: Marketer of the Month Podcast with Martin Riedmiller

About our host:

About our guest: