Event-Based Camera Chips Are Here, What’s Next?

This month, Sony starts launching its first high-resolution event-based camera chips. Ordinary cameras record complete scenes at regular intervals, although most of the pixels in these frames do not change from one scene to another. The pixels in event-based cameras — a technology inspired by animal vision systems — respond only if they detect a change in the amount of light falling on them. Therefore, they use little power and generate much less data while capturing motion particularly well. The two Sony chips – 0.92 megapixel IMX636 and the smaller 0.33 megapixel IMX637 – combine Prophesee’s event-based circuits with Sony’s 3D chip stacking technology to produce a chip with the smallest event-based pixels on the market. Prophesee CEO Luca Verre explains what comes next with this neuromorphic technology.

Luca Verre om…

The role of event-based sensors in cars

Progress in future chips

The long road to this milestone

IEEE spectrum: When announcing the IMX636 and 637, Sony highlighted industrial machine vision applications for it. But when we spoke last year, augmented reality and car applications seemed top of mind.

Prophesee CEO and co-founder Luca Verre

Luca Verre: The scope is wider than just industrial. In the automotive industry, we are actually very active in non-safety related applications. In April, we announced a partnership with Xperi, which developed a driver monitoring solution for autonomous cabs. [Car makers want in-cabin monitoring of the driver to ensure they are attending to driving even when a car is autonomous mode.] Security-related applications [such as sensors for autonomous driving] is not covered by the IMX636 because it would require security, which this design is not intended for. However, there are a number of OEM and tier 1 suppliers who are evaluating it, fully aware that the sensor cannot be put into mass production as it is. They test it because they want to evaluate the performance of the technology and then potentially consider pushing us and Sony to redesign it to do so in line with security. Car safety is still an area of ​​interest, but more long-term. In any case, if any of this evaluation work leads to a product development decision, it would take quite a few years [before it appears in a car].

back to the top

IEEE spectrum: What’s next with this sensor?

Luca Verre: For the next generation, we are working along three axes. An axis is around the reduction of the pixel height. Together with Sony, we made great strides in shrinking the pixel height from the 15 micrometers in Generation 3 down to 4.86 micrometers with Generation 4. But of course there is still a lot of room for improvement by using a more advanced technology node or by using the now mature stacking technology with double and triple stacks. [The sensor is a photodiode chip stacked onto a CMOS chip.] You have the photodiode process, which is 90 nanometers, and then the intelligent part, the CMOS part, was developed at 40 nanometers, which is not necessarily a very aggressive node. When going for more aggressive nodes like 28 or 22 nm, the pixel height will shrink a lot.

At the top is a chip marked conventional (front-lit) with a size of 15  u03bcm.  Part of the chip is labeled Photodiode and the other part is labeled Relative Change Detector.  Below is a chip labeled Advanced Sensor (Stacked) that is 4.86  u00b5m in size.  The full square at the top is labeled Photodiode, while the bottom layer is labeled Relative Change Detector.
Conventional versus stacked pixel designProphesy

The benefits are clear: It is an advantage in terms of cost; this is an advantage in terms of reducing the optical format of the camera module, which also means reducing costs at the system level; plus it allows integration into devices that require tighter space constraints. And then the other related advantage is of course the fact that with the corresponding silicon surface you can put more pixels in so that the resolution is increased. The event-based technology does not necessarily follow the same course that we still see in conventional [color camera chips]; we do not shoot for tens of thousands of pixels. It is not necessary for machine vision unless you are considering some very niche exotic applications.

The second axis is around the further integration of treatment capacity. It is possible to integrate more processing capacities inside the sensor to make the sensor even smarter than it is today. Today, it is a smart sensor in the sense that it processes the changes [in a scene]. It also formats these changes to make them more compatible with the conventional [system-on-chip] platform. But you can even push this reasoning further and think about doing some of the local treatment inside the sensor [that’s now done in the SoC processor].

The third is related to power consumption. The sensor is actually low power in terms of design, but if we want to reach an extremely low power level, there is still a way to optimize it. If you look at the IMX636 gen 4, power is not necessarily optimized. In fact, what is more optimized is the throughput. It is the ability to actually respond to many changes in the scene and be able to properly stamp them with extremely high precision. So in extreme situations where the scenes change a lot, the sensor has a power consumption similar to conventional image sensor, even though the time accuracy is much higher. You can argue that in these situations you are driving at the equivalent of 1000 frames per second or even further. So it is normal that you use as much as a sensor of 10 or 100 images per second. Second.[A lower power] the sensor can be very appealing, especially for consumer equipment or portable devices where we know there are features related to eye tracking, attention monitoring, eyelid that are becoming very relevant.

back to the top

IEEE spectrum: Is it just a matter of using a more advanced semiconductor technology to get to lower power?

Luca Verre: Certain use of a more aggressive technology will help, but I think marginally. What will essentially help is to have a wake-up mode program in the sensor. For example, you may have a matrix where virtually only a few active pixels are always on. The rest is completely shut down. And then when you have reached a certain critical mass of events, you wake up everything else.

IEEE spectrum: How has the journey been from concept to commercial product?

Luca Verre: For me, it has been a seven year journey. For my co-founder, CTO Christoph Posch, it has been even longer since he started researching in 2004. To be honest with you, I thought the time to market would have been shorter. And in time, I realized that [the journey] was much more complex for various reasons. The first and most important reason was the lack of an ecosystem. We are the pioneers, but being pioneers also has some drawbacks. You are alone in front of everyone else, and you have to bring friends with you, because as a technology supplier you only deliver part of the solution. When I laid out the story of Prophesee in the beginning, it was a story of a sensor with great benefit. And I thought, naively to be honest, that the benefits were so clear and straightforward that everyone would jump at it. But in reality, even though everyone was interested, they also saw the challenge of integrating it – to build an algorithm, to put the sensor inside the camera module, to interface with a system on a chip, to build an application. What we did with Prophesee over time was work more at the system level. Of course, we kept developing the sensor, but we also developed more and more software assets.

At present, more than 1,500 unique users are experimenting with our software. Plus we implemented an evaluation camera and development kit by connecting our sensor to a SoC platform. Today, we are able to give the ecosystem not only the sensor, but much more than that. We can provide them with tools that enable them to clarify the benefits. The second part is more basic to the technology. When we started seven years ago, we had a chip that was huge; it was a pixel distance of 30 micrometers. And of course, we knew that the road to getting into applications with large volumes was very long, and we had to find that path using a stacked technology. And that’s why we convinced Sony to work together four years ago. At the time, the first backlit 3D stacking technology was becoming available, but it was not really available. [In backside illumination, light travels through the back of the silicon to reach photosensors on the front.
Sony’s 3D stacking technology moves the logic that reads and controls the pixels to a separate silicon chip, allowing pixels to be more densely packed.] The two largest image sensor companies, Sony and Samsung, have their own process internally. [Others’ technologies] are not as advanced as the two market leaders. But we managed to develop a relationship with Sony, access their technology, and manufactured the first industrial-quality, commercially available, backlit 3D stack event-based sensor that has a size that is compatible with larger volume consumers. the beginning that this had to be done, but the way to reach the point was not necessarily clear. Sony does not do this very often. IMX636 is the only one [sensor] which Sony has developed together with an external partner. This is for me a reason to be proud, because I think they believed in us – in our technology and our team.

back to the top

Leave a Comment