Welcome to the Camera and Photos group lab. My name is Sergey. I'm part of the developer relations team here at Apple. Today, I'm joined by a panel of experts from Camera and Photos engineering team. I'll let them introduce themselves. Matt, let's start with you. Hi, everyone. I'm Matt Tekoff, and I work on the Photos Frameworks team. I'm Brad Ford. I work in camera software and have had the good fortune of working on every single iPhone in my 25 years now. Hi, my name is Ivan Cabero-Villabunde. I work in camera software as well as Brad, and I specialize on the still capture pipeline. My name is Davide Conchon. I work in a team where we do file format, compression, and RAW. And I'm lucky enough that I've been working at Apple for 19 years now. I still love it. There is every day something new to learn. It's amazing. I'm Jake. I work on camera performance. And I've not been here for 19 years, just five years. Not too bad. So in addition to those on screen, there is a team behind the scenes helping us triage the questions. We are so excited to talk about building exception experience using camera and photos frameworks. And today, we want to focus on questions that benefit everyone watching. So for code specific questions, or if you somehow didn't get to answer your questions today, go to the developer forums to continue the conversation. Also, you can use Feedback Assistant to file box or submit feedback requests. And to get started, I would like to ask a question to all our panelists. So today, camera and photos apps on the iPhone are packed with features. There are so many of them. And I'm just curious, what is your favorite feature on camera and Photos app? Maybe, Matt, maybe you can start. Sure, yeah, there are a lot. Honestly, the one I love the most is probably live photos. I think the, you know, especially with kids and nieces and nephews that live far away, getting this sort of like slice of time photo plus video and sound. It really like brings those moments to life. So live photos is probably my favorite. Yeah, it's great. Right. What about you? Yeah. Live photos is kind of the OG, isn't it? It's an Easter egg. My mom didn't know for years that she was taking live photos. She says, how do I turn on live photos? And I said, you already have live photos in your photo roll, mom. I, like a lot of us, spend a lot of time on my Mac in meetings. And so I love the video effects, the system wide video effects like portrait, studio light, center stage. The fact that you get them for free in any app as a user is very powerful. Recently we've added background replacement, gestures, whimsical, delightful, and you know add value to conferences. So I really love those and all the new features on the iPhone 17 front-facing camera which I hope we get to answer some questions. Yeah for sure. Yeah thank you Brett. Ivan what about you? The one that's near and dear to my heart is we refer to it as as opportunistic depth capture or portrait in photo. And this is the feature that I introduced a few years back where as long as there's a person in the scene, you'll see a little F in the camera app. And when you take a photo, we will get depth for you. And you can go and add the portrait effect afterwards, if you so desire, with all the controls that you want. That release, we did a ton of work to actually make the depth processing a lot more efficient and available at all kinds of zoom levels and be able to take advantage of deferred processing. So these are all things that are available for developers to adopt in their camera applications as well. That's great. Lavey Deir. Yeah, for me, it has to be pro-raw for two reasons. Number one, I've been working on that feature for so long, you cannot even imagine. And the second reason is that it is a vehicle for the rest of us to access features from the camera that usually are just for the pros, the people that know how to edit. With the ProRA format, which is available on camera and even for third party, you can actually collect the full spectrum of what the iPhone can give us in terms of image quality. And then on the other side, if you like to do so, you have the latitude to embellish these images or modify them the way you like. Because even though images coming out from the iPhone are definitely amazing, every one of us has a little bit of a different taste. So Pro-Roy is actually allowing you to do so. Yeah, makes sense. Thank you for sharing. Jake, what about you? Yeah, I think I've been having a lot of fun with the new Tele on the iPhone 17 Pro. Like the zoom on it is insane. Like it'll be like 40X. We were at Yosemite like a few months ago. - Yeah. - Hiked up to the top of Glacier Point and like I was zooming in all the way through the valley. It was like incredible. - Were you able to see? - Yeah, you can actually see people. Yeah, they kind of look like ants from that far away but it was like pretty cool. - I should try that. - Yeah, yeah. - It's amazing. - A lot of fun. - Thank you. Thank you for sharing your personal stories, everyone. And with that, I would just, I want to jump straight into questions because we have so many of them and you know, I want to give as much time to answer them. So the first question is from user with the name Florent in INF. And the question is, when ReframeR clean up is used to modify a photo, does iOS 27 tag it with any metadata or content credentials so someone can tell the image was edited by AI? Oh, David. DAVID SOUTER: I would like to take this one. DAVID SOUTER: Go for it. It's just because we do work with metadata every day. So yes, the answer is yes. There is the metadata in the file is being modified with IPTC metadata together with EXIF. And the IPTC gets updated even based on which of the AI modification that you're doing, spatial, reframe, or even cleanup. So yes, there is a way to understand from the metadata what has been done today. - That's great. - Just to add on a little bit of that, in the Photos app, when you do one of these edits, in the info panel, so when you swipe, you know, on iOS, when you swipe up at the bottom of the info panel, we display this information about which edit was used. - That's great. Thank you for sharing. All right, so the second question is about the keywords in photos. So the question is, keywords were mentioned feature coming to photos. Will there be an API for third party apps to use or edit keywords? Yep, I can take this one as well. Yes, so the feature I think Ben's referring to there is on iOS now. It's a feature that's existed on Mac for a long time in the Photos app. You can see in edit keywords again in the info panel that sort of UI I was just mentioning, and there's also UI for sort of managing these keywords and searching by them. These will be reflected in IPTC metadata when you export. But there's currently no sort of photo kit level API for fetching or querying based on these keywords. GABRIEL SANCHEZ: Gotcha. OK, thank you. So the next question is from a user with the name sjk_27. When showing thumbnails of images in a lazy grid view that also have a matched geometry effect to a full image detail view when tapped. So what's the optimal way to load the thumbnails images for performance? For example, respective smaller size stored with OG or use some CI filter to scale them down on it. I don't know who is-- - I can take this one. So on the core graphics side, there is a property when opening images called open image with thumbnails. Internally, core graphics will understand if the image has already a thumbnail to be utilized, or it will try to decode and scale down as fast as possible the main image. And on the core image side, we have ways to request a scale factor when we open images. And depending on your UI, you can ask the scale factor to be as small as you wish. And this internally instructs core image to actually scale down the image as soon as possible during the process so that everything that happens after then is going to happen as fast as possible. OK. That's great. Thank you. We have one more question from Ben on Deferst Start. And maybe, Jake, you can take that. Yeah, absolutely. So the question is, can Deferst Start cause issues where the user may attempt to capture a photo before the capture photo output has attached to the session? Yeah, it's a good question. So I actually talk about this exact point in my dub dub session this year, the filter responsive camera app that launches quickly. So yeah, with deferred start, basically what you're doing is just pushing out initialization. So if you only do that, to the asker's point is, yeah, you may actually still miss the shot. So the photo capture output actually has an is responsive capture enabled property. You set that to true while also deferring it. Well, actually, the system will add some buffering. So you can launch quickly, get the capture, even if we haven't fully initialized the output. So you don't end up missing that moment. We still queue up that shot. This is great. And by the way, that session is awesome. If you haven't watched it, please go ahead and find it in a developer app. It's called, let me find the name, Build a Responsive Camera App that launches quickly, and we have a cool domino effect. Yeah, you should watch me play with dominoes for 20 minutes. And just for background for anyone who hasn't watched it, isn't sure what we're talking about, that Deferred Start API was introduced in iOS 26. It's a way of telling your AV capture session outputs, which of them should start up deferred and which are most important to start right away. So usually you could say, I want the preview to start first. Everything else can be deferred. Yeah, exactly. Yeah, all about that fast launch experience. Perfect. All right, so the next question from Yeleon. I hope I pronounced the name correct. What is the best way to get the depth map and the image for a live viewfinder and for the taken image to create a nice 3D effect with both? Okay. I can take that. Go ahead. So thanks for the question, Yelion. So there's two parts to this. Part is the preview side and part is the still capture side. For the preview side, you want to use depth data output enabled on your preview stream, and you want to use the AVCapture synchronizer so that it's synchronized with the RGB stream. On the still capture side, what you want to do is enable depth on the AVCapture photo output, and that will ensure that you get depth with the still capture so that you can use – So then you can use filters like in core image to add the blur based on the depth that's captured into the still. Yeah, there are levels to it. If all you're interested in is having a live depth effect in your preview, there's a shortcut. You could just turn on the cinematic video capture API, which we introduced last year, which lets you replicate exactly what we do in the cinematic mode in the camera app. So if you just use a video preview layer and you set cinematic video capture enabled on your device input, that will give it to you for free. So if you just want something cheap and easy, you got that. What Yvonne talked about is absolutely right. If you want access to the actual depth samples and you want to kind of do it yourself, then video data output plus depth data output, and then use a data output synchronizer to make sure that you get the depth and the video for the same timestamp in the same callback. And then you can render them together using a core image filter. Makes sense. And the specific, thanks, Brad. And specifically about the specific API for the still capture side that you want to be looking for is in the AV capture photo settings for the AV capture photo output, you want to go ahead and set depth data delivery enabled to true. Okay, it's good to know. Thank you. So much useful info. Thanks. Next question is on photo quality prioritization from Eric. So the question is: When photo quality prioritization is equal to quality, the AV capture device often overrides the manual exporter duration and ISO settings that it has been set when capturing a photo. Is there a way ahead of time to find out what the final exposure setting will be before the photo is captured? You want to take it? Sure. So from an API standpoint, actually it's not just quality, but also balanced. We choose because we're doing processing algorithms, most of which include fusion, which need to do things like capture different exposures, like an underexposed image as well as a normally exposed image. We do not give control to any of the manual controls. The manual controls, in fact, are only supported when you capture it with speed. If it looks like balance or quality are respecting your settings, that's because you're getting lucky. But in general, there's no guarantee that the capture stack will make that determination depending on the scene, the overall brightness, and so on to figure out how to expose. Whenever you choose balanced or quality, you're essentially handing off the decision-making about how to expose this to us. And if you really need that level of control, you should use manual. Yeah, we realize it's a little opaque that you signal to the framework, I am willing to take this quality prioritization. So I'm willing to wait or I want to balance or I just want it as fast as possible. But we use that as a hint. So depending on which format you're using, if it's the photo format, we are going to use our best algorithms, our best fusion algorithms. And we're going to ignore your manual settings. For some of the video formats where we choose not to use those Fusion formats because we don't want to disrupt movies and preview, you may still get a speed capture, and so you might still have your manual settings preserved. So the rule is if you're using photo mode, quality and balanced are always going to override your manual settings. If you really care about having exactly your shutter speed, ISO, and things preserved, then you should definitely use speed for your photo prioritization. That's great. And I believe this was pretty well covered in the Implement High Resolution Photo Capture session this year. Yeah, great session. Check it out. Check it out and find more answers there. Great. The next question is on PH Asset framework API. Yep. So can you elaborate on PH Asset original resource choice and what it's used for? There is no much documentation. Yep. Yes. So, yeah, this is a new API for an existing feature. So for a long time in the Photos app, this is related to RAW plus JPEG. So, you know, it's very common for DSLR cameras to have a setting where you shoot raw plus a JPEG, or these days, heek compressed in a raw image. And then when they're imported to the Photos app, you'll see badges that say raw R plus J. And when you go into edit in the Photos app, there's a way to choose, am I doing my edits on top of the raw image, or am I doing my edits on top of the JPEG? And users can kind of swap between them. So that's what this original resource choice is. Is it saying, is the compressed image the original, or is the raw image the original, and what am I using to do edits on top of? And also, what are we using to make smaller image derivatives and thumbnails and such? Which resource are we sourcing those from? So, yeah, there's a handful of APIs there around the pH content editing input source, choosing which one. There's a change request on assets for toggling which resource is the original. Yeah, I believe those are the main ones. Great. Thank you. Next question is on Bayer Row Capture Preview. So will IES 27 support the linear scene referred preview stream for Bayer Row Capture via AV Capture video data output or AV capture video preview layer without tone mapping or computational processing so it can match a linear CI row filter DNG conversation. And they actually linked related feedback assistant, which is great. Thank you for submitting that feedback assistant. I want to take this question. I can take this. So today the only way to actually get scene-referred linear data, you know, in camera capture is actually through the log format. We have two flavors of it. We've got log and log 2 introduced last year, which is an improved gamut of the log. Now, in terms of, you know, if I'm drilling deeper into the question, I guess I'm assuming that the developer would like to actually have raw data coming out from camera capture and being able to use the same filters that we use, say, for images. Now, that is not available today. Raw frames coming out for ProRes RAW, for instance, for capturing ProRes RAW from camera capture, do come with certain metadata. And that metadata is not compatible with the metadata that CI or filter would need for rendering that image. But I kind of like the idea. And thank you for the feedback request. I believe it's something that is worth exploring, definitely, for the future. Yeah, it's great, Davide. Thank you. I really love the next question. I think it's more holistic, more high level. Maybe everyone can take a turn and give their own suggestion. So the question is, for an app that lives and dies by its capture experience, what are your must-have recommendations? I know, Jake, do you want to go first? I mean, I'm going to be biased here. But I'm going to say you want to have that most performant app as possible. I think when you have a capture app, people are using it to capture life's moments. And if you miss the shot, sometimes you can't get that moment back. So keeping things fluid, you launch quickly, you get that shot, responsive capture that I think we've had session in 2023 about. And then Mohin in the most recent session talked about that and deferred processing. I think to me, those are your top three things. Going to have a fast launch, responsive capture, and use deferred processing. Can't go wrong with that. MAHIN HANSON: Sure. Thank you. Have a day. MAHIN HANSON: I think make it pro. had any capability that are available for reaching maybe the part of the world that would like to actually interact with those assets with more personal flair. So, keying off of that, I'm going to say maybe not make it pro. I'm going to say, like, the lives or dies by capture experience is one of these things that is, you know, it's vague enough that we're not really sure what you mean. Like, an optimal capture experience for something that's intended to capture video for a social network is very, very different from a pro-photography application. You can have a photography application that actually tries to lean into the super fun, like remember Hipstamatic when it first launched. So it really depends what you mean by the capture experience. That said, I will echo the performance stuff. Users really don't like to wait. The performance is critical. Reliability is critical. There's lots of things that you can do to make it more reliable and more stable. We have tons of tooling for you to analyze your performance and track the kind of issues that you should be addressing. But 100%, like the takeaway, I would say, focus on the user experience based on who you think your users are going to be. Who is this for? Thank you, man. Yeah. We realize that AV Foundation is not an easy framework to get into. It's huge. I think we're the second largest framework in iOS after UIKit. So it can be daunting. You know, you look at all of these properties, all of these classes, and you're like, where do I start? My recommendation is to not start from scratch. Use the sample code, because there are best practices that you might miss out on if you just, you know, peruse the documentation and then start writing. A common problem we see is that people use their AV capture session on the main thread, not realizing that it blocks to do certain lengthy operations. And we, you know, you would know that if you looked at our sample code that you need to have a dedicated serial queue in talking with AV capture session. So right there, that's going to go to the performance, to the responsiveness. Don't lock up your UI thread doing stuff that's meant to be done in a background thread. Also, whether it's pro or not pro, figure out what differentiates your app from others. I mean, I think camera apps are a dime a dozen. We give you so many tools, everything from pro use cases like GenLock and Locked Frame Duration, which we just gave you last year, to very simple record a movie file. It's all there for you. but what are you going to do in your app that differentiates it, that draws people to it? Is it a social aspect? Is it a key feature, a key value add that you've got that we are providing the toolbox, but we don't provide everything for you? Makes sense. Thank you, Brett. I'd be remiss if I didn't talk about the photos framework side of it. I think it's hard to talk about the camera app is great for taking photos, and then those photos need to go somewhere. And so the integration with the photo kit, both Photos UI and Photos Framework, it starts with the user permissions. By default, your app does not have any access to save or read back photos from the photo library. And you can sort of gradually increase that. You can request to just save assets. That's a very simple request that most users are fine with. And then if you need to read stuff back, you can upgrade your permissions through prompts. there's sort of a very kind of tight coupling and, you know, performance considerations there around capture to review, whether that's in your app or someone's jumping to the Photos app. Yeah, that's a great advice. I think you're one. I like the, you know, pro, not pro discussion. I think we should give the battle for him, yeah. Oh, I love the pro apps, too. Like, I have some near to my art. I was just, I thought it would be interesting to, like, offer some contrast and look at the more fun apps as well. It's super helpful. 100%. All right. The next question is on... So the question is, what is the optimal and recommended way for streaming video and audio simultaneously? My concern is that there will be a delay in transmission and the video and audio will not sync. Specifically, what is the most optimal way for streaming the audio and is it safe to stream frame by frame of the video? I can take this one. Sure. Audio and video synchronization is no joke. You should pay attention to it. Don't have incidental or coincidental sync. Our video and audio on modern iPhones is synced from the same clock. So if you don't handle your AV sync, you might think you're okay and ship an app that is usually in sync, except when you run it on an iPad with an external camera, and then the audio and video suddenly become out of sync. The first recommendation I would have is use AV Capture Session for both your audio and your video. In other words, attach a device input for a camera that you're interested in and a device input for the mic that you're interested in, and then if you get audio data output and video data output, the AV capture session will already do the hard work of synchronizing those two sources so that what comes out of those outputs, the PTSs are already on the same timeline. So whatever you want to do with them, stream them, write them to a movie file, whatever, they're already in sync. If for some reason you need to use a different audio API such as AU Remote I.O., then you're going to need to do a little bit more work. Then you're going to need to delve into clocks like CM Clock. Every device on our system is backed by a time source or a CM clock. So the video will be on one clock. The audio will be on a different clock. And it's important that once you get samples from those two sources, you synchronize them using these clocks. And there is a low-level API in core media called CM clock convert time. CM synchronization convert time. Something like that. which lets you say from this clock to this clock, give me the PTS. And you give it the PTS in the source timeframe, and it'll give you the output PTS. So what you're going to wind up doing usually is keeping your audio time because you don't want to have to rate convert the audio, but you will synchronize your video to your audio clock, get a different timestamp for that video buffer that came out, and now they're both on the audio timeline. So as far as meta concerns about things coming out, if you're streaming them over a network, well, then you just have to rely on the timestamps that you had them in sync before you sent them. So when you get them on the receiving side, the audio and video have a coherent timeline, then you're going to need to take care of the playback synchronization on the playback side. And we have plenty of APIs in the rest of AV Foundation on the playback side, such as with AV Player, AV sample buffer, display layer that can take care of the synchronization for the playback portion. Sounds straightforward. Yeah, it's totally easy. Very easy. Is audio ramping easier to perceive than video ramping or something? I think the science tells us that people perceive changes in audio sample rate more readily than they do micro-adjustments in video time. So generally that's the better thing to do is adjust the video to the audio than vice versa. Unless you're going to sample rate convert the audio. Audio is much more important for experience. I mean, just a tiny glitch in audio and people will hear it. Yeah, it's super sensitive. One thing, what does PTS stand for? Oh, presentation timestamp. Thank you. I think everyone asks the question, yes. All right, we got some really focused questions up for it. So, Davide, your turn. So next question is from a user with a username from number 16. And the question is, why does ProRow support 48 megapixel output from quad buyer sensor whereas native Bayer RAW output is limited to the binned resolution? Yes. So to answer this question, maybe we need to differentiate what is a Bayer RAW from a ProRow. So Bayer RAW contains Bayer data in the file, while ProRAW has gone through a debayering step and photonic engine merge of multiple images to get that output. So at that point, ProRAW does not care anymore what is the format of the sensor. It may be Bayer, it may be Quadra. It doesn't really matter because the data that comes out is in RGB already linearized. Now, to go to this question, why ProRaw can do it and Bayer can't, it's because ProRaw is linear, so we are able to do that. Now, the ability to do Bayer for Quadra sensor is not yet available. And if the developer would like to have actually this capability, I would highly suggest to send a feedback request. Because it's something that if many developers were to be wanting, could be something that we can look at as well. Yeah, so please submit that feedback assistant. We actually read them. And if I can add a little bit to that. The other aspect is that this raw output wouldn't be Bayer. It would be quad Bayer. That's the thing. The Bayer output, the raw Bayer output from a quad Bayer sensor is the bin sensor. We bin the quad pixels into Bayer. And the other aspect of this is that this is an ecosystem question. We can't give you quad Bayer DNGs raw output without giving you a way to decode them. So it has to come with support in CI raw filter, et cetera, et cetera. So it's a heavier lift than what it sounds. Yes, and maybe to add even to that, debayering Bayer data, Bayer sensor is a technology and skills that have been developed for now 25 years. De-bayering Quadra is a completely different beast. It's way more complicated than one may think. Interesting. Thank you for sharing. All right. We have one more question for the ProCamp. What is your recommended process to generate and write ISO-conform gain maps and gain maps metadata for HIC and JPEG on iOS? Yes, wonderful question. Without going into the details of the API, I'm going to send the developer directly to a WWDC talk that we had two years ago where exactly this question is being, you know, drilled into. We have two major frameworks that can handle game map and ISO game map data. One is core graphics, and the other one is core image. For both these cases, David, one colleague of us, is giving exactly what are the APIs, how to do, how input need to be prepared to then create outputs that are game map compatible. And there is a third way that is possible. The specification for game map have been added to both the HIF spec and JPEG. We have been working, Apple has been working to add those spec. And they're very, very clear. So that's another way to even monitor understanding what this metadata does, why this image is divided in two, in terms of what is your SDR or RGB content, and what is the HDR addition to it. So that's a third way to-- if the developer likes to go about it. But again, for APIs on the system, that talk will give you all the information that you need and even power that you cannot even imagine this the core image side can do can can control for the output almost everything what is the look what is the headroom being utilized how how the data is is merged together so you will have fun to go through that presentation. That's great. Thank you for referring to that presentation. Next question is about the first start. People actually care about launching apps quickly. So the question is, the first start says hold the photo output back. The new high-res guidance says warm it early with set prepared photo settings array. A contract, a decade older than deferral. What's a supported composition? Does prepare queue pass deferral or force a start? And what does each reserve? Yeah, I could start with that. Yeah, I think the way I view deferred start is really, it's just all about getting that launch up and getting preview running. So you're just basically moving the initialization from before preview to after preview. So I don't think, like, in terms of using the warm on the photo setting array, like, you can basically still do that after, you know, previews running. I don't think it really makes much of a difference with deferred start. I think deferred start is just moving that initialization out. Yeah, one way to think of it is imagine a graph of objects where they have branches going out for preview and for photos and for movies or whatever you're making. The deferred start just says, like, we don't need to resolve all of the objects, buffer pools, et cetera, on all of the branches to start. We just have to get preview rolling as quickly as possible. These guys can get done when they get done. Whereas the prepared settings array, which you're right, has been there for a long time, is a way for you to tell the photo output up front, here's the worst it's going to be. You know, this is the most crazy thing that I might ask for. quality plus whatever other features. And that lets us pre-allocate for the worst case in the still image pipeline. But you can do those at any time. You can re-prepare at any time. It's helpful if before you start your session, you do call set prepare settings array. Tell us what the worst case is going to be. But I think these two APIs are kind of orthogonal to one another, would you say so? Yeah, I would agree. And it is actually interesting, though, that the photo output quality will impact launch time. if you don't use deferred start. With the speed capture, the launch is actually going to be faster versus qualities. And to your point, we have to do all these heavy allocations. So yeah, I agree. I don't think they collide. And they can complement one another. Yeah, exactly. Yeah, makes sense. I like when people agree. Let's shift gears a little bit and talk about photos for a little while. So the first question I have is, will adjusting ph assets new rating property require ph library change request? Yep. It's a ph asset change request. You'll see the rating property there that you can change per ph asset. And there's a new enum. It's like values of unset and 1 through 5. So that's how you can modify PHSR ratings via the API. OK, thank you. The second question I have is, is there a native way to obtain metadata of OG file type-- PNG, JPEG, et cetera-- from the photo selected? And they clarified, I ask this because every time a photo gets imported from a photo speaker, it seems to only show as PNG. and a follow up on that, does it mean it's getting converted before saving it to our app folder? - Gotcha, so there's a few things you should check. So when using the transferable to set up the data, be sure you're setting the UT type as opposed to, it sounds like maybe you're leaving it as a default image type. So specifying UT type is one thing to look out for. I believe there's some sample code on developer.apple.com for the photos picker from a session a few years ago that should cover this. And then there's also, I mean, it sounds like you're getting PNG out, so I wouldn't necessarily expect this, but there can be some conversion that happens in the photo picker, for example, if user has disabled captions or location from being given to your app via the picker, we may convert from some formats like raw, but it sounds like the PNG is the output. So I would double check the UT type for transferable. Makes sense. Thank you. And the third question I have. Is there a way to learn about the adjustment PLIS format that the Apple Photos app writes? And when exporting that file, is there a way to import those adjustments back into Photos? So we do not have API for that. So if that's something that's desirable for your app, definitely file a feedback request. There's no way to sort of decode that plist, unfortunately. - Okay, makes sense, thank you. Let's switch gears again. Let's talk about some prof features again. What is your recommended process to generate and write ISO conf, oh, we already answered this. Sorry, my bad. Oh, actually, there's one really high-level generic question you can go to. What are the biggest mistakes a developer can make when building a camera-heavy apps on iPhone? Quitting your day job. A couple of really common ones would be, like I said earlier, The AV capture session is meant to be called on a background thread, on a dedicated serial queue. It, by design, blocks and waits when it does long, you know, reconfiguration of the graph. This is clearly documented. Hopefully you've read the documentation. So, you know, the most naive thing would be to just call AV capture session, start adding things to it, call start running on the main thread. It will block your UI. You will get little spinners, and people will give you one star. So don't do that. The other one is when you're reconfiguring your session, usually you're going to change more than one thing at a time. You know, like if you just set one property on it, it doesn't really matter if you call begin configuration and commit configuration because you're just doing one thing. But imagine you're changing from a photo mode to a video mode or you're changing from, you know, one high resolution active format to a lower one. Usually you're going to do several steps there. Implicitly, the AV capture session will reconfigure its underlying graph every time you call a property that causes a disruptive change. The way to prevent it from doing this each and every time you set a property, like on the way to getting to what you really want to do, is to call begin configuration. Think of it like an ATM where you need to, like, this is the start of the transaction. I'm going to do a bunch of stuff, but hold on. Like, don't do anything until I'm done and I commit at the end. And if you do it that way, then you ensure that you do one operation or 20 operations, the graph is not going to reevaluate and reconfigure itself until you say commit. So those are the two that come to mind as, you know, rookie mistakes if you're not using those two features. Any other ones, guys? I was thinking about video data output if you're using that to render preview. I think sometimes it's pretty easy to, you know, you get the frame data and you're all excited to do some processing, You could end up actually dropping frames if you're doing too much of heavy lifting in there. So using preview layer, AVCapture video preview layer to render preview, if that's just your motivation, is maybe a better option. Right, right. Yeah, the only real reason to use a video data output for preview is if you need to interact with the buffers in some way. If you need to either get metadata from them or draw on them or meter them for histograms or something like that. But yeah, our video preview layer is really, really efficient. I don't think you're gonna do better than the optimized path that we have. - That's great, yeah, thank you, Jake and Brad. Any more? Let's move on to the next question. So the question is about zoom scrolling pan feature. Any suggestions on a close to native performance way to achieve zoom scrolling pan feature for photo images? For example, up to only its original max resolution without pixelating. And David, you want to take this? So I believe the best way is actually to use CoreImage. CoreImage, as a framework, is actually built for adding this ability of having UI movement, zoom, pen. And it does that by caching almost everything that happens throughout a decoding pipeline. So I do believe there is a talk or a document on the developer portal that actually does explain exactly this and shows how to set up a CI filter for opening images that then would allow you to modify what you do. For instance, for a zoom, decode only this rectangle. Everything is cached and moving this rectangle around. So it's a really powerful framework for doing exactly this stuff. One thing to note is, just because I come from the ProWork workflow, on the raw side, CoreImage has also its counterpart CI raw filter for it. It has the same capabilities. So even if you open 100 megapixel images and then try to pan around, all these features, all this capability of CoreImage still are present. So your scrolling, your zooming will still stay very, very smooth because of this caching capability that CoreImage has. It's great to know. Yeah. If I can echo something, riffing off of that, the region of interest, the ROI management in core images, is really one of the places where it really shines. If you, for example, have an image that has a number of heavy operations that are applied to it, and then you zoomed in a lot, and you move around, it's only going to go and recompute the area that you're zoomed into, even if the image is many times larger. So it will do all kinds of optimizations like that to make sure that you're not essentially wasting cycles on things that are not going to impact your application. Exactly. Yeah, thank you. Got some really highly upvoted questions. Let's go back-- let's go to those. So how many times can I use as an intelligence feature? And also, when using it, does it affect or change my picture quality? Who can take this? This is about the Siri camera, yeah? Yep. I believe so. I don't know if you want to take it. Yeah, sure. I'm happy. Yeah, so on limit, you can use it as much as you want on the device. As far as the quality, those captures don't go to the user's normal photo library. They get saved to the Siri app. And as far as quality, their screen resolution aspect ratio, so you're not getting the full-blown image quality that you get in photo mode, for example. Yeah, so don't use it if you're trying to get the most beautiful photos. In order to have a conversation with Siri about that photo. I believe there is some limit on the number of conversational things you can say about a photo or ask about a photo. Makes sense. Thank you for that. Next question. The new spatial reframe feature uses 3D modeling to reconstruct a scene from a different angle. As a developer, can we access a pipeline via an API, or is it logged to the photo app? So, yeah, there's no developer API there for using that reframe capability. There are some parts of the ARKit framework that allow for 3D scene capture and reconstruction and manipulation, but it's not a sort of out-of-the-box photos, reframe, edit solution. Right. And we have, if you're interested in live captures that make use of, it's not exactly spatial features, but we do have depth sensing cameras that we can use. On the Pro phones, we have the LiDAR sensing camera. So we do have an AV capture device that uses the LiDAR depth camera as well as the RGB camera and fuses them together to give you depth. And then if you're using the front-facing camera, the true depth camera, there's infrared that you can use for capturing depth. But that's a good feature to request as a feedback assistant. Yeah, sure. So file that feedback assistant request. Next question is about rotation handling. When can you get the rotation issue just out to handle by the camera capture? Since on iOS and macOS and iPadOS, it always seems to handle orientation differently, I spend so much time tweaking this for any new app and always get it wrong. Geometry is hard. Yes. I can't tell you how many times I've taken a piece of paper and drawn a smiley face and then like drawn it in all the different orientations and then flipped it around. It's just reality. You know, like you're dealing with an ID, a device that's got a camera in it. The camera is physically mounted a certain way. A different camera in the same ID might be mounted a different way. One of them might be portrait. One might be landscape. there's really no getting around the need for dealing with orientation. It's not something that we can authoritatively do correctly for you because we don't know your intent. We could say, well, gravity is here, so probably they want this to be up, but maybe not. Maybe it's an always portrait app and that's the wrong thing to do. So this is why we don't just automatically change, you know, pull the rug out from under you, and you do need to deal with rotation. But the good news is we have some good tools that we've introduced in the last maybe like three years. AV Capture Device Rotation Coordinator is something that can help you either dial in what the correct rotation degree should be for preview or to keep something horizon level upright. So we have those two flavors that you can ask it. And based on that, you can know how you need to rotate the images afterwards. And then you can also just tell the framework to do it for you. By talking to your video capture connection, you can set the video rotation angle on it. And then we will automatically rotate it for you. The way that we rotate it may be different depending on the output. If you're dealing with video data output and you tell us to rotate the video, we will physically rotate the buffers. If you're using a photo output, we don't have to physically rotate them. We'll use EXIF data to tell it, rotate it on playback. And same thing with movies. It would be great if, for the asker of this question, who was it, Joshua Arzenich? No, Michael Rowan. Great question. We have the new iPhone 17s, for the first time, have a front-facing camera that's oriented differently than on previous iPhones, which has caused some people some consternation. We talked about that in, a member of my team, Tracy, talked about that in a session about the new center stage camera, and it's called Support the Center Stage Front Camera in Your iOS App. And there is a portion of that session dedicated to the vagaries of rotation and how to do it correctly, you know, so that you're defended against future changes to ID and rotation. You can write the code now, and it'll be correct in the future. That's great. Thank you. The next question is about a 24-megapixel image capture from Joshua Arzensek. I hope I pronounced it well. Is it possible to capture 24-megapixel images with depth data on deferred photo capture? Yes. So we covered 24 megapixels in the session, in one of the sessions this year, regarding capturing high-resolution images. And there's a couple of details that you need to follow up. First, 24 megapixel processing is time-consuming enough that we require that you opt into deferred processing. And second, you should make sure that you opt in into the max photo dimensions, so it actually supports the device that you've selected supports 24 megapixels. But if you do that, if you select quality prioritization as well as enable depth data delivery, you will get captures with depth. Great. Great. Let's go to – So many good questions. Yeah, really good questions. Thank you so much for submitting all these questions. We have a really good one here. A group labs shot an iPhone. Also, what APIs and technologies can you use to build such a multi-camera streaming apps? Nice. The answer is yes. I'm staring at an iPhone, right? There are many iPhones in tripods. We're seeing iPhones everywhere. You know, last year we did so much with iPhone 17's front camera. We called it the year of the selfie internally. But it was also the year of other cool stuff. It was probably the biggest year we've ever had for pro video. So we had a number of new features introduced for, you know, pro settings. You can now use your iPhone and get, you know, amazing quality out of it, like the ProRes and ProRAW that we've talked about. but also just for a functionality like if you want to do a multicam shoot. There were two new features that were specifically great for this. One of them is locked frame duration. So if you're working in a pro environment, you need to make sure that your frame rate is exactly 29.97, no deviation ever. It's not enough to just try to set your min and max frame rate to the same thing. We introduced a locked frame duration API that ensures that the video is rock solid 29.97, or 24, or 25, or 60, or whatever you need it to be, and that the audio is synchronized to that. Furthermore, there is an extension to that, which is that you can ask for an external sync source to be the one that multiple iPhones synchronize to, which is called GenLock, and it's something that's used in pro environments all the time to make sure that all of the cameras are synced up, and when you go put them all in a timeline, there's no tearing. Everything is exactly at the same start time. And so using a third-party Blackmagic ProDoc, you can plug one of those into an iPhone. You can use an external GenLock generator. You can put them all on that same time source, and then all the recordings will be perfectly synchronized when you bring them into Final Cut later. And then lastly, we also introduce timecode generation. So you can get timecode from an external source through your AV capture session as an output now. And you can associate timecode with video frames that you store to a movie. So in addition to having metadata, video, and audio, you've also got a timecode track, which makes it much easier in post to go and say, you know, go to timecode number, whatever. So, yeah, we're using it here. There's also a ton of API that you can use for that, too. Yeah, 100%. Absolutely. Maybe just to plug the AVPro Video Storage API that we just released this week. So that's if you're using ProRes for your 4K 30 video captures, we're writing a ton of data to disk. And AVPro Video Storage really helps to give you that deterministic file write speeds so that way you're not dropping any frames. So we talked a little bit about it in the session that I had this week, but there's a lot of documentation on it. It's, I think, a pretty cool feature for people to start using. Totally. That's great. Thank you. And I think we have time for just one more question. Let's just give it a brief answer. Is there a niche new API that may not be talked about so much? I think Jake just hit it there, that brand new one that we introduced. It's a great performance optimization. Yeah, if you are interested in ProRes, if you want to capture to ProRes, you really should use this new ProVideo Storage API. It gives you a pre-allocated file on your disk that you can write to. Even if your phone is old and it's got a fragmented disk, you can be ensured that it's not going to drop frames. So, you know, we're retrofitting older phones with the same API you can use it, and we encourage you to use it. It may have worked before okay, but now you can ensure that it will work well for years to come. It's great for a professional setup like this. Yeah. The settings UI is actually pretty nice for it too. You get to, you know, users can control exactly how much they actually want to dedicate to this. It's just, you know, a little bit for your whole disk space. Another great one, Davide, is the ProRAW. We have a new RAW 9 engine that is on the 27 version of the OS. that I think is going to blow everybody's minds away because it's an ML solution for problems like debayering, denoising, raw files from third-party cameras, and the outputs are outstanding. Yeah, and unfortunately we won't be able to double-click on that one today, but there is a great session by David Hayward, our veteran on the camera team, enhanced raw image processing with its core image. So anyone who is interested in that, please watch that. And it talks about the difference between row 8 and row 9. And that's about all the time we have today for this group lab. Thank you, everyone, for joining us today. And big thanks to all our panelists for such an insightful conversation. I hope you enjoyed it as well. As I mentioned earlier, if you have more questions, developer forums is a great place to get them answered. And don't forget, you can file box or feature request with Feedback Assistant. We actually do read them. And actually, speaking of feedback, we'll send you a survey link via email. And your participation in this survey is so important for us. It will help us make these feature events better in future. So please fill out that form. And with that, thank you again, and hope you have a great WWDC.