Ep. 32 — A culture of experimentation — Lessons learned from creating TwinCam
Intentional practice helps us get better, no matter it’s basketball playing, painting or public speaking. This strategy applies to software engineering too.
If we regularly explore new APIs, toolsets and tinker with them, we would get better at turning ideas into products efficiently.
Experiments are essential and fun…
TwinCam is one of these experiments I ran recently. iOS AV Foundation has opened up and allowed developers to use multiple cameras simultaneously (front, back, wide-angle and depth detection.)
I got curious. 🤔 What if I combine images from two cameras simultaneously? Front camera for recording my face and back camera for the landmark I am trying to capture? Would I be able to fuse them together, in real time?
There is only one way to find out… Just Do It.
The resulting real-time experience is pretty interesting.
TwinCam’s is not too hard to realize using today’s machine learning toolkits. In this experiment, I used Google’s ML Kit — Selfie Segmentation (so I can try the same idea on Android too.) iOS has a native Vision Framework that can do it just as efficiently.
Experiments are fun because you can change direction as you see fit. For TwinCam, the size of the self-image matters a lot; so I added a slider to allow me to adjust things on the fly.
Here are the top 4 highlights when building this:
- You can do a lot of heavy lifting (e.g. object detection, image transforms) on every preview frame in real time without lags, even when you have multiple camera data streams active at the same time on older devices like iPhone X. Modern phones are very powerful.
- You don’t have to learn ML concepts like hyper-parameters tuning to create cool experiences. Many of these ARKit or MLKits are user-friendly and require zero ML experience. A sample app is all you need to get going.
- Code that are executed in very high frequency requires different debugging techniques. Early versions of TwinCam crashed mysteriously in production (SIGSEGV) and they happened on certain devices only. It took me many nights to reproduce the crashes and locate the root cause. I can’t rely on good old print statements because it would generate too much noise and side effects. (Side note: the logic error was in the camera initialization section, not even in the critical section of the per-frame, high frequency processing logic.)
- Physical device farms like BrowserStack work really well. It has a good free tier that enabled me to explore. It has plugins to things like fastlane to automate your build/deploy/test workflow.
I hope you find this blog useful. You can download TwinCam for free here.
A few other experiments if you are curious. The technical learnings are usually captured in my blog (https://billylo.medium.com) .
- Sidekick: turns your iPhone into a Dashcam with maps and navigation, speed camera alerts, fuel cost calculations and toll-road optimization.
- Plane Above Me: What’s that plane flying above me? A 737 flying to Frankfurt?
- How Deep: shows water depth and altitudes anywhere.
- Travel Shopping Buddy: aim your camera to a price tag and it shows you the tax-inclusive amount in your home currency for easy evaluation
- BuzzWatch: To help the hearing impaired, this Apple Watch app listens for important sounds (e.g. car horns) and alert them using vibrations on their wrists promptly. (open-sourced)
- SmartAlert: An Android app that works like an alarm monitoring service, except it’s free.
- SuperRoute: generates route using Google, Apple Maps, MapQuest, HERE, OpenStreetMap, Mapbox, Azure in one click so you can choose the one you like the most.
- MyCommunityWatch.org: notifies you by email if a major crime has been reported in your area (data sourced from Police services)
- Try Something New: Every time a new restaurant opens in your neighbourhood, this web site will email you its details for you to see if it’s worth trying! (powered by Google Maps and Yelp)