Amazon Go and the Emergence of Sentient Buildings: How It Works and What Its Impact Will Be
on Apr 12, 2018
Amazon Go embodies very impressive advances in computer vision, sensor fusion, and deep learning combined to create what is probably the simplest and fastest bricks and mortar shopping experience available today. Here we discuss how they do it and its broader applicability and impact.
Full Article Below -
It was 2013 and I was mesmerized by the huge construction project across the street from my Dad’s 14th floor apartment at the north end of downtown Seattle. It was awe-inspiring to see thousands of people, machines, and materials execute a choreographed dance within such a confined space. Turns out they were building the Doppler Tower, aka Amazon Tower 1, part of Amazon’s new Seattle headquarters complex covering three city blocks, including two new towers over 500 feet tall. And these were only a fraction of the 12 million square feet, across 40 buildings in Seattle, that Amazon could occupy by 2022. Their headquarters complex has completely changed the nature of the north end of the city. But not as much as Amazon has completely changed the nature of supply chains!
Figure 1 - Amazon's New HQ—Tower 2 (Future Location of Amazon Go) and Spheres Under Construction in 2015
When it comes to disruption of business as usual for supply chains, Amazon towers above the rest. They are often spoken about by competitors with a combination of reverence and fear. Amazon has achieved their almost mythic status by continually innovating at a pace that others find hard to keep up with. They have continually upped the ante on ease-of-use, selection, rapid delivery, and the overall user experience. And they have branched out into other areas such as dominating the cloud infrastructure market with their AWS cloud services, a broad array of logistics/transportation services and assets, and bricks and mortar retailing with their Whole Foods acquisition and Amazon Go.
I visited the Amazon Go store while I was in Seattle recently. It is located at the very headquarters complex that I watched being built five years ago, on the first floor of Day 1, aka Amazon Tower II. That is currently the only Amazon Go store in the world, although some sources1 say Amazon is planning to open as many as six new Go stores this year. Amazon declined to comment on whether and when they will open new stores, saying only “It’s too early to speculate on that. Right now, we’re laser focused on customers and delivering good food to them fast.”
Figure 2 - Amazon Go Store is a Novel Attraction for Tourists (but Routine for the Recurring Shoppers)
The Amazon Go store in Seattle is about 1,800 square feet. I might describe it as a checkoutless convenience store, with a good selection of prepared-onsite fresh food. Amazon described it to me this way: “We think of Amazon Go as the fastest place to grab hand-made meals and snacks, essential groceries, and fresh, seasonal meal kits to go. No checkout. No waiting in line. Just good food, fast. It’s the first store with Just Walk Out Technology,2 a brand new technology allowing customers to grab what they want and just walk out.”
The User Experience: Getting Started
Downloading and installing the ~11MB Amazon Go app was quick. Once installed, you log in using your regular Amazon account ID and password. The first time you log in, the app takes you through a 9-step animated tutorial which explains how to scan yourself into the store and how to shop. It took about a minute to go through the whole tutorial. I was impressed by how clear and easy-to-understand the tutorial was, especially with the simple animated graphics showing you exactly what to do. Since I had more than one credit card on my account with Amazon, it also asked me which one I wanted to use.
Figure 3 - Screenshots of Animated Startup Tutorial, Step 2 (of 9): How to Let In Friends and Family
Next the app displays a 2D barcode to identify you, which you use to get into the store. After initial setup, the 2D barcode is the first thing displayed when you run the app (i.e. no login required). That fits with Amazon’s relentless pursuit of removing friction from the shopping process (“don’t make me log in to my account just to get in your store!”), but it also means anyone who physically possesses your phone can use it to shop, without knowing your Amazon ID and password, if they can get into your phone.3 The app does have an optional setting to make it require a login upon launch, though it doesn’t do so by default. Also, the app was smart enough to prevent me from taking a screenshot of the 2D barcode, I assume so I wouldn’t share it on social media, possibly letting the whole world shop on my account. They redraw a new and different 2D barcode on the phone every 30 seconds, which I would imagine provides some degree of protection against people being able to shop on my account just by taking a picture of my barcode.4
Entering the Store
Figure 4 - Scanning in to Enter the Store
Getting into the store is somewhat similar to entering a mass transit system that has RFID readers with mechanized ‘saloon doors’ (i.e. small swinging doors), except Amazon Go’s doors are clear plastic and you use your phone instead of an RFID card. It is somewhat similar to checking into an airplane using the airline’s 2D barcode reader. So, for some people this will not be unfamiliar. At or near the store entrance (outside of the entry gates) there was the constant presence of one or two orange-shirted Amazon associates helping people with any issues. The most common was just explaining that the shopper needed to download the app and explaining to them how it worked. But I also saw them help one fellow whose phone battery had died (they have chargers for common phones handy), and they were happy to answer other simple questions. My guess is that over time (i.e. over years, not months) they may be able to reduce the need for constantly staffing the entrance, as more and more people get used to the routine of scanning themselves into the store, once stores have become established in a geography.
Buying and Paying
You buy and pay for things by picking them up and walking out the store. That’s it. Thus, Amazon Go epitomizes the company’s obsession with removing friction from shopping.
Howit Works: The Basics
Amazon Go embodies some stunning recent advancements in technology, particularly in sophisticated computer vision, sensor fusion, and deep learning. Amazon is understandably fairly tight-lipped about how this all works. They don’t want to make it easy for competitors to reverse engineer their technology. As a result, a fair amount of what has been written about how they perform this magic is speculation. In this article, we will endeavor to make clear distinctions between direct observations and speculation.
The easy part for the system is identifying you … or at least identifying your phone. The system uses the 2D bar code from the Amazon Go app on your phone to unambiguously link whoever has the phone5 to your Amazon account when they scan into the store.
Now comes the hard part. From the moment you scan and enter into the store until you leave, the system needs to keep track of you, where you are, what items you pick up (and how many) and what items you set down. Thus, it keeps a running tally of exactly what items you walk out of the store with. That is a truly amazing feat, IMHO.
I’ve seen a lot of different assertions on what kinds of sensors Amazon is using in their store, with people talking about everything from lasers to infrared to RFID. Amazon has confirmed they are not using RFID for tracking items in their Go store.6 One thing’s for sure, there is a lot going on in that store.
Figure 5 - Ceiling of Amazon Go— A Lot Going on Up There!
Figure 6 - Rows of Cameras and Lighting Mounted
on Outer Edge of Each Shelf, Pointing at Shelf Below
About 200 cameras in the ceiling, all pointed at different angles. I’m guessing that part of the secret sauce is getting these positioned properly.
About 120 of another device in the ceiling, also deliberately pointing at a variety of different angles. These boxes have heat sinks on them and I’d estimate they are approximately 9”X6”X 2”. I don’t know what these are or why they would need a heat sink.
A few thousand cameras mounted to the shelves. Under each shelf, at the front of the shelf, is a row of lights and cameras pointing down at the items on the shelf below. I roughly estimated the store has about 1,000 linear feet of shelving with perhaps four or so cameras per foot of shelf.7
Unknown number of scales. At least some of the shelves have scales in them to help detect when items are taken on or off the shelf (Amazon confirmed this for me). When I pushed down on the shelves, I could find none of them that felt like a typical scale—i.e. there was no give at all in any of the shelves. But I’m not an expert on scales, so there may be a type of scale that does not move when you push on it.
Half a dozen or so squarish boxes about 12”X12”X3”, possibly WiFi8 access points?
About 10 big cylinders, maybe 18” in diameter, 18” tall with 8 square baffle-like openings on the bottom. These devices are a mystery to me and may not even have anything to do with the sensing system.
Figure 7 - Every Shelf Has an Ethernet Connection
Computer Vision, Sensor Fusion, and Deep Learning
Amazon is using a combination of computer vision, sensor fusion, and deep learning to accomplish the task of tracking individuals and what they pick up or put down. Specifically, they are inferring pose for articulated motion analysis in crowded scenarios. In other words, they are continuously analyzing the video streams to build a model of the position and pose of each person in the store; what position each person’s limbs and hands are in and how they are moving through space. This involves modeling the human body and its various joints to feasible positions. That is hard enough when there is only one person in the picture. It becomes really hard when there are many people, making it easy for different parts to be hidden or exposed and figure out what is what.9 I assume that is why there are so many cameras, to ensure there is no configuration of people that will fool the system.
Figure 8 – Example Articulated Pose Recognition (Single Person, Front View, Limited or No Occlusion)
They use deep learning algorithms to recognize products on and off the shelves, even in the presence of occlusion (blockages of the image). Image recognition is an area where we’ve seen tremendous advances from deep learning. Amazon can recognize an item even when it is partially hidden.
It is not clear how well the system knows if a product is handed from one person to another. Their stated policy, which appears right there in the introduction tutorial, is that the person who takes an item off of the shelf pays for it, even if they hand it to another person who walks out with it. However, a shopper doesn’t pay for an item if they pick it up, but then subsequently set it down before leaving the store. So, perhaps that is an easier problem to solve (knowing that someone set an item down), compared with keeping track of who handed an item to who.
Regarding facial recognition, Amazon said “We don’t identify customers through facial recognition. We use the QR code generated by our app to associate customers with their account when they enter the store.” So, it is clear that they do not use facial recognition to identify customers as they come into the store … they don’t ‘remember’ what someone looks like between one visit to the store to the next visit. It is less clear whether they use facial recognition after a shopper has entered the store, during the single shopping visit, to distinguish one shopper from the other shoppers. In any case, privacy would not be compromised, as Amazon is not associating the face with an identity.
When you step back and think about the different layers of intelligence here; how Amazon engineers have gotten the system to recognize people and their pose; how the store system stitches together inputs from thousands of different devices to form a cohesive understanding of each person and every item in the store; how they are able to deal with so many different potential scenarios and challenges of occlusion (blocked views); all of the work that must have gone into testing and tweaking and tuning and refining to get it to be nearly fool proof … it is extremely impressive.
Implications for Fixtures, Shelving, Packaging, and Selection
It is amazing what Amazon is doing with this technology, but that doesn’t necessarily mean this solution is ready for all kinds of products and store formats. Here we speculate about its limitations. For example, the current setup has a dense array of cameras on every shelf which we assume is probably required to make this all work. That setup may be difficult to duplicate in other settings, such as in a clothing store with lots of hanging items on racks or in the bulk produce section of a grocery store or the lumber section of a home improvement store. It is also unclear whether the system of scales on the shelves can reliably handle and price food that is sold by the pound.
Figure 9 - Adjustable Dividers on Shelving
Shelves are compartmentalized, with adjustable plastic dividers between each section. The adjustment slots are about 3/8” apart, allowing fairly fine-grained adjustments to the width of each slot. The entire store had these dividers with only a single row of items per slot in all cases. So, there were no piles of produce as you might see in a typical grocery store. This makes for a very neat looking and well-organized store, but its real purpose may be to enable more reliable recognition of individual items and tracking of when they are removed. It may also have something to do with the arrangement of scales and making them work more reliably (again speculating on these last two points).
Potential Impact on Loss Prevention and Shrink.
It is interesting to consider the impact these systems will have on loss prevention and shrink. This system should eliminate sweet-hearting, which is where cashiers give items to people they know without charging them (often by doing ‘fake’ swipes over the bar code reader to make it appear like they are ringing up the items without actually incurring a charge). It seems like it may also make many conventional forms of shoplifting or employee theft more difficult … possibly much more difficult. People will always try to find a way to fool the system, so it remains to be seen how fool proof the system is.
Nevertheless, it could have a substantial impact on shrink. The average shrink rate for US retail last year was about 1.4% of revenue. About 2/3 of this loss, or just under 1% of revenue on average, is due to theft (external and internal) with the other 1/3 from administrative errors, vendor fraud, and ‘unknown.’ Considering that the average net profit for convenience stores is about 1.8%, if Amazon can eliminate most of that theft, they will have increased net profits by about 50% … and that is before considering any other impacts of this technology on profitability.
The Financial Model: Upfront Capital Costs, Store Velocity, Staffing, and Profitability
The Amazon Go model changes the dynamics of the store’s financial model. There is a higher up-front capital expense compared with a typical convenience store. How much higher? We can guess—my gut says it is somewhere between $100K - $500K—but you shouldn’t put too much stock in that guestimate. It is really hard to say with any level of confidence without more visibility into all the pieces involved. Regardless, there is some non-trivial additional upfront capital expense involved in rolling out one of these stores.
Figure 10 - In Traditional Stores, Checkout Lines
Consume Space and Shoppers' Time
On the other side of the equation, an Amazon Go store should have a higher velocity of sales than a traditional store. There is no wait time in checkout lines, so people get in and out of the store faster. That means a given size store can support more traffic, more sales per hour. Another way to look at it is the space that would normally be needed for registers and lines can instead be dedicated to more shelves and products.10 Therefore, with more products in the same footprint, the store should produce higher sales per square foot.
Of course, that assumes that the same products with the same price and profit margins will be sold in an Amazon Go store as compared with an equivalent convenience store. But this is not your typical convenience store. A significant portion of the Amazon Go store’s products are fresh prepared-onsite items and meal kits that you won’t find in your typical 7-11 or service station convenience store. These are higher margin products that should further enhance the revenue per square foot advantage of these stores.
The degree of labor savings is more of a question mark. Yes, there are no cashiers. But the higher velocity means the shelves need to be stocked more frequently, requiring slightly more labor. The alcoholic beverage section of the store has a dedicated employee to check IDs. There are one or two people out in front of the entrance gates at all times, helping people to download the app and answering questions. And there are quite a few people preparing the fresh food (though roughly the same amount of labor may be needed for offsite fresh food preparation, but it is not in-store labor). So, it is possible, maybe even likely, that this format actually requires more labor than the same size convenience store.
In summary, there is a significant additional upfront investment in capital required to put up one of these stores, but it pays off over time by higher revenue and profit per square foot. Labor costs may not be lower. Rather the higher profit is likely achieved by a combination of lower shrink, higher velocity of sales, and higher margin products. A more precise analysis and estimation of these financial impacts is beyond the scope of this article, but it would make an interesting study for the future.
Impact on the Broader Retail Landscape
While the technology in Amazon Go may be revolutionary, I believe the impact of these changes across retail will be evolutionary. It will take time. Amazon first needs to learn from their own efforts and indications are they will not roll out a lot of stores until they are ready. Furthermore, at this stage the technology seems only doable in small format, convenience-type stores. Eventually it will expand to larger grocery stores, but that may take many years.
Figure 11 - In Many Retail Settings
(Such as Luxury Apparel) the Speed of Getting in and Out
is Not the First Consideration
In most retail sectors, the speed of getting in and out of the store is not the first consideration. Shopper experience is key and that may mean personal attention, unique and engaging in-store experiences, opportunities for learning and discovery, attractive selections of merchandise, highly knowledgeable sales staff, and exuding exclusivity and class. None of that is accomplished by a grab and go approach. Yes, even luxury clothing stores may eventually adopt checkoutless technology, but it is pretty far down the list for them and that is likely many years (maybe even a decade or two) before it becomes common. The impact of Amazon’s entry is likely to be felt soonest in convenience stores and next in grocery. This includes online grocery; Amazon Go may actually displace some online grocery shopping for shoppers that place a premium on convenience and instant gratification.
There is the open question of where competitors would get the technology to deliver a checkoutless store. My working assumption is that Amazon will not be sharing this technology with competing retailers, much as they do not share their warehouse automation technology. I could be wrong—Amazon is more than happy to sell AWS services to competing retailers, but that does not so directly offer Amazon a competitive advantage as the Go store seems to do. We do see a number of startups aimed at providing Amazon Go-like technology and capabilities to competing retailers. However, it is really early and hard to tell which of these will succeed until we see some serious pilots and rollouts.
New Data Being Generated and How It Will Be Used
The Amazon Go store will generate a tremendous amount of new data. Amazon has revolutionized the use of online shopping data, following your every move online to make recommendations and try to get you to buy more. It would only make sense for Amazon to add into the mix the shopping behaviors for each shopper, as observed by its system running the Amazon Go store, to gain a fuller understanding of the shopper’s combined online and offline shopping persona, tastes, and shopping patterns. While other retailers have been striving to provide a unified omni channel experience, Amazon has the opportunity to have a truly seamless connection between their physical stores and online shopping.
The Amazon Go data should also have a lot of value to CPG manufacturers, as well as to Amazon’s own merchants, planners, and store designers. They have the opportunity to observe customer engagement, items considered but not bought, shopping sequences, path to purchase, impact of different store layouts, impact of different planograms, product codependencies, influence of offline shopping on online shopping (and vice versa), and more. Of course, privacy concerns will need to be considered in all of this as well. It will be very interesting to see where Amazon goes with all this new data.
The Emergence of Sentient Buildings
Looking beyond the impact on retail, Amazon Go represents the emergence of the ‘Sentient Building’ that understands the actions and intentions of its occupants. This kind of awareness could be used for many different things. It might become a safety feature where a building ‘knows’ when one of its occupants is in danger, or has been injured, or an unauthorized intruder has entered. Or it might help with evacuation in an emergency or direct emergency personnel to save trapped individuals. It could be used in a hospital setting to instantly locate key personnel or pieces of equipment that are needed in a hurry. The sentient home could help monitor elderly people to ensure their needs are met and/or they are compliant with treatment regimes. If opted-in, someone might want their building to act almost as a virtual life coach, gently reminding them to take a break and step away from their computer to take a walk.
Some of these scenarios, (especially that last one) require a high level of emotional intelligence or empathy to get right. There are technology companies attempting to provide that kind of emotional intelligence, such as Emotion Research Labs (based in Spain) who refers to themselves as the ‘artificial empathy’ company (I assume that is AE to complement AI). They have algorithms that monitor facial muscles to determine the emotional state of a person. There are a handful of other companies working on emotion recognition as well.11 That will eventually become a key part of a building successfully anticipating and responding to the needs of its occupant. And not just buildings, but vehicles and other environments. Emotion Research Lab says they have the ability to measure attentiveness, which might be a great safety feature for ensuring that drivers and pilots of trucks, trains, planes, and ships are paying attention, potentially saving lives.
Thus, Amazon Go augers more than a new way of shopping. It is ushering in new ways of interacting with our environment, with more intelligent buildings and vehicles. This technology has the potential to radically invade our personal space and privacy, sensing our every move, mood, and intention. That is a scary thought and a deep concern that needs to be addressed, which we may cover in future articles. One can only hope these technologies are used for good, to make things better for mankind.
For more on automation, AI, and the future of work, read Thinking Machines in this issue.
4 I notice the barcode looks completely different each time it is regenerated. It is not clear whether there is some sort of time synch going on between the phone and the cloud application used to control the entry doors, or whether the phone sends a new ‘fingerprint’ every 30 seconds (this seems less likely), or some other scheme. In any case, it seems this feature adds a layer of security to the system compared to a static barcode. -- Return to article text above
5 As stated earlier, whoever has the phone (if the person can access and run the Amazon Go App) can shop at an Amazon Go store and all purchases they make will be charged to the Amazon account associated with the phone. An unprotected phone (or weakly protected phone) with Amazon Go is thereby almost like having a wallet full of cash, up to your credit card limit. People need to understand this and set up a strong password or biometrics or other means of preventing unauthorized access to their phones. Not to lecture, but really, you should be doing that anyway. -- Return to article text above 6 As I noted in an earlier post, grocery is a low margin business where many items have just pennies of profit. Even the lowest cost UHF RFID tags at high volumes—once you add up all costs it takes to produce, commission/serialize, apply, and test them—are about $0.07 - $0.12 each. In contrast, using video analytics, sensors, and machine learning adds $0.00 of variable cost to each item. There are also challenges using RFID around liquids and metals, which would make something like a can of soup or soda difficult to reliably read. Companies have found some ways of overcoming those limitations up to a point, but probably not good enough for the performance Amazon needs here. -- Return to article text above 7 Doing the math, that comes to about 4,000 cameras just on the shelving, keeping in mind that number might be off by 50% or so either way. -- Return to article text above 8 These also looked a bit like RTLS RFID readers I’ve seen, but that seems unlikely, unless they are using RFID tags to track employees. -- Return to article text above 9 Even we humans get fooled by occlusions (the technical term used by image analysts to refer to an obstructed view of part of what you’re trying to analyze). You may have seen some of those whacky, disorienting pictures where two peoples legs appear swapped, or wearing someone else’s hair, or disappearing legs, or this one, that create an illusion just because of the angle and timing of the picture. -- Return to article text above 10 A portion of that space is still needed for the entry/exit gates, so you aren’t reclaiming 100% of the checkout space. -- Return to article text above 11 My impression is that we are very early in the development of those kinds of emotion detecting technologies, so the initial uses for them will likely be quite restricted to a specific set of use cases matching their capabilities. But like all of these technologies, emotion detection will get better over time. -- Return to article text above
To view other articles from this issue of the brief, click here.