That Transform Technology Summits launch on October 13 with Low-Code / No Code: Enabling Enterprise Agility. Register now!
Facebook has quietly acquired itself AI.Reverie, a New York-based startup that creates synthetic data for training machine learning models, VentureBeat has learned. In an apparent nod to the HBO show Westworld, where visitors to an amusement park encounter hordes of artificially intelligent robots, the purchase was made through a holding company called Dolores Acquisition Sub, Inc., after a character in the show.
A Facebook spokesman confirmed the acquisition when contacted for comment.
AI.Reverie was launched in 2017 by a founding team that included Daeil Kim, Joey Tran and Paul Walborsky. Kim was a former computer scientist at The New York Times, where he spearheaded NYT Español’s audience acquisition strategy by developing AI solutions to optimize the brand’s buying funnel. Walborsky, former president and CEO of tech media brand Gigaom, was SVP of The Times, responsible for a team that led the publisher’s international expansion.
AI.Reverie offered APIs and a platform that procedurally generated fully annotated synthetic videos and images for AI systems. Synthetic data, often used in conjunction with real-time data to develop and test AI algorithms, has come into vogue as companies embrace digital transformation during the pandemic. In a recent survey of managers, 89% of respondents said synthetic data will be crucial to staying competitive. And according to Gartner, by 2030, synthetic data will overshadow real data in AI models.
While synthetic data accurately reflects real data, mathematically or statistically, the jury is still out on its effectiveness. A paper published by researchers at Carnegie Mellon outlines the challenges of simulation that hampers the real world, including problems with reproducibility and the so-called “reality gap”, where simulated environments do not adequately represent reality.
Other research suggests that the synthetic data may be just as good for training a model compared to data based on actual events or people. For example, Nvidia researchers have demonstrated a way to use data created in a virtual environment to train robots to retrieve objects such as cans of soup, a mustard bottle, and a box of real-world Cheez-Its.
In a study published by AI.Reverie in 2019, the company claimed that fine-tuning a model trained in synthetic data with only 10% of real data achieved performance on a par with a model fully trained in real data data. “We … allow big minds everywhere to test the value of synthetic data for themselves,” Kim said in an earlier statement.
AI.Reverie – which competed with startups like Tonic, Delphix, Mostly AI, Hazy, Gretel.ai and Cvedia, among others – has a long history of military and defense contracts.
In 2019, the company announced a strategic alliance with Booz Allen Hamilton with the introduction of Modzy at Nvidia’s GTC DC conference. Through Modzy – a platform for managing and implementing AI models – AI.Reverie launched a weapons detection model that could apparently spot ammunition, explosives, artillery, firearms, missiles and magazines from “multiple perspectives.”
In 2020, AI.Reverie was awarded a $ 1.5 million research grant by AFWERX, a technical incubator arm of the U.S. Air Force, to build AI algorithms for the 7th Bomb Wing at Dyess Air Force Base. In a statement, Kim said AI.Reverie would create synthetic images to train computer vision algorithms for navigation, which would normally require hand-tagged images.
The company further described the first phase of its work in a press release: “The Department of Defense is looking to AI.Reverie to accelerate reconnaissance to the speed required in a contingency environment. hard to reach places … AI.Reveries synthetic data platform … [generates] millions of fully commented, richly varied images – fast and at a low price. AI.Reverie aims to generate images across the electromagnetic spectrum that will allow soldiers to more accurately identify objects and make life-saving decisions. ”
The contract closely followed AI.Reverie’s work with CosmiQ Works to release RarePlanes, a dataset containing over tens of thousands of real and synthetic satellite scenes and annotations of various aircraft types. CosmiQ Works, which focuses on creating AI technologies for geospatial applications, was founded in 2015 by In-Q-Tel, an investment company that connects tech companies with the American intelligence community.
In 2021, AI.Reverie received a contract for the US Air Force Advanced Battle Management System (ABMS), the goal of which is to create a network for the military that would provide the technical infrastructure to connect various platforms and sensors. ABMS also aims to apply AI to data from the network to help analyze information and help with decision making.
“We are honored that the Air Force chose AI.Reverie to support its Advance Battle Management System,” Kim said at the time. “We believe that in partnership with AI.Reverie, the Air Force will have a significant opportunity to improve mission-critical vision algorithms that secure military benefits and keep our troops safe.”
Investment in synthetic data
Prior to the acquisition, AI.Reverie, which had attracted $ 10 million in funding from Compound, In-Q-Tel, Resolute Ventures, SGInnovate, TechNexus and Triphammer Ventures, claimed to have government agencies and Fortune 500 customers in retail, smart cities, industry and agriculture, including airport simulation, weapons detection, non-cash purchases and delivery bots. But Facebook’s games seem to be for the company’s synthetic data generation technology rather than its customer base.
Although Facebook has not revealed in detail how-or-if-it uses synthetic data for computer vision, researchers at the company have used synthetic data to train models like the M2M-100, which can translate between 100 languages without English data. Synthetic data can be used to improve the performance of computer vision algorithms on the Facebook platform that detects hate speech, or to develop intelligent assistants in virtual reality (VR) and augmented reality (AR) environments such as the social network Horizon Worlds.
As the pandemic accelerates the trend toward stricter regulation and management of data protection, synthetic data gives Facebook another advantage: compliance. The company has historically trained computer vision algorithms on videos and images from its products (e.g. Instagram) and other sources, but synthetic data technologies like AI.Reverie’s could teach Facebook’s dependence on actual user and third-party data.
In 2020, a Lithuanian company called Planner 5D sued Facebook for allegedly stealing thousands of files from Planner 5D’s software, which was made available through a partnership with Princeton for participants in Facebook’s 2019 Scene Understanding and Modeling Challenge for computer vision researchers. Planner 5D claimed that Princeton, Facebook and Oculus, Facebook’s VR-focused hardware and software division, could have benefited from the training data taken from it.
Recently, a federal judge approved a $ 650 million privacy settlement over Facebook’s use of face recognition. The lawsuit alleged that the company’s Tag Suggestions tool, which scanned faces on photos and offered suggestions on who people could be, stored biometric data without users’ consent in violation of Illinois law.
51 percent of consumers surveyed are not confident in sharing their personal information, according to a Privitar survey. And in a Veritas report, 53% of respondents say they would spend more money with trusted organizations and 22% said they would spend up to 25% more with a company that takes data protection seriously.
VentureBeat’s mission is to be a digital urban space for technical decision makers to gain knowledge about transformative technology and transactions. Our site provides important information about data technologies and strategies to guide you as you lead your organizations. We invite you to join our community to access:
- updated information on topics that interest you
- our newsletters
- gated thought-leader content and discount access to our valued events, such as Transform 2021: Learn more
- networking features and more