Roundup Let’s catch up on recent bits and bytes from the world of machine-learning.
There’s a GPT-3! OpenAI teased its latest text-generating AI engine GPT-3, a bigger and better version of its predecessor, in a paper on arXiv.
The giant autoregressive language model has a whopping 175 billion parameters, making it more than a hundred times larger than GPT-2. It was trained on four datasets scraped from the internet and from book archives, some measuring more than 500GB of text.
OpenAI didn’t reveal too much about the hardware it used to process all that data to train its software, but did mention that it was Nvidia “V100 GPUs on part of a high-bandwidth cluster provided by Microsoft,” and that it “consumed several thousand petaflop/s-days of compute during pre-training”. That sounds like the super-cluster Redmond said it spun up for OpenAI, announced at this year’s Microsoft Build conference.
Since GPT-3 is trained on such a large corpus of text data that encompasses all sorts of information, it’s able to do things including machine translation, question and answering, and filling in blank words to complete sentences. Interestingly, it’s also able to do simple calculations, like basic adding and subtraction.
The performance for different tasks varies, however, and it’s not as good as models that have been specifically designed for narrow applications, such as translation or math problem solving. GPT-3 is also not always good at answering questions. Although it’s decent at trivia, it’s flummoxed by more complicated questions, such as: “Why is the sky blue?”
The teasing of GPT-3 was a much more muted affair compared to the fanfare of GPT-2, which was accompanied by an official blog post and multiple press articles that described it as something that was “too dangerous to release.”
“Since the release of GPT-2 there has been no discernible difference in operations that may see potential gains by using language models,” according to the paper.
“The assessment was that language models may not be worth investing significant resources in because there has been no convincing demonstration that current language models are significantly better than current methods for generating text, and because methods for ‘targeting’ or ‘controlling’ the content of language models are still at a very early stage.”
OpenAI hasn’t published the code for its model, though it did release a few samples generated from a smaller model of GPT-3 as well as share a dataset to teach machines how to unscramble words and how to perform simple arithmetic. That’s available on GitHub here.
Microsoft has laid off about 50 journalists in America working on its Microsoft News and MSN teams, replacing them with AI software to automatically pick and choose articles and headlines to push to netizens, according to Business Insider and The Verge. On top of this, 27 journos in the UK – employed at PA Media, previously known as the Press Association, for Microsoft’s news service – were also axed, the Guardian reports, in favor of automated editors.
Amazon is about to snap up Zoox: Amazon is reportedly in “advanced talks” to buy self-driving car startup Zoox for less than its valuation of $3.2bn.
Zoox, founded in 2014 and based in Foster City, California, laid off its contractors tasked with tested its autonomous vehicles in April amid the coronavirus pandemic. The shelter-in-place orders in the US state, to curb the spread of the bio-nasty, made testing difficult: sitting close to co-workers in the confined space of a car is not recommended.
As Zoox faced an uncertain future, Amazon saw a good opportunity as the giant tech corp mulls gobbling the smaller startup for a cheaper price, the Wall Street Journal first reported.
Delivery is a big part of Amazon’s business, and it’s not surprising that it, too, wants to invest in autonomous vehicles to ferry people’s packages and food around. By getting rid of human drivers, Amazon can cut operating costs.
It may not be such a bad deal for Zoox, either, even if it’s probably being acquired for less than it wanted; Amazon has the money and resources to keep its efforts at building a self-driving car alive. Other startups, such as Starsky Robotics, which ran out of cash, have had to drop out of the race completely.
A Lite BERT language model: Amazon has also released the training code for ALBERT, which stands for A Lite BERT, a language-model based on Google’s BERT model for developers on Amazon Sagemaker, its machine-learning platform in AWS.
By releasing ALBERT, Amazon is giving developers a chance to use a language model that can, among other things, improve search results for recommendation systems, provide better machine translation, or generate and summarise text at a fraction of the cost. ALBERT has fewer parameters than BERT, making it a smaller model that’s cheaper and easier to train.
“The scripts use mixed-precision training and accelerated linear algebra to complete training in under 24 hours (five times faster than without these optimizations), which allows data scientists to iterate faster and bring their models to production sooner,” members of Amazon’s AWS Deep Learning said.
ALBERT comes pretrained on 16GB of text scraped from all English-language Wikipedia articles, plus 11,000 books. The code is available on GitHub here, and here are more instructions on how to use ALBERT in Sagemaker. ®