Last week's (4.16~4.23) AI news overview:

This week was relatively quiet, with more manufacturers entering the LLM track. Google is working hard to catch up with OpenAI in terms of products, and SnapChat has unveiled its own chat AI bot.

Now let me review the big AI news from last week.

April 17th
Kunlun Universe launched the billion-level language model "Tiangong" and started internal testing.

Developed in collaboration between Kunlun Universe and the AI team Qidian Zhiyuan, "Tiangong" is a large-scale language model with dual billion-level capacity that is comparable to ChatGPT. It is also another innovative generative AI product from Kunlun Universe, following their AI drawing product "Tiagong Qiaohui". In December 2022, Kunlun Universe released the AIGC series of algorithms and models, covering multi-modal AI content generation capabilities in areas such as images, music, text, and programming. According to Kunlun Universe, the current version of "Tiangong" supports text conversations of over 10,000 words and can achieve more than 20 rounds of user interaction.

It is said that the entire project has invested hundreds of millions of RMB and has formed a research and development team of hundreds of people, and will continue to increase investment in the future.

Internal testing address:
https://tiangong.kunlun.com/

April 18th
Meta released DinoV2.

DINOv2 is a new self-supervised high-performance computer vision model for training (self-supervised means that the model learns from unlabeled data without human annotations). DINOv2 has achieved outstanding results on several computer vision benchmarks, such as image classification, object detection, and segmentation, due to its novel contrastive learning method that encourages the model to focus on salient regions of the image while ignoring the background. It can learn from any collection of images without the need for fine-tuning for different tasks.

Demo address: https://dinov2.metademolab.com/

Paper: https://arxiv.org/abs/2304.07193

GITHUB: https://github.com/facebookresearch/dinov2

April 19th
Aydar Bulatov and others released a technique that uses RMT to extend Transformers to over 1 million tokens.

This technique report introduces the use of recursive memory to extend the context length of BERT, one of the most effective Transformer-based models in natural language processing. By leveraging the recursive memory transformer architecture, they have successfully increased the effective context length of the model to an unprecedented 2 million tokens while maintaining high memory retrieval accuracy. This method allows for the storage and processing of local and global information and enables the flow of information between different segments of the input sequence using recursion.

During the inference process, the model effectively utilizes the memory of 4,096 segments with a total length of 2,048,000 tokens, far exceeding the maximum input size reported for transformer models (64K tokens for CoLT5 and 32K tokens for GPT-4). In their experiments, this enhancement kept the memory size of the base model at 3.6GB.

Paper address: https://arxiv.org/abs/2304.11062

GITHUB: https://github.com/booydar/t5-experiments/tree/scaling-report

April 20th

The manufacturer of the famous image production tool Stable Diffusion, Stability-AI, announced the release of their own LLM - StableLM. This is a language model that can generate stable and consistent text in different fields and tasks. The alpha version has 3 billion and 7 billion parameters and performs well (GPT-3 has 175 billion parameters), and there will be models with 15 billion to 65 billion parameters in the future. "Developers are free to inspect, use, and adapt our StableLM base model for commercial or research purposes, but must comply with the terms of the CC BY-SA-4.0 license" (one thing to note is that although the base model is under a Creative Commons license, the fine-tuned model is under a Non-Commercial Creative Commons license, which means it cannot be used for commercial purposes).

GITHUB: https://github.com/stability-AI/stableLM/

On the same day, Snapchat launched AI chatbot functionality for all users worldwide.

This chatbot named Snapbot allows users to have conversations with an AI agent. It can answer questions, tell jokes, play games, and send snaps. Snapbot also learns from users' preferences and behaviors and occasionally sends snaps to users based on their interests. Snapchat claims that Snapbot is not meant to replace human interaction but to enhance it and make it more fun and engaging. Snapbot is powered by a deep neural network and can generate natural language responses and images. Snapchat states that Snapbot complies with privacy and data protection laws, and users can opt out of this feature at any time.

Snapchat

@Snapchat

·Follow

Say hi to My AI, our new chatbot located at the top of your chat. Write a song for your bestie who loves cheese, find the best IYKYK restaurant, or Snap it a photo of your garden to find the perfect recipe. Now free for all Snapchatters. #SnapPartnerSummit

Watch on X

6:00 PM · Apr 19, 2023

459

Read 2.0K replies

April 21st
Google's AI Bard has opened up its ability to write code, supporting 20 languages and also providing debugging capabilities.

Jack Krawczyk

@JackK

·Follow

Today we’re updating Bard with the ability to help people with programming and software development tasks. We’re launching these capabilities in 20+ programming languages including C++, Go, Java, Javascript, Python and Typescript. blog.google/technology/ai/… 1/

1:24 PM · Apr 21, 2023

2.9K

Read 125 replies

On the same day, Fudan University's Natural Language Processing Laboratory launched a new MOSS model, becoming the first open-source large-scale language model with plugin-enhanced capabilities similar to ChatGPT in China.

MOSS is an open-source dialogue language model that supports Chinese, English, and multiple plugins. The moss-moon series models have 160 billion parameters and can run on a single A100/A800 or two 3090 graphics cards with FP16 precision. With INT4/8 precision, it can run on a single 3090 graphics card. The MOSS base language model is pre-trained on approximately 700 billion Chinese and English words, as well as code words. It has the ability for multi-turn dialogue and the use of multiple plugins through dialogue instruction fine-tuning, plugin-enhanced learning, and human preference training.

The MOSS model is derived from the team led by Professor Qiu Xipeng of Fudan University's Natural Language Processing Laboratory, and its name is derived from the AI in the movie "The Wandering Earth".

Apply for trial: https://moss.fastnlp.top

GITHUB: https://github.com/OpenLMLab/MOSS

If this article is helpful, please subscribe and share, and you can also follow my Twitter. I will bring you more information about Web3, Layer2, AI, and Japan-related news:

https://twitter.com/cryptonerdcn