Gear Up Your AI: Fine-Tuning LLMs
February 7, 2022

Clarifai Release 8.1

Table of Contents:

Clarifai Release 8.1

The Clarifai platform has now been updated to Release 8.1! We've added a number of incredible NLP models for text processing, generation, detection, and inference. 

Model: T5-base fine-tuned on SQuAD for Question Generation

Details of T5

Recently, there are lots of transfer learning techniques for NLP. Some techniques may work almost identically — just with different datasets or optimizers — but they achieve different results. Can we then say that the technique with better results is better than the other? Given the current landscape of transfer learning for NLP, Text-to-Text Transfer Transformer (T5) aims to explore what works best, and how far can we push the tools we already have.

The T5 model was presented in Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer by Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu. Here is the abstract:


Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.


Model: t5-base-en-generate-headline

Generate your own headlines for text content. The model has been trained on a collection of 500k articles with headings. Its purpose is to create a one-line heading suitable for the given article.


Model: Text Summarization  distilbart-cnn-12-6

Summarization is the task of condensing a piece of text to a shorter version, reducing the size of the initial text while at the same time preserving key informational elements and the meaning of content. Since manual text summarization is a time expensive and generally laborious task, the automatization of the task is gaining increasing popularity and therefore constitutes a strong motivation for academic research.


Model: Chinese Poem GPT2 Model

The model is used to generate Chinese ancient poems. You can download the model either from the GPT2-Chinese Github page, or via HuggingFace from the link gpt2-chinese-poem.


Since the parameter skip_special_tokens is used in the, special tokens such as [SEP], [UNK] will be deleted, the output results of Hosted inference API (right) may not be properly displayed.

You can use the model directly with a pipeline for text generation.

Model: CodeBERT fine-tuned for Insecure Code Detection 

Details of CodeBERT

We present CodeBERT, a bimodal pre-trained model for programming language (PL) and natural language (NL). CodeBERT learns general-purpose representations that support downstream NL-PL applications such as natural language codesearch, code documentation generation, etc. We develop CodeBERT with Transformer-based neural architecture, and train it with a hybrid objective function that incorporates the pre-training task of replaced token detection, which is to detect plausible alternatives sampled from generators. This enables us to utilize both bimodal data of NL-PL pairs and unimodal data, where the former provides input tokens for model training while the latter helps to learn better generators. We evaluate CodeBERT on two NL-PL applications by fine-tuning model parameters. Results show that CodeBERT achieves state-of-the-art performance on both natural language code search and code documentation generation tasks. Furthermore, to investigate what type of knowledge is learned in CodeBERT, we construct a dataset for NL-PL probing, and evaluate in a zero-shot setting where parameters of pre-trained models are fixed. Results show that CodeBERT performs better than previous pre-trained models on NL-PL probing.

Details of the downstream task (code classification) - Dataset

Given a source code, the task is to identify whether it is an insecure code that may attack software systems, such as resource leaks, use-after-free vulnerabilities and DoS attack. We treat the task as binary classification (0/1), where 1 stands for insecure code and 0 for secure code.


Model: Natural Language Inference  nli-distilroberta-base

The model was trained using SentenceTransformers Cross-Encoder class. It was trained on the SNLI and MultiNLI datasets . For a given sentence pair, it will output three scores corresponding to the labels: contradiction, entailment, neutral. Pre-trained models can be used like this: pre-trained and directly with Transformers library. You can use the model also directly with the Transformer library (without the library) using the library's AutoModel library. The model is trained using the CrossEncoder Cross Encoder for Quora Duplicate Questions Detection. It can also be used with Transformer Library AutoModel for the same reason: AutoModelForSequenceClassification.


Model: disease-ner-ncbi NER model

BioBERT model fine-tuned in NER task with BC5CDR-diseases and NCBI-diseases corpus along with selected pubtator annotations from LitCOVID dataset. This was fine-tuned in order to use it in a datummd/bionlp system which is available at:

The Clarifai platform 8.1 includes a number of incredible NLP models for text processing, generation, detection, and inference.