How to run the OpenAI GPT-2 text generator on macOS

Earlier this month, OpenAI released the final model in GPT-2’s staged release. If you don’t know what GPT-2 is, you should check out OpenAI’s blog post about it. To summarize the post, GPT-2 (which stands for Generative Pretrained Transformer 2, a successor to OpenAI’s original GPT model) is a language model that is designed to predict the next word when given a list of the previous words within a passage of text. This model can allow us to generate a lengthy continuation of synthetic text from a prompt of 1 or 2 sentences. In their original blog post, OpenAI cited concerns about the malicious application of the model as the reason for not releasing the fully trained model. However, OpenAI posted another blog post announcing the release of the largest version of GPT-2 after the lack of strong evidence that the model was being misused.

The largest and final release of the GPT-2 model has 1.5B parameters, but for this tutorial, I’m going to be using the model with 345M parameters. Let’s get started!

Getting Started

The first thing you’re going to need is Python3. If you don’t already have it installed, you can install it using HomeBrew.

Downloading GPT-2

Next, download GPT-2 from OpenAI’s Github repository and jump into the newly-created directory.

Creating a Virtual Environment

After downloading GPT-2, we’re going to create a virtual environment inside the GPT-2 directory. Then we’re going to install TensorFlow and GPT-2’s other dependencies inside that virtual environment.

Downloading the Model

Next, we’re going to download the language model with 345M parameters.

Running the Model

After the model has finished downloading, we need to go into src/ and set the model_name to 345M and the top_k to 40.

Then we can run the model.

It might take a few seconds for the model to set up depending on the strength of your machine. Once the model is ready, you should see something that looks like this.

Now the model is ready to test! Let’s see what it comes up with when I input the first two-and-half sentences in this tutorial.

Here’s what the model produced from that prompt.

It’s not the worst writing I’ve seen! However, it’s definitely not the most coherent passage of text and not the direction I was planning on taking this article. I’m sure that the model with 1.5B parameters would produce a more realistic synthetic passage, but I don’t think models like this are ready to put writers out work just yet.

Leave a Reply

Your email address will not be published. Required fields are marked *