Skip to main content

Can we write codes automatically with GPT-3?

 Introduction:

OpenAI created and released the first versions of GPT-3 back in 2021 beginning. We wrote a few text generation articles that time and tested how to create tinder bio using GPT-3. If you are interested to know more on what is GPT-3 or what is openai, how the server look, then read the tinder bio article.

In this article, we will explore Code generation with OpenAI models. 

It has been noted already in multiple blogs and exploration work, that GPT-3 can even solve leetcode problems. We will try to explore how good the OpenAI model can "code" and whether prompt tuning will improve or change those performances.

Basic coding:

We will try to see a few data structure coding performance by GPT-3.

(a) Merge sort with python:

 First with 200 words limit, it couldn't complete the

Write sample code for merge sort in python.

 

def merge(arr, l, m, r):

    n1 = m - l + 1

    n2 = r- m

 

    # create temp arrays

    L = [0] * (n1)

    R = [0] * (n2)

 

    # Copy data to temp arrays L[] and R[]

    for i in range(0 , n1):

        L[i] = arr[l + i]

 

    for j in range(0 , n2):

        R[j] = arr[m + 1 + j]

 

    # Merge the temp arrays back into arr[l..r]

    i = 0     # Initial index of first subarray

    j = 0     # Initial index of second subarray

    k = l     # Initial index of merged subarray

 

    while i < n1 and j < n2 :

        if L[i] <= R[j]:

            arr[k] = L[i]

            i += 1

        else:

            arr[k] = R[j]

            j += 1

        k += 1

 

    # Copy the remaining elements of L[], if there

    # are any

    while i < n1:

        arr[k] = L[i]

        i += 1

        k += 1

 

    # Copy the remaining elements of R[], if there

    # are any

    while j < n2:

        arr[k] = R[j]

        j += 1

        k += 1

 

# l is for left index and r is right index of the

# sub-array of arr to be sorted

def mergeSort(arr,l,r):

    if l < r:

 

        # Same as (l+r)/2, but avoids overflow for

        # large l and h

        m = (l+(r-1))/2

 

        # Sort first and second halves

        mergeSort(arr, l, m)

        mergeSort(arr, m+1, r)

        merge(arr, l, m, r)

Although we try a higher temperature with similar length, and this time it only comes up with a pseudo code. So, it is not so sure to come up with concrete code always.

def merge_sort(ls):

   

    if len(ls) <= 1:

        return ls

   

    left, right = ls[:len(ls)//2], ls[len(ls)//2:]

   

    return merge(merge_sort(left), merge_sort(right))

 

def merge(left, right):

   

    result = []

   

    while left and right:

        result.append(left.pop(0) if left[0] <= right[0] else right.pop(0))

   

    return left or right or result

 

So the higher temperature may cause such bad results too. Now we will try less temperature with another problem and run multiple tries to see if that provides real code every time or again something else comes up.

(b) Bubble sort with python:

Prompt:

Write sample code for bubble sort with python.

Temperature is set to 0.25. 

Try 1:

 

def bubbleSort(arr):

    n = len(arr)

 

    # Traverse through all array elements

    for i in range(n):

 

        # Last i elements are already in place

        for j in range(0, n-i-1):

 

            # traverse the array from 0 to n-i-1

            # Swap if the element found is greater

            # than the next element

            if arr[j] > arr[j+1] :

                arr[j], arr[j+1] = arr[j+1], arr[j]

 

# Driver code to test above

arr = [64, 34, 25, 12, 22, 11, 90]

 

bubbleSort(arr)

 

print ("Sorted array is:")

for i in range(len(arr)):

    print ("%d" %arr[i]),

Try 2:

Generates the same.

Try 3:

def bubble_sort(lst):

last    for i in range(len(lst)):

        for j in range(len(lst)-1-i):

            if lst[j] > lst[j+1]:

                lst[j], lst[j+1] = lst[j+1], lst[j]

    return lst

So the code may not be proper in some generations with the same settings of parameters. 

Let's try another problem, with even less temperature.

(c)

 

class Node:

    def __init__(self, data):

        self.data = data

        self.next = None

 

class LinkedList:

    def __init__(self):

        self.head = None

 

    def insert_at_beginning(self, data):

        node = Node(data)

        if self.head is None:

            self.head = node

        else:

            node.next = self.head

            self.head = node

 

    def insert_at_end(self, data):

        node = Node(data)

        if self.head is None:

            self.head = node

        else:

            current = self.head

            while current.next:

                current = current.next

            current.next = node

 

    def delete_at_beginning(self):

        if self.head is None:

            return

        self.head = self.head.next

 

    def delete_at_end(self):

        if self.head is None:

            return

        current = self.head

        while current.next.next:

            current = current.next

        current.next = None

 

    def traverse(self):

        if self.head is None:

            return

        current = self.head

        while current:

            print(current.data)

            current = current.next

So, at the same parameters, we will try again to regenerate a separate output. And with this high temperature we expect it to be a separate output definitely. So, let's see what happens.

 

class Node:

    def __init__(self, data):

        self.data = data

        self.next = None

 

class LinkedList:

    def __init__(self):

        self.head = None

 

    def insert_at_head(self, data):

        new_node = Node(data)

        new_node.next = self.head

        self.head = new_node

 

    def insert_at_tail(self, data):

        new_node = Node(data)

        if self.head == None:

            self.head = new_node

            return

        current_node = self.head

        while current_node.next != None:

            current_node = current_node.next

        current_node.next = new_node

 

    def insert_at_position(self, data, position):

        if position == 0:

            self.insert_at_head(data)

            return

        current_node = self.head

        current_position = 0

        while current_node != None and current_position < position:

            current_position += 1

            current_node = current_node.next

        if current_node == None:

            print("Invalid position!")

            return

        new_node = Node(data)

        new_node.next = current_node.next

        current_node.next = new_node

 

    def delete_at_position(self, position):

        if self.head == None:

            print("List is empty!")

            return

        if position == 0:

            self.head = self.head.next

            return

        current_node = self.head

        current_position = 0

        while current_node != None and current_position < position:

            current_position += 1

            current_node = current_node.next

        if current_node == None:

            print("Invalid position!")

            return

        current_node.next = current_node.next.next

 

    def print_list(self):

        if self.head == None:

            print("List is empty!")

            return

        current_node = self.head

        while current_node != None:

            print(current_node.data, end=" ")

            current_node = current_node.next

        print()

 

This is funny as it does a separate but better implementation of linked list. But it seems to have captured the proper implementation of data structures. We are not checking how much of this is from memories, but this kind of code, seems to be a thing which is coming totally off the memory though. So, now, we will leave the basics and enter simple programs.

We aren't going to enter into direct long text programming questions, since that will need a lot of reasoning, so we will go to questions which test simple implementations in simple languages. Let's try, for example, fibonacci series questions.

(a) write a simple fibonacci program.

 

write code for Fibonacci series generation in python.

 

nterms = int(input("How many terms? "))

 

# first two terms

n1, n2 = 0, 1

count = 0

 

# check if the number of terms is valid

if nterms <= 0:

   print("Please enter a positive integer")

elif nterms == 1:

   print("Fibonacci sequence upto",terms,":")

   print(n1)

else:

   print("Fibonacci sequence:")

   while count < nterms:

       print(n1)

       nth = n1 + n2

       # update values

       n1 = n2

       n2 = nth

       count += 1

So, it writes a fine code using the simple formula. Does look like a bit of a script type, but we are just checking logics here. 

Language problems:

Let's now try a bit of a language problem. Let's see if GPT-3 can do small language problems.

1. Write a program that asks the user for their name and greets them with their name.

Prompt:

Write a program that asks the user for their name and greets them with their name.  
---
Output:
name = input("What is your name? ")  
print("Hello, " + name + "!")
 
2. 

Write a program that asks the user for a number n and gives them the possibility to choose between computing the sum and computing the product of 1,…,n.

def compute_sum(n):

total = 0

for i in range(1, n+1):

total = total + i

return total

 

def compute_product(n):

product = 1

for i in range(1, n+1):

product = product * i

return product

 

def main():

n = int(input("Enter a number: "))

choice = input("Do you want to compute the sum or the product? ")

if choice == "sum":

result = compute_sum(n)

elif choice == "product":

result = compute_product(n)

else:

print("Invalid choice")

return

print("The result is", result)

main()

 

Okay. So it can solve simple logical problem too. Lets now gear up the logic a bit up. Lets do 2-3 line problems.

3.

Write a guessing game where the user has to guess a secret number. After every guess the program tells the user whether their number was too large or too small. At the end the number of tries needed should be printed. It counts only as one try if they input the same number multiple times consecutively.

 

I'm thinking of a number between 1 and 100.

 

Input:

 

50

 

Too low!

 

75

 

Too high!

 

63

 

Too high!

 

62

 

Too low!

 

61

Correct! You needed 5 tries.

__________________________________________________

!! So this is the time when a code didn't come out. But as you may notice; the word program or code is not mentioned, so let's try with that word to help the model understand. Let's see if it still does mistake.

___________________________________________________

Write a program for a guessing game where the user has to guess a secret number. After every guess the program tells the user whether their number was too large or too small. At the end, the number of tries needed should be printed. It counts only as one try if they input the same number multiple times consecutively.

 

secret = 17

tries = 0

while True:

    guess = int(input("Guess the secret number: "))

    tries += 1

    if guess == secret:

        print("You guessed it! The secret number was", secret)

        print("It took you", tries, "tries to guess the secret number.")

        break

    elif guess > secret:

        print("Too high!")

    else:

        print("Too low!")

__________________________________________

And so it doesn't fail. So, lets move on to something a bit more subtle.

Easy level problems:

___________________________________________
Question (prompt):

Implement a data structure for graphs that allows modification (insertion, deletion). It should be possible to store values at edges and nodes. It might be easiest to use a dictionary of (node, edgelist) to do this.  
 
Output:
 
A graph is a collection of nodes and edges. Nodes are connected by edges. Edges can have values assigned to them. It would be possible to store a graph in a dictionary where the keys are the nodes and the values are the edges. To insert a new node, simply add a new key to the dictionary. To insert a new edge, add a new key-value pair to the dictionary. To delete a node, simply delete the key from the dictionary. To delete an edge, delete the key-value pair from the dictionary.

___________________________________________

So, "implement a data structure" also doesn't trigger the model to write a code. Just to check, we will explicitly tell the model to write a program for it.
 ___________________________________________

Write a program to implement a data structure for graphs that allows modification (insertion, deletion). It should be possible to store values at edges and nodes. It might be easiest to use a dictionary of (node, edgelist) to do this.

A
graph is a data structure that consists of a set of nodes (vertices) and a set of edges connecting them. There are many different ways to represent a graph, but one common way is to use a dictionary of (node, edgelist) pairs, where edgelist is a list of the nodes that are connected to the node.
For
example, the following dictionary represents a graph with three nodes (A, B, and C) and three edges (A-B, B-C, and C-A): graph = { 'A' : ['B', 'C'], 'B' : ['C'], 'C' : ['A'] }
This
dictionary can be used to represent any graph, no matter how complicated.
 
_____________________________________________
 
So this is not working yet. We will reuse the prompt, but cut the bulk off and see if it is able to implement.
______________________________________________

Write a program to implement graphs that allows modification (insertion, deletion). It should be possible to store values at edges and nodes. It might be easiest to use a dictionary of (node, edgelist) to do this.

 

def graph(nodes, edges):

    graph = {}

    for node in nodes:

        graph[node] = []

    for edge in edges:

        graph[edge[0]].append(edge[1])

        graph[edge[1]].append(edge[0])

    return graph

 

def add_node(graph, node):

    if node in graph:

        print("Node already in graph")

    else:

        graph[node] = []

 

def add_edge(graph, edge):

    if edge[0] in graph and edge[1] in graph:

        graph[edge[0]].append(edge[1])

        graph[edge[1]].append(edge[0])

    else:

        print("Nodes not in graph")

 

def delete_node(graph, node):

    if node in graph:

        del graph[node]

        for key in graph:

            for i in range(len(graph[key])):

                if graph[key][i] == node:

                    del graph[key][i]

                    break

    else:

        print("Node not in graph")

 

def delete_edge(graph, edge):

    if edge[0] in graph and edge[1] in graph:

        for i in range(len(graph[edge[0]])):

            if graph[edge[0]][i] == edge[1]:

                del graph[edge[0]][i]

                break

        for i in range(len(graph[edge[1]])):

            if graph[edge[1]][i] == edge[0]:

                del graph[edge[1]][i]

                break

    else:

        print("Nodes not in graph")

_________________________________-

So, yeah. This finally worked.

Conclusion:

Simply it makes sense to say that GPT-3 can code like a beginner given no help and explicit prompt about programming. But to make it create code for further complicated stuff, i.e. language problems, will be challenging. So, we will have to find more prompt engineering as well as ideas on how to crack a long problem down automatically to use GPT-3 for the rest. For this, I will be writing another follow-up blog on the same topic. Stay tuned for the same.

Comments

Popular posts from this blog

Mastering SQL for Data Science: Top SQL Interview Questions by Experience Level

Introduction: SQL (Structured Query Language) is a cornerstone of data manipulation and querying in data science. SQL technical rounds are designed to assess a candidate’s ability to work with databases, retrieve, and manipulate data efficiently. This guide provides a comprehensive list of SQL interview questions segmented by experience level—beginner, intermediate, and experienced. For each level, you'll find key questions designed to evaluate the candidate’s proficiency in SQL and their ability to solve data-related problems. The difficulty increases as the experience level rises, and the final section will guide you on how to prepare effectively for these rounds. Beginner (0-2 Years of Experience) At this stage, candidates are expected to know the basics of SQL, common commands, and elementary data manipulation. What is SQL? Explain its importance in data science. Hint: Think about querying, relational databases, and data manipulation. What is the difference between WHERE ...

Spacy errors and their solutions

 Introduction: There are a bunch of errors in spacy, which never makes sense until you get to the depth of it. In this post, we will analyze the attribute error E046 and why it occurs. (1) AttributeError: [E046] Can't retrieve unregistered extension attribute 'tag_name'. Did you forget to call the set_extension method? Let's first understand what the error means on superficial level. There is a tag_name extension in your code. i.e. from a doc object, probably you are calling doc._.tag_name. But spacy suggests to you that probably you forgot to call the set_extension method. So what to do from here? The problem in hand is that your extension is not created where it should have been created. Now in general this means that your pipeline is incorrect at some level.  So how should you solve it? Look into the pipeline of your spacy language object. Chances are that the pipeline component which creates the extension is not included in the pipeline. To check the pipe eleme...

What is Bort?

 Introduction: Bort, is the new and more optimized version of BERT; which came out this october from amazon science. I came to know about it today while parsing amazon science's news on facebook about bort. So Bort is the newest addition to the long list of great LM models with extra-ordinary achievements.  Why is Bort important? Bort, is a model of 5.5% effective and 16% total size of the original BERT model; and is 20x faster than BERT, while being able to surpass the BERT model in 20 out of 23 tasks; to quote the abstract of the paper,  ' it obtains performance improvements of between 0 . 3% and 31%, absolute, with respect to BERT-large, on multiple public natural language understanding (NLU) benchmarks. ' So what made this achievement possible? The main idea behind creation of Bort is to go beyond the shallow depth of weight pruning, connection deletion or merely factoring the NN into different matrix factorizations and thus distilling it. While methods like know...