We wanted to do something with text generation
We like poetry :D
Haikus are short and stick to a strict outline
Haikus are fun! And it's interesting to see a machine create art
We used the Haikuzao dataset to train our AI model. However, during the data preprocessing, after condensing each poem into string list items, we had to clean some of the poems due to them being inappropriate.
We used GPT-2 by OpenAI for our model! GPT is a deep learning language model that can generate all types of language, from poems to stories that replicate human-like language, by using token to token prediction.
We used these articles to help us understand GPT-2!
GPT is a language model that specializes in predicting the next word based on previous context, using masked self attention. (Other models, such as BERT use self attention which allows "fill in the blank " situations.)
Masked self attention prevents the model from seeing tokens to the right of the position being calculated. (in the image above, it is at position 4)
A layer is a decoder, each layer has 12 different mechanisms (heads), for a total of 144 (12x12) attention patterns. These help the AI distribute importance between the relationships of individual words in a sentence.
Picking the function of our generator! We knew we wanted to do something in line with text generation. We ended up choosing to create a haiku generator.
Gathering data! We found multiple kaggle datasets but ended up using the Haikuzao dataset on github.
Understanding GPT (Generative Pre Training), the core of our model.
Training our model on the Haikuzao dataset, modifying settings to adjust the poems.
Cleaned some of the haikus in the dataset.
Combined the post-processing and the model, to create our final haiku generator.
Design a website to house our product and express our ambition.
Errors and mistakes in code / debugging
cocalcg8 server being slow, and only tolerating one person running the code at once
Cocalc3 being slow
AI is trilingual and some of the haikus in the dataset were not pg-13
Hi, I'm Asher, I like finding new projects to work on with friends. From cyber security to conlanging, I like to do it all. I love to code, and have been doing it for years, having started off with python.
Hi, I'm a rising sophomore. I speak both French and English! I love math, computer science, and web design. When I'm not coding, I'm drawing or hanging out with my two cats, Helo and Starbuck.
Hi I'm Jay and I like making and achieving new goals, especially those that are difficult. There is no better feeling than triumph and working to achieve it, especially when working with a team as great as this one.
Hi, I am about to go to tenth grade and some of the things I like to do is playing soccer and basketball and of course coding. For this project I helped creating the website and the text generation section.
Hi I'm Justin and I served as the team's product manager and instructor! I am passionate about education and hold a particular interest in fairness in machine learning
Hi, I’m Anabelle. I'll be starting my freshman year in high school this fall. I hope to go on to have a job in programming. I enjoy reading, hanging out with friends, and learning new things about AI!
Hi I'm Owen, I will be starting my freshman year in autumn of 2021. My hobbies are doing math, physics, and playing the piano for fun or practice. I enjoy hikes, having fun with friends, and science.