How Trexel Bot Worked

a screenshot of trexel bot's twitter profile

Trexel bot was a project I made in February of 2021. It was code which took lines by from the transcripts of the podcast Stellar Firma, specifically Trexel Geistman’s lines. It would process these lines as to randomly generate lines similar to what Trexel said in the podcast. In this blog I’ll explain why I shut the bot down, it’s best tweets and how it worked.

This post will be broken down into multiple sections based on what you may want to read, here’s the breakdown of the post.

  1. Why I shut it down
  2. Some of the best quotes
  3. How the bot worked

Why I shut it down

I spoke briefly about this in a Twitter thread, but I’ll go into more detail here.

The process of maintaining Trexel bot is tiring, to be honest. Many Twitter AI generate their quotes and post them automatically with no human checking. This is efficient and means bots can easily run with little to no maintenance. However, Trexel bot’s engine was incredibly flawed, and had a habit of producing content which was mostly gibberish. This meant for the bot to post good content, each quote had to be checked and approved.

The process of checking I developed was a fairly simple solution. It was a simple text based program. You are presented with a quote, if you like it, you could choose it, or you could skip over it. This would continue till you pressed end. However this process is very repetitive, and quickly gets boring fast when 95% of the content produced it gibberish.

an image of a console, quotes are approved as they come through
An image of how Trexel bot’s quotes are approved

With all this, and the initial excitement of a new project long gone, I can’t find the motivation to keep generating good quotes for the bot.

I considered upgrading to a more refined built engine, such as GPT2. However, Stellar Firma is no longer being produced. So putting all the effort into this would most likely be wasted, since there’s not likely to be a steady stream of people finding the bot and enjoying it’s content.

These factors combined make me believe the only reasonable decision I have left is to stop generating quotes for Trexel bot and shut it down.

I’m sad to do this, as I’ve seen the bot has produced some very good quotes. I’ve gotten some laughs from content made by the bot, and I’ve seen people share the quotes and laugh, it felt good.

However, it’s just not feasible, and the bot hasn’t really produced any good content like that in months now, so I’m shutting it down. In a way I feel relieved as this has been on my mind for a while, as I’ve wanted to clean up old projects that weren’t really worth maintaining any more, so I can focus on more important things.

I want to thank everyone who shared and liked Trexel bot’s quotes, who helped share the good content the bot produced. I also want to thank @reefsharkivist on Twitter for the amazing profile picture of the bot!

Some of the best quotes

I have a few personal favorite quotes of the bot, I’m going to list them here, feel free to comment with your own!

Note: While none of these tweets are really NSFW, some may look a bit questionable out of context. This is more of an indication to make sure no one looks over your shoulder, rather than a viewer discretion advisory.

Tweet from @BotTrexel -

"TREXEL: gender."
I think this was the most popular posts the bot ever made.
Tweet from @BotTrexel -

"TREXEL: I 'm high-roading you on this, Knifeplay."
This led to me messaging my friends and asking why this tweet was more popular than usual, and an awkward conversation on what the definition of ‘Knifeplay’ was.
Tweet from @BotTrexel -

"TREXEL : Oh, wow, a ducky!"
An oddly wholesome tweet by the bot, compared to some of it’s other content.
Tweet from @BotTrexel -

"TREXEL : Yes ,I am late ."
An incredibly accurate tweet, closely representing Trexel.
Tweet from @BotTrexel -

"cw // nsfw

TREXEL : harder, so much cream!"
Eventually the bot posted so much questionable stuff, I had to add a content warning filter feature.

How the bot worked

The bot ran using a very simple method. It would process each line one at a time and begin extracting different types of vocabulary from each line. These included :

  • Singular nouns
  • Plural nouns
  • Singular verbs
  • Plural verbs

It kept a record of these in a CSV file. After extracting this info, it would replace each of these with a corresponding token. Such as %p_verb for plural verbs, %s_noun for singular nouns. It would keep a record of each of these tokenized sentences. These would be used later on in the generation process.

The generation process included getting a list of all of these tokenized sentences, and replacing the words in them with already extracted words. For example, the tokenized sentence I like this %s_noun would have %s_noun replaced with an already extracted singular noun. So if the singular noun *house* was extracted from the script, the sentence could become I like this house.

Example

The following example is the transcript from EP1 of Stellar Firma.

The original line in the transcript is as follows.

[crosstalk] I have the notes! I have the notes! I have the notes!

After processing this line and tokenizing it, it looks like this.

crosstalk ] I have the %p_noun! I have the %p_noun! I have the %p_noun!

Finally, when tried in the generator, it produced this result.

crosstalk ] I have the entities! I have the entities! I have the entities!

An interesting observation of the bot’s processing is it’s tripping over punctuation and interrupted words. This can be seen in the example above, where the bot cuts of the beginning opening bracket from the line. The transcripts used demonstrated when a character was cut off in conversation by cutting them off in the transcript as well, like this.

DAVID: I— I—
TREXEL: David, I don’t need to tell you. [David continues stuttering] David, do I need to tell you how to submit a brief?
DAVID: Eeeeehhh…

This began to confuse the bot a little bit, as it would frequently end up processing a lot of these words and sorting them into the vocab files. This became so much of a problem (because there’s so much interrupting each other in the podcast), that I eventually added a blacklist function to the processor.

Blacklist function

This blacklist functionality would prevent it from using these short words based off a number of predefined regular expressions in a text file (blacklist.txt). This presented more challenges in itself in a way. I would frequently get tripped up by Unicode characters that looked similar to ASCII characters. In some cases I would make regular expressions to avoid these ASCII patterns, then become confused to as why they were still showing, not realizing it was Unicode.

If you intend to use the bot for your another script that’s written differently, you’ll likely run into the same issue. The best option is simply to copy and paste the characters into a regular expression generator like RegExr and build them from there. For example, if I wanted to block words ending in because I knew all those words wouldn’t process correctly by the bot, I would use a regex like —{1,}$. This would block all words ending in that being counted, and would reduce the bot misinterpreting that word and using it later on the generator.

Recalling all of this, I see now that the nature of the generator was that it was very hacky, and thus accumulated a lot of technical debt during the development process. This only increased with the amount of episodes that rolled out over time, thus increasing the data the bot had, and the potential for something to go wrong. I admin this was also a factor in encouraging me to shut down this bot, as the idea of re-working this into a better engine was going to be even more difficult, given my constant hacky fixes and accrued technical debt.

The source code of the engine is available for use, you are free to use this for any script you use and create another Twitter bot if you wish to do so. If you do though, please link back to this blog post (or the source code of the bot, if you prefer). If you know someone who would benefit from this, please share it.

Trexel bot was recently shut down, but it's been open sourced Click To Tweet

Leave a Reply

Your email address will not be published.