Programming A2Z - Week 2

Regex Connections

Regular expressions–super powerful, definitely not something that comes easily to me! I want to say that after toying around with Regex Golf and Crosswords a bit that I have a decent idea of what they’re capable of doing, but the syntax is still a bit beyond me at the moment.

That’s a shame, because I’m really into the idea that I’ve got in mind for a Regex game. Have you played Connections?

This is the first thing the Times game section has done post-Wordle acquisition that seems to be capturing any of that magic. The format is pretty simple but the puzzle crafting is, in a word, devious: Players must sort 16 words into 4 groups of 4, each defined by their common connection.

These connection categories can be just about anything and double meanings abound. The 4 are ranked by difficulty (per the puzzle constructor’s judgment) and players get their guess results formatted as a sequence of emoji blocks. You can only make mistakes 4 times before failing. One nice thing: any previously made incorrect guesses can’t be accidentally submitted again. Words can also be shuffled around on the page to aid players in discovering potential connections.

Here are the words from September 22nd’s puzzle:

FLOAT, MALT, SHAKE, SUNDAE

CONCRETE, FIRM, SOLID, TANGIBLE

GLASS, OLD, SIGNS, SPLIT

DASH, HOVER, KEY, STAR

They’re sorted into their categories order of difficulty. The first two are pretty obvious: the first row are all things you can order at a soda fountain, and the second are all synonyms for “set.” The third seems completely unrelated at first glance unless you’re a fan of twists–they’re all M. Night Shamaylan movie titles. The last group is all prefixes for nouns ending in “board.” Note the tricky combo across the easiest and hardest categories: you’re meant to wonder if float and hover are supposed to be sorted together.

My idea for a Regex version involves the same puzzle structure, but each category is defined by an expression that keys to those 4 words (and ideally no more than 1 or 2 words in other categories). Here’s an example category:

BUTTE, SHUTTLE, SERVIETTE, BUTTERCUP

At the moment, what I’ve got for that category is “ttl?e” as the unifying expression. That works, but it would also catch “PEANUT BUTTER,” another word I’m including in my working draft of the puzzle. Design-wise I see 3 options here:

  1. Make the category PEANUT BUTTER belongs to obvious enough that a reasonably careful player won’t fall for lumping it in with the words like BUTTERCUP and BUTTE.
  2. Instead of “ttl?e,” define the category by an even stricter expression that doesn’t catch PEANUT BUTTER
  3. Delete PEANUT BUTTER from the puzzle, go another route

Here are the other three categories I’m working with. Some of the expressions I’m not quite sure how to write.

.*but

ATTRIBUTE, TRIBUTE, DEBUT, ABUTMENT

\w{6}+\s|\-\w{6}+

[This doesn’t work, but what I want is six characters, followed by a space or hyphen, followed by six characters]

PEANUT BUTTER, SEARCH ENGINE, ABSENT-MINDED, BLUISH-PURPLE

[words with 10 or more letters, no non-alphabet characters]

COUNTERATTACK, FABRICATOR, JACKASSERY, TABLECLOTH

I’ve started dusting off what little p5 knowledge I have to implement the game with that. Just getting the layout of buttons has already proven challenging, and to get to a point that only replicates the functionality of the original game would be a lot of work for me.

But to my opinion the really cool way to implement this game would be not to group the categories into their predefined sets, but to allow any solution involving 4 unique regular expressions that each only match a unique group of 4 words in the set. I think that would mean building regular expression parsing into the game, which is way beyond me at the moment.