Write an AI to generate crossword puzzles.
$ python generate.py data/structure1.txt data/words1.txt output.png
██████████████
███████M████R█
█INTELLIGENCE█
█N█████N████S█
█F██LOGIC███O█
█E█████M████L█
█R███SEARCH█V█
███████X████E█
██████████████
How might you go about generating a crossword puzzle? Given the structure of a crossword puzzle (i.e., which squares of the grid are meant to be filled in with a letter), and a list of words to use, the problem becomes one of choosing which words should go in each vertical or horizontal sequence of squares. We can model this sort of problem as a constraint satisfaction problem. Each sequence of squares is one variable, for which we need to decide on its value (which word in the domain of possible words will fill in that sequence).
Complete the implementation of enforce_node_consistency
, revise
, ac3
, assignment_complete
, consistent
, order_domain_values
, selected_unassigned_variable
, and backtrack
in generate.py
so that your AI generates complete crossword puzzles if it is possible to do so.
The enforce_node_consistency
function should update self.domains
such that each variable is node consistent.
- Recall that node consistency is achieved when, for every variable, each value in its domain is consistent with the variable’s unary constraints. In the case of a crossword puzzle, this means making sure that every value in a variable’s domain has the same number of letters as the variable’s length.
- To remove a value x from the domain of a variable
v
, sinceself.domains
is a dictionary mapping variables to sets of values, you can callself.domains[v].remove(x)
. - No return value is necessary for this function.
The revise
function should make the variable x
arc consistent with the variable y
.
x
andy
will both beVariable
objects representing variables in the puzzle.- Recall that
x
is arc consistent withy
when every value in the domain ofx
has a possible value in the domain ofy
that does not cause a conflict. (A conflict in the context of the crossword puzzle is a square for which two variables disagree on what character value it should take on.) - To make
x
arc consistent withy
, you’ll want to remove any value from the domain ofx
that does not have a corresponding possible value in the domain ofy
. - Recall that you can access
self.crossword.overlaps
to get the overlap, if any, between two variables. - The domain of
y
should be left unmodified. - The function should return
True
if a revision was made to the domain ofx
; it should returnFalse
if no revision was made.
The ac3
function should, using the AC3 algorithm, enforce arc consistency on the problem. Recall that arc consistency is achieved when all the values in each variable’s domain satisfy that variable’s binary constraints.
- Recall that the AC3 algorithm maintains a queue of arcs to process. This function takes an optional argument called
arcs
, representing an initial list of arcs to process. Ifarcs
isNone
, your function should start with an initial queue of all of the arcs in the problem. Otherwise, your algorithm should begin with an initial queue of only the arcs that are in the list arcs (where each arc is a tuple(x, y)
of a variablex
and a different variabley
). - Recall that to implement AC3, you’ll revise each arc in the queue one at a time. Any time you make a change to a domain, though, you may need to add additional arcs to your queue to ensure that other arcs stay consistent.
- You may find it helpful to call on the
revise
function in your implementation ofac3
. - If, in the process of enforcing arc consistency, you remove all of the remaining values from a domain, return
False
(this means it’s impossible to solve the problem, since there are no more possible values for the variable). Otherwise, returnTrue
.
The assignment_complete
function should (as the name suggests) check to see if a given assignment
is complete.
- An
assignment
is a dictionary where the keys areVariable
objects and the values are strings representing the words those variables will take on. - An assignment is complete if every crossword variable is assigned to a value (regardless of what that value is).
- The function should return
True
if the assignment is complete and returnFalse
otherwise.
The consistent
function should check to see if a given assignment
is consistent.
- An
assignment
is a dictionary where the keys areVariable
objects and the values are strings representing the words those variables will take on. Note that the assignment may not be complete: not all variables will necessarily be present in the assignment. - An
assignment
is consistent if it satisfies all of the constraints of the problem: that is to say, all values are distinct, every value is the correct length, and there are no conflicts between neighboring variables. - The function should return
True
if the assignment is consistent and returnFalse
otherwise.
The order_domain_values
function should return a list of all of the values in the domain of var
, ordered according to the least-constraining values heuristic.
var
will be aVariable
object, representing a variable in the puzzle.- Recall that the least-constraining values heuristic is computed as the number of values ruled out for neighboring unassigned variables. That is to say, if assigning
var
to a particular value results in eliminatingn
possible choices for neighboring variables, you should order your results in ascending order ofn
. - Note that any variable present in
assignment
already has a value, and therefore shouldn’t be counted when computing the number of values ruled out for neighboring unassigned variables. - For domain values that eliminate the same number of possible choices for neighboring variables, any ordering is acceptable.
- Recall that you can access
self.crossword.overlaps
to get the overlap, if any, between two variables. - It may be helpful to first implement this function by returning a list of values in any arbitrary order (which should still generate correct crossword puzzles). Once your algorithm is working, you can then go back and ensure that the values are returned in the correct order.
- You may find it helpful to
sort
a list according to a particularkey
: Python contains some helpful functions for achieving this.
The select_unassigned_variable
function should return a single variable in the crossword puzzle that is not yet assigned by assignment
, according to the minimum remaining value heuristic and then the degree heuristic.
- An
assignment
is a dictionary where the keys areVariable
objects and the values are strings representing the words those variables will take on. You may assume that the assignment will not be complete: not all variables will be present in the assignment. - Your function should return a
Variable
object. You should return the variable with the fewest number of remaining values in its domain. If there is a tie between variables, you should choose among whichever among those variables has the largest degree (has the most neighbors). If there is a tie in both cases, you may choose arbitrarily among tied variables. - It may be helpful to first implement this function by returning any arbitrary unassigned variable (which should still generate correct crossword puzzles). Once your algorithm is working, you can then go back and ensure that you are returning a variable according to the heuristics.
- You may find it helpful to
sort
a list according to a particularkey
: Python contains some helpful functions for achieving this.
The backtrack
function should accept a partial assignment
assignment as input and, using backtracking search, return a complete satisfactory assignment of variables to values if it is possible to do so.
- An
assignment
is a dictionary where the keys areVariable
objects and the values are strings representing the words those variables will take on. The input assignment may not be complete (not all variables will necessarily have values). - If it is possible to generate a satisfactory crossword puzzle, your function should reutrn the complete assignment: a dictionary where each variable is a key and the value is the word that the variable should take on. If no satisfying assignment is possible, the function should return
None
. - If you would like, you may find that your algorithm is more efficient if you interleave search with inference (as by maintaining arc consistency every time you make a new assignment). You are not required to do this, but you are permitted to, so long as your function still produces correct results. (It is for this reason that the
ac3
function allows anarcs
argument, in case you’d like to start with a different queue of arcs.)
You should not modify anything else in generate.py
other than the functions the specification calls for you to implement, though you may write additional functions and/or import other Python standard library modules. You may also import numpy
or pandas
, if familiar with them, but you should not use any other third-party Python modules. You should not modify anything in crossword.py
.