Wednesday, November 3, 2010

More Perl 5 - Comparing "split range" vs "test in loop"

Although the difference I found between the split range and the test within loop implementations were small, it seems to me it's wrong.

So I boiled the test down to a bar minimum. I created a benchmark test consisting of nothing but the loop. I had to do something within the loop, so I stored the number in a variable.


test8 => sub { for ( 0..8 ) {
next if $_ == 8;
my $j = $_;
} },
range0 => sub { for ( 0..-1,1..8) {
my $j = $_;
} },



As the names and the code suggest, I actually had 9 test and 9 range instances, for each value in 0..8. While there was a little variation within the sets, the difference between sets was significant! Using a split range was 60% faster than testing within the loop. The test achieved 376923 .. 390809 iterations per second, compared to 596918 .. 606710 for the split range, when tested for 10 seconds per version. Testing just test4 vs range4 for a minute gave similar results:


Rate test4 range4
test4 386996/s -- -35%
range4 593520/s 53% --



How about if the loops are larger? If we go from 0 to 80 or 0 to 800 instead of just 0 to 8, the test should get worse, since it's repeating the test over and over, while the split range has no extra work to do.


0-8,000
Rate test4 range4
test4 540/s -- -39%
range4 886/s 64% --

0-80,000
Rate test4 range4
test4 55.4/s -- -38%
range4 88.7/s 60% --



And in fact the relative speed improves linearly as the loops size grows by powers of ten from an upper limit of 8, to 8,000, after which the improvement decreases ... guess that's log(N). But by 80,000 it's only doing 50 or 100 reps a second, accuracy may be fading.

So the question is, in isolated testing, the range is definitely better than the test within the loop. Why does the test perform better in the otherwise identical program? After all, all that was happening within the loop was hash and array de-referencing, and an integer comparison.


return # collision
if $val == $self->{grid}[$row][$c];



Using -d:FProf, I see that BruteForceTest.pl took 0.030 seconds to run a total of 5450 row, column and block tests, while BruteForceExplicitList.pl took 0.026 seconds for 5450 calls to the unified test. using the timings from the first test, above, the explicit list loop should have been 9 ms while the repeated test should be 14 ms. Presumable the 16ms is the hash and array de-referencing and the comparison.

But the test wrapper, cell_value_ok used to be a trivial call to the row, column and block tests, but now it constructs a few arrays. That must be why it has gone from 0.006 ms to 0.015 ms.

Of course the real lesson is that profiling programs that take 1/10 second overall is a waste of time

Friday, October 22, 2010

Sudoku in Perl5 #2 - Optimizing the Perl5 implementation

The test in BruteForceTest irritated me .. iterating across the whole row, testing each cell to see if it was the cell we didn't want to test against. And same for each row of the column, and each cell of the block. All those repeated tests must add up to a lot of wasted time, right?

So I came up with BruteForceRange, which doesn't iterate over the whole row, only the portions to the left and right of the current cell. In the column, it iterates over the cells above and below ... the block test is more complicated, we'll get to it.

The program is mostly the same as BruteForceTest. The first difference is the name changes:

package BruteForceRange;

Instead of specifying 0..8 for the row or column for loop ...

for my $c ( 0 .. 8 ) {
next CELL # don't test against self.
if $c == $col;

the new version specifies a portion to the left, and a portion to the right. This is possible because ranges in Perl must increment, a left boundary greater than the right one is an empty range.

for my $c ( 0..$col-1,$col+1..8 ) {

We no longer need a test, let alone 8 tests that fail. Test_col() is similar, iterating over the rows above and below, but not the actual one we're testing against. test_block() is more complicated. Instead of the former nested block:

for my $r ( $baserow .. $baserow + 2 ) {
CELL:
for my $c ( $basecol .. $basecol + 2 ) {
next CELL # don't test against self.
if $r == $row and $c == $col;

We still have to iterate over all the rows. If we're not in the test row, scan all the columns, but in the test row, scan only the rows above and below the test column. The ternary operator : test ? value-if-test-was-true : value-if-test-was-false is the compact solution. Using a two-part range within the ternary operator generates an error, unless you add parentheses to delimit the list ... or maybe it's to clarify the bounds of the if-true portion of the operator.

for my $r ( $baserow .. $baserow + 2 ) {
CELL:
for my $c ( $r == $row
? ($basecol..$col-1,$col+1..$basecol + 2)
: $basecol .. $basecol + 2 ) {

If the ternary operator combines with the split range freaks you out, an alternative would be to calculate the range between the loops and store the lements in an array:

for my $r ( $baserow .. $baserow + 2 ) {
CELL:
my @columns = ( $r == $row
? ($basecol..$col-1,$col+1..$basecol + 2)
: $basecol .. $basecol + 2
);
for my $c ( @columns ) {


I won't show you the driver program I used while testing and verifying the module, it's just like the other one only with slightly different names. More interesting is a program to compare the results:

use BruteForceRange;
use BruteForceTest;
use Benchmark qw/:all/;

my $ARGS = { input =>
'010056207'
. '000700005'
. '000300498'
. '000200380'
. '006000700'
. '051007000'
. '684002000'
. '100003000'
. '305460020'
};

cmpthese( -60, {
BruteForceTest => sub { BruteForceTest->new( $ARGS )->solve; },
BruteForceRange => sub { BruteForceRange->new( $ARGS )->solve; },
}
);

new() returns an object for the specified puzzle, which is then told to solve itself. Since we're not going to look at the results, but only solve the problem over and over to see how fast it is, that's the simplest arrangement, and keeps the cmpthese call short. If the first argument to cmpthese is a positive number, it's a count of how many runs to perform; a negative number is the number of seconds to spend running the specified block ... in this case, each version is tested over and over for 15 seconds.

The results? It's a tie! In fact, the superfluous tests are a fraction faster that than the split range, 27.7 reps/second compared to 27.4 reps/second. That's on a 2.66 GHz Mac Pro tower with one Nehalem ( i7 ) processor and 12 GB of 1066 MHz DD3 memory sitting around doing nothing.

bash-3.2$ perl5.10 compare.pl
Rate BruteForceRange BruteForceTest
BruteForceRange 27.4/s -- -1%
BruteForceTest 27.7/s 1% --
bash-3.2$

Thursday, October 21, 2010

Sudoku in Perl5 #1 - a brute-force solution

Having written the Perl6 Sudoku solver, I decided to go back to the Perl5 version I wrote when I was experimenting with Moose, but I couldn't find where I put it. While waiting for my senile memory to function, I implemented a brute-force solution. The basic concept of the brute-force approach is that you begin with a grid, some cells of which have been defined by the puzzle designer. I begin with the first empty cell ( call it A) and set it to '1'. I check to see if that conflicts with any already specified cells in the same row, in the same column, in the same block. If there's a conflict, I try the next element in the list of possible values : 2, 3..9. If a value does not conflict with existing cells, then on to the next cell. If all values create a conflict, then an earlier cell must have the wrong value, so back up. By recursively calling a routine over and over, backing up is easy, just return. If we run out of cells to the right, advance to the next row; when we run out of rows, the puzzle is solved. At that point, return a true value, and all the recursion unrolls in victory.

I call this version BruteForceTest, because, when testing the row, column and block, you don't want to compare a cell against itself. Thee simple way to avoid that is to iterate over the whole list, and test whether we are comparing the cell against itself.


BruteForceTest.pm





package BruteForceTest;

use warnings;
use strict;

sub new {
my ($class) = shift;
my $self = bless {}, $class;

$self->_init(@_);
return $self;
}

sub _init {
my $self = shift;
my ($args) = @_;

die( __PACKAGE__
. "constructors requires arg 'input'.\n" )
unless exists $args->{input}
and defined $args->{input};
my (@digits) = ( $args->{input} =~ m{(\d)}xmsg );
die( __PACKAGE__
. q{constructor arg 'input' must be 81 }
. ' digits, 1..9, with 0 for empty cells. '
. 'Line breaks and spaces are allowed.'
. "\n"
) unless 81 == scalar @digits;
for my $row ( 0 .. 8 ) {
for my $col ( 0 .. 8 ) {
my $digit = shift @digits;
$self->{set}{$row}{$col}++
if $digit >= 1 and $digit <= 9;
$self->{grid}[$row][$col] = $digit;
}
}
}


I've never used a copy constructor in 12 years of Perl programming, so I don't bother with that ref $class or $class stuff, but I do move all the initialization into an _init method, so subclasses can override. In the last couple of years I've begun to shift the object off @_; that way, I can x the argument list in the debugger, without having a flood of junk from $self.

My constructor expects a single argument, a hash ref, with the key input and a single string as the value. Then I extract all the digits from the string into an array. I considered a couple of options, considered allowing multi-line input, considered accepting a dot , '.', or space for a zero, but then decided sticking to digits meant I could grep the digits out of a longer string. I also considered a routine to read from a file, but my priority is to compare the different versions from a single driver program, so I didn't really need that. Any file reader would go in main() rather than in the individual solver modules.

Having verified the right number of digits, I scan through the set, assigning them one by one into an array of arrays. If the value is a zero, it's still waiting to be solved. Digits one through nine are values set by the designer, pre-conditions i must not alter during the solving process, so I also set use the row and column indices to set a hash, indicating protected cells.

sub solve {
my $self = shift;
my ( $r, $c ) = @_;

$r ||= 0;
$c ||= 0;

return 1 if $r >= 9;
return $self->solve( successor( $r, $c ) )
if $self->{set}{$r}{$c}; # skip pre-spec sell
VALUE:
for my $value ( 1 .. 9 ) {
$self->{grid}[$r][$c] = $value;
next VALUE
unless $self->cell_value_ok( $r, $c );
return 1
if $self->solve( successor( $r, $c ) )
}
$self->{grid}[$r][$c] = 0;
return;
}

I'm going to call solve() recursively, to process each next cell. The first call, from main(), shouldn't be encumbured by arguments, but the default values are trivial to assign. At one time I would have tested for the end of the puzzle when incrementing the indices ... that's the way I was taught, but that seems so C, so Pascal, now. It's simple to check the row index at the top of solve(), and more in the style of functional languages. If the index is too large, we've reached the end of the puzzle, and can return the victory signal.

If the current cell is one of the pre-specified cells, all we need do is solve the rest of the puzzle, and return the success or failure of that effort. Otherwise, try the values one through nine, in the current cell. If cell_value_ok() detects a conflict, try the next value in the list, but if it's alright for now, see if the rest of the puzzle is solvable with the current conditions. If we get a failure back, try the next value. If the last value fails, we must have gone wrong earlier, so set the current cell back to zero, and back up a space. Returning true from solve() indicates success, false indicates failure.

sub cell_value_ok {
my $self = shift;
my ( $r, $c ) = @_;

return unless $self->test_row( $r, $c );
return unless $self->test_col( $r, $c );
return unless $self->test_block( $r, $c );
return 1;
}
sub test_row {
my $self = shift;
my ( $row, $col ) = @_;

my $val = $self->{grid}[$row][$col];

CELL:
for my $c ( 0 .. 8 ) {
next CELL # don't test against self.
if $c == $col;
return # collision
if $val == $self->{grid}[$row][$c];
}
return 1; # ok
}

cell_value_ok returns success if there is no conflict between the value in cell $r, $c and the other values in the row, in the column, and in the block. test_row() demonstrates the concept: For each cell in the row, skipping over column $c, return a failure if the values match. If no conflict are present, return success. test_col() looks the same, except for swapping row and column. I considered merging the two routines into one, with some indication of which way to process, but merging test_block as well would be difficult. A little redundancy only makes the program slightly larger, but this module is so tiny, it can't take more than a single block to read it in. So there's no advantage besides clean coding, to merging, and there may be some performance to be gained.

sub test_col {
my $self = shift;
my ( $row, $col ) = @_;

my $val = $self->{grid}[$row][$col];

CELL:
for my $r ( 0 .. 8 ) {
next CELL # don't test against self.
if $r == $row;
return # collision
if $val == $self->{grid}[$r][$col];
}
return 1; # ok
}

sub test_block {
my $self = shift;
my ( $row, $col ) = @_;

my $baserow = 3 * int( $row / 3 ); # 0, 3 or 6
my $basecol = 3 * int( $col / 3 );

my $val = $self->{grid}[$row][$col];

for my $r ( $baserow .. $baserow + 2 ) {
CELL:
for my $c ( $basecol .. $basecol + 2 ) {
next CELL # don't test against self.
if $r == $row and $c == $col;
return # collision
if $val == $self->{grid}[$r][$c];
}
}
return 1; # ok
}

test_block() is a bit more complicated. Each cell is in one of 9 sub-blocks, the first row and first column of the sub-block can only be zero, three or six. So divide the cell's row or column number by 3 and keep only the integer part, multiply by three again, and you have the first row/column value. Just consider that row and the next two, that column and the next two, and you've coverd the sub-block.

sub successor {
my ( $r, $c ) = @_;

if ( $c == 8 ) {
$r++;
$c = 0;
}
else {
$c++;
}
return ( $r, $c );
}

Figuring out the next cell is fairly simple. If we've reached column 8, it's time to advance to the first column of the next row, otherwise just go to the next column.

sub print_answer {
my $self = shift;

print "\n\n";
for my $r ( 0 .. 8 ) {
for my $c ( 0 .. 8 ) {
print $self->{grid}[$r][$c];
print q{ }
if $c % 3 == 2;
}
print "\n";
print "\n"
if $r % 3 == 2;
}
}
1;

Finally, a routine to display the grid as it is at the moment. Blank spaces and empty rows division the matrix into sub-blocks, cleaner than using lines. Admittedly, lines, and dots or zeroes or underscores, rather than empty spaces, for unsolved cells, would be helpful if debugging a partially complete solution, so you can determine the location of a value more easily. But for simple display purposes, less is more.

BruteForceTest.pl




use BruteForceTest;

use warnings;
use strict;

my $puzzle = BruteForceTest->new(
{
input => (
'010056207'
. '000700005'
. '000300498'
. '000200380'
. '006000700'
. '051007000'
. '684002000'
. '100003000'
. '305460020'
),
}
);
$ouzzle->print_answer;
$puzzle->solve;
$puzzle->print_answer;

The main-line is to demonstrate that the program works ...
Create a puzzle with a particular value. Print out the initial conditions, solve the puzzle, and print the result. The display looks like this, running it with time to estimate performance:

010 056 207
000 700 005
000 300 498

000 200 380
006 000 700
051 007 000

684 002 000
100 003 000
305 460 020



418 956 237
923 784 615
567 321 498

749 215 386
236 849 751
851 637 942

684 192 573
192 573 864
375 468 129

real 0m0.139s
user 0m0.041s
sys 0m0.005s

So overall it takes 1/7 seconds of clock time, of which 1/25 seconds is real and only 5/1000 seconds is spent running the CPU. This is my desktop, not the laptop which spent 42 seconds running the Perl6 version, so direct comparisons are pointless, but obviously there are some efficiencies still to be implemented in Perl6.

Thursday, September 30, 2010

Exploring Perl6

I've been exploring Perl6 since Rakudo-star was released in late July. There's a lot of exciting things to learn about Perl6, and you have to start early to figure things out.

I've been working on implementing a simple sudoku solver, as something to do.

My conclusion is that I'm delighted with what's been done, and what's coming ... and frustrated waiting for what's coming to turn into usable features!


Sudoku




In case you've been on Mars the last few years, sudoku is a puzzle with nine rows and nine columns. That block is also divided into 3x3 regions ... like a tic-tac-toe board subdivided into tic-tac-toe boards. When solved, each row contains the digits 1 to 9, without repetition; so does each column, and each block. Each puzzle begins with a certain number of cells filled in, your job is to figure out the rest.


You know that cell A1 can't be a 1, because there's a one already in that row, it can't be a 2 or 4 or 6 or 7 or 8. From the other cells in the column, you know it can't be a 3, either. From the other cells in the same block, you know 5 is out of the question. By a process of elimination, it must be a 9. That means none of the empty cells in that row, that column, that block can be a 9, either. That might be enough to resolve some other cells.

Sometimes that sort of elimination is not enough. You wind up with a pair of cells that might be either of a couple of values, so you need to use other information to determine what value they actually are. For example, you know there has to be a 1 somewhere in block abc/789. I can't be in column A, because of A4; it can't be in column B because of B1. B7 & C9 are occupied, so it must be in C8

My program isn't that smart, it only solves trivially easy sudoku using elimination. The goal is not really to solve sudoku, though it does that faster than I do, the goal is to explore Perl6.


sudoku.pl



My main program looks like this:
#!./perl6
use Grid;

sub test ( Str $file ) {
my $grid = Grid.new;
$grid.load( $file );
$grid.display;
$grid.show-queue;
while $grid.has-stuff-to-process and not $grid.solved {
$grid.process-next;
}
$grid.display;
$grid.show-queue
}
test( $_ ) for @*ARGS;

Starting at the top, I don't have Perl6 actually installed, it's just in a neighbouring directory, so I provide a local link to the executable, to simplify paths ... actually, because I wasn't using a hashbang line during testing, just the link.

In theory you're supposed to have a use v6 to activate Perl6 mode, but so far that doesn't seem necessary. The idea is the same binary will handle both Perl 5 and 6, and certain keywords will differentiate, but that's still in the future.

I include the Grid module, which will be describe later. For now, it's magic that does stuff.

A subroutine is still a subroutine, though there are also methods and multi-methods and other stuff. Perl is now one of the Real Languages™ ... you declare subroutine arguments between the subroutine name and the opening curly. You don't HAVE to, you have a number of options, but I usually prefer to.

Skipping over the details of the routine, for the moment, I invoke the test routine once for each command line argument. The arguments will specify files containing Sudoku game definitions, so you can run one, or several as a batch.

The Sudoku game needs a playing surface, the Grid, and you just know it's going to consist of Cells. You have options about declaring variables, you can give a little information, or a lot. In the finest Perl tradition, you can have a scalar with holds a Dog at one moment, an integer a little later, then a string ... but the language needs to keep track of what is possible. the more wide open your options, the greater the run-time load. Specific requirements helps prevent errors and makes things faster, because the system knows your variable will never handle a different type.

In this case, I leave it flexible, but immediately assign it a grid object. Like most languages, methods are invoked by joining the object name and method name by a dot. Sometimes there's an advantage to being conventional. Loading the initial game board consist of resolving the cells which are specified at the start. Resolving a cell sets its value, but also adds that cells row, column and value to a queue. As the grid is loaded, the queue is primed. Once loading is complete, I display the initial board condition, and the contents of the queue.

Note that variable names can now contain a hyphen or apostrophe so long as the next character is alphabetic, So we no longer need to be jealous of LISP programmers, and our tests can contain both can() and can't() tests.

Then it's time to search for a solution. So long as there are elements queued for processing, and the game has not been solved, process the next queued element. Applying values may resolve other cells in the row, column or block, which will lead to additional queue entries.

If the puzzle does get solved or the queue runs out, I display the final board, and show what is still left in the queue.

Update: Thank you Jonathan Worthington .. just discovered if you have a subroutine MAIN, Perl6 will automatically convert between command line arguments and subroutine args, and will even display a brief USAGE message if there's an error. Goodbye, Getopts::Long, and thanks for all the fishing.

Grid.pm


use Cell;

class Grid {
has @!cells[9][9] of Cell;
has @!queue;
has $!remaining is ro;

The playing board is a Grid of Cells. Details coming later, but basically, each cell keeps track of a set of possible values. Initially, it can be anything from 1 to 9, gradually those possibilities get whittles away until it is resolved to an exact value.

A class now gets a keyword of its own. package and class must have a bracketed scope, no longer "from here till the next package.

Attributes are declares with has and use a twigil. The first character defines the type of variable, the second is a dot for public attributes, exclamation mark for private ones. Perl6 is supposed to support multi-dimensional arrays, using [$i;$j;$k] indexing, but it isn't there, yet, so I just use an array of arrays.

When a cell is resolved, that is, when you figure out the exact value for a cell, you want to remove that value from the set of possibilities
for the other cells in that row, in that column, and in that sub-block. Resolving a single cell may resolve several neighbours, so I have a queue of resolutions waiting to be processed.

Initially, there are 81 cells waiting to be resolved. Loading the initial state resolves some of them and primes the queue. When there are no more cells waiting to the resolved, the puzzle is done.

  submethod BUILD {
for 0..8 -> $r {
for 0..8 -> $c {
@!cells[$r][$c] = Cell.new(9);
}
}
$!remaining = 81;
}

Besides subroutines, there are now methods, multi-methods (methods overloaded by signature), submethods, and some others. I don't understand what a submethod is, yet, but the constructor is one, and has the name BUILD. Each element of the Grid gets a new Cell, which may have any value in the range 1..9.

The grid has rows 1->9 and columns 1->9. Eventually Perl6 will allow you to refer to the array elements using the names you want, by putting the index in curly braces: @array{1}, but for the moment, you're forced to use traditional FORTRAN-like zero-based indices: @array[0]. I should use symbolic names or even helper routines to map from grid row and column names to array indices, but I haven't bothered.
 method load( Str $file ) {
my $init = open $file, :r
err die( "Could not open file '$file' $*ERR" );
for $init.lines -> $line {
my ( $r, $c, $v ) = $line.split(' ');
@!cells[$r-1][$c-1].assign( $v.Int );
@!queue.push( [$r, $c, $v] );
}
}

The first step in solving a puzzle is to load the initial conditions onto the board. Note that open now returns the file handle, like subroutines should. As well, the conditions associated with opening the file, in that case that we are reading the file, become adverbs, modifying the routine. Special variables are all upper-case, and frequently have a star or question mark as the second character of the twigil. There is far less punctuation or one-character variables. Unfortunately, many don't work yet.

The initial conditions file consist of rows, each row containing the row number, column number and value for a cell, separated by spaces.

I assign the value to the cell, and add it to the queue for processing related cells.
 method display {
for 0..8 -> $r {
for 0..8 -> $c {
print @!cells[$r][$c].is || '.';
}
print "\n";
}
print "\n";
}

To display the grid, print the value of each resolved cell, or a dot if it isn't resolve yet. Note the Perl6 for loop syntax, no parentheses, specifying the loop variable name in the pointy block. While Perl6 has become verbose where it has benefits, the philosophy of economy remains ... the act of declaring the loop variable is enough, no my or other superfluous keywords.
 method show-queue {
my $rstr = 'r ';
my $cstr = 'c ';
my $vstr = 'v ';
for @!queue.values -> @elem {
for @elem -> $r, $c, $v {
$rstr ~= $r;
$cstr ~= $c;
$vstr ~= $v;
}
}
say $rstr; say $cstr; say $vstr; say '';
}
For debugging, I found it useful to display the queue. Rather than display it vertically, as a stack, I rotate it so it only consumes 3 rows of display. I iterate over the queue elements, appending the row, column, and value components tot he appropriate string buffer, and then print out the buffers, with a blank line separator.

method has-stuff-to-process {
return 0 < @!queue.elems();
}
method process-next {
self.can't-be( @!queue.shift );
}
method solved {
return $!remaining == 0;
}
It's generally a bad idea for the user to access internal components. That would prevent you from altering the implementation. Instead you want to provide methods which make sense in the problem space. In this case, the queue is a problem space component, but we still want to control how the user uses it. So we prevent the user from directly manipulating the queue, just permitting him to see whether it is empty or not. Similarly, you don't want the top level to take elements from the queue and remove them from the grid, though that is what my solution did, for a long time. Better to externalize the processing of an element, but conceal the details. Similarly, the user can detect whether the puzzle has been solved, but not how that information is determined.
 method clear-and-queue ( @r, @c, @skip, $v ) {
for @r -> $r {
for @c -> $c {
next if $r == @skip[0] && $c == @skip[1];
@!cells[$r-1][$c -1].clear($v.Int);
if @!cells[$r-1][$c-1].changed {
@!queue.push( [ $r, $c, @!cells[$r-1][$c-1].is ] );
$!remaining--;
}
}
}
}
method cant-be ( @set ) {
for @set -> $r, $c, $v {
self.clear-and-queue( [1..9], [$c], [$r, $c], $v);
self.clear-and-queue( [$r], [1..9], [$r, $c], $v);
my $rt = 1 + 3 * floor(( $r - 1 ) / 3);
my $ct = 1 + 3 * floor(( $c - 1 ) / 3);
self.clear-and-queue( [$rt..$rt+2], [$ct..$ct+2], [$r, $c], $v);
}
}
}
When we fetch an element from the queue, we know that value cannot appear anywhere in the specified row, or the specified column, or the sub-block containing that row and column. clear-and-queue() is the routine which processes a range or cells. it takes an array of row numbers, and array of column numbers and a value to drop from those cells. But we want to skip over the actual cell which generated the queue element, so the actual row and column numbers are present as a pair, @skip. We iterate over the row values and the column values, and except at the generating cell location, drop the specified value from the list of possibilities. If the cell becomes resolved to a single value, indicated by setting the changed value, we add that cell to the queue, and decrement the count of cells to process.

Processing a column consists of passing the values 1..9 for the row array, and the actual column number for the column array. Processing a row is the revere; the row array has a single element, the column ranges over 1..9. Processing a block is a bit tricky, involving integer arithmetic to determine the row and column of the top left corner, and processing that row / column and the next two.


Cell.pm



class Cell {

has %!values is Keyset;
has Int $!resolves-to = 0;
has Bool $!changed = False;
Since each sudoku row, column and sub-block contain each of the digits 1..9 without repetition, any cell in a new, uninitialized sudoku grid can contain any of those values. A KeyHash uses the keys of a hash to represent a set of values; aKeySet restricts them to having boolean values. If a value becomes false, the associated key disappears from the KeySet. This is perfect for representing the possible values of a sudoku cell. Unfortunately, it doesn't work, yet. The good news is we get a chance to use list reductions.

Since KeySets don't work properly, the private %!values attribute will have the possible cell values as keys, with a hash value of 1 if the digit is still possible, and a zero if it has been eliminated

If the Cell has been resolved to a single digit, that value will be stored in the $!resolves to attribute, to eliminate querying the hash. At the time the cell is resolved, $!changed will be set; but automatically cleared when the outer program accesses the value. As a result, it will only be true once.
submethod BUILD {
%!values{$_} = True for 1..9;
}
Initialze the possible values.

method has-value( Int $n ) {
return %!values{$n};
}
Determine whether a value is among the possibilities for the cell.

method assign( Int $v ) {
%!values{$_} = False for 1..9;
%!values{$v} = True;
self.test-whether-resolved;
}

Loading an initial state for the sudoku board involves specifying a single value for a cell, so we need an assign() method. Eliminate all possible values, and then assign the single one we want. Of course the test-whether-resolved will find it resolved.
method set( Int $n ) {
%!values{$n}++;
}
The set() method is useful for testing, to add values to the set of possibilities. In sudoku solving, the list of possibilities only ever decreases.
method test-whether-resolved {
return unless ( 1 == [+] %!values.values );
$!resolves-to = first { %!values{$_} }, 1..9;
$!changed++;
}

The values for the hash will contain a one for tags that are set, zero for those that have been dropped. So %!values.values will be a list of zeros and ones. Performing a list reduction iterates over the list applying the specified operator, like a map-reduce. In this case, it returns the number of elements which are still set. If only one element is set, the cell has been solved, otherwise we exit the method.

first() is a list method which returns the first true value, like a grep which terminates after a success.

The $!changed attribute is set, to indicate the resolution is new. Since the variable is not available outside the Cell, only a changed() method, the value can only become true, once, in test-whether-resolved(), and is cleared the rest of the time.
method clear( Int $n ) {
return unless %!values{$n};
%!values{$n}--;
self.test-whether-resolved;
}
When we process the cells of the grid, any value should only be dropped once. To be sure, we only clear the value if it is set, then we check whether that change resolved the cell.
  method key-count {
return [+] %!values.values;
}
method changed {
return unless $!changed;
$!changed = False;
return True;
}
method is {
return $!resolves-to;
}
}
key-count is another testing routine, it reports the number of remaining values. The changed() method is the external interface to determine whether the cell has become solved. If the cell has been solved, the $!changed attribute is cleared, and TRUE is returned to the user. The flag is set, once, in test-whether resolved and is cleared by reading.

Disclaimer: While writing up this report, I made some changes. Some sloppiness that I tolerated while fooling around with the code became too embarrassing to stand by in a write-up. I changed the code to test it, and then edited the code in the write-up by hand. So The code does run,and gives the correct results, but there is some possibility of errors in the code I have in the file.

Wednesday, September 1, 2010

It didn't happen if you don't have photos

Visited Niagara Falls last weekend. Everyone was busy proving they had been there.










Sunday, July 18, 2010

Tall Ships

There was a gathering of tall ships in Toronto Canada / Independence Day weekend, and while I found some interesting shots of rigging and hardware, I especially liked this rope hanging off the stern of the German ship, so swimmers could climb back aboard. The ship was steady, being tied to the dock, hardly any motion at all, so the movement of the water expressed itself relative to the monkey's fist knot at the bottom of the gently swaying rope.






This one is scaled down; click on the picture to go to my photo website gallery of animations and see it full size.

Tuesday, July 6, 2010

Toronto's Gay Pride parade, last Sunday, July 4th.


How do they arrange a HOT day, year after year? Do they have special arrangements with the man upstairs? or is it the hot guy downstairs? Either way, it was a day for super-soakers, especially the humunguous one on the fire enging bringing up the rear.