Tag Archives: algorithm

JenPop and Genetic Algorithms on the loose

Hey there! I’ve committed a new class on JenPop, and it’s called “Individual”. It was made to represent the rules of typical genetic algorithms:

I’ve read a bit more on the subject and figured that genetic algorithms, although being able to achieve good solutions with little computing, are only suitable for problems where brute-forcing isn’t feasible. For instance, they do a nice job on combinatorial problems such as the classic knapsack problem, where you have a list of items with value v and weight w, and you must figure what’s the maximum value you can carry in a knapsack of capacity wc by combining different amounts of those items.

And that problem was chosen to serve as the first example. At this kind of problem, a naive approach can easily become unfeasible as the input list grows. With an evolutionary approach there’s a bigger probability that it will result in a better solution, given the same time limits. And even if the solution isn’t better, it will be a decent solution. The more time you let it roll, bigger is the chance of getting a great result.

For the classic scenario with a knapsack with wc = 15, and a (v,w) item list of (4, 12), (2, 2), (2, 1), (1, 1), (10, 4), I was able to achieve the following results:

30 iterations: Best result was { 0, 0, 4, 0, 2 } (v 28, w 12)
100 iterations: Best result was { 0, 3, 0, 9, 0 } (v 15, w 15)
500 iterations: Best result was { 0, 7, 1, 0, 0 } (v 16, w 15)
1000 iterations: Best result was { 0, 3, 4, 4, 0 } (v 18, w 14)
3000 iterations: Best result was { 0, 1, 1, 5, 1 } (v 19, w 12)
5000 iterations: Best result was { 0, 0, 5, 2, 2 } (v 32, w 15)

You can see that while none of those runs delivered the optimal solution, which would be { 0, 0, 3, 0, 3 } (v 36, w 15), they got close enough to provide good solutions. Also, it becomes clear that randomism takes a nice part in this, giving us the chance of achieving a great solution with only 30 iterations. These results also show that the more you let it run, bigger is the chance of getting better results.

Anyway, see you! 😀

Code Jam 2008 Round 1A, Problem C: Numbers

Hello again! This one is one hell of a tricky problem. No wonder it’s called “Numbers”:

In this problem, you have to find the last three digits before the decimal point for the number (3 + √5)n.

For example, when n = 5, (3 + √5)5 = 3935.73982… The answer is 935.

For n = 2, (3 + √5)2 = 27.4164079… The answer is 027.

Wow, that can’t be hard, eh? Indeed! It’s really just an expression, isn’t it? Of course it is! But once you try, you get to see something beautiful happening:

$ python
>>> import math
>>> 3+math.sqrt(5)
>>> (3+math.sqrt(5)) ** 18
>>> (3+math.sqrt(5)) ** 19
>>> (3+math.sqrt(5)) ** 20

If I was clear enough, you can see where I’m going. If not, the answers for n = 18, 19 and 20 should be 607, 263 and 151. But, it gets worse as the exponent grows:

>>> (3+math.sqrt(5)) ** 30
>>> (3+math.sqrt(5)) ** 40
>>> (3+math.sqrt(5)) ** 50

And those results should be 647, 551 and 247, respectively. As you can now clearly see, python can’t naturally handle those big floats, what a bummer! Well, the first thing to handle is the √5 value, since 2.23606797749979 does not give enough sampling to work with the needed precision. Looking a bit, I’ve found that python has a nice Decimal module that can deal with big numbers. So, let’s use it:

>>> from decimal import *
>>> Decimal(5).sqrt()

Hmm… not as much as I was expecting. Digging a little deeper, there’s a way to increase the precision, so things might get better:

>>> from decimal import *
>>> getcontext().prec = 256
>>> Decimal(5).sqrt()

Much better indeed! Let’s test the previous cases that way!

>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(18)
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(19)
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(20)
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(30)
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(40)
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(50)

Whoa, now that’s improvement! Since the small input ranges from 2 to 30, we’re covered, yay! To the large input! The first case is 910062006… much bigger! But we have faith in our solution!

>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(910062006)

Aaaaaand it fails miserably. You say increase the precision? Go!

>>> getcontext().prec = 1024
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(910062006)
>>> getcontext().prec = 1000000000
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(910062006)

It is still computing that last command… maybe it’ll finish before Christmas! 😀 So… nothing we can do about it. Not in a natural way at least. It turns out that Code Jam organizers knew about this (that’s the whole point, indeed) and my approach, even if it’s able to solve the small input, is not the correct one. From the contest analysis:

The key in solving the problem is a mathematical concept called conjugation. In our problem, we simply note that (3 – √5) is a nice conjugate for (3 + √5). Let us define

(1)     α := 3 + √5,   β := 3 – √5,   and Xn := αn + βn.

We first note that Xn is an integer. This can be proved by using the binomial expansion. If you write everything down you’ll notice that the irrational terms of the sums cancel each other out.


Another observation is that βn < 1, so Xn is actually the first integer greater than αn. Thus we may just focus on computing the last three digits of X.

Whoa, there’s a ton of math symbols there. What it is trying to explain is that there’s another way around, and we can write a simple algorithm to calculate only what we need. Taking that math apart, this suggested approach is quite interesting:

Solution C. [the periodicity of 3 digits]

For this problem, we have another approach based on the recurrence (7). Notice that we only need to focus on the last 3 digits of Xn, which only depends on the last 3 digits of the previous two terms. The numbers eventually become periodic as soon as we have (Xi, Xi+1) and (Xj, Xj+1) with the same last 3 digits, where i < j. It is clear that we will enter a cycle no later than 106 steps. In fact, for this problem, you can write some code and find out that the cycle has the size 100 and starts at the 3rd element in the sequence. So to solve the problem we can just brute force the results for the first 103 numbers and if n is bigger than 103 return the result computed for the number (n – 3) % 100 + 3.

Hmm… so that means that 103 restarts the cycle?

>>> getcontext().prec = 1024
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(3))
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(103))
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(4))
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(104))

Yay! To the git-pushed solution!

from decimal import *                            # Improved math

getcontext().prec = 1024                         # Adjust the precision
base = (Decimal(3) + Decimal(5).sqrt())          # Calculate the base

for case in range(1,int(raw_input())+1):         # For all test cases
  exp = (int(raw_input()) - 3) % 100 + 3         # Get the recurrent exponent

  num = str(base ** Decimal(exp))                # Calculate
  ans = num[:num.find('.')]                      # Get only the integer part

  print "Case #%d: %03d" % (case, int(ans[-3:])) # Zero-lead the last 3 digits

One thing that’s really interesting about this is that I actually solved this (of course, not from scratch) while posting! I was already OK with the idea of only solving the small input, but reading the analysis brought some new light into this problem. Blogging FTW! See ya!

Floating point precision, bad numbers

Hello there! While still trying to find a good way to solve the 2008 Round-1A-C problem, Numbers, I came across something that I still have to adventure a bit on: Floating Point number precision. I mean, is there a way (even at expense of time, CPU and memory usage) to fully reproduce what calculations “by hand” can offer? It might sound silly, but beside Code Jam challenges, as a programmer it has never bothered me. But since we need precise results to solve those kind of problems, I’ve found myself in need of tools, because the ones I got are clearly not OK (or I don’t know how to use them correctly).

So, I’ll try to spend some time this week to try to make an expensive but precise way to multiply floating point numbers, so that I can solve the problems without the need to use some external calculator (I’ve done before, it’s at my Code Jam github repository). Wish me luck! 🙂

Code Jam 2008 Round 1A, Problem A: Minimum Scalar Product

Hey there, since things are getting a little more calm, I can focus once more to solve some nice Code Jam problems. And here we go, from the first official tournament round: Problem A: Minimum Scalar Product. It says:

You are given two vectors v1=(x1,x2,…,xn) and v2=(y1,y2,…,yn). The scalar product of these vectors is a single number, calculated as x1y1+x2y2+…+xnyn.

Suppose you are allowed to permute the coordinates of each vector as you wish. Choose two permutations such that the scalar product of your two new vectors is the smallest possible, and output that minimum scalar product.

There’s a catch on the text, that made me quite confused. In case you didn’t notice, it’s that damn TWO PERMUTATIONS part. Despite getting the sample results right, when the bigger test cases came, I wasn’t able to get the smallest results right! Because, guess why, I was only doing two value permutations, because to me, the text says so!

Anyway, took a look into the results and saw that participants were simply not considering this limitation and outputting the minimum possible value anyway and to hell with it. If Google thinks this would be the correct approach, who am I to judge! 😀 Just sort both arrays, invert one of them, iterate calculating the MSP and you’re done! Python code, as usual:

for case in range(1,int(raw_input())+1): # For all test cases
  size = int(raw_input())                # Since we're informed, save array size
  x = map(int, raw_input().split(" "))   # Get all X parameters
  y = map(int, raw_input().split(" "))   # Get all Y parameters

  x.sort()                               # Sort both vectors
  y.sort()                               # And sort-reverse Y
  y.reverse()                            # So we can match the bigger/smaller values

  msp = 0
  for i in range(size):                  # For all the array's len
    msp += x[i]*y[i]                     # Sum "forward" on X and "backwards" on Y

  print "Case #%d: %d" % (case, msp)     # Print the sum

And that’s it! Quick and easy, despite the text hiccup.  Githubbin’ it. See you!

Difficult CodeJam problem and Free Software Documentary

Well, it’s been a tough weekend. Many new things were installed at home, lots of drilling and weightlifting (oh my arms…) but feels nicer after all. 🙂 More places for the kittens to hide.

That said, I’m evolving on Africa 2010 CodeJam D problem, but as soon as I get closer to solving, another challenge appears. This time, I’m having trouble finding an heuristic to determine the right choice to make, of all the results. Perhaps I’m still getting many results and not comparing correctly to the “solved group” list. I’ll for sure find out eventually.

Also, watched InProprietario, a brazilian documentary on free software (lots of Stallman) and, beside being very crude on effects, they used some nice movie references to cleverly show some software concepts. If you can understand portuguese and would like to know more about Free Software, I highly recommend it. 🙂

Code Jam 2010 Africa – Problem C

Whoa, that was a tough problem. Ironically it’s called Qualification Round:

You’ve just advanced from the Qualification Round of Google Code Jam Africa 2010, and you want to know how many of your fellow contestants advanced with you. To give yourself a challenge, you’ve decided only to look at how many people solved each problem.

The Qualification Round consisted of P problems; the ith problem was fully solved by Scontestants. Contestants had to solve C problems in order to advance to the next round. Your job is to figure out, using only that information, the maximum number of contestants who could have advanced.

Well, at first it seems pretty simple, but there is quite some math involved. The first case of the example input is easy to solve, but the second will most likely get you scratching your head thinking “where the #$%* that number came from” (at least I did). 🙂 Well the secret is that you should not look at the inputs as they are, but instead, find the maximum amount of solved problem combinations there are.

I studied a bit but still could not solve it, and surrendered by checking the contest analysis. It was quite complicated to understand, until they wrote about a clever solution from the winning candidate: it consisted in lowering the possible scores to max at the division of the scores by the amount of needed problems. Just do it (don’t ask why) until there are no scores over the result of that division. That same result will be the answer. Creepy! Python code time:

for case in range(1,int(raw_input())+1):          # For all test cases
    info = map(int, raw_input().split(" "))       # Get test case information
    problems = info[0]                            # Fetch number of problems
    needed = info[1]                              # Fetch number of needed problems to pass
    results = info[2:]                            # Fetch results for each problem

    qualified = 0                                 # Initialize our results

    if (problems == needed):                      # For this case, we can expect
        results.sort()                            # as many contestants as the
        qualified = results[len(results)-needed]  # least solved problem

    else:                                         # For this case, we use a solution
        out = False                               # similar to the winning one
        while not out:
            out = True
            qualified = sum(results)/needed       # Reduce the number of possible qualifiers
            for i in range(len(results)):         # And for all results
                if results[i] > qualified:        # Limit their values to the current qualifiers
                    results[i] = qualified
                    out = False                   # Until they all "match" the qualifiers number

    print "Case #%d: %s" % (case, str(qualified)) # Report results

Interestingly enough, this code is smaller than the last, and solves a more complicated problem (at least in my opinion). I don’t think I could have done a better job on this one. 🙂 That’s the beauty in learning!

CodeJam Quickie: 2010 Africa – Problem B

Howdy! This time, problems started getting more and more complex, and that’s good, because I like the challenge! The next problem is Get to Work:

You work for a company that has E employees working in town T. There are N towns in the area where the employees live. You want to ensure that everyone will be able to make it to work. Some of the employees are drivers and can drive P passengers. A capacity of P == 1 indicates that the driver can only transport themselves to work. You want to ensure that everyone will be able to make it to work and you would like to minimize the number of cars on the road.

You want to calculate the number of cars on the road, with these requirements:

  • Every employee can get to town T.
  • The only way an employee may travel between towns is in a car belonging to an employee.
  • Employees can only take rides from other employees that live in the same town.
  • The minimum number of cars is used.

Find whether it is possible for everyone to make it to work, and if it is, how many cars will end up driving to the office.

Whoa, that’s way more complex than the other problems, but still, not that hard! What we need to do is: check who are drivers and passengers for each town (since they can only help same town buddies), if the total capacity of cars from a town is inferior to the number of passengers, the problem cannot be solved and so, reported as “IMPOSSIBLE”. If we can carry everyone, start filling the cars from the “minivan” (can carry more passengers) to the “mini cooper”. 😀 That way we can make sure there are fewer cars on the road.

Some rough, but commented, and validated (!) python code follows:

for case in range(1,int(raw_input())+1):                  # For all test cases
    town_len, office = map(int, raw_input().split(" "))   # Get amount of towns and office

    carriers = [[] for x in xrange(town_len)]             # Create new lists for every town
    capacities = town_len * [0]                           # Initialize "capacity of transportation"
    payload = town_len * [0]                              # Initialize "cargo" for each town

    for i in range(int(raw_input())):                     # Get and go through employee list
        town, capacity = map(int, raw_input().split(" ")) # Get the current employee info
        if town == office:                                # If employee is from office town
            continue                                      #   No need to commute, ignore

        town -= 1                                         # Adjust "town" to zero-based index
        payload[town] += 1                                # Employee needs to commute, count

        if capacity != 0:                                 # If employee has a car
            capacities[town] += capacity                  # Add to our capacities for the town
            carriers[town].append(capacity)               # Append its car capacity

    resultStr = ""                                        # Create our result string

    for i in range(town_len):                             # For all towns
        if capacities[i] < payload[i]:                    # If we can't carry everyone
            resultStr = "IMPOSSIBLE "                     #   declare it impossible, give up

        cars = 0                                          # We can carry, start with 0 cars
        carriers[i].sort()                                # Sort and reverse the car list
        for car in carriers[i]:                           # So we can start from the bigger ones
            if (payload[i] <= 0):                         # If we're carrying everybody already
                break                                     #  we're done for this town

            payload[i] -= car                             # Else, fill a car, report as being used
            cars += 1

        resultStr += str(cars)+" "                        # Add this town filled vehicles

    print "Case #%d: %s" % (case, resultStr[:-1])         # Report results, remove last space

As the last problem, the same can be told about this code. There are many ways to improve its performance, but this way it can already solve the large inputs in an instant. Bitbangers apart, it solves the problem and everybody gets happy. Also, I think it is quite clear to understand, even more with the comments. 🙂 Bye!