Tag Archives: dynamic programming

JenPop and Genetic Algorithms on the loose

Hey there! I’ve committed a new class on JenPop, and it’s called “Individual”. It was made to represent the rules of typical genetic algorithms:

I’ve read a bit more on the subject and figured that genetic algorithms, although being able to achieve good solutions with little computing, are only suitable for problems where brute-forcing isn’t feasible. For instance, they do a nice job on combinatorial problems such as the classic knapsack problem, where you have a list of items with value v and weight w, and you must figure what’s the maximum value you can carry in a knapsack of capacity wc by combining different amounts of those items.

And that problem was chosen to serve as the first example. At this kind of problem, a naive approach can easily become unfeasible as the input list grows. With an evolutionary approach there’s a bigger probability that it will result in a better solution, given the same time limits. And even if the solution isn’t better, it will be a decent solution. The more time you let it roll, bigger is the chance of getting a great result.

For the classic scenario with a knapsack with wc = 15, and a (v,w) item list of (4, 12), (2, 2), (2, 1), (1, 1), (10, 4), I was able to achieve the following results:

30 iterations: Best result was { 0, 0, 4, 0, 2 } (v 28, w 12)
100 iterations: Best result was { 0, 3, 0, 9, 0 } (v 15, w 15)
500 iterations: Best result was { 0, 7, 1, 0, 0 } (v 16, w 15)
1000 iterations: Best result was { 0, 3, 4, 4, 0 } (v 18, w 14)
3000 iterations: Best result was { 0, 1, 1, 5, 1 } (v 19, w 12)
5000 iterations: Best result was { 0, 0, 5, 2, 2 } (v 32, w 15)

You can see that while none of those runs delivered the optimal solution, which would be { 0, 0, 3, 0, 3 } (v 36, w 15), they got close enough to provide good solutions. Also, it becomes clear that randomism takes a nice part in this, giving us the chance of achieving a great solution with only 30 iterations. These results also show that the more you let it run, bigger is the chance of getting better results.

Anyway, see you! 😀

Code Jam 2008 Round 1A, Problem C: Numbers

Hello again! This one is one hell of a tricky problem. No wonder it’s called “Numbers”:

In this problem, you have to find the last three digits before the decimal point for the number (3 + √5)n.

For example, when n = 5, (3 + √5)5 = 3935.73982… The answer is 935.

For n = 2, (3 + √5)2 = 27.4164079… The answer is 027.

Wow, that can’t be hard, eh? Indeed! It’s really just an expression, isn’t it? Of course it is! But once you try, you get to see something beautiful happening:

$ python
>>> import math
>>> 3+math.sqrt(5)
>>> (3+math.sqrt(5)) ** 18
>>> (3+math.sqrt(5)) ** 19
>>> (3+math.sqrt(5)) ** 20

If I was clear enough, you can see where I’m going. If not, the answers for n = 18, 19 and 20 should be 607, 263 and 151. But, it gets worse as the exponent grows:

>>> (3+math.sqrt(5)) ** 30
>>> (3+math.sqrt(5)) ** 40
>>> (3+math.sqrt(5)) ** 50

And those results should be 647, 551 and 247, respectively. As you can now clearly see, python can’t naturally handle those big floats, what a bummer! Well, the first thing to handle is the √5 value, since 2.23606797749979 does not give enough sampling to work with the needed precision. Looking a bit, I’ve found that python has a nice Decimal module that can deal with big numbers. So, let’s use it:

>>> from decimal import *
>>> Decimal(5).sqrt()

Hmm… not as much as I was expecting. Digging a little deeper, there’s a way to increase the precision, so things might get better:

>>> from decimal import *
>>> getcontext().prec = 256
>>> Decimal(5).sqrt()

Much better indeed! Let’s test the previous cases that way!

>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(18)
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(19)
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(20)
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(30)
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(40)
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(50)

Whoa, now that’s improvement! Since the small input ranges from 2 to 30, we’re covered, yay! To the large input! The first case is 910062006… much bigger! But we have faith in our solution!

>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(910062006)

Aaaaaand it fails miserably. You say increase the precision? Go!

>>> getcontext().prec = 1024
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(910062006)
>>> getcontext().prec = 1000000000
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(910062006)

It is still computing that last command… maybe it’ll finish before Christmas! 😀 So… nothing we can do about it. Not in a natural way at least. It turns out that Code Jam organizers knew about this (that’s the whole point, indeed) and my approach, even if it’s able to solve the small input, is not the correct one. From the contest analysis:

The key in solving the problem is a mathematical concept called conjugation. In our problem, we simply note that (3 – √5) is a nice conjugate for (3 + √5). Let us define

(1)     α := 3 + √5,   β := 3 – √5,   and Xn := αn + βn.

We first note that Xn is an integer. This can be proved by using the binomial expansion. If you write everything down you’ll notice that the irrational terms of the sums cancel each other out.


Another observation is that βn < 1, so Xn is actually the first integer greater than αn. Thus we may just focus on computing the last three digits of X.

Whoa, there’s a ton of math symbols there. What it is trying to explain is that there’s another way around, and we can write a simple algorithm to calculate only what we need. Taking that math apart, this suggested approach is quite interesting:

Solution C. [the periodicity of 3 digits]

For this problem, we have another approach based on the recurrence (7). Notice that we only need to focus on the last 3 digits of Xn, which only depends on the last 3 digits of the previous two terms. The numbers eventually become periodic as soon as we have (Xi, Xi+1) and (Xj, Xj+1) with the same last 3 digits, where i < j. It is clear that we will enter a cycle no later than 106 steps. In fact, for this problem, you can write some code and find out that the cycle has the size 100 and starts at the 3rd element in the sequence. So to solve the problem we can just brute force the results for the first 103 numbers and if n is bigger than 103 return the result computed for the number (n – 3) % 100 + 3.

Hmm… so that means that 103 restarts the cycle?

>>> getcontext().prec = 1024
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(3))
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(103))
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(4))
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(104))

Yay! To the git-pushed solution!

from decimal import *                            # Improved math

getcontext().prec = 1024                         # Adjust the precision
base = (Decimal(3) + Decimal(5).sqrt())          # Calculate the base

for case in range(1,int(raw_input())+1):         # For all test cases
  exp = (int(raw_input()) - 3) % 100 + 3         # Get the recurrent exponent

  num = str(base ** Decimal(exp))                # Calculate
  ans = num[:num.find('.')]                      # Get only the integer part

  print "Case #%d: %03d" % (case, int(ans[-3:])) # Zero-lead the last 3 digits

One thing that’s really interesting about this is that I actually solved this (of course, not from scratch) while posting! I was already OK with the idea of only solving the small input, but reading the analysis brought some new light into this problem. Blogging FTW! See ya!

CodeJam trouble, hard problems ahead!

Well, since we’re mostly finished with the company project, I can now put some of my attention here again. I’ts been a while! I’ve been trying to crack some hard (for me at least) CodeJam problems, from probabilities to geometry and I’ve not been able to move on much, must study a lot more. In the meantime, I’ve checked out some of the solutions looking for light and being completely outstanded by the work of some coders out there.

I know most of them are perhaps too familiar with the problems and/or can crack really hard math problems, but that’s SO above every coder I’ve ever known! It’s exciting! 😀

I’m committing some new problems at GitHub and will start talking about them tomorrow.

Difficult CodeJam problem and Free Software Documentary

Well, it’s been a tough weekend. Many new things were installed at home, lots of drilling and weightlifting (oh my arms…) but feels nicer after all. 🙂 More places for the kittens to hide.

That said, I’m evolving on Africa 2010 CodeJam D problem, but as soon as I get closer to solving, another challenge appears. This time, I’m having trouble finding an heuristic to determine the right choice to make, of all the results. Perhaps I’m still getting many results and not comparing correctly to the “solved group” list. I’ll for sure find out eventually.

Also, watched InProprietario, a brazilian documentary on free software (lots of Stallman) and, beside being very crude on effects, they used some nice movie references to cleverly show some software concepts. If you can understand portuguese and would like to know more about Free Software, I highly recommend it. 🙂

Code Jam 2010 Africa – Problem C

Whoa, that was a tough problem. Ironically it’s called Qualification Round:

You’ve just advanced from the Qualification Round of Google Code Jam Africa 2010, and you want to know how many of your fellow contestants advanced with you. To give yourself a challenge, you’ve decided only to look at how many people solved each problem.

The Qualification Round consisted of P problems; the ith problem was fully solved by Scontestants. Contestants had to solve C problems in order to advance to the next round. Your job is to figure out, using only that information, the maximum number of contestants who could have advanced.

Well, at first it seems pretty simple, but there is quite some math involved. The first case of the example input is easy to solve, but the second will most likely get you scratching your head thinking “where the #$%* that number came from” (at least I did). 🙂 Well the secret is that you should not look at the inputs as they are, but instead, find the maximum amount of solved problem combinations there are.

I studied a bit but still could not solve it, and surrendered by checking the contest analysis. It was quite complicated to understand, until they wrote about a clever solution from the winning candidate: it consisted in lowering the possible scores to max at the division of the scores by the amount of needed problems. Just do it (don’t ask why) until there are no scores over the result of that division. That same result will be the answer. Creepy! Python code time:

for case in range(1,int(raw_input())+1):          # For all test cases
    info = map(int, raw_input().split(" "))       # Get test case information
    problems = info[0]                            # Fetch number of problems
    needed = info[1]                              # Fetch number of needed problems to pass
    results = info[2:]                            # Fetch results for each problem

    qualified = 0                                 # Initialize our results

    if (problems == needed):                      # For this case, we can expect
        results.sort()                            # as many contestants as the
        qualified = results[len(results)-needed]  # least solved problem

    else:                                         # For this case, we use a solution
        out = False                               # similar to the winning one
        while not out:
            out = True
            qualified = sum(results)/needed       # Reduce the number of possible qualifiers
            for i in range(len(results)):         # And for all results
                if results[i] > qualified:        # Limit their values to the current qualifiers
                    results[i] = qualified
                    out = False                   # Until they all "match" the qualifiers number

    print "Case #%d: %s" % (case, str(qualified)) # Report results

Interestingly enough, this code is smaller than the last, and solves a more complicated problem (at least in my opinion). I don’t think I could have done a better job on this one. 🙂 That’s the beauty in learning!

CodeJam Quickie: 2010 Africa – Qualification A

I didn’t knew that this contest existed! 😀 Well, found it on the front page, looking to be solved, so, let’s go! Its summary:

You receive a credit C at a local store and would like to buy two items. You first walk through the store and create a list L of all available items. From this list you would like to buy two items that add up to the entire value of the credit. The solution you provide will consist of the two integers indicating the positions of the items in your list (smaller number first).

It’s quite simple, and by smaller number first, they mean the position of the items, not its value. From that, the solution is trivial (already validated for the inputs):


for case in range(1,int(raw_input())+1):      # For all test cases

    credit = int(raw_input())                 # Get the test's credit
    raw_input()                               # Discard the number of items
    strItems = raw_input().split(" ")         # Get the items as strings
    items = []
    for item in strItems:
        items.append(int(item))               # Convert the items to integers

    out = False
    i = 0
    j = i+1
    for i in range(0,len(items)-1):           # first item can be any of the items, but the last
        for j in range(i+1,len(items)):       # the second will be any of the items after the first
            if items[i] + items[j] == credit:
                out = True                    # Found, get out
        if out: break

    print "Case #%d: %d %d" % (case, i+1, j+1)# Reports results

Pretty easy! 😀 See you next post!

CodeJam Online Competition for Veterans 2013 C: Ocean View

Gotchas all around! 😀 In fact, this time I’ve struggled with the algorithm to solve the problem. My first attempt did work, but could only solve correctly the small input of the problem, entitled Ocean View. It’s not complicated as a problem, but this time you need to think a bit more to get there:

Ocean View is a small town on the edge of a small lake, populated by people with high self-esteem. There is only one street in this town, Awesome Boulevard, running away from the lake on the west towards the hill in the east. All of the houses in Ocean View are situated along one side of Awesome Boulevard and are numbered starting at #1 on the edge of the lake all the way up to #N at the foot of the hill.

Each resident of Ocean View wants to be able to see the lake. Unfortunately, some of the houses may be blocking the view for some of the higher numbered houses. House #A blocks the view for house #B whenever A is smaller than B, but house #A is as tall as or taller than house #B.

Tired of hearing complaints about obstructed views, the Supreme Ruler of Ocean View has decided to solve the problem using his favorite tool of governance — violence. He will order his guards to destroy some of the houses on Awesome Boulevard in order to ensure that every remaining house has a view of the lake. Of course, if he destroys too many houses, he might have a rebellion to deal with, so he would like to destroy as few houses as possible.

What is the smallest number of houses that need to be destroyed in order to ensure that every remaining house has an unobstructed view of the lake?

It needs a little bit of thinking to understand (not if you have previously worked on it), but the general concept is: you have to find the longest increasing subsequence in a list of numbers. Yes, I didn’t know how “classic” it was, it even has its own Wikipedia entry. My lack of knowledge led me to my first solution, coded in a way that it would just solve the problem, even if not optimized:

for case in range(1,int(raw_input())+1):                #For all test cases
        size = int(raw_input())                         #Get the case size
        heights = []
        paths = []

        for i in raw_input().split(" "):                #Split the case values by spaces
                heights.append(int(i))                  #Store its values as integers

        paths.append([heights[0]])                      #Starts first list with first value
        for i in range(1, size):                        #Iterate from second member on
                for path in paths:                      #For every path
                        if heights[i] > max(path):      #Is the current height bigger than my maximum?
                                path.append(heights[i]) #Append its value, so it will be crescent
                                path.append(0)          #Else, ignore it

                paths.append(heights[:i+1])             #Create a new path with the heights so far
                cur_num = paths[-1][-1]                 #Get the current height

                for j in reversed(range(0, i)):         #Iterate down to the first height
                        if cur_num > heights[j]:        #If the next value is smaller, keep it
                                cur_num = heights[j]
                                paths[-1][j] = 0        #If it's not smaller, ignore it

        min_houses = size
        for path in paths:
                min_houses = min(min_houses, path.count(0)) #Get the path with least ignored members

        print "Case #%d: %d" % (case, min_houses)

As I said, not a good example of resource usage, but “works”. The thing is, while brainstorming a bit with a friend (who introduced me to the Dynamic Programming concept), I’ve found that my first attempt had a little bug (didn’t took time to debug it though): it will ignore possibilities with subsequent “out-of-order” nodes:

list = [ 1, 2, 4, 3, 6, 5, 7, 8 ]
path[0] = [ 1, 2, 0, 3, 6, 0, 7, 8 ]
path[1] = [ 1, 2, 4, 0, 6, 0, 7, 8 ]
path[2] = [ 1, 2, 4, 0, 0, 5, 7, 8 ]

As you can see, there’s the not covered possibility [ 1, 2, 0, 3, 0, 5, 7, 8], enough to invalidate the solution. As it was already a bad solution, I decided to use the wikipedia algorithm to solve the problem:

 L = 0
 for i = 1, 2, ... n:
    binary search for the largest positive j ≤ L
      such that X[M[j]] < X[i] (or set j = 0 if no such value exists)
    P[i] = M[j]
    if j == L or X[i] < X[M[j+1]]:
       M[j+1] = i
       L = max(L, j+1)
for i in range(size):                                          #For all the sequence
                j = 0
                for k in range(length+1):                              #For all the sequence until the current length
                        if heights[pointers[k]] < heights[i]:          #Compare the values on pointers to the current value
                                j = max(j, k)                          #Get the maximum value index

                if j == length or heights[i] < heights[pointers[j+1]]: #If the value is a "keeper"
                        pointers[j+1] = i                              #Keeps its index
                        length = max(length, j+1)                      #Update the length

But, that algorithm has a minor bug: when “length == size”, the index keeping will fail, because j+1 exceeds the pointers list size. So, I’ve added a check for “j < size” before it. The bug only happens at the last iteration anyway, we don’t need to store its index anyway. Perhaps it can be “avoided” with a bigger than “size” pointer list.

The final, validated, solution becomes:

for case in range(1,int(raw_input())+1):                               #For all test cases
        size = int(raw_input())                                        #Get the case size
        heights = []
        pointers = [0] * size                                          #Create the pointer list
        length = 0                                                     #Sequence length

        for i in raw_input().split(" "):                               #Split the case values by spaces
                heights.append(int(i))                                 #Store its values as integers

        for i in range(size):                                          #For all the sequence
                j = 0
                for k in range(length+1):                              #For all the sequence until the current length
                        if heights[pointers[k]] < heights[i]:          #Compare the values on pointers to the current value
                                j = max(j, k)                          #Get the maximum value index

                if j == length or heights[i] < heights[pointers[j+1]]: #If the value is a "keeper"
                        if j < size-1:
                                pointers[j+1] = i                      #Keeps its index
                        length = max(length, j+1)                      #Update the length

        print "Case #%d: %d" % (case, size-length)

Until the next Jam!