Tag Archives: open source software

JenPop and Genetic Algorithms on the loose

Hey there! I’ve committed a new class on JenPop, and it’s called “Individual”. It was made to represent the rules of typical genetic algorithms:

I’ve read a bit more on the subject and figured that genetic algorithms, although being able to achieve good solutions with little computing, are only suitable for problems where brute-forcing isn’t feasible. For instance, they do a nice job on combinatorial problems such as the classic knapsack problem, where you have a list of items with value v and weight w, and you must figure what’s the maximum value you can carry in a knapsack of capacity wc by combining different amounts of those items.

And that problem was chosen to serve as the first example. At this kind of problem, a naive approach can easily become unfeasible as the input list grows. With an evolutionary approach there’s a bigger probability that it will result in a better solution, given the same time limits. And even if the solution isn’t better, it will be a decent solution. The more time you let it roll, bigger is the chance of getting a great result.

For the classic scenario with a knapsack with wc = 15, and a (v,w) item list of (4, 12), (2, 2), (2, 1), (1, 1), (10, 4), I was able to achieve the following results:

30 iterations: Best result was { 0, 0, 4, 0, 2 } (v 28, w 12)
100 iterations: Best result was { 0, 3, 0, 9, 0 } (v 15, w 15)
500 iterations: Best result was { 0, 7, 1, 0, 0 } (v 16, w 15)
1000 iterations: Best result was { 0, 3, 4, 4, 0 } (v 18, w 14)
3000 iterations: Best result was { 0, 1, 1, 5, 1 } (v 19, w 12)
5000 iterations: Best result was { 0, 0, 5, 2, 2 } (v 32, w 15)

You can see that while none of those runs delivered the optimal solution, which would be { 0, 0, 3, 0, 3 } (v 36, w 15), they got close enough to provide good solutions. Also, it becomes clear that randomism takes a nice part in this, giving us the chance of achieving a great solution with only 30 iterations. These results also show that the more you let it run, bigger is the chance of getting better results.

Anyway, see you! 😀

Advertisement

New project at work = more free time = more hobby development

So, at last, I’ve joined the new project team, reading documentation and “first steps” to develop the new application. They seemed pretty organized so far, and I was able to solve all the hickups I’ve had when starting to install the development environment. Since I started by the end of tthe day, I didn’t have much time to do it and might finish tomorrow.

Since forever, my HabitRPG (they started running ads now?) is telling me that I must do four tasks (and I plan to kill them tomorrow):
– Put Carlin’s Explosion on GitHub;
– Establish a JenPop goal;
– Commit new code into it;
– Put Clonix (my attempt at a 2d physics game engine) on GitHub.

I’ll try to create some new tasks (and honor them, FFS) to better document and maintain those projects. But let’s see how things go. Anyway, tomorrow’s a big day so, see you then!

Code Jam 2008 Round 1A, Problem C: Numbers

Hello again! This one is one hell of a tricky problem. No wonder it’s called “Numbers”:

In this problem, you have to find the last three digits before the decimal point for the number (3 + √5)n.

For example, when n = 5, (3 + √5)5 = 3935.73982… The answer is 935.

For n = 2, (3 + √5)2 = 27.4164079… The answer is 027.

Wow, that can’t be hard, eh? Indeed! It’s really just an expression, isn’t it? Of course it is! But once you try, you get to see something beautiful happening:

$ python
>>> import math
>>> 3+math.sqrt(5)
5.23606797749979
>>> (3+math.sqrt(5)) ** 18
8751751364607.995
>>> (3+math.sqrt(5)) ** 19
45824765067264.016
>>> (3+math.sqrt(5)) ** 20
239941584945152.1

If I was clear enough, you can see where I’m going. If not, the answers for n = 18, 19 and 20 should be 607, 263 and 151. But, it gets worse as the exponent grows:

>>> (3+math.sqrt(5)) ** 30
3.7167066517539914e+21
>>> (3+math.sqrt(5)) ** 40
5.757196418599164e+28
>>> (3+math.sqrt(5)) ** 50
8.917924847980422e+35

And those results should be 647, 551 and 247, respectively. As you can now clearly see, python can’t naturally handle those big floats, what a bummer! Well, the first thing to handle is the √5 value, since 2.23606797749979 does not give enough sampling to work with the needed precision. Looking a bit, I’ve found that python has a nice Decimal module that can deal with big numbers. So, let’s use it:

>>> from decimal import *
>>> Decimal(5).sqrt()
Decimal('2.236067977499789696409173669')

Hmm… not as much as I was expecting. Digging a little deeper, there’s a way to increase the precision, so things might get better:

>>> from decimal import *
>>> getcontext().prec = 256
>>> Decimal(5).sqrt()
Decimal('2.236067977499789696409173668731276235440618359611525724270897245410520925637804899414414408378782274969508176150773783504253267724447073863586360121533452708866778173191879165811276645322639856580535761350417533785003423392414064442086432539097252592627229')

Much better indeed! Let’s test the previous cases that way!

>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(18)
Decimal('8751751364607.992147917157027451941369199107402191688475662016738729585793189576921000417819488022521008035769458236805623355117964654519012995149203145651482621970904733675616603829159073699152262190237686134096053140643178581342535160175294280596644277408')
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(19)
Decimal('45824765067263.99400154247292878012837438073861701581979206241047287979476762412992360313819778117994498296523948378426713929520188369997784294269662668000818303924272345525577601783001726960434941938734208353272624615875651599971618368131896320524727780674')
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(20)
Decimal('239941584945151.9954175862094628730047694880020933281648497263958823604254329864718576171579087349895858656483590697583803423507394435817910056755829474974431677475727217968321896916634673228294874675631017566599732643899663816729269614472126021090970897308')
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(30)
Decimal('3716706651753989275647.999689800241818180639346264792151927705378585007996556972979408800792479361060761238505878803342888805078901553987993893381697765526226152915257781651542284398399012963211473811148141385735168384084704178517042635758936907558116236798')
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(40)
Decimal('57571964185991590695825047551.99997900148385229515959986036977505867324065830938072692611318337057286921280055552228787014811837140299777965403896929514221392361243371590607709492852574169582705605198217499438837728596606637811005617061688138760057438182845')
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(50)
Decimal('891792484798041273405092327480885247.9999985785363502863489704179083057190576105613126358868072802757567654991706904766380400266097425885364586125135675324533868405035416842283484903727056986268545269819514444772213000354546446009165968589958054059389753136')

Whoa, now that’s improvement! Since the small input ranges from 2 to 30, we’re covered, yay! To the large input! The first case is 910062006… much bigger! But we have faith in our solution!

>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(910062006)
Decimal('8.919448541446674973277872746171245933275138630248130907008894180215290388336023868239234075453030387703632363584380715738200630546564329205017138690712399728915139680036686573669248915790961561461272596637583821147468503082573913305207639264660422490421272E+654339383')

Aaaaaand it fails miserably. You say increase the precision? Go!

>>> getcontext().prec = 1024
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(910062006)
Decimal('8.919448541446674973277872746171245933275138630248130907008894180215290388336023868239234075453030387703632363584380715738200630546564329205017138690712399728915139680036686573669248915790961561461272596637583821147468503082573913305207639264660422298653727910480913903028526252731111441440780088518055499910461897770689934875196042744680956819167861668586410285893670912335938531514972687567521690574174307478919726903450127130115796805326307689705722233029462717164687160362204672599178459426195515468743692514680427999432359495315725269307474956180961814633487158821252498531057196075527072557513060268997873150760403767129373225601570326857102899047033394123502152602095967110573983281744795256688834356089540911527223279951661861220253786982201657061127666645472375405207936633238083128248926879097462312848916323276895759197636621186062828933585137465868233553517455227951169710940268113754834340838971865855889844812381822071748466341030321954673834742153256457611174551211610695251648386140804719763671673855871318598E+654339383')
>>> getcontext().prec = 1000000000
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(910062006)

It is still computing that last command… maybe it’ll finish before Christmas! 😀 So… nothing we can do about it. Not in a natural way at least. It turns out that Code Jam organizers knew about this (that’s the whole point, indeed) and my approach, even if it’s able to solve the small input, is not the correct one. From the contest analysis:

The key in solving the problem is a mathematical concept called conjugation. In our problem, we simply note that (3 – √5) is a nice conjugate for (3 + √5). Let us define

(1)     α := 3 + √5,   β := 3 – √5,   and Xn := αn + βn.

We first note that Xn is an integer. This can be proved by using the binomial expansion. If you write everything down you’ll notice that the irrational terms of the sums cancel each other out.

(2)     

Another observation is that βn < 1, so Xn is actually the first integer greater than αn. Thus we may just focus on computing the last three digits of X.

Whoa, there’s a ton of math symbols there. What it is trying to explain is that there’s another way around, and we can write a simple algorithm to calculate only what we need. Taking that math apart, this suggested approach is quite interesting:

Solution C. [the periodicity of 3 digits]

For this problem, we have another approach based on the recurrence (7). Notice that we only need to focus on the last 3 digits of Xn, which only depends on the last 3 digits of the previous two terms. The numbers eventually become periodic as soon as we have (Xi, Xi+1) and (Xj, Xj+1) with the same last 3 digits, where i < j. It is clear that we will enter a cycle no later than 106 steps. In fact, for this problem, you can write some code and find out that the cycle has the size 100 and starts at the 3rd element in the sequence. So to solve the problem we can just brute force the results for the first 103 numbers and if n is bigger than 103 return the result computed for the number (n – 3) % 100 + 3.

Hmm… so that means that 103 restarts the cycle?

>>> getcontext().prec = 1024
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(3))
'143.5541752799932702850935573994008395340997875075688231766687118531366696204097567812612610681210327990242616368247610721361045671823063636347635238890704866837369015421401333059608526503244754105771443632133610811201095485572500621467658412511120829640713240415845568781018115768290851039517558845033513247514268365293660766111722141944646319577243016028721322545978616614879992775936489738916811506453866403999620857004610560179974673677933515377954018029702474909866234538867118832726784429359876544843489516123361404699859563411214642284786110581426213620665031024800019372553994881602971764760169511058165201173143835473121683945817118282695117541643928648522370457633421861064234721885791403592429823579671023036790457862064776462707851047085181514532031422824259553334558658255398529105918381922193776593783203964365329603283979620022727823878012615898529089431909156945831424556056508947493929464193144955007377329721849300605964711369201940159834308432696751549585124838734830577613026312919174891129200931587878779'
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(103))
'114167750723954075118506909152243184891636055812752354180398291472449798143.9999999999990991848908257663970808178511998207346190405521007125450976083864218743136022578058573724307353453054670910157005516822203809922099339999229730921853127935894210481666862007195197029531387123303773842565764868785959347563566969625431650825652899007545905552208934231424236695732218626707231897771500030743392334437529104775115629703470534218197873009424685334262220812905849999596760216536020750239094398005209138812883762761775627540601688798157222970582315547073108170909501918275307282209081948400177828835529550843580664303363677674984334542177398232468766365174239306558484619915875621147584977655598495728316487075831729616962367129848658475956289730636910101894486413362652617806007673400475350422005999251984236600004538944250397180445073308202368906957039259759659759171424708223454172257556499138474886346739987815229848744135302653447948550170944394443810710713254841099059181243617896686076589736820366175425243225873342031095'
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(4))
'751.6594202199646689967411763468544075540238844147363216775107372289675155071512231016216206076354221948773735933299956287145489777071084090825085004176200550896187330962356998562944764142034959055300079068701456758805751299255628262705206665683384355613744512183189236100345107783526967957467183936425944549449908917791719022086541245209393177780525834150786943366387737228119962073666571129313260408882798620998009499274205440944867036809150955734258594655937993276797731329052373871815618254139351860428319959647647374674262707908876871995127080552487621508491412880200101705908473128415601764990889933055367306159005136233888840715539870984149367093630625404742444902575464770587232289900404868860256573793272870943149903775840076429216217997197202951293164969827362655006432955840842277806071505091517327117361820812917980417240893005119321075359566233467277719517523073965614978919296671974343129687014011013788730981039708828181314734688310185839130119271657945635321905403357860532468388142825668178428304890836363588'
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(104))
'597790103628874365020709731601266597564608080186061135989016966589686218751.9999999999993118384917497799741823573220669410860125985943311312283719331038685561426361821548229163694601416604360587022213664498323168042300721157774333881227779113388111518769679330555901306111571209984513448987672187738925735173356845673100783866660482880393505457702913561140292994651458434392515606071335149084501581993111115430324514143366418268949755088875955878162140123915031739484539675175716477662141640774955323553027076342238845088259277403081453735076960587935422087441774083950451065986208934485511159638170169172967026214058721899896878428807562496126156550849295831738165499903388396019937912975536800221981964584521002823218293972856046263138381804404409768650023404607326059948575617153491987096854714709481244777860640664620489694413768067522712403081007060606347733537888589375255218262079999570621589681646821848243321335219275048192774559691932753194312696801857829138795089010920052048921614192197989676705158844531382420174'

Yay! To the git-pushed solution!

from decimal import *                            # Improved math

getcontext().prec = 1024                         # Adjust the precision
base = (Decimal(3) + Decimal(5).sqrt())          # Calculate the base

for case in range(1,int(raw_input())+1):         # For all test cases
  exp = (int(raw_input()) - 3) % 100 + 3         # Get the recurrent exponent

  num = str(base ** Decimal(exp))                # Calculate
  ans = num[:num.find('.')]                      # Get only the integer part

  print "Case #%d: %03d" % (case, int(ans[-3:])) # Zero-lead the last 3 digits

One thing that’s really interesting about this is that I actually solved this (of course, not from scratch) while posting! I was already OK with the idea of only solving the small input, but reading the analysis brought some new light into this problem. Blogging FTW! See ya!

Floating point precision, bad numbers

Hello there! While still trying to find a good way to solve the 2008 Round-1A-C problem, Numbers, I came across something that I still have to adventure a bit on: Floating Point number precision. I mean, is there a way (even at expense of time, CPU and memory usage) to fully reproduce what calculations “by hand” can offer? It might sound silly, but beside Code Jam challenges, as a programmer it has never bothered me. But since we need precise results to solve those kind of problems, I’ve found myself in need of tools, because the ones I got are clearly not OK (or I don’t know how to use them correctly).

So, I’ll try to spend some time this week to try to make an expensive but precise way to multiply floating point numbers, so that I can solve the problems without the need to use some external calculator (I’ve done before, it’s at my Code Jam github repository). Wish me luck! 🙂

Code Jam 2008 Round 1A, Problem B: Milkshakes

Another day, another problem. Milkshakes it is! And it’s pretty straightforward:

You own a milkshake shop. There are N different flavors that you can prepare, and each flavor can be prepared “malted” or “unmalted. So, you can make 2N different types of milkshakes.

Each of your customers has a set of milkshake types that they like, and they will be satisfied if you have at least one of those types prepared. At most one of the types a customer likes will be a “malted” flavor.

You want to make N batches of milkshakes, so that:

  • There is exactly one batch for each flavor of milkshake, and it is either malted or unmalted.
  • For each customer, you make at least one milkshake type that they like.
  • The minimum possible number of batches are malted.

Find whether it is possible to satisfy all your customers given these constraints, and if it is, what milkshake types you should make.

If it is possible to satisfy all your customers, there will be only one answer which minimizes the number of malted batches.

It is a classic Satisfiability problem, when you have a finite set of resources, receive several requests such as “want one of these” and have to decide which resource to hand to each request, in a way that every request gets attended. The “malted” can symbolize a cost-increasing modification of a resource, that we want to keep at minimum. In a case that two or more requests share a same need for a resource, they agree on sharing it, but only if the resource in on the same wanted state.

Being an NP-complete problem, there are conflicts of interest that we won’t be able to solve, and those we can inform as being “IMPOSSIBLE”. Python code as usual:

for case in range(1,int(raw_input())+1): # For all test cases
  shakes = int(raw_input()) * [0] # Get shake size and create unmalted list
  customers = [] # Start empty customer list

  for i in range(int(raw_input())): # For all customers
    custList = map(int, raw_input().split(" ")) # Get the preferences
    customer = []
    for j in range(custList.pop(0)): # First value is number of shakes
      flavor = custList.pop(0)-1 # First pair element is flavor index
      malted = custList.pop(0) # Second element is malted preference
      customer.append(flavor, malted) # Add preference to customer

    customers.append(customer) # When done, add customer to list

  impossible = False
  solved = False
  while not impossible and not solved: # While not finished
    redo = False
    for customer in customers: # Examine all customers
      unsatisfied = []
      for flavor, malt in customer: # Examine all their preferences
        if shakes[flavor] == malt: # If satisfied, move to next customer
          unsatisfied = []
          break
        else: # If unsatisfied, take note of it
          unsatisfied.append([flavor, malt])

      for flavor, malt in unsatisfied: # Check unsatisfied flavors
        if malt == 1 and shakes[flavor] == 0: # Look for a possible malted preference
          shakes[flavor] = 1 # Attend the malted preference
          redo = True # Restart checking customers
          break

      if redo:
        break

      if len(unsatisfied) > 0: # If we've reached here, all insatisfactions are unmalted
        impossible = True # Then we can't solve it
        break

    if not redo: # If we don't need to look into customers again
      solved = True # Problem was solved (might still be impossible)

  result = "IMPOSSIBLE" if impossible else " ".join(map(str, shakes)) # Decide result
  print "Case #%d: %s" % (case, result) # Print the result

Validated, committed and pushed. 🙂 Another day, another problem. Cheers!

Code Jam 2008 Round 1A, Problem A: Minimum Scalar Product

Hey there, since things are getting a little more calm, I can focus once more to solve some nice Code Jam problems. And here we go, from the first official tournament round: Problem A: Minimum Scalar Product. It says:

You are given two vectors v1=(x1,x2,…,xn) and v2=(y1,y2,…,yn). The scalar product of these vectors is a single number, calculated as x1y1+x2y2+…+xnyn.

Suppose you are allowed to permute the coordinates of each vector as you wish. Choose two permutations such that the scalar product of your two new vectors is the smallest possible, and output that minimum scalar product.

There’s a catch on the text, that made me quite confused. In case you didn’t notice, it’s that damn TWO PERMUTATIONS part. Despite getting the sample results right, when the bigger test cases came, I wasn’t able to get the smallest results right! Because, guess why, I was only doing two value permutations, because to me, the text says so!

Anyway, took a look into the results and saw that participants were simply not considering this limitation and outputting the minimum possible value anyway and to hell with it. If Google thinks this would be the correct approach, who am I to judge! 😀 Just sort both arrays, invert one of them, iterate calculating the MSP and you’re done! Python code, as usual:

for case in range(1,int(raw_input())+1): # For all test cases
  size = int(raw_input())                # Since we're informed, save array size
  x = map(int, raw_input().split(" "))   # Get all X parameters
  y = map(int, raw_input().split(" "))   # Get all Y parameters

  x.sort()                               # Sort both vectors
  y.sort()                               # And sort-reverse Y
  y.reverse()                            # So we can match the bigger/smaller values

  msp = 0
  for i in range(size):                  # For all the array's len
    msp += x[i]*y[i]                     # Sum "forward" on X and "backwards" on Y

  print "Case #%d: %d" % (case, msp)     # Print the sum

And that’s it! Quick and easy, despite the text hiccup.  Githubbin’ it. See you!

CodeJam trouble, hard problems ahead!

Well, since we’re mostly finished with the company project, I can now put some of my attention here again. I’ts been a while! I’ve been trying to crack some hard (for me at least) CodeJam problems, from probabilities to geometry and I’ve not been able to move on much, must study a lot more. In the meantime, I’ve checked out some of the solutions looking for light and being completely outstanded by the work of some coders out there.

I know most of them are perhaps too familiar with the problems and/or can crack really hard math problems, but that’s SO above every coder I’ve ever known! It’s exciting! 😀

I’m committing some new problems at GitHub and will start talking about them tomorrow.

Difficult CodeJam problem and Free Software Documentary

Well, it’s been a tough weekend. Many new things were installed at home, lots of drilling and weightlifting (oh my arms…) but feels nicer after all. 🙂 More places for the kittens to hide.

That said, I’m evolving on Africa 2010 CodeJam D problem, but as soon as I get closer to solving, another challenge appears. This time, I’m having trouble finding an heuristic to determine the right choice to make, of all the results. Perhaps I’m still getting many results and not comparing correctly to the “solved group” list. I’ll for sure find out eventually.

Also, watched InProprietario, a brazilian documentary on free software (lots of Stallman) and, beside being very crude on effects, they used some nice movie references to cleverly show some software concepts. If you can understand portuguese and would like to know more about Free Software, I highly recommend it. 🙂

Project hosting, tough choice

Howdy 🙂

it’s 10PM once again, and I’ve decided to restart posting. Really it’s been really hard to focus energy into posting content. Let’s see how it goes. I was planning a deep analysis into many different services and their offerings, guided by this wikipedia entry, but, as a colleague once said: “who DOESN’T use GitHub nowadays”?

I love Free and Open Source Software and advocate its use, but I don’t really care if the host I’m using holds proprietary software. I work in a closed-source business company (in fact, we use plenty Open Source software to produce our proprietary products), and I don’t think every software MUST be free, but COULD, with a different business model.

I’d like to have a defect control system with it (such as Bugzilla) and I’d love to have my projects featuring a nice front page introduction, perhaps even some examples of its use. A wiki system would also be nice, but I could live without it. I don’t need a build system, mailing lists or forums for my projects. There are also lots of alternatives for team management, so I don’t need it (especially on single-man projects such as mine).

So, not only GitHub matches my needs, but also Sourceforge, Google Code (without web-page hosting), Launchpad (without web and wiki), JavaForge, GNU Savannah and some other less known choices. But, it’s friends’ good experience with it that guides my choice this time, and I can always change if there’s a better solution. 🙂

To GitHub!