Tag Archives: code language

Got regex mojo? Not so fast!

If you think you’re a regex person (not following what XKCD has told us), and that most problems can be solved with simple (yet always too simple) regular expressions, I’ve got a challenge for you: http://regex.alf.nu/.

I know a bit of regexes (use them quite a lot with VI’s search, but it’s not ECMAscript), and I’m stuck at abba. 😀 Gotta study a bit more, or learn how to think again. I could match all the “should not match” words, would be a matter of inverting the match, but I couldn’t find a way to invert the search… thanks, ECMAscript! 🙂

I hope you have better regex mojo, you’ll need it!

Advertisements

Code Jam 2008 Round 1A, Problem C: Numbers

Hello again! This one is one hell of a tricky problem. No wonder it’s called “Numbers”:

In this problem, you have to find the last three digits before the decimal point for the number (3 + √5)n.

For example, when n = 5, (3 + √5)5 = 3935.73982… The answer is 935.

For n = 2, (3 + √5)2 = 27.4164079… The answer is 027.

Wow, that can’t be hard, eh? Indeed! It’s really just an expression, isn’t it? Of course it is! But once you try, you get to see something beautiful happening:

$ python
>>> import math
>>> 3+math.sqrt(5)
5.23606797749979
>>> (3+math.sqrt(5)) ** 18
8751751364607.995
>>> (3+math.sqrt(5)) ** 19
45824765067264.016
>>> (3+math.sqrt(5)) ** 20
239941584945152.1

If I was clear enough, you can see where I’m going. If not, the answers for n = 18, 19 and 20 should be 607, 263 and 151. But, it gets worse as the exponent grows:

>>> (3+math.sqrt(5)) ** 30
3.7167066517539914e+21
>>> (3+math.sqrt(5)) ** 40
5.757196418599164e+28
>>> (3+math.sqrt(5)) ** 50
8.917924847980422e+35

And those results should be 647, 551 and 247, respectively. As you can now clearly see, python can’t naturally handle those big floats, what a bummer! Well, the first thing to handle is the √5 value, since 2.23606797749979 does not give enough sampling to work with the needed precision. Looking a bit, I’ve found that python has a nice Decimal module that can deal with big numbers. So, let’s use it:

>>> from decimal import *
>>> Decimal(5).sqrt()
Decimal('2.236067977499789696409173669')

Hmm… not as much as I was expecting. Digging a little deeper, there’s a way to increase the precision, so things might get better:

>>> from decimal import *
>>> getcontext().prec = 256
>>> Decimal(5).sqrt()
Decimal('2.236067977499789696409173668731276235440618359611525724270897245410520925637804899414414408378782274969508176150773783504253267724447073863586360121533452708866778173191879165811276645322639856580535761350417533785003423392414064442086432539097252592627229')

Much better indeed! Let’s test the previous cases that way!

>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(18)
Decimal('8751751364607.992147917157027451941369199107402191688475662016738729585793189576921000417819488022521008035769458236805623355117964654519012995149203145651482621970904733675616603829159073699152262190237686134096053140643178581342535160175294280596644277408')
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(19)
Decimal('45824765067263.99400154247292878012837438073861701581979206241047287979476762412992360313819778117994498296523948378426713929520188369997784294269662668000818303924272345525577601783001726960434941938734208353272624615875651599971618368131896320524727780674')
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(20)
Decimal('239941584945151.9954175862094628730047694880020933281648497263958823604254329864718576171579087349895858656483590697583803423507394435817910056755829474974431677475727217968321896916634673228294874675631017566599732643899663816729269614472126021090970897308')
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(30)
Decimal('3716706651753989275647.999689800241818180639346264792151927705378585007996556972979408800792479361060761238505878803342888805078901553987993893381697765526226152915257781651542284398399012963211473811148141385735168384084704178517042635758936907558116236798')
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(40)
Decimal('57571964185991590695825047551.99997900148385229515959986036977505867324065830938072692611318337057286921280055552228787014811837140299777965403896929514221392361243371590607709492852574169582705605198217499438837728596606637811005617061688138760057438182845')
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(50)
Decimal('891792484798041273405092327480885247.9999985785363502863489704179083057190576105613126358868072802757567654991706904766380400266097425885364586125135675324533868405035416842283484903727056986268545269819514444772213000354546446009165968589958054059389753136')

Whoa, now that’s improvement! Since the small input ranges from 2 to 30, we’re covered, yay! To the large input! The first case is 910062006… much bigger! But we have faith in our solution!

>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(910062006)
Decimal('8.919448541446674973277872746171245933275138630248130907008894180215290388336023868239234075453030387703632363584380715738200630546564329205017138690712399728915139680036686573669248915790961561461272596637583821147468503082573913305207639264660422490421272E+654339383')

Aaaaaand it fails miserably. You say increase the precision? Go!

>>> getcontext().prec = 1024
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(910062006)
Decimal('8.919448541446674973277872746171245933275138630248130907008894180215290388336023868239234075453030387703632363584380715738200630546564329205017138690712399728915139680036686573669248915790961561461272596637583821147468503082573913305207639264660422298653727910480913903028526252731111441440780088518055499910461897770689934875196042744680956819167861668586410285893670912335938531514972687567521690574174307478919726903450127130115796805326307689705722233029462717164687160362204672599178459426195515468743692514680427999432359495315725269307474956180961814633487158821252498531057196075527072557513060268997873150760403767129373225601570326857102899047033394123502152602095967110573983281744795256688834356089540911527223279951661861220253786982201657061127666645472375405207936633238083128248926879097462312848916323276895759197636621186062828933585137465868233553517455227951169710940268113754834340838971865855889844812381822071748466341030321954673834742153256457611174551211610695251648386140804719763671673855871318598E+654339383')
>>> getcontext().prec = 1000000000
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(910062006)

It is still computing that last command… maybe it’ll finish before Christmas! 😀 So… nothing we can do about it. Not in a natural way at least. It turns out that Code Jam organizers knew about this (that’s the whole point, indeed) and my approach, even if it’s able to solve the small input, is not the correct one. From the contest analysis:

The key in solving the problem is a mathematical concept called conjugation. In our problem, we simply note that (3 – √5) is a nice conjugate for (3 + √5). Let us define

(1)     α := 3 + √5,   β := 3 – √5,   and Xn := αn + βn.

We first note that Xn is an integer. This can be proved by using the binomial expansion. If you write everything down you’ll notice that the irrational terms of the sums cancel each other out.

(2)     

Another observation is that βn < 1, so Xn is actually the first integer greater than αn. Thus we may just focus on computing the last three digits of X.

Whoa, there’s a ton of math symbols there. What it is trying to explain is that there’s another way around, and we can write a simple algorithm to calculate only what we need. Taking that math apart, this suggested approach is quite interesting:

Solution C. [the periodicity of 3 digits]

For this problem, we have another approach based on the recurrence (7). Notice that we only need to focus on the last 3 digits of Xn, which only depends on the last 3 digits of the previous two terms. The numbers eventually become periodic as soon as we have (Xi, Xi+1) and (Xj, Xj+1) with the same last 3 digits, where i < j. It is clear that we will enter a cycle no later than 106 steps. In fact, for this problem, you can write some code and find out that the cycle has the size 100 and starts at the 3rd element in the sequence. So to solve the problem we can just brute force the results for the first 103 numbers and if n is bigger than 103 return the result computed for the number (n – 3) % 100 + 3.

Hmm… so that means that 103 restarts the cycle?

>>> getcontext().prec = 1024
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(3))
'143.5541752799932702850935573994008395340997875075688231766687118531366696204097567812612610681210327990242616368247610721361045671823063636347635238890704866837369015421401333059608526503244754105771443632133610811201095485572500621467658412511120829640713240415845568781018115768290851039517558845033513247514268365293660766111722141944646319577243016028721322545978616614879992775936489738916811506453866403999620857004610560179974673677933515377954018029702474909866234538867118832726784429359876544843489516123361404699859563411214642284786110581426213620665031024800019372553994881602971764760169511058165201173143835473121683945817118282695117541643928648522370457633421861064234721885791403592429823579671023036790457862064776462707851047085181514532031422824259553334558658255398529105918381922193776593783203964365329603283979620022727823878012615898529089431909156945831424556056508947493929464193144955007377329721849300605964711369201940159834308432696751549585124838734830577613026312919174891129200931587878779'
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(103))
'114167750723954075118506909152243184891636055812752354180398291472449798143.9999999999990991848908257663970808178511998207346190405521007125450976083864218743136022578058573724307353453054670910157005516822203809922099339999229730921853127935894210481666862007195197029531387123303773842565764868785959347563566969625431650825652899007545905552208934231424236695732218626707231897771500030743392334437529104775115629703470534218197873009424685334262220812905849999596760216536020750239094398005209138812883762761775627540601688798157222970582315547073108170909501918275307282209081948400177828835529550843580664303363677674984334542177398232468766365174239306558484619915875621147584977655598495728316487075831729616962367129848658475956289730636910101894486413362652617806007673400475350422005999251984236600004538944250397180445073308202368906957039259759659759171424708223454172257556499138474886346739987815229848744135302653447948550170944394443810710713254841099059181243617896686076589736820366175425243225873342031095'
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(4))
'751.6594202199646689967411763468544075540238844147363216775107372289675155071512231016216206076354221948773735933299956287145489777071084090825085004176200550896187330962356998562944764142034959055300079068701456758805751299255628262705206665683384355613744512183189236100345107783526967957467183936425944549449908917791719022086541245209393177780525834150786943366387737228119962073666571129313260408882798620998009499274205440944867036809150955734258594655937993276797731329052373871815618254139351860428319959647647374674262707908876871995127080552487621508491412880200101705908473128415601764990889933055367306159005136233888840715539870984149367093630625404742444902575464770587232289900404868860256573793272870943149903775840076429216217997197202951293164969827362655006432955840842277806071505091517327117361820812917980417240893005119321075359566233467277719517523073965614978919296671974343129687014011013788730981039708828181314734688310185839130119271657945635321905403357860532468388142825668178428304890836363588'
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(104))
'597790103628874365020709731601266597564608080186061135989016966589686218751.9999999999993118384917497799741823573220669410860125985943311312283719331038685561426361821548229163694601416604360587022213664498323168042300721157774333881227779113388111518769679330555901306111571209984513448987672187738925735173356845673100783866660482880393505457702913561140292994651458434392515606071335149084501581993111115430324514143366418268949755088875955878162140123915031739484539675175716477662141640774955323553027076342238845088259277403081453735076960587935422087441774083950451065986208934485511159638170169172967026214058721899896878428807562496126156550849295831738165499903388396019937912975536800221981964584521002823218293972856046263138381804404409768650023404607326059948575617153491987096854714709481244777860640664620489694413768067522712403081007060606347733537888589375255218262079999570621589681646821848243321335219275048192774559691932753194312696801857829138795089010920052048921614192197989676705158844531382420174'

Yay! To the git-pushed solution!

from decimal import *                            # Improved math

getcontext().prec = 1024                         # Adjust the precision
base = (Decimal(3) + Decimal(5).sqrt())          # Calculate the base

for case in range(1,int(raw_input())+1):         # For all test cases
  exp = (int(raw_input()) - 3) % 100 + 3         # Get the recurrent exponent

  num = str(base ** Decimal(exp))                # Calculate
  ans = num[:num.find('.')]                      # Get only the integer part

  print "Case #%d: %03d" % (case, int(ans[-3:])) # Zero-lead the last 3 digits

One thing that’s really interesting about this is that I actually solved this (of course, not from scratch) while posting! I was already OK with the idea of only solving the small input, but reading the analysis brought some new light into this problem. Blogging FTW! See ya!

Floating point precision, bad numbers

Hello there! While still trying to find a good way to solve the 2008 Round-1A-C problem, Numbers, I came across something that I still have to adventure a bit on: Floating Point number precision. I mean, is there a way (even at expense of time, CPU and memory usage) to fully reproduce what calculations “by hand” can offer? It might sound silly, but beside Code Jam challenges, as a programmer it has never bothered me. But since we need precise results to solve those kind of problems, I’ve found myself in need of tools, because the ones I got are clearly not OK (or I don’t know how to use them correctly).

So, I’ll try to spend some time this week to try to make an expensive but precise way to multiply floating point numbers, so that I can solve the problems without the need to use some external calculator (I’ve done before, it’s at my Code Jam github repository). Wish me luck! 🙂

Code Jam 2008 Round 1A, Problem A: Minimum Scalar Product

Hey there, since things are getting a little more calm, I can focus once more to solve some nice Code Jam problems. And here we go, from the first official tournament round: Problem A: Minimum Scalar Product. It says:

You are given two vectors v1=(x1,x2,…,xn) and v2=(y1,y2,…,yn). The scalar product of these vectors is a single number, calculated as x1y1+x2y2+…+xnyn.

Suppose you are allowed to permute the coordinates of each vector as you wish. Choose two permutations such that the scalar product of your two new vectors is the smallest possible, and output that minimum scalar product.

There’s a catch on the text, that made me quite confused. In case you didn’t notice, it’s that damn TWO PERMUTATIONS part. Despite getting the sample results right, when the bigger test cases came, I wasn’t able to get the smallest results right! Because, guess why, I was only doing two value permutations, because to me, the text says so!

Anyway, took a look into the results and saw that participants were simply not considering this limitation and outputting the minimum possible value anyway and to hell with it. If Google thinks this would be the correct approach, who am I to judge! 😀 Just sort both arrays, invert one of them, iterate calculating the MSP and you’re done! Python code, as usual:

for case in range(1,int(raw_input())+1): # For all test cases
  size = int(raw_input())                # Since we're informed, save array size
  x = map(int, raw_input().split(" "))   # Get all X parameters
  y = map(int, raw_input().split(" "))   # Get all Y parameters

  x.sort()                               # Sort both vectors
  y.sort()                               # And sort-reverse Y
  y.reverse()                            # So we can match the bigger/smaller values

  msp = 0
  for i in range(size):                  # For all the array's len
    msp += x[i]*y[i]                     # Sum "forward" on X and "backwards" on Y

  print "Case #%d: %d" % (case, msp)     # Print the sum

And that’s it! Quick and easy, despite the text hiccup.  Githubbin’ it. See you!

CodeJam Quickie: 2010 Africa – Problem B

Howdy! This time, problems started getting more and more complex, and that’s good, because I like the challenge! The next problem is Get to Work:

You work for a company that has E employees working in town T. There are N towns in the area where the employees live. You want to ensure that everyone will be able to make it to work. Some of the employees are drivers and can drive P passengers. A capacity of P == 1 indicates that the driver can only transport themselves to work. You want to ensure that everyone will be able to make it to work and you would like to minimize the number of cars on the road.

You want to calculate the number of cars on the road, with these requirements:

  • Every employee can get to town T.
  • The only way an employee may travel between towns is in a car belonging to an employee.
  • Employees can only take rides from other employees that live in the same town.
  • The minimum number of cars is used.

Find whether it is possible for everyone to make it to work, and if it is, how many cars will end up driving to the office.

Whoa, that’s way more complex than the other problems, but still, not that hard! What we need to do is: check who are drivers and passengers for each town (since they can only help same town buddies), if the total capacity of cars from a town is inferior to the number of passengers, the problem cannot be solved and so, reported as “IMPOSSIBLE”. If we can carry everyone, start filling the cars from the “minivan” (can carry more passengers) to the “mini cooper”. 😀 That way we can make sure there are fewer cars on the road.

Some rough, but commented, and validated (!) python code follows:

for case in range(1,int(raw_input())+1):                  # For all test cases
    town_len, office = map(int, raw_input().split(" "))   # Get amount of towns and office

    carriers = [[] for x in xrange(town_len)]             # Create new lists for every town
    capacities = town_len * [0]                           # Initialize "capacity of transportation"
    payload = town_len * [0]                              # Initialize "cargo" for each town

    for i in range(int(raw_input())):                     # Get and go through employee list
        town, capacity = map(int, raw_input().split(" ")) # Get the current employee info
        if town == office:                                # If employee is from office town
            continue                                      #   No need to commute, ignore

        town -= 1                                         # Adjust "town" to zero-based index
        payload[town] += 1                                # Employee needs to commute, count

        if capacity != 0:                                 # If employee has a car
            capacities[town] += capacity                  # Add to our capacities for the town
            carriers[town].append(capacity)               # Append its car capacity

    resultStr = ""                                        # Create our result string

    for i in range(town_len):                             # For all towns
        if capacities[i] < payload[i]:                    # If we can't carry everyone
            resultStr = "IMPOSSIBLE "                     #   declare it impossible, give up
            break;

        cars = 0                                          # We can carry, start with 0 cars
        carriers[i].sort()                                # Sort and reverse the car list
        carriers[i].reverse()
        for car in carriers[i]:                           # So we can start from the bigger ones
            if (payload[i] <= 0):                         # If we're carrying everybody already
                break                                     #  we're done for this town

            payload[i] -= car                             # Else, fill a car, report as being used
            cars += 1

        resultStr += str(cars)+" "                        # Add this town filled vehicles

    print "Case #%d: %s" % (case, resultStr[:-1])         # Report results, remove last space

As the last problem, the same can be told about this code. There are many ways to improve its performance, but this way it can already solve the large inputs in an instant. Bitbangers apart, it solves the problem and everybody gets happy. Also, I think it is quite clear to understand, even more with the comments. 🙂 Bye!

CodeJam Quickie: 2010 Africa – Problem A

Hey there! So, another day, another problem! This time, it is called Odd Man Out, and it is quite simple to “get”:

You are hosting a party with G guests and notice that there is an odd number of guests! When planning the party you deliberately invited only couples and gave each couple a unique number C on their invitation. You would like to single out whoever came alone by asking all of the guests for their invitation numbers.

Well, just get the number that’s not duplicated! There are many ways to do this, be it counting the occurrences of every member of the list and finding the “odd man”, or the way I did, which is basically some push and pop into a new list. Python code follows:

for case in range(1,int(raw_input())+1):        # For all test cases
    raw_input()                                 # Discard the guest list size
    guests = raw_input().split(" ")             # Get the list of guest IDs

    guestList = []                              # Create an empty guest list

    for guest in guests:                        # For all guests
        if guestList.count(guest) > 0:          # If it's already on the final list
            guestList.remove(guest)             #    if is a couple, remove
        else:                                   # If it's not in the list yet
            guestList.append(guest)             #    Add its ID

    print "Case #%d: %s" % (case, guestList[0]) # Reports results

Needless to say, I bet there are many ways to improve this algorithm, but for this problem it performs decently, so no worries (unless you’re a hardcore bit banger). 🙂 See you tomorrow!

CodeJam Quickie: 2010 Africa – Qualification C

Whoa! Keeping up with the Africa 2010 contest, enter the Qualification C problem: T9 Spelling! It was a very fun problem and says:

The Latin alphabet contains 26 characters and telephones only have ten digits on the keypad. We would like to make it easier to write a message to your friend using a sequence of keypresses to indicate the desired characters. The letters are mapped onto the digits as shown below. To insert the character B for instance, the program would press 22. In order to insert two characters in sequence from the same key, the user must pause before pressing the key a second time. The space character ' ' should be printed to indicate a pause. For example, 2 2 indicates AA whereas 22 indicates B.

Reminds me of those days of texting through that old indestructible Nokias! Anyway, it’s not hard to do, but requires some data disposition. I chose to create a map to lists containing keys and the amount of keystrokes for every letter (including space, which would be 0), in order to simply match letters and print the sequence right along. Also, to know when to pause, I would check the last printed character before, so I can print a space and then the sequence.

Python code followsas usual:

# Build a map with the number to write and its repetition count
t9 = {
        "a" : ["2", 1], "b" : ["2", 2], "c": ["2", 3],
        "d" : ["3", 1], "e" : ["3", 2], "f": ["3", 3],
        "g" : ["4", 1], "h" : ["4", 2], "i": ["4", 3],
        "j" : ["5", 1], "k" : ["5", 2], "l": ["5", 3],
        "m" : ["6", 1], "n" : ["6", 2], "o": ["6", 3],
        "p" : ["7", 1], "q" : ["7", 2], "r": ["7", 3], "s": ["7", 4],
        "t" : ["8", 1], "u" : ["8", 2], "v": ["8", 3],
        "w" : ["9", 1], "x" : ["9", 2], "y": ["9", 3], "z": ["9", 4],
        " " : ["0", 1],
}

for case in range(1,int(raw_input())+1): # For all test cases
    message = raw_input()                # Get the list of words

    attr = t9[message[0]]                # Start with the first character
    t9str = attr[0]*attr[1]              # Initialize t9str with first T9 "word"

    for i in range(1,len(message)):      # From the second character on
        attr = t9[message[i]]            # Get the map values
        if t9str[-1] == attr[0]:         # If it's the same number as the last
            t9str += " "                 # Add a space

        t9str = t9str+attr[0]*attr[1]    # Append the new T9 "word"

    print "Case #%d: %s" % (case, t9str) # Reports results

It was the hardest of the Qualification problems, but still very easy to develop. 🙂 Fun! See ya tomorrow!