# Got regex mojo? Not so fast!

If you think you’re a regex person (not following what XKCD has told us), and that most problems can be solved with simple (yet always too simple) regular expressions, I’ve got a challenge for you: http://regex.alf.nu/.

I know a bit of regexes (use them quite a lot with VI’s search, but it’s not ECMAscript), and I’m stuck at abba. 😀 Gotta study a bit more, or learn how to think again. I could match all the “should not match” words, would be a matter of inverting the match, but I couldn’t find a way to invert the search… thanks, ECMAscript! 🙂

I hope you have better regex mojo, you’ll need it!

# Code Jam 2008 Round 1A, Problem C: Numbers

Hello again! This one is one hell of a tricky problem. No wonder it’s called “Numbers”:

In this problem, you have to find the last three digits before the decimal point for the number (3 + √5)n.

For example, when n = 5, (3 + √5)5 = 3935.73982… The answer is 935.

For n = 2, (3 + √5)2 = 27.4164079… The answer is 027.

Wow, that can’t be hard, eh? Indeed! It’s really just an expression, isn’t it? Of course it is! But once you try, you get to see something beautiful happening:

\$ python
>>> import math
>>> 3+math.sqrt(5)
5.23606797749979
>>> (3+math.sqrt(5)) ** 18
8751751364607.995
>>> (3+math.sqrt(5)) ** 19
45824765067264.016
>>> (3+math.sqrt(5)) ** 20
239941584945152.1

If I was clear enough, you can see where I’m going. If not, the answers for n = 18, 19 and 20 should be 607, 263 and 151. But, it gets worse as the exponent grows:

>>> (3+math.sqrt(5)) ** 30
3.7167066517539914e+21
>>> (3+math.sqrt(5)) ** 40
5.757196418599164e+28
>>> (3+math.sqrt(5)) ** 50
8.917924847980422e+35

And those results should be 647, 551 and 247, respectively. As you can now clearly see, python can’t naturally handle those big floats, what a bummer! Well, the first thing to handle is the √5 value, since 2.23606797749979 does not give enough sampling to work with the needed precision. Looking a bit, I’ve found that python has a nice Decimal module that can deal with big numbers. So, let’s use it:

>>> from decimal import *
>>> Decimal(5).sqrt()
Decimal('2.236067977499789696409173669')

Hmm… not as much as I was expecting. Digging a little deeper, there’s a way to increase the precision, so things might get better:

>>> from decimal import *
>>> getcontext().prec = 256
>>> Decimal(5).sqrt()
Decimal('2.236067977499789696409173668731276235440618359611525724270897245410520925637804899414414408378782274969508176150773783504253267724447073863586360121533452708866778173191879165811276645322639856580535761350417533785003423392414064442086432539097252592627229')

Much better indeed! Let’s test the previous cases that way!

>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(18)
Decimal('8751751364607.992147917157027451941369199107402191688475662016738729585793189576921000417819488022521008035769458236805623355117964654519012995149203145651482621970904733675616603829159073699152262190237686134096053140643178581342535160175294280596644277408')
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(19)
Decimal('45824765067263.99400154247292878012837438073861701581979206241047287979476762412992360313819778117994498296523948378426713929520188369997784294269662668000818303924272345525577601783001726960434941938734208353272624615875651599971618368131896320524727780674')
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(20)
Decimal('239941584945151.9954175862094628730047694880020933281648497263958823604254329864718576171579087349895858656483590697583803423507394435817910056755829474974431677475727217968321896916634673228294874675631017566599732643899663816729269614472126021090970897308')
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(30)
Decimal('3716706651753989275647.999689800241818180639346264792151927705378585007996556972979408800792479361060761238505878803342888805078901553987993893381697765526226152915257781651542284398399012963211473811148141385735168384084704178517042635758936907558116236798')
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(40)
Decimal('57571964185991590695825047551.99997900148385229515959986036977505867324065830938072692611318337057286921280055552228787014811837140299777965403896929514221392361243371590607709492852574169582705605198217499438837728596606637811005617061688138760057438182845')
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(50)
Decimal('891792484798041273405092327480885247.9999985785363502863489704179083057190576105613126358868072802757567654991706904766380400266097425885364586125135675324533868405035416842283484903727056986268545269819514444772213000354546446009165968589958054059389753136')

Whoa, now that’s improvement! Since the small input ranges from 2 to 30, we’re covered, yay! To the large input! The first case is 910062006… much bigger! But we have faith in our solution!

>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(910062006)
Decimal('8.919448541446674973277872746171245933275138630248130907008894180215290388336023868239234075453030387703632363584380715738200630546564329205017138690712399728915139680036686573669248915790961561461272596637583821147468503082573913305207639264660422490421272E+654339383')

Aaaaaand it fails miserably. You say increase the precision? Go!

>>> getcontext().prec = 1024
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(910062006)
Decimal('8.919448541446674973277872746171245933275138630248130907008894180215290388336023868239234075453030387703632363584380715738200630546564329205017138690712399728915139680036686573669248915790961561461272596637583821147468503082573913305207639264660422298653727910480913903028526252731111441440780088518055499910461897770689934875196042744680956819167861668586410285893670912335938531514972687567521690574174307478919726903450127130115796805326307689705722233029462717164687160362204672599178459426195515468743692514680427999432359495315725269307474956180961814633487158821252498531057196075527072557513060268997873150760403767129373225601570326857102899047033394123502152602095967110573983281744795256688834356089540911527223279951661861220253786982201657061127666645472375405207936633238083128248926879097462312848916323276895759197636621186062828933585137465868233553517455227951169710940268113754834340838971865855889844812381822071748466341030321954673834742153256457611174551211610695251648386140804719763671673855871318598E+654339383')
>>> getcontext().prec = 1000000000
>>> (Decimal(3) + Decimal(5).sqrt()) ** Decimal(910062006)

It is still computing that last command… maybe it’ll finish before Christmas! 😀 So… nothing we can do about it. Not in a natural way at least. It turns out that Code Jam organizers knew about this (that’s the whole point, indeed) and my approach, even if it’s able to solve the small input, is not the correct one. From the contest analysis:

The key in solving the problem is a mathematical concept called conjugation. In our problem, we simply note that (3 – √5) is a nice conjugate for (3 + √5). Let us define

(1)     α := 3 + √5,   β := 3 – √5,   and Xn := αn + βn.

We first note that Xn is an integer. This can be proved by using the binomial expansion. If you write everything down you’ll notice that the irrational terms of the sums cancel each other out.

(2)

Another observation is that βn < 1, so Xn is actually the first integer greater than αn. Thus we may just focus on computing the last three digits of X.

Whoa, there’s a ton of math symbols there. What it is trying to explain is that there’s another way around, and we can write a simple algorithm to calculate only what we need. Taking that math apart, this suggested approach is quite interesting:

Solution C. [the periodicity of 3 digits]

For this problem, we have another approach based on the recurrence (7). Notice that we only need to focus on the last 3 digits of Xn, which only depends on the last 3 digits of the previous two terms. The numbers eventually become periodic as soon as we have (Xi, Xi+1) and (Xj, Xj+1) with the same last 3 digits, where i < j. It is clear that we will enter a cycle no later than 106 steps. In fact, for this problem, you can write some code and find out that the cycle has the size 100 and starts at the 3rd element in the sequence. So to solve the problem we can just brute force the results for the first 103 numbers and if n is bigger than 103 return the result computed for the number (n – 3) % 100 + 3.

Hmm… so that means that 103 restarts the cycle?

>>> getcontext().prec = 1024
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(3))
'143.5541752799932702850935573994008395340997875075688231766687118531366696204097567812612610681210327990242616368247610721361045671823063636347635238890704866837369015421401333059608526503244754105771443632133610811201095485572500621467658412511120829640713240415845568781018115768290851039517558845033513247514268365293660766111722141944646319577243016028721322545978616614879992775936489738916811506453866403999620857004610560179974673677933515377954018029702474909866234538867118832726784429359876544843489516123361404699859563411214642284786110581426213620665031024800019372553994881602971764760169511058165201173143835473121683945817118282695117541643928648522370457633421861064234721885791403592429823579671023036790457862064776462707851047085181514532031422824259553334558658255398529105918381922193776593783203964365329603283979620022727823878012615898529089431909156945831424556056508947493929464193144955007377329721849300605964711369201940159834308432696751549585124838734830577613026312919174891129200931587878779'
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(103))
'114167750723954075118506909152243184891636055812752354180398291472449798143.9999999999990991848908257663970808178511998207346190405521007125450976083864218743136022578058573724307353453054670910157005516822203809922099339999229730921853127935894210481666862007195197029531387123303773842565764868785959347563566969625431650825652899007545905552208934231424236695732218626707231897771500030743392334437529104775115629703470534218197873009424685334262220812905849999596760216536020750239094398005209138812883762761775627540601688798157222970582315547073108170909501918275307282209081948400177828835529550843580664303363677674984334542177398232468766365174239306558484619915875621147584977655598495728316487075831729616962367129848658475956289730636910101894486413362652617806007673400475350422005999251984236600004538944250397180445073308202368906957039259759659759171424708223454172257556499138474886346739987815229848744135302653447948550170944394443810710713254841099059181243617896686076589736820366175425243225873342031095'
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(4))
'751.6594202199646689967411763468544075540238844147363216775107372289675155071512231016216206076354221948773735933299956287145489777071084090825085004176200550896187330962356998562944764142034959055300079068701456758805751299255628262705206665683384355613744512183189236100345107783526967957467183936425944549449908917791719022086541245209393177780525834150786943366387737228119962073666571129313260408882798620998009499274205440944867036809150955734258594655937993276797731329052373871815618254139351860428319959647647374674262707908876871995127080552487621508491412880200101705908473128415601764990889933055367306159005136233888840715539870984149367093630625404742444902575464770587232289900404868860256573793272870943149903775840076429216217997197202951293164969827362655006432955840842277806071505091517327117361820812917980417240893005119321075359566233467277719517523073965614978919296671974343129687014011013788730981039708828181314734688310185839130119271657945635321905403357860532468388142825668178428304890836363588'
>>> str((Decimal(3) + Decimal(5).sqrt()) ** Decimal(104))
'597790103628874365020709731601266597564608080186061135989016966589686218751.9999999999993118384917497799741823573220669410860125985943311312283719331038685561426361821548229163694601416604360587022213664498323168042300721157774333881227779113388111518769679330555901306111571209984513448987672187738925735173356845673100783866660482880393505457702913561140292994651458434392515606071335149084501581993111115430324514143366418268949755088875955878162140123915031739484539675175716477662141640774955323553027076342238845088259277403081453735076960587935422087441774083950451065986208934485511159638170169172967026214058721899896878428807562496126156550849295831738165499903388396019937912975536800221981964584521002823218293972856046263138381804404409768650023404607326059948575617153491987096854714709481244777860640664620489694413768067522712403081007060606347733537888589375255218262079999570621589681646821848243321335219275048192774559691932753194312696801857829138795089010920052048921614192197989676705158844531382420174'

Yay! To the git-pushed solution!

from decimal import *                            # Improved math

getcontext().prec = 1024                         # Adjust the precision
base = (Decimal(3) + Decimal(5).sqrt())          # Calculate the base

for case in range(1,int(raw_input())+1):         # For all test cases
exp = (int(raw_input()) - 3) % 100 + 3         # Get the recurrent exponent

num = str(base ** Decimal(exp))                # Calculate
ans = num[:num.find('.')]                      # Get only the integer part

print "Case #%d: %03d" % (case, int(ans[-3:])) # Zero-lead the last 3 digits

One thing that’s really interesting about this is that I actually solved this (of course, not from scratch) while posting! I was already OK with the idea of only solving the small input, but reading the analysis brought some new light into this problem. Blogging FTW! See ya!

# Floating point precision, bad numbers

Hello there! While still trying to find a good way to solve the 2008 Round-1A-C problem, Numbers, I came across something that I still have to adventure a bit on: Floating Point number precision. I mean, is there a way (even at expense of time, CPU and memory usage) to fully reproduce what calculations “by hand” can offer? It might sound silly, but beside Code Jam challenges, as a programmer it has never bothered me. But since we need precise results to solve those kind of problems, I’ve found myself in need of tools, because the ones I got are clearly not OK (or I don’t know how to use them correctly).

So, I’ll try to spend some time this week to try to make an expensive but precise way to multiply floating point numbers, so that I can solve the problems without the need to use some external calculator (I’ve done before, it’s at my Code Jam github repository). Wish me luck! 🙂

# Code Jam 2008 Round 1A, Problem A: Minimum Scalar Product

Hey there, since things are getting a little more calm, I can focus once more to solve some nice Code Jam problems. And here we go, from the first official tournament round: Problem A: Minimum Scalar Product. It says:

You are given two vectors v1=(x1,x2,…,xn) and v2=(y1,y2,…,yn). The scalar product of these vectors is a single number, calculated as x1y1+x2y2+…+xnyn.

Suppose you are allowed to permute the coordinates of each vector as you wish. Choose two permutations such that the scalar product of your two new vectors is the smallest possible, and output that minimum scalar product.

There’s a catch on the text, that made me quite confused. In case you didn’t notice, it’s that damn TWO PERMUTATIONS part. Despite getting the sample results right, when the bigger test cases came, I wasn’t able to get the smallest results right! Because, guess why, I was only doing two value permutations, because to me, the text says so!

Anyway, took a look into the results and saw that participants were simply not considering this limitation and outputting the minimum possible value anyway and to hell with it. If Google thinks this would be the correct approach, who am I to judge! 😀 Just sort both arrays, invert one of them, iterate calculating the MSP and you’re done! Python code, as usual:

for case in range(1,int(raw_input())+1): # For all test cases
size = int(raw_input())                # Since we're informed, save array size
x = map(int, raw_input().split(" "))   # Get all X parameters
y = map(int, raw_input().split(" "))   # Get all Y parameters

x.sort()                               # Sort both vectors
y.sort()                               # And sort-reverse Y
y.reverse()                            # So we can match the bigger/smaller values

msp = 0
for i in range(size):                  # For all the array's len
msp += x[i]*y[i]                     # Sum "forward" on X and "backwards" on Y

print "Case #%d: %d" % (case, msp)     # Print the sum

And that’s it! Quick and easy, despite the text hiccup.  Githubbin’ it. See you!

# CodeJam Quickie: 2010 Africa – Problem B

Howdy! This time, problems started getting more and more complex, and that’s good, because I like the challenge! The next problem is Get to Work:

You work for a company that has E employees working in town T. There are N towns in the area where the employees live. You want to ensure that everyone will be able to make it to work. Some of the employees are drivers and can drive P passengers. A capacity of P == 1 indicates that the driver can only transport themselves to work. You want to ensure that everyone will be able to make it to work and you would like to minimize the number of cars on the road.

You want to calculate the number of cars on the road, with these requirements:

• Every employee can get to town T.
• The only way an employee may travel between towns is in a car belonging to an employee.
• Employees can only take rides from other employees that live in the same town.
• The minimum number of cars is used.

Find whether it is possible for everyone to make it to work, and if it is, how many cars will end up driving to the office.

Whoa, that’s way more complex than the other problems, but still, not that hard! What we need to do is: check who are drivers and passengers for each town (since they can only help same town buddies), if the total capacity of cars from a town is inferior to the number of passengers, the problem cannot be solved and so, reported as “IMPOSSIBLE”. If we can carry everyone, start filling the cars from the “minivan” (can carry more passengers) to the “mini cooper”. 😀 That way we can make sure there are fewer cars on the road.

Some rough, but commented, and validated (!) python code follows:

for case in range(1,int(raw_input())+1):                  # For all test cases
town_len, office = map(int, raw_input().split(" "))   # Get amount of towns and office

carriers = [[] for x in xrange(town_len)]             # Create new lists for every town
capacities = town_len * [0]                           # Initialize "capacity of transportation"
payload = town_len * [0]                              # Initialize "cargo" for each town

for i in range(int(raw_input())):                     # Get and go through employee list
town, capacity = map(int, raw_input().split(" ")) # Get the current employee info
if town == office:                                # If employee is from office town
continue                                      #   No need to commute, ignore

town -= 1                                         # Adjust "town" to zero-based index
payload[town] += 1                                # Employee needs to commute, count

if capacity != 0:                                 # If employee has a car
capacities[town] += capacity                  # Add to our capacities for the town
carriers[town].append(capacity)               # Append its car capacity

resultStr = ""                                        # Create our result string

for i in range(town_len):                             # For all towns
if capacities[i] < payload[i]:                    # If we can't carry everyone
resultStr = "IMPOSSIBLE "                     #   declare it impossible, give up
break;

carriers[i].sort()                                # Sort and reverse the car list
carriers[i].reverse()
for car in carriers[i]:                           # So we can start from the bigger ones
break                                     #  we're done for this town

payload[i] -= car                             # Else, fill a car, report as being used
cars += 1

resultStr += str(cars)+" "                        # Add this town filled vehicles

print "Case #%d: %s" % (case, resultStr[:-1])         # Report results, remove last space

As the last problem, the same can be told about this code. There are many ways to improve its performance, but this way it can already solve the large inputs in an instant. Bitbangers apart, it solves the problem and everybody gets happy. Also, I think it is quite clear to understand, even more with the comments. 🙂 Bye!

# CodeJam Quickie: 2010 Africa – Problem A

Hey there! So, another day, another problem! This time, it is called Odd Man Out, and it is quite simple to “get”:

You are hosting a party with G guests and notice that there is an odd number of guests! When planning the party you deliberately invited only couples and gave each couple a unique number C on their invitation. You would like to single out whoever came alone by asking all of the guests for their invitation numbers.

Well, just get the number that’s not duplicated! There are many ways to do this, be it counting the occurrences of every member of the list and finding the “odd man”, or the way I did, which is basically some push and pop into a new list. Python code follows:

for case in range(1,int(raw_input())+1):        # For all test cases
raw_input()                                 # Discard the guest list size
guests = raw_input().split(" ")             # Get the list of guest IDs

guestList = []                              # Create an empty guest list

for guest in guests:                        # For all guests
if guestList.count(guest) > 0:          # If it's already on the final list
guestList.remove(guest)             #    if is a couple, remove
else:                                   # If it's not in the list yet

print "Case #%d: %s" % (case, guestList[0]) # Reports results

Needless to say, I bet there are many ways to improve this algorithm, but for this problem it performs decently, so no worries (unless you’re a hardcore bit banger). 🙂 See you tomorrow!

# CodeJam Quickie: 2010 Africa – Qualification C

Whoa! Keeping up with the Africa 2010 contest, enter the Qualification C problem: T9 Spelling! It was a very fun problem and says:

The Latin alphabet contains 26 characters and telephones only have ten digits on the keypad. We would like to make it easier to write a message to your friend using a sequence of keypresses to indicate the desired characters. The letters are mapped onto the digits as shown below. To insert the character B for instance, the program would press 22. In order to insert two characters in sequence from the same key, the user must pause before pressing the key a second time. The space character ' ' should be printed to indicate a pause. For example, 2 2 indicates AA whereas 22 indicates B.

Reminds me of those days of texting through that old indestructible Nokias! Anyway, it’s not hard to do, but requires some data disposition. I chose to create a map to lists containing keys and the amount of keystrokes for every letter (including space, which would be 0), in order to simply match letters and print the sequence right along. Also, to know when to pause, I would check the last printed character before, so I can print a space and then the sequence.

Python code followsas usual:

# Build a map with the number to write and its repetition count
t9 = {
"a" : ["2", 1], "b" : ["2", 2], "c": ["2", 3],
"d" : ["3", 1], "e" : ["3", 2], "f": ["3", 3],
"g" : ["4", 1], "h" : ["4", 2], "i": ["4", 3],
"j" : ["5", 1], "k" : ["5", 2], "l": ["5", 3],
"m" : ["6", 1], "n" : ["6", 2], "o": ["6", 3],
"p" : ["7", 1], "q" : ["7", 2], "r": ["7", 3], "s": ["7", 4],
"t" : ["8", 1], "u" : ["8", 2], "v": ["8", 3],
"w" : ["9", 1], "x" : ["9", 2], "y": ["9", 3], "z": ["9", 4],
" " : ["0", 1],
}

for case in range(1,int(raw_input())+1): # For all test cases
message = raw_input()                # Get the list of words

t9str = attr[0]*attr[1]              # Initialize t9str with first T9 "word"

for i in range(1,len(message)):      # From the second character on
attr = t9[message[i]]            # Get the map values
if t9str[-1] == attr[0]:         # If it's the same number as the last
t9str += " "                 # Add a space

t9str = t9str+attr[0]*attr[1]    # Append the new T9 "word"

print "Case #%d: %s" % (case, t9str) # Reports results

It was the hardest of the Qualification problems, but still very easy to develop. 🙂 Fun! See ya tomorrow!

# CodeJam Real Quickie: 2010 Africa – Qualification B

Hello there! Well, I can’t say anything about this problem other than it’s just super easy. Allow me to show you:

Given a list of space separated words, reverse the order of the words.

If that first sentence didn’t kill it for you, I don’t know what will. Anyway, the python code (validated, of course):

for case in range(1,int(raw_input())+1):           # For all test cases
words = raw_input().split(" ")                 # Get the list of words
words.reverse()                                # Reverse word list
print "Case #%d: %s" % (case, " ".join(words)) # Reports results, add spaces

Can’t get easier than that. And I actually had to learn that reverse() does not return the reversed list… in a perfect world, it could be solved within the print line, with:

" ".join(raw_input().split(" ").reverse())

But, maybe next time… 😀 See ya, warned you it was real quick!

# CodeJam Quickie: 2010 Africa – Qualification A

I didn’t knew that this contest existed! 😀 Well, found it on the front page, looking to be solved, so, let’s go! Its summary:

You receive a credit C at a local store and would like to buy two items. You first walk through the store and create a list L of all available items. From this list you would like to buy two items that add up to the entire value of the credit. The solution you provide will consist of the two integers indicating the positions of the items in your list (smaller number first).

It’s quite simple, and by smaller number first, they mean the position of the items, not its value. From that, the solution is trivial (already validated for the inputs):

for case in range(1,int(raw_input())+1):      # For all test cases

credit = int(raw_input())                 # Get the test's credit
raw_input()                               # Discard the number of items
strItems = raw_input().split(" ")         # Get the items as strings
items = []
for item in strItems:
items.append(int(item))               # Convert the items to integers

out = False
i = 0
j = i+1
for i in range(0,len(items)-1):           # first item can be any of the items, but the last
for j in range(i+1,len(items)):       # the second will be any of the items after the first
if items[i] + items[j] == credit:
out = True                    # Found, get out
break
if out: break

print "Case #%d: %d %d" % (case, i+1, j+1)# Reports results

Pretty easy! 😀 See you next post!